31.IJAEST Vol No 5 Issue No 2 VLSI Architecture and ASIC Implementation of ICE Encryption Algorithm 310 314
31.IJAEST Vol No 5 Issue No 2 VLSI Architecture and ASIC Implementation of ICE Encryption Algorithm 310 314
Abstract— In modern security, the need for safe cryptographic and for high–speed performance. The proposed architecture
algorithms that are hardware implemental is mandatory. A has very encouraging performance result in terms of speed and
hardware architecture is proposed in this paper, for the throughput. This makes the design very useful in current
implementation of the Information Concealment Engine applications that use DES as the base of a cryptographic
encryption algorithm. The Information Concealment Engine protocol [2]. ICE is a Feistel network with a block size of 64
algorithm was designed for use in software applications. Those
bits. The standard ICE algorithm takes a 64-bit key and has 16
applications are slow due to the use of modular arithmetic. So the
need for faster implementations becomes mandatory. That can be rounds. A fast variant, Thin-ICE, uses only 8 rounds. An
T
achieved through hardware implementations. The system open-ended variant ICE-n, uses 16n rounds with 64n bit key.
operates for the encryption processes and has been optimized for They described an attack on Thin-ICE which recovers the
low hardware resources and for high-speed. In this paper we are secret key using 2 23 chosen plaintexts with a 25% success
going to discuss about the ICE, DES and Triple DES, also we probability. If 2 27 chosen plaintexts are used, the probability
compare the performance of all. The proposed architecture has can be improved to 95%. For the standard version of ICE, an
been implemented as ASIC in 180 nm technology. attack on 15 out of 16 rounds was found, requiring 2 56 work
and at most 2 56 chosen plaintexts.
Keywords-ICE, Montgomery
multiplication, Systolic architecture
I. INTRODUCTION
ES
Multiplication, Modular
In this paper we are going to discuss about the ICE, DES
and Triple DES, also we compare the performance of all.
three days. The machine cost less than $250,000 and searched
The ICE algorithm was designed for use in software over 88 billion keys per second.
applications. Those applications however are slow due to the
use of modular arithmetic [3]. So the need for faster B. Triple DES
implementations is great. That can be achieved through
hardware implementations. Considering the fact that hardware The Triple-DES variant was developed after it became
implementations are generally faster and more Ease of Use clear that DES by itself was too easy to crack. It uses three 56-
reliable than software implementations the outcome of a bit DES keys, giving a total key length of 168 bits. Encryption
hardware design is even more interesting. using Triple-DES is simply
cipher text = EK3(DK2(EK1(plaintext)))
In this paper, the architecture and the VLSI
implementation of the ICE encryption algorithm are proposed. DES encrypt with K1, DES decrypt with K2, then DES
The system operates for the both encryption and decryption encrypt with K3.Because Triple-DES applies the DES
processes and has been optimized for low hardware resources algorithm three times (hence the name), Triple-DES takes
T
right half and a 60-bit subkeys are fed into the function F. The
output of F is XORed with the left half, and then the halves Figure 1. ICE F Function
are swapped. This is the Transformation Round of the ICE
algorithm. This process is repeated for 16 rounds [2], A. The Expansion Function E
[3].However the final swap is left out. The decryption process
is the same, except that the subkeys are used in reverse order. The 32-bit plaintext half is expanded to four 10-bit values,
ES
The advantages of Feistel structure are one-to- one
mapping between plaintext and cipher text, which is necessary
for a cipher to be decryptable. Secondly, Feistel ciphers have
E1, E2, E3, E4, in the following manner [2], [3].
E1=P1P0P31P30P29P28P27P26P25P24
E2=P25P24P23P22P21P20P19P18P17P16
E3=P17P16P15P14P13P12P11P10P9P8
been publicly cryptanalysed for more than two decades, and E4=P9P8P7P6P5P4P3P2P1P0
no systematic weakness has been uncovered. And finally, This expansion function was chosen because four 10-bit
Feistel ciphers are reasonably fast and simple to implement in values were needed for the S-boxes, and it was reasonably fast
software. Speed and simplicity were two important design to implement in software [3].
aims for ICE [3].
B. Keyed Permutation
III. THE F FUNCTION After expansion, keyed permutation is used. The
permutation subkeys is 20 bits long, and is used to swap bits
The ICE F function is similar in structure to the one used
A
between E1 and E3, and between E2 and E4. When the odd
in DES, with the exception of keyed permutation described
bits of the permutation key are set they swap El relative bits
below [2]. The function as a whole is illustrated in Figure. 1
with E3 bits else they swap E2 relative bits with E4 bits. For
example, if bit 1 of the subkey is set, bit 0 of E1 and E3 will
be swapped. If bit 2 of the subkey is set, bit 1 of E2 and E4
will be swapped [2],[3].
IJ
The values E1, E2, E3, and E4, after being permuted, are
XORed with 40 bits of subkeys, then used as input to the four
S-boxes, S1, S2, S3, and S4. Each S-box takes a 10 bit input
and produces an 8 bit output [3].
D. The S-Boxes
The S-boxes in ICE are similar in structure to those used in
LOKI [4] in their use of Galois field exponentiation. Each S-
box takes a 10-bits input X. Bits X9 and X 0are concatenated to
form the row selector R. Bits X8.....X1 are concatenated to
form the 8-bit column selector C. For each row R, there is an
XOR offset value OR, and a Galois field prime PR [2], [3]. The
8-bit output of an S-box for an input X is given by (C XOR
OR)7 mod PR under 8-bit Galois field arithmetic. The exponent
7 was chosen because it is a one-to-one function [2], [3]. The
XOR offsets for each row in each S-box are given in table 1,
while the prime numbers are specified in table 2.
Table 1. The S-Box offset values
Table 2 . The S-Box Galois Field Prime values From the analysis of ICE, main design function lies in the ICE
T
transformation round. Especially, in the implementation of the
F function, shown in Figure 3. The F function has four parts.
The Expansion function E, the key permutation, the S-boxes
and the permutation function P [2].
5. E=MM(D, A, P)
subkeys are used in reverse order [2].
6. Out=MM(E, 1, P)
Function MM (A, B, M)
1. R=0
2. For k=o to n-1 do begin
3. q=(a0+xky0)mod b
4. R=R+ xkB +qM
5. R=R/b end
6. Return R
End
This algorithm is a modified version of original
Montgomery Multiplication algorithm. The base b is
T
considered Radix 2 (b=2) and R=2n where n is the bit length of
operand A, B [2].
ES
Figure 3. The ICE Transformation Round
the generated partial product. This solution is quite slow as the Function MMcs (A, B, M)
final result is only available after n clock cycles; n is the size 1. C2in=0, C1in=0, Sin=0
of operands [9]. 2. For k=0 to n-1 do begin
3. Q=(Sin0+C1in0+C2in0+Akb0)mod 2
The advantage of Montgomery calculation is that we do
not need subtractions in order to reduce the intermediate 4. C1+C2+S=C2in+C1in+Sin+AkY+qM
results. The disadvantage is the fact that the Montgomery’s 5. C2in=(C2)/2, C1in=(C1)/2, Sin=S/2 End
modular multiplication calculates A.B2 mod M instead of
-n
6. Return C2in/2, C1in/2 and Sin/2
A.B mod M and two Montgomery’s modular multiplications
are required for one modular multiplication [6]. This algorithm
computes the product of two integers modulo a third one The Carry2 signal of a PE is connected with the next PE of
without performing division by M. It yields the reduced the next row of PE, Carry1 signal is connected with the same
product using a series of additions [9]. PE of the next row of PE (the same column) while the Sum
signal is connected with the previous PE of the next row VII. CONCLUSIONS
starting from bit 0 [8]. ICE is a symmetric key block cipher designed for software
applications. An efficient architecture for the VLSI
implementation is proposed in this paper. It is designed for
high clock speed - performance and minimized area resources
using Montgomery multiplication method. It is proven that the
ICE algorithm, used for this architecture, is able to be
implemented on hardware applications. The implementation on
ASIC is a system that has an external clock of 3 GHz and a
throughput of 12 Gbits/sec. Compared to other popular
encryption algorithms it is concluded that the ICE
implementation’s performance is better than most block
encryption algorithms implementations.
REFERENCES
[1] Bruce schneier, Applied Cryptography –Protocols, Algorithms and Source
Code in C, John Wiley & Sons, Seconded. New York, 1996.
[2] A. P. Fournaris, N. Sklavos and O. Koufopavlou,―VLSI Architecture and
FPGA Implementation of ICE Encryption Algorithm,‖
T
[3] M Kwan, ― The Design of the ICE Encryption Algorithm‖, in proc. Of Fast
software Encryption workshop, 1997.
[4] L. Brown, J. Pieprzyk and J. Seberry, LOKI: A cryptographic primitive for
Authentication and Secrecy Applications, Advance in cryptology-
AUSCRYPT, 90 Proceedings, Springer-verlag, pp. 229-236, 1990
[5] Peter L. Montgomery, ―Modular multiplication without trial division,‖
Mathematics of Computation, vol. 44, no.170, pp. 519-521, 1985
Figure 5. The Systolic Conventional Architecture [6] David Narh Amanor, Christof Paar, Jan Pelzl and Viktor Bunimov,
VI. PERFORMANCE
ES
The proposed architecture has been captured by using
Manfred Schimmler, ― Efficient hardware architectures for Modular
Multiplication on FPGAS‖,
[7] C. T. Koc, T. Acar and B.S Kaliski, ―Analysing and Comparing
Montgomery Multiplication Algorithm,‖ IEEE micro, vol. 16, no. 3 pp.26-33,
June 1996.
Verilog HDL. All the internal components of the design were [8] A. P. Fournaris and O. Koufopavlou, ― Montgomery Modular
Multiplication Architectures and hardware implementations for an RSA
Synthesized placed and routed using cadence tools using cryptosystem,‖
ASIC. The VLSI synthesis results are shown in Table 3 [9] Nadia Nedjah and Luiza De Macedo Mourelle, ―Systolic hardware
contain the Frequency, Area and Power comparisons Implementation for the Montgomery Modular Multiplication‖,
[10] S. E. Eldridge and C. D. Walter, Hardware implementation of
Table 3 . Covered Frequency, Area and Power comparisons Montgomery’s Modular Multiplication Algorithm‖, IEEE Transaction on
Computers, 42(6):619-624, 1993.
Encryption Freq Power Area [11] C. D Walter,―Montgomery Exponentiation Needs No Final Subtraction‖,
A
Electronics Letters, vol. 35 no. 21, October 1999, pp 1831-1832.
Algorithm (GHz) (pW) [12] D. J Guan, ― Montgomery Algorithm for Modular Multiplication,‖
Lecture notes, National Sun Yat Sen University, 2001.
ICE 3 68631.1 571191 [13] T. H. Cormen, C. E. Leiserson, and R. L. Rivest, Introduction to
algorithms, The MIT Press, Cambridge, 1990.
DES 1 76236.8 293458 [14] Shay Gueron, ―Enhanced Montgomery Multiplication,‖ in proc. Of
Workshop on Cryptographic Hardware and Embedded System, San Francisco,
TRIPLE DES 1 251144.4 223921 August 13-15 2002.
[15] Taek-Won Kwan, Chang-Seok you, Won-seok Heo, Yong-kyu Kang, and
IJ