LEA A 128-Bit Block Cipher For Fast Encryption On Common Processors-English
LEA A 128-Bit Block Cipher For Fast Encryption On Common Processors-English
Processors
Deukjo Hong1 , Jung-Keun Lee1 , Dong-Chan Kim1 , Daesung Kwon1 , Kwon Ho Ryu1 , and Dong-Geon Lee2
1
Attached Institute of ETRI
{hongdj, jklee, dongchan, ds kwon, jude}@ensec.re.kr
2
Information Security & IoT Laboratory, Pusan National University
[email protected]
Abstract. We propose a new block cipher LEA, which has 128-bit block size and 128, 192, or 256-bit key
size. It provides a high-speed software encryption on general-purpose processors. Our experiments show that
LEA is faster than AES on Intel, AMD, ARM, and ColdFire platforms. LEA can be also implemented to have
tiny code size. Its hardware implementation has a competitive throughput per area. It is secure against all the
existing attacks on block ciphers.
Key words: LEA, Block cipher, Fast encryption
1 Introduction
CPUs and operating systems are continuously developing, and many computing devices work much better than
before, with such powerful resources. For example, smart portable devices like smart phones and tablet PCs do not
only replace mobile phones but also allow to enjoy various cloud computing and social network services. With those
applications, the amount of the private data which people create for their business and life will be significantly
increasing. Another example is a smart meter, which is a basic unit of an advanced metering infrastructure in
a smart grid, recording consumption of electric energy, gathering data for remote reporting, and communicating
with the utility for monitoring and billing purpose. For the convenience of management, smart meters are often
implemented to perform tasks in software with small CPUs [18].
Those data mentioned in the above examples are usually important information which must be protected from
various threats in networks. It implies that the wide use of software applications significantly causes the necessity
of cryptographic systems on software platforms. With this consideration, we have been interested in a software
encryption. Software encryptions are easier to deploy and more cost-effective than hardware ones in many cases.
In particular, when a new encryption service is required for already deployed computing environments, a software
encryption is more suitable than a hardware one.
A block cipher is one of the most widely used cryptographic primitives. It is applied to data encryption, message
authentication, random bit generation, message hashing and so on. Presumably, the most widely used block cipher
in the world is AES [27] which has been established as various international standards. AES shows good performance
figures on most software and hardware platforms and is generally considered to be secure after surviving about 15
years of comprehensive cryptanalysis though some weaknesses have been found. Since AES, many block ciphers
have been designed for hardware lightweight implementation. Some of them were standardized as ISO lightweight
cryptography (ISO/IEC 29192-2). The main feature of the lightweight block ciphers is the efficient hardware im-
plementation with low resource. In order to achieve that goal, most of them use simple structures with small block
sizes and large number of rounds. However, those design approaches usually lead to low performance, and is far
from our consideration for software encryption. Consequently, we have designed a new block cipher providing a fast
encryption on common software platforms.
1.1 Contribution
We propose a new block cipher LEA. It has the block size of 128 bits and the key size of 128, 192, or 256 bits. We
denote the algorithms with 128-bit, 192-bit, and 256-bit keys by LEA-128, LEA-192, and LEA-256, respectively.
The structure of LEA has the following features.
1. LEA consists of only ARX (modular Addition, bitwise Rotation, and bitwise XOR) operations for 32-bit words.
Those operations are well-supported and fast in many 32-bit and 64-bit platforms. Moreover, we suppose that
the usage of 32-bit and 64-bit processors will grow rapidly compared to 8-bit or 16-bit ones.
2. The ARX operations contribute to the encryption and key schedule procedures in efficient and parallel way. Our
arrangement of operations does not only lead to fast software encryption and small size code, but also strong
resistance against the attacks using the properties of a particular operation.
3. The last round function of LEA is the same as other round functions, while many block cipher including DES
and AES have special last round functions which are somewhat different from other round functions. This is
for the encryption speed in both software and hardware because we think the block cipher encryption is more
frequently used than decryption.
4. The key schedule of LEA has a simple structure without any interleaving between 32-bit key words. It is good
for the efficiency, and does not cause any weakness.
Security. Our goal for the security of LEA is to get the resistance against all the existing attacks for block ciphers
and to provide enough security margin. To achieve this goal, we firstly found the minimum number R of rounds for
LEA to resist against all the known cryptanalytic techniques for each key size. Then we determined the number of
rounds of LEA as around 3R/2 to prepare for the unknown attacks to appear in future.
Efficiency. LEA provides a fast encryption on many platforms. Our experiments measuring the speed for one-block
encryption on the platforms of Intel, AMD, ARM and ColdFire show that even a C level implementation of LEA is
very fast. It implies that the evaluation of LEA encryption requires the light overhead to CPUs. Note that the light
overhead can lead to the low power consumption which is useful for the devices based on batteries. The optimized
implementation of LEA-128 for one-block encryption is faster than those of AES-128 publicly reported [25, 47], on
our test platforms. To objectivity, we used the announced facts for comparison instead of implementing AES.
LEA can be implemented with SIMD operations supported by Intel and AMD CPUs such that it encrypts
4 blocks simultaneously. It is useful for the highly fast encryption with ECB or CTR modes under a powerful
environment like a server-based computing. Our experiments on Intel Core 2 Quad Q6600 and Intel Core i7-800
show that the speed of the 4-block SIMD implementation of LEA-128 is about 2 times and 1.7 times faster than
the best records of the multi-block encryption codes of AES-128 [35], respectively.
We also found that LEA is implemented with a small code-size. The small-size implementation is useful in
a memory-limited environment. LEA-128 is implemented with less than 600 and 750 bytes on the platforms of
ARM926EJ-S and ColdFire MCF5213, respectively, while AES-128 is known to be implemented with around 2,400
and 960 bytes on the platforms of ARM7TDMI and ColdFire v2, respectively.
Comparison with Other Ciphers. We compare LEA to other ciphers in order to explain why it is meaningful to
propose this new block cipher.
– AES. AES was designed based on design and analysis techniques by 2000, and cryptanalysis of block ciphers
has been continuously researched and developed. Recent several attacks have pointed out some weaknesses for
AES. In [11], Biryukov et al. presented a chosen-key distinguisher for full 14-round AES-256 and converted it to
a key-recovery attack for a weak key class with the complexities of 2131 time and 265 memory. In [10], Biryukov
and Khovratovich presented related-key boomerang attacks on full 14-round AES-256 with 299.5 time and data
complexities and AES-192 with 2176 time and 2123 data complexities. In [14], Bogdanov et al. used biclique
techniques to make key recovery attacks on full AES-128, AES-192, and AES-256 with time complexities 2126.1 ,
2189.7 , and 2254.4 , respectively. LEA is designed based on the latest design and analysis techniques and we
checked that LEA is secure and has sufficient security margin against all the existing attacks.
Furthermore, as we already mentioned, LEA provides better software encryptions in speed and size on many
platforms than AES.
– Block ciphers with ARX structure. TEA [56] and XTEA [46] are Feistel block ciphers with simple round
function and key schedule. Their encryption speeds are not fast because they have the block length of 64 bits
shorter than LEA and 64 rounds more than LEA. Additionally, there are full-round attacks [37, 58] on TEA
and XXTEA [57], which is the third algorithm of TEA family.
At the AES competition, RC6 [49] was regarded as faster than Rijndael [21], which is the AES winner, on
many software platforms. However, parallelism offered by modern CPUs is not exploited well with RC6, and
the performance of recent implementation of AES exceeds that of RC6.
HIGHT [31] is a lightweight block cipher based on 8-bit ARX operations. So, it is not suitable for fast encryption
on 32-bit CPUs. Recently, full-round attacks on HIGHT have been published [32, 41].
Hash functions often adopt the ARX structure for the high performance on various platforms, similarly to our
design goal [2, 26]. Most of them have block ciphers as a component for building compression and hash functions.
They are even secure against attacks for block ciphers. However, hash functions and block ciphers are different
in the usage. In particular, most block ciphers in the hash functions have much larger block and key sizes than
those usually required for the security and application of block ciphers.
Recently, NSA published two block cipher families SPECK and SIMON [3]. SPECK is a typical ARX cipher
and SIMON consists of ANDs, rotations, and XORs. They have various parameters. The algorithms with 128-
bit block are comparable with LEA. The Performance of LEA is faster than SIMON in both 32- and 64-bit
processors. Since SPECK uses 64-bit addition with 128-bit block, its performance exceeds that of LEA only in
64-bit processors but LEA is more suitable for most 32-bit processors.
– Lightweight block ciphers. Many lightweight block ciphers like HIGHT [31], PRESENT [13], LED [30], and
Piccolo [51] have short block size and large number of rounds and their software encryptions are usually not
fast. Although [43] provides fast bitslice implementation of PRESENT and Piccolo, SIMD implementation of
LEA is faster than them. Furthermore, a short block size is not proper for encrypting huge data because some
modes of operation can allow security leakage like a ciphertext-matching attack.
KLEIN [28] is designed to be faster than AES on 8-bit and 16-bit platforms, while our targets are 32-bit and
64-bit platforms. CLEFIA [52] has the same block and key size as AES and the performance of its software
encryption is close to that of AES on AMD Athlon TM Processor 4000+. However, as far as we know, it does
not claim higher software performance than AES. Recently, PRINCE [16] was proposed as a low-latency block
cipher which has good performance in software and hardware implementations, but its security goal is somewhat
different from that for the general-purpose block ciphers.
– Stream ciphers. Several stream ciphers such as Salsa20 [4] are based on ARX operations, but we think
the block cipher is not totally comparable to the stream cipher because they do not always have the same
applications.
1.2 Organization
The remaining part is organized as follows. Section 2 describes the specification of LEA. In Section 3, we introduce
design principles. In Section 4, we present the security analysis results for existing cryptanalytic techniques. In
Section 5, we explain the implementation results. Section 6 is the conclusion of our paper.
2 Specification of LEA
LEA is a block cipher with 128-bit block. Key size is 128-bit, 192-bit, and 256-bit. The number of rounds is 24 for
128-bit keys, 28 for 192-bit keys, and 32 for 256-bit keys. In Section 2.1, we introduce notations which are often
used in this paper. We explain how the key schedule generates round keys from the master key in Section 2.3. We
explain how the encryption procedure converts a plaintext to a ciphertext in Section 2.4. We omit the description
of the decryption procedure because it is simply considered as the inverse of the encryption procedure.
2.1 Notations
– P : a 128-bit plaintext, consisting of four 32-bit words P = P [0]kP [1]kP [2]kP [3]
– C: a 128-bit ciphertext, consisting of four 32-bit words C = C[0]kC[1]kC[2]kC[3]
– Xi : a 128-bit intermediate value (an input of i-th round in the encryption function), consisting of four 32-bit
words Xi = Xi [0]kXi [1]kXi [2]kXi [3]
– Len(x): the bit-length of a string x
– K: a master key. It is denoted as a concatenation of 32-bit words. K = (K[0], K[1], K[2], K[3]) when Len(K) =
128; K = K[0]kK[1]k · · · kK[5] when Len(K) = 192; K = K[0]kK[1]k · · · kK[7] when Len(K) = 256
– r: the number of rounds. r = 24 when Len(K) = 128; r = 28 when Len(K) = 192; r = 32 when Len(K) = 256
– RK: the concatenation of all round keys, defined by RK = RK0 kRK1 k · · · kRKr−1 where RKi is the 192-bit
round key for the i-th round. Each RKi consists of six 32-bit words RKi = RKi [0]kRKi [1]k · · · kRKi [5]
– x ⊕ y: XOR (eXclusive OR) of bit strings x and y with same length
– x y: Addition modulo 232 of 32-bit strings x and y
– ROLi (x): the i-bit left rotation on a 32-bit value x
– RORi (x): the i-bit right rotation on a 32-bit value x
2.2 State Representation
Let a[0], a[1], ..., be representation of arrays of bytes. The bytes and the bit ordering within bytes are derived from
the 128-bit input sequence input0 , input1 , ... as follows:
a[i] = {input8i , input8i+1 , ..., input8i+7 }.
All the operations in the LEA algorithm are 32-bit-word-oriented. The 128-bit plaintext P of LEA is represented
as an array of four 32-bit words P [0], P [1], P [2], P [3]. Each P [i] is taken for the input bytes a[0], a[1], ..., a[15] as
follows:
P [i] = a[4i + 3]ka[4i + 2]ka[4i + 1]ka[4i] for 0 ≤ i ≤ 3.
The key K of LEA is also represented as an array of 32-bit words K[0], K[1], ..., and taken for the input bytes in
the same way. Table 1 shows how bits and bytes in the word indexed by 0 are numbered.
Constants. The key schedule uses several constants for generating round keys, which are defined as
δ[0] = 0xc3ef e9db, δ[1] = 0x44626b02,
δ[2] = 0x79e27c8a, δ[3] = 0x78df 30ec,
δ[4] = 0x715ea49e, δ[5] = 0xc785da0a,
δ[6] = 0xe04ef 22a, δ[7] = 0xe5c40957.
√
They are obtained from hexadecimal expression of 766995, where 76, 69, and 95 are ASCII codes of ‘L,’ ‘E,’ and
‘A.’
Key Schedule with a 128-Bit Key. Let K = (K[0], K[1], K[2], K[3]) be a 128-bit key. We set T [i] = K[i] for 0 ≤ i < 4.
Round key RKi = (RKi [0], RKi [1], ..., RKi [5]) for 0 ≤ i < 24 are produced through the following relations:
T [0] ← ROL1 (T [0] ROLi (δ[i mod 4])),
T [1] ← ROL3 (T [1] ROLi+1 (δ[i mod 4])),
T [2] ← ROL6 (T [2] ROLi+2 (δ[i mod 4])),
T [3] ← ROL11 (T [3] ROLi+3 (δ[i mod 4])),
RKi ← (T [0], T [1], T [2], T [1], T [3], T [1]).
Key Schedule with a 192-Bit Key. Let K = (K[0], K[1], ..., K[5]) be a 192-bit key. We set T [i] = K[i] for 0 ≤ i < 6.
Round key RKi = (RKi [0], RKi [1], ..., RKi [5]) for 0 ≤ i < 28 are produced through the following relations:
T [0] ← ROL1 (T [0] ROLi (δ[i mod 6])),
T [1] ← ROL3 (T [1] ROLi+1 (δ[i mod 6])),
T [2] ← ROL6 (T [2] ROLi+2 (δ[i mod 6])),
T [3] ← ROL11 (T [3] ROLi+3 (δ[i mod 6])),
T [4] ← ROL13 (T [4] ROLi+4 (δ[i mod 6])),
T [5] ← ROL17 (T [5] ROLi+5 (δ[i mod 6])),
RKi ← (T [0], T [1], T [2], T [3], T [4], T [5]).
Key Schedule with a 256-Bit Key. Let K = (K[0], K[1], ..., K[7]) be a 256-bit key. We set T [i] = K[i] for 0 ≤ i < 8.
Round key RKi = (RKi [0], RKi [1], ..., RKi [5]) for 0 ≤ i < 32 are produced through the following relations:
The encryption procedure of LEA consists of 24 rounds for 128-bit keys, 28 rounds for 192-bit keys, and 32 rounds
for 256-bit keys. For r rounds, it encrypts a 128-bit plaintext P = (P [0], P [1], P [2], P [3]) to a 128-bit ciphertext
C = (C[0], C[1], C[2], C[3]).
Initialization. Set the 128-bit intermediate value X0 to the plaintext P . Run the key schedule to generate r round
keys.
Iterating Rounds. The 128-bit output Xi+1 = (Xi+1 [0], ..., Xi+1 [3]) of the i-th round for 0 ≤ i ≤ r − 1 is computed
as
Xi+1 [0] ← ROL9 ((Xi [0] ⊕ RKi [0]) (Xi [1] ⊕ RKi [1])),
Xi+1 [1] ← ROR5 ((Xi [1] ⊕ RKi [2]) (Xi [2] ⊕ RKi [3])),
Xi+1 [2] ← ROR3 ((Xi [2] ⊕ RKi [4]) (Xi [3] ⊕ RKi [5])),
Xi+1 [3] ← Xi [0].
Finalization. The ciphertext C is produced from the finally obtained Xr after round iteration in the following way:
Efficient round structure with 32-bit ARX operations. The round function of LEA consists of ARX operations.
Especially, we used 32-bit ARX operations instead of 8-bit ones because 32-bit operations are more popular than
8-bit ones and we think that most processors will be developed to support 32-bit operations even in resource-
constrained devices. It has just three internal computation modules including two key XORs, one addition, and one
bitwise rotation. We adopt the addition modulo 232 as a nonlinear function with two 32-bit inputs and one 32-bit
output1 . Round key XORs are used for randomizing the inputs of the nonlinear functions, the bitwise rotations and
the word-wise swap are used for diffusion. The simple and efficient structure of LEA provides both tiny-size code
and high-speed code. In spite of its simplicity, it has nice nonlinearity and diffusion effect to give a proper number
of rounds for good performance.
Encryption is more useful than decryption. Unexpectedly, there are not many modes of operation which need
the decryption function. For example, ISO/IEC 9797-1, ISO/IEC 10116, and ISO/IEC 19772 specify 6 message
authentication modes, 5 encryption modes, and 6 authenticated-encryption modes of block ciphers, respectively.
However, only ECB, CBC, and OCB modes need both encryption and decryption functions2 . It implies that the block
cipher encryption is more widely and frequently used than the block cipher decryption. With this consideration, we
did not care for the balance of the speed between encryption and decryption. Note that most block ciphers usually
have the special last round different from other rounds for efficiency in the implementation of decryption, while the
last round of LEA is not special but has the same structure as the other rounds for efficiency of the encryption-only
modes. Nevertheless, the decryption speed of LEA is still competitive with most block ciphers.
Choice of rotations. We chose the rotations in encryption procedure such that it has the strong diffusion property.
Firstly, for the parameters (a, b, c), we set the round function with input Xi , output Xi+1 , and round key RKi as
follows.
Xi+1 [0] ← ROLa ((Xi [0] ⊕ RKi [0]) (Xi [1] ⊕ RKi [1])),
Xi+1 [1] ← ROLb ((Xi [1] ⊕ RKi [2]) (Xi [2] ⊕ RKi [3])),
Xi+1 [2] ← ROLc ((Xi [2] ⊕ RKi [4]) (Xi [3] ⊕ RKi [5])),
Xi+1 [3] ← Xi [0].
Then, we linearized the LEA encryption algorithm by replacing the additions with XORs, and searched the XOR
differential characteristics (XDCs) of the linearized structure for all possible 32
3 candidates of (a, b, c). Note that
ROLb = ROR32−b and ROLc = ROR32−c . One of our searching strategies is to start from a middle round with low
Hamming weight of differences. As a result, we found that for each candidate of (a, b, c) there exists a 11-round XDC
whose probability is not lower than 2−128 . The probability is estimated under the assumption that every addition is
independent. Note that this assumption is not stronger than any other block ciphers because each addition can be
regarded as a nonlinear function with two 32-bit inputs and a 32-bit output and because XORing subkeys with the
inputs of nonlinear functions is the most popular way to combine key materials to encryption body. We found 32
candidates of (a, b, c) which have 12-round XDCs with the probability of 2−128 or 2−129 as best ones. We optimized
these characteristics such that both of the first and last rounds are not linearized. We chose (9, 27, 29) because
it made only differential characteristics with the probability 2−128 as best ones, and because the number of such
characteristics is fewer than any other candidates.
Additionally, we considered short characteristics for the boomerang attack, and found that the maximum number
of rounds having differential characteristic with the probability greater than 2−32 is 7 over all (a, b, c), and so does
it for the case (a, b, c) = (9, 27, 29).
As a different approach for the same goal, we can also regard the linearized rounds as a linear code. So, we
tried to get good differential characteristics by applying Canteaut-Chabaud method [17] to search code words with
minimum weight code words, but we could not find better differential characteristics than the first approach.
1
On the other hand, round key XORs and rotations work as nonlinear functions for the adversary using add-differences
2
We leave the fifth mode, ‘Encrypt-then-MAC’ in [33] out of the discussion because it uses general notions of encryption
and MAC
Simple key schedule. We adopt a very simple structure for the key schedule. It does not mix the words of the key
and has no avalanche effect in key bits at all. Nevertheless, our security analysis show that it protects LEA from
the attacks such as slide attack [12], related-key attack [5], related-key boomerang attack [10, 11], biclique attack
[14], rotational attack [38] and so on. The simplicity of the key schedule provides efficiency in small-size hardware
and software implementations.
4 Security Analysis
We analyzed the security of LEA for existing cryptanalytic techniques by searching, constructing, or exploiting
various characteristics such as differential and linear trails. For each attack, firstly, we found the maximum number
N of rounds where there exists an available characteristic, and then constructed the best N -round characteristic.
We determined the number of rounds making the algorithm secure against each attack, considering the difference
propagation of the round function and the arrangement of the round key words, as follows.
1. If the characteristic is N -round and holds with the probability between 0 and 1, then the secure number of
rounds is N + 3 for 128-bit keys, N + 4 for 192-bit keys, and N + 5 for 256-bit keys.
2. If the characteristic is N -round and holds with the probability 0 or 1, then the secure number of rounds is N + 4
for 128-bit keys, N + 5 for 192-bit keys, and N + 6 for 256-bit keys.
COSIC made an evaluation report for LEA, too, independently of us [20]. We will explain some of their security
analysis results. The whole main analysis results are summarized at Table 2. We also discuss other attacks not listed
in Table 2.
As we mentioned in Section 3, the probability of the best 12-round differential characteristic which we have found is
estimated at most as 2−128 . Since it is not available for the attack, we searched 11-round differential characteristic
with the same way; firstly find the XOR-linearized differential characteristics with high probabilities and then
optimize it by removing the linearity in the differential paths of the first and the last rounds. As a result, the best
found ones of 11-round differential characteristics have the probability 2−98 and the following form:
– Input difference: 80000234 α0402214 β0401205 γ0400281, where α ∈ {4, c}, β ∈ {4, c}, and γ = β ⊕ 1,
– Output difference: η800000a 88aaa00a 220202ζ0 00200050, where η ∈ {4, c} and ζ ∈ {2, 6}.
We can apply one of these characteristics to 11 rounds from Round 0 to Round 10, and attack 12 rounds for
128-bit keys. This attack recovers 96 bits of the round key RK11 in the last round, Round 11 with very high signal-
to-noise ratio, and requires around 2100 plaintexts, 284 encryptions, and the memory for 276 bytes. Extending it to
13-round attack is not successful because
– If one applies the 11-round characteristic to the first 11 rounds from Round 0 to Round 10 and tries to recover
partial bits of RK12 , the round key of Round 12, he will be in trouble with the poor filtering and it leads to
the bad signal-to-noise ratio.
– If one applies the 11-round characteristic to Round 1 to Round 11 and tries to recover partial bits of RK0 , the
round key of Round 0 and RK12 , the round key of Round 12, he will face too much guessed key bits or too
weak filtering to attack.
We consider the possibility of 13-round attack for 192-bit keys and 14-round attack for 256-bit keys, respectively.
Using a set of many differential characteristics with relatively high probabilities instead of a best one, we can
increase the probability from 2−98 to 2−91.9 . This is a kind of truncated differential characteristic [39], which can
be used for reducing some of complexities for the above differential attack, but not be helpful for increasing the
number of the attacked rounds. Analyses with other types of differences [20] have been tried but not found any
critical weaknesses.
where RK is a vector composed of all round keys. We denote the probability that (1) is satisfied, by p, and let
ε = p − 1/2. ε is called the bias of (1). A linear attack using a linear approximation has the data complexity of
O(ε−2 ).
It is not easy to find a good linear approximation for long rounds of LEA. Wallén’s work [55] shows that in
the masks of a linear approximation for modular additions the absolute value of the bias tends to decrease as the
highest nonzero bit of the masks is close to the most significant. The combination of the bitwise rotations in LEA
encryption significantly disturbs the appearance of linear approximations with good biases. We searched the linear
approximations in such a way that the propagation of linear masks is suppressed as strong as possible. Consequently,
we found 10-round linear approximation with ε = 2−46 and 11-round linear approximation with ε = 2−63 . We can
use Matsui’s algorithm 1 and the 11-round linear approximation to get 1-bit information about round keys for 11
rounds with O(2126 ) known plaintexts, and we can use Matsui’s algorithm 2 and the 10-round linear approximation
to make a 11-round key recovery attack with O(292 ) known plaintexts.
Recently, the attacks using zero correlation approximations have been introduced [15], which is a counter part of the
impossible differential attack in linear cryptanalysis. The best key recovery attacks in single-key setting based on
zero correlation approximations have been made for TEA and XTEA. Since LEA has the use of ARX operations in
common with TEA and XTEA, one may suspect the vulnerability of LEA against zero correlation attack. However,
we found that a 7-round zero correlation approximation is constructed from 3-round forward and 4-round backward
approximations, and it is difficult to construct much longer zero correlation approximations than 7 rounds. Based on
the 7-round zero correlation approximations, we consider the possibility of 9-round attack for 128-bit keys, 10-round
attack for 192-bit keys, and 11-round attack for 256-bit keys, respectively.
The best differential probability for 7 rounds is 2−27 . The best 7-round one has the following differences of input
and output.
We construct a 14-round boomerang characteristic from the best 7-round differential characteristic. There are some
round-skip techniques maximizing the number of rounds of the boomerang characteristic [10, 24], but they do not
work for LEA. It is the best one which we have found ever. For 128-bit keys, we can use it to make an attack on
at most 15 rounds with 2116.3 plaintexts. We could not find a proper attack on 16 rounds due to increased data
complexity and worsened filtering. The amplified boomerang [36] or rectangle attacks [7] do not seem to improve
our attacks significantly. We consider the possibility of 16-round attack for 192-bit keys and 17-round attack for
256-bit keys, respectively.
4.5 Impossible Differential Attack
Impossible differential attack [6] uses differential characteristics with probability of 0. They are usually constructed
from miss-in-the-middle combination with forward and backward truncated differential characteristics with proba-
bility of 1. For LEA, the best impossible differential characteristics are 10 rounds, constructed with 6-round forward
and 4-round backward truncated differential characteristics with probability of 1, which is reported in [20].
For 128-bit keys, we can use the 10-round impossible differential characteristics to make a 11-round attack to
derive a partial information of the last round key. one may make a 12-round attack by using a set of specially chosen
plaintexts or constructing a key-recovering process. We consider the possibility of 13-round attack for 192-bit keys,
and 14-round attack for 256-bit keys, respectively.
Recently, some kinds of meet-in-the-middle attacks have made impressive cryptanalytic results for block ciphers
and hash functions. We checked that meet-in-the-middle attack techniques are not applicable to LEA very well.
A basic meet-in-the-middle attack [23] is disturbed since there is no separation of long rounds. The meet-in-the-
middle pseudo-preimage attack [1, 50] does not work for even half rounds. The partial-matching and initial-structure
techniques are not efficient in LEA.
Rotational cryptanalysis [38] is attractively available on ARX-based structures. We examined the resistance of
LEA against rotational cryptanalysis for the single-key model and the related-key model in which two keys form
a rotational pair. We found that key XORs in the encryption procedure and constant XORs in the key schedule
prevent rotational characteristics from being constructed for long rounds.
Algebraic attack [19] forms an overdefined system of equations derived from the block cipher. Several algorithms
are proposed for solving it, but they fail to find a right solution for existing block ciphers. We think they hardly
work for LEA, too.
We have studied various existing cryptanalytic techniques for block ciphers in order to analyze the security of LEA.
Although some characteristics we mentioned can be somewhat upgraded by new technologies, it is unlikely to find a
new attack to improve significantly the results in Table 2, as long as we did not miss critical weakness of LEA. We
determined the number of rounds for LEA-128 based on the above security analysis such that the security margin
to the whole rounds ratio is greater than 30%. For LEA-192 and LEA-256, we added 4 and 8 rounds, respectively
to the rounds of LEA-128, considering the difference of key schedules and security criteria.
5 Implementation
We have implemented LEA on various 32-bit and 64-bit software platforms. We have focused on LEA-128 since the
speed decreases almost in proportion to the number of rounds.
On ARM platforms, we can implement LEA without register-spilling and most of the bit rotations can be
processed without costing any clock cycle thanks to the barrel shifter. Thus we get remarkably high throughput
compared to other block ciphers both in encryption and decryption.
On Intel/AMD platforms, we can also implement LEA without register-spilling and, due to the highly parallel
structure of LEA round function, we also get high encryption speed. Moreover, by utilizing SIMD(Single Instruction
Multiple Data) instructions inherent in most of recent Intel/AMD platforms, we can get even higher throughput
for parallel modes of LEA.
On ARM and ColdFire platforms, we have measured the compactness of LEA. Since the round function of LEA
consists of a small number ARX operations without S-box, the code size of LEA on these platforms is quite small
compared to other block ciphers with the same block size.
We have also estimated the efficiency of LEA on some 8-bit platforms and confirmed that LEA has sound
performance on these platforms.
ARM platforms. ARM processors are the most widely used 32-bit embedded processors. They support rotate,
multiple load/store instructions as well as most arithmetic and logical ones. Comparison with the speed-optimized
implementation of AES on comparable platforms is given in Table 3.
Table 4 shows the comparison with the code-size-optimized implementation of AES.
Table 3. Speed of LEA-128 and AES-128 on ARM platform
Intel and AMD Platforms. Most of recent Intel/AMD CPUs have 3 pipelines. Since LEA consists of 24 rounds and
each round can be expressed as a sequence of 16 instructions, the minimal cycle cost of LEA encryption is expected
to be around 128. Comparison with 32-bit implementation of AES is given in Table 5. Decryption is slower than
encryption since decryption is processed rather serially. We note that AES is faster than all other well-known block
ciphers with similar block and key size on these platforms.
Most of recent Intel/AMD processors support SIMD extensions at least up to SSE2. Thus, basic 32-bit operations
like XOR, ADD, SHIFT can be performed very efficiently in parallel. Moreover, the latency and throughput of SIMD
instructions are close to those of corresponding 32-bit-wise instructions on recent processors. Since LEA is described
as a combination of XOR, ADD, and ROTATE, it is straightforward to implement parallel modes of LEA using
SSE2 to process 4 or 8 blocks simultaneously.
Comparison with SIMD implementations of AES (not using AES instruction set) is given in Table 6.
ColdFire platforms. ColdFire processors are 32-bit microprocessors targeted towards embedded systems. LEA shows
lower performance here than on ARM platforms since load/store and rotate operation are performed less efficiently:
They do not support rotate, multiple load/store instructions and the shift instruction can shift only by up to 8 bits.
We have implemented speed-optimized and size-optimized LEA on MCF5213. Comparison with implementation of
AES on comparable platform is given in Table 7. We note that LEA runs faster than hardware-accelerated AES.
8-bit and 16-bit Platforms. Though LEA is designed to achieve high performance in 32-bit platforms. We have also
analyzed the performance of LEA on Advanced Virtual RISC(AVR), which are among the most favorable 8-bit
platforms. LEA is estimated to run at around 3,040 cycles for encryption on AVR AT90USB82/162 where AES
best record is 1,993 cycles [47]. We suppose that the performance of LEA is comparable to that of AES on low-end
8-bit or 16-bit platforms, both in speed and code size.
We have implemented LEA-128 with Verilog HDL and synthesized to ASIC with fully verifying the correctness
of front-end and back-end design. For HDL implementation and verification of our design, we have used Mentor
Modelsim 6.5f for RTL simulation and Synopsys Design Compiler Ver. B-2008.09-SP5 for its synthesis. Our RTL
level design result of LEA is synthesized to ASIC with the UMC 0.13µm standard cell library and 100MHz operating
frequency.
LEA-128 AES-128
Platform Encryption Decryption Encryption
Intel Core 2 Quad Q6600 9.29 14.83 12.20 [25]
Intel Core i5-2500 9.29 14.52 11.35 [25]
AMD Phenom II X4 965 8.85 14.50 10.35 [25]
AMD Opteron 6176 SE 8.55 14.05 N/A
Table 6. SIMD implementations of LEA-128 and AES-128
Since the LEA consists of the small number of simple operations such as bit XOR, rotation and 32-bit adder
without complex operations such as S-box, it can be implemented with low hardware resources. The LEA can also
achieve high performance for its short critical path characteristics. The operational blocks for the round function
and key scheduling are so regular that we can achieve these operations with low hardware resources by using its
basic operational blocks repetitively.
Table 8 shows the hardware complexity of two different implementations of LEA-128 encryption module: One is
the area-optimized and the other is the FOM-optimized (throughput/area).The area-optimized implementation of
LEA has 3,826 GE and 168 clock cycles, and the FOM-optimized has 5,426 GE and 24 clock cycles. We can see that
the LEA encryption algorithm has relatively lightweight key scheduling and encryption block (Round Function)
from this table.
Area(GE)
Block
Area-opt. FOM-opt.
Constants Generation 970 964
Control Unit 75 54
Key Scheduling 400 695
State Register 920 1,037
Key Register 998 1,037
Round Function 450 1,080
Others 23 559
Total Block 3,826 5,426
Table 9 compares our hardware implementation results of LEA-128 encryption to other 128-bit key block ciphers
with view point of FOM.
6 Conclusion
We have proposed a new block cipher LEA, which has 128-bit block size and 128, 192, or 256-bit key size. LEA
provides a high-speed software encryption on general-purpose processors. It can be also implemented to have tiny
code size. Its hardware implementation has a competitive throughput per area. It is secure against all the existing
attacks. In spite of the remarkable implementation results presented in this paper, we believe that the they have
room for further optimizations.
Table 9. Hardware implementation of LEA-128 encryption algorithm and its comparison to that of other 128-bit key block
ciphers
References
1. K. Aoki and Y. Sasaki, “Meet-in-the-Middle Preimage Attacks Against Reduced SHA-0 and SHA-1,” CRYPTO 2009,
pp. 70–89, Springer, 2009.
2. J.-P. Aumasson, L. Henzen, W. Meier, and R. C.-W. Phan, “SHA-3 Proposal BLAKE,” Submission to NIST (Round 3),
2010.
3. R. Beaulieu, D. Shors, J. Smith, S. Treatman-Clar, B. Weeks, and L. Wingers, “The SIMON and SPECK Families of
Lightweight Block Ciphers,” IACR Cryptology ePrint Archive: Report 2013/404, 2013.
4. D. J. Bernstein, “The Salsa20 Stream Cipher,” SKEW 2005 — Symmetric Key Encryption Workshop, 2005.
5. E. Biham, “New Types of Cryptanalysic Attacks Using Related Keys,” EUROCRYPT’93, LNCS 765, pp. 398–409,
Springer, 1994.
6. E. Biham, A. Biryukov, and A. Shamir, “Cryptanalysis of Skipjack Reduced to 31 Rounds Using Impossible Differentials,”
EUROCRYPT’99, LNCS 1592, pp. 12–23, Springer, 1999.
7. E. Biham, O. Dunkelman, and N. Keller, “The Rectangle Attack — Rectangling the Serpent,” EUROCRYPT 2001,
LNCS 2045, pp. 340–357, Springer, 2001.
8. E. Biham, O. Dunkelman, and N. Keller, “Enhancing Differential-Linear Cryptanalysis,” ASIACRYPT 2002, LNCS
2501, pp. 254–266, Springer, 2002.
9. E. Biham and A. Shamir, Differential Cryptanalysis of the Data Encryption Standard, Springer, 1993.
10. A. Biryukov and D. Khovratovich, “Related-Key Cryptanalysis of the Full AES-192 and AES-256,” ASIACRYPT 2009,
LNCS, pp. 1–18, Springer, 2009.
11. A. Biryukov, D. Khovratovich, and I. Nikolic, “Distinguisher and Related-Key Attack on the Full AES-256,” In S. Halevi
(Ed.), CRYPTO 2009, LNCS 5677, pp. 231–249, Springer, 2009.
12. A. Biryukov and D. Wagner, “Slide Attacks,” FSE’99, LNCS 1636, pp. 245–259, Springer, 1999.
13. A. Bogdanov, L. R. Knudsen, G. Leander, C. Paar, A. Poschmann, M.J.B. Robshaw, Y. Seurin, and C. Vikkelsoe,
“PRESENT: An Ultra-Lightweight Block Cipher,” In P. Paillier and I. Verbauwhede (Eds.), CHES 2007, LNCS 4727,
pp. 450–466, Springer, 2007.
14. A. Bogdanov, D. Khovratovich, and C. Rechberger, “Biclique Cryptanalysis of the Full AES,” ASIACRYPT 2011, LNCS
7073, pp. 344–371, Springer, 2011.
15. A. Bogdanov and M. Wang, “Zero Correlation Linear Cryptanalysis with Reduced Data Complexity,” FSE 2012, LNCS
7549, pp. 29–48, Springer, 2012.
16. J. Borghoff, A. Canteaut, T. Güneysu, E. B. Kavun, M. Knezevic, L. R. Knudsen, G. Leander, V. Nikov, C. Paar,
C. Rechberger, P. Rombouts, S. S. Thomsen, and T. Yalçm, “PRINCE — A Low-Latency Block Cipher for Pervasive
Computing Applications,” ASIACRYPT 2012, LNCS 7658, pp. 208–255, Springer, 2012.
17. A. Canteaut and F. Chabaud, “A New Algorithm for Finding Minimum-Weight Words in a Linear Code: Application to
McEliece’s Cryptosystem and to Narrow-Sense BCH Codes of Length 511,” IEEE Transactions on Information Theory,
Vol. 44, No. 1, pp. 367–378, 1998.
18. Certicom White Paper Series, “Critical Infrastructure Protection for AMI Using a Comprehensive Security Platform,”
Februrary 2009.
19. N. Courtois and J. Pieprzyk, “Cryptanalysis of Block Ciphers with Overdefined Systems of Equations,” In Y. Zheng
(Ed.), ASIACRYPT 2002, LNCS 2501, pp. 267–287, Springer, 2002.
20. COSIC, Final Report: Security Evaluation of the Block Cipher LEA, 2011.
21. J. Daemen and V. Rijmen, The Design of Rijndael: AES . The Advanced Encryption Standard, Springer, 2002.
22. M. Darnall and D. Kuhlman, “AES Software Implementation on ARM7TDMI,” Indocrypt 2006, LNCS 4329, pp. 424–435,
2006.
23. W. Diffie and M. Hellman, “Exhaustive Cryptanalysis of the NBS Data Encryption Standard,” Computer, Vol. 10, No.
6, pp. 74–84, 1977.
24. O. Dunkelman, N. Keller, and A. Shamir, “A Practical Time Related-Key Attack on the KASUMI Cryptosystem Used
in GSM and 3G Telephony,” In T. Rabin (Ed.), CRYPTO 2010, LNCS 6223, pp. 393–410, Springer, 2010.
25. eBACS: ECRYPT Benchmarking of Cryptographic Systems, bench.cr.yp.to.
26. N. Ferguson, S. Lucks, B. Schneier, DougWhiting, M. Bellare, Tadayoshi Kohno, J. Callas, and Jesse Walker, “The Skein
Hash Function Family,” Submission to NIST (Round 3), 2010.
27. ADVANCED ENCRYPTION STANDARD (AES), Federal Information Processing Standards Publication 197, November
26, 2001.
28. Z. Gong, S. Nikova, and Y.-W. Law, “KLEIN: A New Family of Lightweight Block Ciphers,” RFIDSec 2011, LNCS 7055,
pp. 1–18, Springer-Verlag, 2011.
29. S. Gueron, “Intel’s Advanced Encryption Standard (AES) Instructions Set,” White Paper, Rev. 2.0, Intel Corporation,
April 2009.
30. J. Guo, T. Peyrin, A. Poschmann, and M. Robshaw, “The LED Block Cipher,” In B. Preneel and T. Takagi (Eds.),
CHES 2011, LNCS 6917, pp. 326–341, Springer, 2011.
31. D. Hong, J. Sung., S. Hong, J. Lim, S. Lee, B. Koo, C. Lee, D. Chang, J. Lee, K. Jeong, H. Kim, J. Kim, and S. Chee,
“HIGHT: A New Block Cipher Suitable for Low-Resource Device,” In L. Goubin and M. Matsui (Eds.), CHES 2006,
LNCS 4249, pp. 46–59. Springer, 2006.
32. D. Hong, B. Koo, and D. Kwon, “Biclique Attack on the Full HIGHT,” ICISC 2011, LNCS 7259, pp. 365–374, Springer-
Verlag, 2011.
33. ISO/IEC 19772, “Information technology — Security techniques — Authenticated encryption,” 2009.
34. G. Jakimoski and Y. Desmedt, “Related-Key Differential Cryptanalysis of 192-bit Key AES Variants,” In M. Matsui and
R. Zuccherato (Eds.), SAC 2004, LNCS 3006, pp. 208–221, Springer, 2004.
35. E. Käsper and P. Schwabe, “Faster and Timing-Attack Resistant AES-GCM,” CHES 2009, LNCS 5747, pp. 1–27, 2009.
36. J. Kelsey, T. Kohno, and B. Schneier, “Amplified Boomerang Attacks Against Reduced-Round MARS and Serpent,”
FSE 2000, LNCS 1978, pp. 75–93, Springer-Verlag, 2000.
37. J. Kelsey, B. Schneier, and D. Wagner, “Related-Key Cryptanalysis of 3-WAY, Biham-DES, CAST, DES-X, NewDES,
RC2, and TEA,” ICICS 1997, LNCS 1334, pp. 233–246, 1997.
38. D. Khovratovich and I. Nikolic, “Rotational Cryptanalysis of ARX,” FSE 2010, LNCS 6147, pp. 333–346, Springer, 2010.
39. L. R. Knudsen, “Truncated and Higher Order Differentials,” In B. Preneel (Ed.), FSE’94, LNCS 1008, pp. 196–211,
Springer, 1995.
40. L. R. Knudsen and D. Wagner, “Integral Cryptanalysis,” In J. Daemen and V. Rijmen (Eds.), FSE 2002, LNCS 2365,
pp. 112–127, Springer, 2002.
41. B. Koo, D. Hong, and D. Kwon, “Related-Key Attack on the Full HIGHT,” In K. -H. Rhee and D. Nyang (Eds.), ICISC
2010, LNCS 6829, pp. 49–67, Springer, 2010.
42. H. Lipmaa and S. Moriai, “Efficient Algorithms for Computing Differential Properties of Addition,” FSE 2001, LNCS
2355, pp. 336–350, Springer, 2001.
43. S. Matsuda and S.Moriai, “Lightweight Cryptography for the Cloud: Exploit the Power of Bitslice Implementation,”
CHES 2012, LNCS 7428, pp. 408–425, Springer, 2012.
44. M. Matsui, “Linear Cryptanalysis Method for DES Cipher,” In T. Helleseth (Ed.), EUROCRYPT’93, LNCS 765, pp.
386–397, Springer, 1994.
45. A. Moradi, A. Poschmann, S. Ling, C. Parr, and H. Wang, “Pushing the Limits: A Very Compact and a Threshold
Implementation of AES,” Eurocrypt 2011, LNCS 6632, pp. 69–88, Springer, 2011.
46. R. M. Needham and D. J. Wheeler, “TEA Extensions,” Technical Report, Computer Laboratory, University of Cam-
bridge, October 1997.
47. D. Osvik, J. Bos, D. Stefan, and D. Canright, “Fast Software AES Encryption,” FSE 2010, LNCS 6147, pp. 75–93, 2010.
48. Available at realtimelogic.com/products/sharkssl/Coldfire-80Mhz/
49. R. L. Rivest, M. J. B. Robshaw, R. Sidney, and Y. L. Yin, “Thr RC6 Block Cipher,” 1998.
50. Y. Sasaki and K. Aoki, “Finding Preimges in Full MD5 Faster Than Exhaustive Search,” In A. Joux (Ed.), EUROCRYPT
2009, LNCS 5479, pp. 282–296, Springer, 2009.
51. K. Shibutani, T. Isobe, H. Hiwatari, A. Mitsuda, T. Akishita, and T. Shirai, “Piccolo: An Ultra-Lightweight Blockcipher,”
CHES 2011, LNCS 6917, pp. 342–357, Springer, 2011.
52. T. Shirai, K. Shibutani, T. Akishita, S. Moriai, and T. Iwata, “The 128-Bit Blockcipher CLEFIA (Extended Abstract),”
FSE 2007, LNCS 4593, pp. 181–195, Springer, 2007.
53. T. Suzaki, K. Minematsu, S. Morioka, and E. Kobayasi, “Twine: A lightweight, versatile block cipher,” In Proceedings
of ECRYPT Workshop on Lightweight Cryptography, 2011.
54. D. Wagner, “The Boomerang Attack,” FSE’99, LNCS 1636, pp. 156–170, Springer, 1999.
55. Johan Wallén, “On the Differential and Linear Properties of Addition,” Master’s thesis, Helsinki University of Technology,
Laboratory for Theoretical Computer Science, November 2003.
56. D. J. Wheeler and R. M. Needham, “TEA, a Tiny Encryption Algorithm, In B. Preneel (Ed.), FSE’94, LNCS 1008, pp.
363–366, Springer, 1995.
57. D. J. Wheeler and R. M. Needham, “Correction of XTEA,” Technical Report, Computer Laborarory, Universoty of
Cambridge, October 1998.
58. E. Yarrkov, “Cryptanalysis of XXTEA,” IACR Cryptology ePrint Archive 2010/254, 2010.
A Differential Characteristic
Let ∆Xi be the XOR difference of Xi , and let pi beQ the probability of ∆Xi → ∆Xi+1 . The probability p of an
r−1
r-round differential characteristic is computed as p = i=0 pi .
Table 10 shows the 11-round differential characteristic with the probability of 2−98 . The differences in the table
are denoted in hexadecimal.
i ∆Xi pi
0 80000234 α0402214 β0401205 γ0400281 2−22
1 80400080 8a000080 82000210 80000234 2−14
2 80000014 80400014 80400004 80400080 2−9
3 80000000 80000000 80000010 80000014 2−3
4 00000000 80000000 80000000 80000000 1
5 00000100 00000000 00000000 00000000 2−1
6 00020000 00000000 00000000 00000100 2−2
7 04000000 00000000 00000020 00020000 2−4
8 00000008 00000001 00004004 04000000 2−8
9 00001200 28000200 80800800 00000008 2−12
10 00200050 05440050 10100101 00001200 2−23
11 η800000a 88aaa00a 220202ζ0 00200050
The 7-round differential characteristic with the probability of 2−27 , discarding the first two rounds and the last
two rounds is used for constructing a 14-round boomerang characteristic.
B Linear Approximation
Let Γ Xi be the mask of Xi , and let εi = pi − 1/2 be the bias of the linear approximation
α0i · (Xi [0] ⊕ RKi [0]) ⊕ α1i · (Xi [1] ⊕ RKi [1]) = α2i · ROR9 (Xi+1 [0]), pαi = 1/2 + εαi , (3)
β0i · (Xi [1] ⊕ RKi [2]) ⊕ β1i · (Xi [2] ⊕ RKi [3]) = β2i · ROL5 (Xi+1 [1]), pβ i = 1/2 + εβi , (4)
γ0i · (Xi [2] ⊕ RKi [4]) ⊕ γ1i · (Xi [3] ⊕ RKi [5]) = γ2i · ROL3 (Xi+1 [2]), pγ i = 1/2 + εγi . (5)
Qr−1
Let ε be the bias of an r-round linear approximation. Note that εi = 4εαi εβi εγ i and ε = 2r−1 i=0 εi by Piling-Up
Lemma [44].
Table 11 shows the 11-round linear approximation with the biases of 2−62 . The masks in the table are denoted
in hexadecimal.
Table 12 shows one of three 10-round impossible differential characteristic reported in [20]. ‘1’ and ‘0’ mean the
single bits 1 and 0 in the XOR difference. ‘x’ means an unknown bit.
Table 11. 11-round linear approximation with the bias ε = 2−62