AES Based Hash
AES Based Hash
by
Martin Schläffer
A PhD Thesis
Presented to the Faculty of Computer Science in Partial Fulfillment of the
Requirements for the PhD Degree
Assessors
Prof. Dr. Ir. Vincent Rijmen (TU Graz, Austria)
Prof. Dr. Lars Ramkilde Knudsen (DTU, Denmark)
March 2011
iii
Acknowledgements
First of all, I would like to thank my supervisor Vincent Rijmen for his excel-
lent guidance throughout my whole PhD studies. Most important, thank you
for integrating me into the Krypto group while I was still looking for research
directions at the beginning of my PhD. Many thanks also for providing interest-
ing research topics, for your support while I was working on my own ideas, for
numerous scientific discussions and for our always entertaining Krypto meetings.
I would also like to thank Lars R. Knudsen for being my external reviewer
and especially for inviting me into the Grøstl team. It is a special honor to be
part of such a prominent team and also the competition is more exciting with an
own submission. Thank you for your hospitality at the Mathematics department
at DTU in Denmark, for playing football together, and for sharpening my mind
in keeping my emails short.
During my studies, I had the pleasure to work in the IAIK Krypto group.
Thank you all for introducing me to the secrets of cryptanalysis and for the great
research atmosphere. Without the deep knowledge in this team, many new at-
tacks would not have been possible. Especially, I would like to thank Florian
Mendel for sharing many research ideas and for showing me his efficiency in per-
forming daily tasks. Special thanks also go to Mario Lamberger, Tomislav Nad,
Norbert Pramstaller and Christian Rechberger for many interesting discussions
on cryptography, mathematics, implementations, and life.
I would also like to thank all guests who have visited the Krypto group during
my studies. Special thanks go to Kazumaro Aoki for introducing me to the fine
details of assembly optimizations, and to Sebastiaan Indesteege and Søren S.
Thomsen for lots of discussions and their help while I was visiting their groups.
Many thanks go to all members of COSIC at K.U.Leuven and the Mathematics
department at DTU, who took care of me during my numerous visits in Leuven
and while I was staying in Copenhagen.
Special thanks go to Elisabeth Oswald for introducing me to cryptography
and to all members of the IAIK VLSI group, who hosted me during my Master’s
thesis and at the beginning of my PhD studies. Thank you all for patiently
answering all my questions on implementation security.
I would also like to thank all people with whom I had many interesting
research discussions. Especially, thanks go to all my coauthors: Jean-Philippe
Aumasson, Praveen Gauravaram, Sebastiaan Indesteege, Emilia Käsper, Dmitry
Khovratovich, ChangKyun Kim, Lars R. Knudsen, Mario Lamberger, Gaëtan
Leurent, Krystian Matusiewicz, Florian Mendel, SangJae Moon, Tomislav Nad,
Marı́a Naya-Plasencia, Ivica Nikolic, Svetla Nikova, Rune S. Ødegård, Elisabeth
v
vi Acknowledgements
Martin Schläffer
Graz, March 2011
Table of Contents
Abstract iii
Acknowledgements v
List of Tables xi
Notation xv
1 Introduction 1
1.1 Cryptographic Hash Functions . . . . . . . . . . . . . . . . . . . 1
1.2 Cryptanalysis of Hash Functions . . . . . . . . . . . . . . . . . . 2
1.3 The NIST SHA-3 Competition . . . . . . . . . . . . . . . . . . . 3
1.4 Outline of this Thesis . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Main Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 6
vii
viii Table of Contents
8 Conclusions 143
Bibliography 155
xi
List of Figures
xiii
xiv List of Figures
List of Abbreviations
AES Advanced Encryption Standard
AES-NI Intel AES new instructions
ARM Advanced RISC Machine
DM Davies-Meyer mode of operation
MD Merkle-Damgård
MP Miyagichi-Preneel mode of operation
MMO Matyas-Meyer-Oseas mode of operation
MMX Multi Media Extension
NEON ARM Advanced SIMD extension
NIST National Institute of Standards and Technology
SHA Secure Hash Algorithm
SSE Streaming SIMD Extensions
AVX Advanced Vector Extensions
xv
1
Introduction
Historically, cryptography has been the art of hiding information. Secret in-
formation is usually protected using symmetric encryption algorithms such as
block ciphers or stream ciphers. Next to secrecy (confidentiality), three other
fundamental security goals are provided by cryptography: data integrity, authen-
tication and non-repudiation. In early years, it has been believed that encryption
algorithms can also provide data integrity and authentication, which is not the
case in general. For a detailed treatment of these security goals we refer to the
Handbook of Applied Cryptography [MvOV96].
In this thesis we analyze cryptographic hash functions. These primitives are
used to provide data integrity and authentication. More specifically, using cryp-
tographic hash functions, the problem of data integrity and authentication of a
long message can be reduced to that of a much shorter hash value. Instead of
protecting the integrity or authenticating the (sometimes) very long message,
only the hash value needs to be protected which is usually much more efficient.
Therefore, hash functions are used in a large number of applications and crypto-
graphic protocols. For example, when used with digital signatures only a short
hash value needs to be signed instead of the full message.
1
2 Chapter 1. Introduction
More specifically, Merkle [Mer79] has defined three main security requirements
for hash functions: collision resistance, second-preimage resistance and preim-
age resistance. Figure 1.1 illustrates these requirements. Despite being secure,
a hash function should also be very efficiently computable.
? ?
?
?
{0, 1}n {0, 1}n {0, 1}n
The size of the hash value is usually denoted by n and the output space of
a hash function has therefore a size of 2n . Due to the compressing structure of
a hash function we cannot avoid messages which result in the same hash value.
For an ideal hash function with ideal security we can find (second-) preimages
by trying out about 2n input messages. Due to the birthday paradox, collisions
can be found using only 2n/2 distinct input messages. Therefore, common hash
sizes range between n = 128 and n = 512 bits. In this case, it is impossible to
find collisions or preimages by exhaustive search. Nevertheless, a hash function
is considered broken if these ideal requirements are not met.
Today’s most popular hash functions have originated from the MD4 fam-
ily of hash functions. These designs are based on the three simple operations
ADD, ROTATE and XOR (ARX), which are repeated for a large number of
rounds. The first member of this family, MD4 [Riv92a] was proposed by Rivest
in 1990 and weaknesses have been found already one year later by den Boer and
Bosselaers in [dBB91], and later by Dobbertin in [Dob96a, Dob98]. Shortly after
MD4, Rivest proposed a strengthened version MD5 [Riv92b] in 1991. Both hash
functions have a rather small hash value size of 128 bits. Also for MD5, small
weaknesses have been found early by den Boer and Bosselaers in [dBB93] and
Dobbertin in [Dob96c]. Despite these weaknesses, both MD4 and MD5 could
not be broken for a surprisingly long time (until 2004) and MD5 is still used in
many applications.
Probably due to these early discovered weaknesses and the small hash size,
the National Institute of Standards and Technology (NIST) proposed a new
Secure Hash Algorithm SHA-0 (initially called SHA) in 1993 [Nat93]. Two years
later, a strengthened version SHA-1 was published and standardized [Nat95]. In
these two hash functions, the hash size has been increased to 160 bits. The first
results on SHA-0 have been published by Chabaud and Joux in 1998 [CJ98].
In this work, new techniques and a collision attack on the full SHA-0 with a
complexity of 261 has been published. It took 6 more years to improve this attack
and apply it also to other hash functions. In the meanwhile, NIST proposed
another new ARX-based hash function family SHA-2 with hash sizes between
224 and 512 bits.
At the Crypto 2004 rump session, Wang et al. suddenly announced practical
collisions for MD4 [WLF+ 05], MD5 [WY05] and SHA-0 [WYY05c], as well as
a collision attack with complexity 269 for SHA-1 [WYY05b]. Wang et al. have
used several new and powerful techniques to break these hash functions. After
publishing their ground-breaking attacks, a run on hash function cryptanalysis
has been started. The results of Wang et al. have been improved, extended and
applied to other hash functions in many publications. However, a practical colli-
sion (a real colliding message pair) is still missing for the full SHA-1. Currently,
the claimed complexity is about 263 [WYY05a] and practical collisions have been
published for 73 out of 80 steps [Gre10].
For SHA-2, (practical) collision attacks are still known on only 24 out of 64
steps [IMPS09, SS08]. Furthermore, theoretical preimages for 43 out of 64 steps
can be constructed for SHA-256 with a very high complexity of 2254.9 [AGM+ 09].
Nevertheless, it is possible that an extension of the attacks of Wang et al. can
also be used to break SHA-2 in the near future. Therefore, NIST decided this
time to find the next SHA-3 standard using an open competition, similarly to
the AES competition.
SHA-3. [Nat07b]. The new SHA-3 should replace SHA-1 and be used in ad-
dition to SHA-2. The competition is held similarly to the AES competition,
in which Rijndael [DR99a] was selected as the new block cipher standard AES
[Nat01]. The deadline for submissions was October 31st, 2008 and the minimum
requirements have been published in [Nat07a]. The main concern is of course
security, but the future SHA-3 standard should also be as fast as SHA-2 on most
current and future platforms to get a high acceptance rate in industry.
NIST received 64 submissions and some hash functions have been broken
quickly, the first one already within 24 hours [Wil08]. 51 candidates have been
selected for Round 1 in December 2008 [Nat08] and the cryptographic commu-
nity has published many new and interesting attacks on these hash functions
since then. At the end of Round 1, about one half of the candidates were bro-
ken or serious weaknesses were found. To focus the cryptanalysis effort on a
small number of candidates, NIST selected 14 SHA-3 candidates to advance into
Round 2 in July 2009 [Nat09].
Among these 14 Round 2 candidates many different design strategies are
present. Some designs are ARX-based or AES-based, or use small 4-bit S-
boxes or Boolean functions. There are block cipher-based and permutation-
based hash functions with very different properties and requirements for their
building blocks. The 14 Round 2 candidates are Blake, Blue Midnight Wish,
CubeHash, ECHO, Fugue, Grøstl, Hamsi, JH, Keccak, Luffa, Shabal, SHAvite-
3, SIMD and Skein. Some of these hash functions have been analyzed thoroughly
and for others almost no results have been published. At the end of Round 2,
no remaining candidate has been broken or has shown really serious weaknesses.
Most results have been published only on building blocks and only in some cases
round-reduced hash function attacks have been shown.
Nevertheless, NIST had to reduce the number of candidates further and has
chosen 5 finalists in December 2010 [Nat10]. The finalists are Blake [AHMP11],
Grøstl [GKM+ 11], JH [Wu11], Keccak [BDPV11b] and Skein [FLS+ 11]. At the
beginning of Round 3, small tweaks were allowed and for all finalists changes
have been made. NIST did not choose candidates with questionable security
nor really slow hash functions. Furthermore, NIST tried to choose a balanced
set of finalists such that a single new attack is not likely to break many of the
finalists. Small tweaks on the finalists were allowed and after another year of
focused analysis, NIST intends to choose the final SHA-3 algorithm in the middle
of 2012 and standardize SHA-3 at the end of 2012.
For an overview of all publicly known SHA-3 candidates and cryptanalysis
results we refer to the SHA-3 Zoo 1 which is maintained by the ECRYPT II
project. Since a good future SHA-3 standard should also have good performance,
automatic software benchmark are given through the eBASH framework 2 of the
ECRYPT II project. Everybody can submit new optimized code for any SHA-3
candidate which will then be benchmarked on a large number of machines.
1 http://ehash.iaik.tugraz.at/wiki/The_SHA-3_Zoo
2 http://bench.cr.yp.to/ebash.html
1.4. Outline of this Thesis 5
based hash function with 6 parallel AES based permutations and a linear message
expansion. Since the AES round transformations are directly used in permuta-
tions with a larger size than the AES state, the diffusion is not optimal. Similar
as in the case of ECHO, a rebound attack with multiple inbound phases and
outbound phases can be applied. However, until today no hash function attack
on Lane has been published.
In Chapter 8, we give a brief summary of the rebound attacks on AES based
hash functions covered in this thesis and on other AES based hash functions.
We also discuss in which cases multiple inbound or multiple outbound phases
are possible. Since the same attack strategy also applies to other hash function
designs, we briefly discuss this extension. Finally, we discuss open problems,
future work and further interesting research directions.
are given in [AKK+ 10] and [KNPRS10]. Most of the results have been published
at international conferences.
The author was also involved in the design of the hash function Grøstl
[GKM+ 11] which was selected as one of the 5 finalists in the SHA-3 competition.
Grøstl is one of those SHA-3 candidates which have been analyzed most exten-
sively. This is due to the early cryptanalysis results of the design team which has
been extended by external cryptanalysis. Additionally, the fastest known Grøstl
implementation has been developed in [RS11] using the Intel AES-NI instruction
and new optimized implementation techniques.
Finally, also the provable side-channel resistant threshold implementation
technique of Nikova et al. [NRR06] has been analyzed and improved. In [NRS08],
the first formulas for the glitch-free and efficient implementation of a block cipher
have been shown. The results have been extended and published in the Jour-
nal of Cryptology [NRS11] which led to increasing interest in the side-channel
community. In the meanwhile, very secure threshold implementations have been
published by other groups for the block cipher Present [PMK+ 11] and AES
[MPL+ 11].
Analysis of Cryptographic Hash
Functions
2
In this chapter, we give a brief introduction to cryptographic hash functions,
discuss their requirements and provide some important attack strategies. In
Section 2.1, we define cryptographic hash functions and discuss their main se-
curity requirements. Furthermore, we present the most commonly used design
strategies for hash functions and compression functions, and discuss attacks on
these important building blocks. In Section 2.2, we provide some generic attack
methods which can be applied to any hash function.
Probably the most powerful attacks on hash functions are differential attacks.
Wang et al. have broken MD5 and SHA-1 using differential attacks and also the
main focus of this thesis are differential attacks. Therefore, we give a detailed
introduction to this powerful tool for the analysis of cryptographic primitives in
Section 2.3. Finally, in Section 2.4 we describe a new tool for the differential
analysis of cryptographic hash functions, the rebound attack [MRST09]. Using
the rebound attack, especially the cryptanalysis of AES-based hash functions
has been improved significantly in recent years [MPRS09, LMR+ 09, MNPN+ 09,
GP10, MRST10, Pey10, SLW+ 10, ITP10].
9
10 Chapter 2. Analysis of Cryptographic Hash Functions
attacks work for any function, a cryptographic hash function is said to be ideal
(regarding these three requirements) if the generic bounds are met.
Unfortunately, these three requirements are not enough for a cryptographic
hash function to be secure in any application. For example, length extension
attacks and also near-collisions [MvOV96] are possible, even if these requirements
are met. There are several other important properties. To cover them all, the
random oracle model has been introduced in [BR93]. A random oracle is a
function which outputs a random hash value for any new input message. If
the same message is used again, it outputs the previously used corresponding
hash value. Due to the limited internal state of a practical hash function it
can never be a random oracle. However, it should be infeasible to distinguish a
cryptographic hash function from a random oracle up to the generic bound for
any attack.
M1 M2 M3 Mt
(s,ci ,ti ) (s,ci ,ti ) (s,ci ,ti ) (s,ci ,ti )
f f f f g
IV H(M )
w w n
More formally, let H : {0, 1}∗ → {0, 1}n be an iterated hash function based
on a compression function f : {0, 1}v → {0, 1}w and an output transformation
g : {0, 1}w → {0, 1}n . Then, we split the message M into t equally sized
message blocks M1 ||M2 || . . . ||Mt of size m. To ensure that the message length is
a multiple of m, an unambiguous padding rule is applied to M . Sometimes, other
additional inputs to the compression function are used in iterated constructions
such as a salt s, counter ci or tweak input ti . Then, the hash value h = H(M )
12 Chapter 2. Analysis of Cryptographic Hash Functions
is computed as follows:
H0 = IV
Hi = f (Hi−1 , Mi , ci , ti , s) for 1 ≤ i ≤ t (2.1)
h = g(Ht ).
The w-bit intermediate variable Hi is called the chaining value and is initialized
with a predefined initial value IV . Together, all inputs to the compression
function have size v and if no salt, counter and tweak input is used, we have
v = m + w.
The security of such an iterated hash function depends on the bitsize w of
the intermediate chaining values Hi , on the security of the compression function
f and on the output transformation g. Informally, we need a more secure com-
pression function for smaller chaining values, but at least w ≥ n to avoid trivial
(collision) attacks. The most commonly used strategy is the Merkle-Damgård
design principle [Dam89, Mer89]. In this case, the chaining value can be as small
as the final hash size (w = n) but the compression function should be designed
more securely. To be more precise, the Merkle-Damgård reduction proof states
that if a compression function is collision resistant, also the resulting iterated
hash function is collision resistant. The Merkle-Damgård strengthening [LM92]
further requires that the length of the message is included in the padding and
the initial value IV is fixed to some predefined constant to avoid some simple
long-message attacks [LM92, Win84] and fixed-point attacks [Pre93].
The Merkle-Damgård design principle still has some non-ideal properties
for chaining values of size w = n without output transformation. The most
important weaknesses are the length extension property [Dam89, Mer89], mul-
ticollision attacks [Jou04], long message second-preimage attacks [KS05] and
herding attacks [KK06]. In recent years, many proposals and extensions to the
Merkle-Damgård construction have been made to reduce these problems. Some
examples are wide-pipe constructions [Luc05] which increase the chaining value
size to w > n such as Chop-MD [CDMP05], the HAIFA framework [BD07]
which includes additional inputs (salt, counter) to the compression function, or
the ROX [ANPS07] and EMD [BR06] construction, which are multi-property
preserving constructions [CDMP05].
Another approach is to increase the size of the internal chaining value even
further to w > 2n. In this case, the compression function is allowed to be
invertible and does not need to be collision resistant anymore. The security
and reduction proofs are based on the large size of the chaining value and other
(ideal) properties of the compressing part of the hash function. Bertoni, Daemen,
Peeters, and Van Assche have defined and formalized the sponge construction
[BDPV07, BDPV08]. In that work, also reduction proofs for the sponge con-
struction are given. For example, the construction cannot be distinguished from
a random oracle if the underlying permutation is a random permutation.
2.1. Cryptographic Hash Functions 13
f f f
Mi
Mi E Hi Mi E Hi
Hi−1 E Hi
Hi−1 Hi−1
Figure 2.2: Three main block cipher modes to construct compression functions.
is Davies-Meyer since it allows different chaining value and message block sizes.
Also BLAKE [AHMP11], one of the five SHA-3 finalists can be considered as a
dedicated hash function based on a weak block cipher (see Figure 2.3a).
cnt, salt
f
f f
Mi
m Mi
m
Mi Q
2n
E Hi m m
Hi−1 2n 2n n Hi−1 P Hi
n Hi−1 P Hi
2n 2n
m m
(≥ 2n)
tweak
f f
Mi Mi
r m
r r
Hi−1 P Hi
Hi−1 E Hi
n n
c c
(≥ 2n)
Figure 2.3: Schematic view on the iterated compression function of the five
SHA-3 finalists. The main building blocks are either a block cipher E or 1-2
permutations P , Q. Other parts of the construction are XORs (⊕) and concate-
nations ||.
every submitted hash function provides an easy way to reduce (or increase) the
number of rounds in the hash function.
to [Rec09]). In free-start collision attacks, both the messages and the chaining
∗
values are allowed to be different and we have f (Hi−1 , Mi ) = f (Hi−1 , Mi∗ ). In
semi-free-start collision attacks, both chaining values need to be equal and we
have f (Hi−1 , Mi ) = f (Hi−1 , Mi∗ ). Note that semi-free-start collision attacks are
more difficult attacks and also not trivial for sponge or sponge-like constructions.
Additionally, near-collisions or other distinguishers of the compression func-
tion can be of some interest. Some hash function designs use proofs which require
an ideal compression function. In turn, many (recent) cryptanalytic attempts
are to distinguish a compression function from an ideal compression function.
However, for many newly designed hash functions and SHA-3 candidates near-
collisions or distinguishers of the compression function are less import. The out-
put transformation (or a subsequent compression function call) destroys these
properties and the used proofs do not require an ideal compression function.
The resulting function P(N ) is shown in Figure 2.4. To find a collision with a
probability of at least P(N ) >= 50% for a function f with n-bit output size, we
need to evaluate f about √ n
N ≈ 2 · ln 2 · 2 2
times. In the remainder of this thesis, we will usually omit the constant factor
and use the asymptotic complexity Θ(2n/2 ) or simply 2n/2 instead.
P(N )
100%
75%
50%
25%
0% N
n n 3n
4 2 4
and the needed time and memory complexity is given by the following Lemma:
Lemma 2.1 (Merging Lists). Let L1 and L2 be two lists of size |L1 | = 2r
and |L2 | = 2s . Then, the complexity to compute and store all 2t solutions of
L12 = L1 ./l L2 with 2t = 2r+s−l is given as follows:
Using Lemma 2.1, the total complexity is 2n/3 in time and memory. If we
increase the size of the initial lists to 2n/2 we can find 2n/2 solutions with a time
and memory complexity of 2n/2 , or with an average complexity of 1:
n n n n
L1 ./ n2 L2 : 2 2 × 2 2 × 2− 2 = 2 2
n n n n
L3 ./ n2 L4 : 2 2 × 2 2 × 2− 2 = 2 2
n n n n
L12 ./ n2 L34 : 2 2 × 2 2 × 2− 2 = 2 2 .
2.3.1 Overview
Differential cryptanalysis was first published by Biham and Shamir for the block
cipher DES in 1990 [BS90, BS91]. Their results led to differential attacks on the
full DES [BS92] and was applied to many other block ciphers, stream ciphers and
also hash functions. The first results on hash functions have been published by
den Boer and Bosselars on MD5 [dBB93], Dobbertin on MD4 [Dob96a, Dob98]
and Chabaud and Joux on SHA-0 [CJ98]. A very natural target for differential
attacks is the collision resistance of a hash function. In this case, a non-zero
input difference should result in a zero output difference. To the surprise of
the cryptographic community, Wang et al. was even able to show attacks on
2.3. Differential Cryptanalysis 21
the full hash functions MD4, RIPEMD, MD5 and SHA-1 by presenting colli-
sion attacks using differential cryptanalysis [WLF+ 05, WY05, WYY05b] (also
see Section 2.3.3). Today, the designers of every newly proposed cryptographic
primitive have to argue or better prove that their design is secure against differ-
ential cryptanalysis.
Differential attacks are dedicated attacks, usually very specialized by ex-
ploiting the internal structure of a design. The main idea is to consider the
propagation of differences between a pair of inputs without knowing the actual
values of the pairs. The propagation of differences is usually predicted over a
multiple number of rounds. The sequence of differences in each round is then
called the (differential) characteristic, differential trail or differential path. The
probability of a characteristic is the fraction of input pairs which conform to,
follow or show the differences of a characteristic. A cryptanalyst is trying to con-
struct high probability characteristics for a cryptographic primitive since then,
many right pairs exist. In this case, it is expected to be easier to find one or
more right pairs, which is usually the final goal of an attack.
In the last 20 years, differential cryptanalysis has improved in several ways.
Many types of differences have been invented and customized to fit the prim-
itive under attack. Some important types of differences are XOR differences
[BS90], modular differences [Dob96a], signed bit differences [WY05, WYY05b]
and truncated differences [Knu94]. Additionally, new types of attacks have been
invented and/or combined with differential cryptanalysis, for example linear-
differential attacks [CJ98], differential-linear attacks [LH94], impossible differ-
ential attacks [BKR97, BBS99], the boomerang attack [Wag99] or the rectangle
attack [BDK01]. Especially for hash functions, new clever techniques have been
developed, refined and extended to find differential characteristics and right
pairs more efficiently. Examples are automated differential path search tech-
niques [SO06, DR06b], advanced message modification [WY05, WYY05b] or the
rebound attack [MRST09].
2.3.2 Preliminaries
In the differential analysis of cryptographic primitives, the most common dif-
ferences to consider are XOR (bitwise) differences. Then, we get the following
definition of a difference:
Definition 2.2 (XOR Difference). Let a and a∗ be two n-bit vectors. Then the
n-bit XOR difference is defined by
∆a = ∆(a, a∗ ) = a ⊕ a∗ . (2.4)
2.3.2.1 Differentials
In the differential cryptanalysis, we consider the propagation of differences
through (sub-)functions of a cryptographic primitive and we get the following
basic definitions:
Definition 2.3 (Differential). A differential D for an n to m bit function f
consists of an n-bit input difference ∆a and an m-bit output difference ∆b. The
differential is denoted by
∆a → ∆b,
or if the function is not clear from the context by
f
∆a −
→ ∆b.
Definition 2.4 (Number of Right Pairs). The number of right pairs (or cardi-
nality [DR07b]) Nf (∆a → ∆b) of a differential D = ∆a → ∆b is the number
of pairs with input difference ∆a and output difference ∆b (#S denotes the
number of elements in a set S):
Nf (∆a → ∆b) = #{(a, a∗ ) | a ⊕ a∗ = ∆a and f (a) ⊕ f (a∗ ) = ∆b} (2.5)
When analyzing cryptographic functions, we are often interested in the num-
ber of right pairs for all possible input and output differences. For functions with
small n, m we can simply list all combinations using the differential distribution
table.
Definition 2.5 (Differential Distribution Table (DDT)). Let f be an n to m
bit function. The differential distribution table of f is an 2n × 2m table whose
entries are the number of right pairs Nf (∆a → ∆b) for all differentials ∆a → ∆b.
The rows of the table are indexed by the input difference ∆a and the columns
are indexed by the output difference ∆b.
The top row of the differential distribution table always contains the elements
2n , 0, 0, . . . , 0 and the sum of each row is always 2n . Since XOR differences
are symmetric (∆a = a ⊕ a∗ = a∗ ⊕ a), only even values occur in the table.
Furthermore, there are 2n−1 possible non-zero input differences and for each of
these differences, 2n−1 pairs exist. In [DR07b], Daemen and Rijmen have further
analyzed the distribution of the number of right pairs for a random function and
have proven the following theorem and corollary:
Theorem 2.2 ([DR07b]). For a random n-bit to m-bit function, the number of
right pairs Nf (∆a → ∆b) of a differential ∆a → ∆b is a random variable with
binomial distribution B(2n−1 , 2−m ).
Corollary. For n ≥ 5 and |n − m| small, the number of right pairs can be
approximated by a Poisson distribution with λ = 2n−m−1 .
For each differential ∆a → ∆b only a fraction of all pairs (a, a∗ ) with input
difference ∆a are right pairs. This fraction is called the differential probability
(DP) or difference propagation probability of a differential ∆a → ∆b and defined
as follows:
2.3. Differential Cryptanalysis 23
2.3.2.4 Conditions
For each differential with differential probability not equal to 1 or 0 we can derive
a set of equations which can be used to describe the right pairs. We call such
an equation a condition of a differential or characteristic. For example, a simple
condition is to list all right pairs.
The main advantage of conditions is that they can usually be derived easily
for differentials on small sub-functions. Using these conditions, we can approx-
imate the differential probability and the expected number of right pairs. By
multiplying these probabilities of the sub-functions, we can get a quite good
approximation for the differential probability of the whole characteristic. Fur-
thermore, conditions can be especially useful when finding right message pairs
for a colliding differential characteristic of a hash function.
the more freedom an attacker has in executing the attack. For this reason, we
often talk about degrees of freedom in an attack. The degrees of freedom (or
simply called freedom) is defined as follows:
Similarly as for any type of differences, we can also define the (approximate)
differential probability and the (expected) number of right pairs for truncated
differences. Note that a truncated differential is a collection of many differentials
and the truncated differential probability is the sum of the probabilities of all
its differentials.
Truncated differentials are particularly useful if they fit the structure of a
cryptographic primitive. For example, byte-wise truncated differentials can be
very useful in the analysis of byte-oriented primitives. Also the AES-based hash
function Grindahl [KRT07] has been broken using truncated differentials [Pey07].
For S-box based, byte-wise functions, is is common to consider only two types
of differences. The difference of a particular S-box or byte is either non-zero or
zero and we define:
Definition 2.13 (Active S-box). An S-box (or byte) with non-zero (input)
difference is called active and otherwise, non-active.
2.4.1 Overview
The basic rebound attack consists of two main phases, called inbound and out-
bound phase, as shown in Figure 2.5. According to these phases, the compres-
sion function, internal block cipher or permutation of a hash function is split
into three sub-parts. Let E be a block cipher, then we get E = Efw ◦ Ein ◦ Ebw .
Hence, the part of the inbound phase is placed in the middle of the cipher and
the two parts of the outbound phase are placed next to the inbound part. In
2.4. The Rebound Attack 29
the outbound phase, two high-probability (truncated) differential trails are con-
structed, which are then connected in the inbound phase. Similar to message
modification, the freedom in the message, key-inputs or (internal) state variables
is used to efficiently fulfill many conditions of a differential trail.
The idea of placing the most expensive part of the differential trail in the
middle was previously used in the cryptanalysis of the compression function of
MD5 [Dob96b] and the hash function Tiger [KL06, MPR+ 06, MR07]. Also,
inside-out techniques have been used by Wagner as an application of second
order differentials in the cryptanalysis of block ciphers in the Boomerang attack
[Wag99].
outbound outbound
inbound
Figure 2.5: A schematic view of the rebound attack. The attack consists of an
inbound and two outbound phases.
2nd outbound
1st outbound
2nd inbound
merge inbound
1st inbound
1st outbound
2nd outbound
Figure 2.6: Schematic of the rebound attack with multiple inbound and multiple
outbound phases.
2.4. The Rebound Attack 31
33
34 Chapter 3. The Rebound Attack on AES-Based Permutations
3.1.3 Decryption
For the AES decryption, inverse round transformations in reverse order are ap-
plied. Also the round keys have to be computed in reverse order. InvShiftRows
rotates right instead of left and since the AES S-box is based on the inversion
in GF (28 ), only the affine transformation needs to be changed to its inverse
in InvSubBytes. Also the coefficients of the InvMixColumns transformation are
different.
3.2. Differential Properties of AES Round Transformations 35
3.2.1 SubBytes
Many differential properties of an S-box S can be derived from its differential
distribution table (DDT) [BS91] (also see Section 2.3.2). For each of the 216
input/output differentials (∆x, ∆y), the differential distribution table gives the
number of solutions x or right pairs (x, ∆y) for the equation
The partial differential distribution table of the AES S-box is shown in Table 3.1.
For a good S-box, the non-uniformity of the DDT and hence, the non-zero entries
in the table should be small and evenly distributed. In the DDT of the AES
S-box only the values 0, 2, 4, 256 occur with frequency 33150, 32130, 255 and
1. The last value corresponds to the zero differential (∆x, ∆y) = (0, 0), for
which any x is a solution. In the majority of all cases, there are either no or
exactly two right pairs. If there is no right pair, the corresponding differential
is called an impossible differential. If there are two right pairs, the differential
probability for the respective differential (∆x, ∆y) is PS = 2 · 2−8 = 2−7 . In
some rare cases, exactly 4 solutions exist and these differentials have a maximum
differential probability of Pmax
S = 4 · 2−8 = 2−6 .
Further properties of the AES S-box (and its inverse), which can be deduced
from the differential distribution table are:
36 Chapter 3. The Rebound Attack on AES-Based Permutations
Table 3.1: An excerpt of the differential distribution table (DDT) for the AES
Sbox in hexadecimal basis.
∆x \ ∆y 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F ...
00 256 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...
01 0 2 0 0 2 0 2 0 2 2 2 2 2 2 2 2 ...
02 0 0 0 2 2 2 2 2 0 0 0 2 2 2 0 2 ...
03 0 0 2 0 2 2 0 0 2 0 2 2 2 0 0 0 ...
04 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 ...
05 0 0 0 0 2 0 0 0 4 2 0 0 2 2 0 0 ...
06 0 2 0 0 2 2 0 0 0 2 0 2 4 0 2 0 ...
07 0 2 0 0 0 0 2 0 2 2 2 0 0 0 0 0 ...
08 0 0 2 2 0 0 0 0 0 2 0 2 2 0 0 2 ...
09 0 0 2 2 0 0 2 2 0 0 0 0 2 0 2 0 ...
0A 0 0 2 2 4 0 2 2 0 2 2 0 2 0 0 2 ...
0B 0 2 0 0 0 0 0 2 2 2 0 0 2 2 2 2 ...
0C 0 0 2 2 0 0 2 2 2 2 2 0 0 2 0 2 ...
0D 0 2 2 0 0 0 2 2 2 2 2 2 0 0 0 2 ...
0E 0 0 2 2 2 2 0 2 2 0 2 2 0 0 0 0 ...
0F 0 0 2 2 2 0 0 0 2 0 2 0 2 0 0 2 ...
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
for a given non-zero input (output) difference of the S-box, the number of
possible output (input) differences is 127.
for each possible non-zero differential (∆x, ∆y), the number of solutions is
either 2 or 4.
for a fixed possible differential (∆x, ∆y), the AES S-box and its inverse
always behave linearly, since there are only 2 or 4 right pairs possible (see
[DR07a] and Section 3.6.2 for more details).
3.2.2 ShiftRows
The ShiftRows transformation moves bytes and thus, differences to different po-
sitions of a row but does not change their value. Due to its good diffusion
property, ShiftRows moves 4 active bytes of a full active column to 4 different
columns of the state. Hence, ShiftRows ensures that 4 active bytes (or differ-
ences) of one column are processed independently by the subsequent MixColumns
transformation.
3.2.3 MixColumns
MixColumns consists of 4 parallel transformations which are applied to each col-
umn of the state. When we talk about properties of MixColumns, we usually refer
to a single column transformation. Since MixColumns is a linear transformation,
the propagation of XOR differences through MixColumns is deterministic. The
propagation of an input (or output) difference ∆x = x ⊕ x∗ only depends on the
difference ∆x and is independent of the values x and x∗ . Additionally, for every
n × n MDS mapping, choosing any n bytes of the input and output uniquely
3.2. Differential Properties of AES Round Transformations 37
Table 3.2 shows the differential probability for the propagation of truncated
differences from a to b active bytes for fixed positions. For example, a truncated
difference with exactly one active byte will propagate to a truncated difference
with 4 active bytes with a probability of 1. On the other hand, a truncated
difference with 4 active bytes can result in any truncated difference between 1
and 4 active bytes after MixColumns. The probability of a transition from 4 to
1 active byte with fixed position is approximately 2−24 , since we need 3 out of
8 bytes to be zero.
a\b 0 1 2 3 4
0 1 0 0 0 0
1 0 0 0 0 1
2 0 0 0 2−8 0.996
−16
3 0 0 2 2−8 0.996
4 0 2−24 2−16 2−8 0.996
38 Chapter 3. The Rebound Attack on AES-Based Permutations
3.3.2 SuperBoxes
In the previous section, we have determined conditions for two rounds of AES by
analyzing independent 32-bit chunks of SubBytes followed by MixColumns and
another SubBytes operation. This sequence of operations is called a SuperBox
and allows to independently analyze parts of two AES rounds. Differential prop-
erties of the AES SuperBox have been analyzed in detail in [DR06a]. However,
3.4. Finding Good Differential Trails 39
in that work, the addition of a secret key is considered in between the non-linear
SubBytes layers.
In hash function cryptanalysis, the key is usually not secret and often con-
stant. In this case, a SuperBox is a non-linear 32-bit S-box with fixed differential
properties. However, the DDT is to big to evaluate completely. In the following,
we describe some techniques which can be used to efficiently find right pairs for
a given SuperBox differential. Some techniques have special requirements, for
example they need a list of many differences on one side of the SuperBox, or
some (byte) differences need to be zero.
The most simple and straightforward technique is to exhaustive search
through all 232 input values of a SuperBox. In this case we will find all so-
lutions with a complexity of 232 . Note that each differential is only possible
with a probability of about 2−4 , but similar as for the S-boxes, we get about 24
pairs for each valid differential (see Section 3.5.2). However, during an attack it
is sometimes desired to reduce the complexity as much as possible at the cost of
more memory requirements and precomputation steps. Note that in a theoretical
attack on hash functions, we can usually still precompute the whole differential
distribution table (DDT) of the AES SuperBox. The memory requirements are
264 but we can lookup whether a differential is possible and also retrieve the
corresponding input pairs with a complexity of one table lookup.
with a the number of active bytes in the first state, b the number of active bytes
in the second state and ri the i-th round of AES. Due to the MDS property of
MixColumns, we either get a + b ≥ 5 or a = b = 0, for one round ri of AES.
Note that the same holds for every column of MixColumns. Hence, for a = 1 we
always get:
ri
1 −→ 4.
S0 S1 S2 S3 S4
SB SB SB SB
SR SR SR SR
MC MC MC MC
AK AK AK AK
The main advantage of using truncated differential trails in AES is, that
there are truncated differential trails with a differential probability of 1. For
example, the following truncated differential trail is fulfilled with probability 1
for any random input pair with one active byte:
1r 2 r
1 −→ 4 −→ 16
Such truncated differential properties are used in the rebound attack (see
Section 3.5). Note that for a truncated differential trail to be useful in an
attack, we need to observe some non-random property at the input and output.
This is usually the case if the input and output states are not fully active. For
example, we can extend the (minimum) 4-round truncated differential trail to a
(minimum) 7-round trail as follows (note that the last MixColumns is omitted in
AES):
r1 r2 r3 r4 r5 r6 r7
T7 = 4 −→ 1 −→ 4 −→ 16 −→ 4 −→ 1 −→ 4 −→ 4 (3.6)
3.5. The Basic Rebound Attack 41
In the following sections, we will use the rebound attack and variants of this
(minimum) truncated differential trail to get attacks on AES based hash func-
tions, permutations, or block ciphers.
In the known-key setting [KR07] or in many AES based hash functions, the
key input is known and constant. In this case, the number of possible inputs
is limited by the block size and by the truncated difference at the input of the
state update. For T7 , the total number of input pairs is 2128 · 2554 ≈ 2160 . It
follows from Lemma 2.4 of Section 2.3.2 that the expected number of right pairs
is only:
E[Nf (T7 )] = 2160 · 2−144 = 216 (3.8)
Note that for a 4-round differential trail with 25 active S-boxes (Sec-
tion 3.4.1), the expected number of right pairs is 2128 · 2−150 = 2−22 and thus,
a right pair most likely does not exist. A similar situation occurs if we try to
extend the 7-round truncated differential trail in the middle. Even if we reduce
only twice from 16 to 4 active bytes, the expected number of right pairs for the
trail
r1 r2 r3 r4
4 −→ 16 −→ 4 −→ 16 −→ 4
is only 2160 · 1 · 2−24·4 · 1 · 2−24·4 = 2−32 . Therefore, such a (sub-)trail cannot
be used in an attack, unless an additional input (e.g. a non-constant key or salt
value) is added in the middle.
S0 S1 S2 S3 S4 S5 S6 S7
SB SB SB SB SB SB
SB
SC SC SC SC SC SC
SC
MR MR MR MR MR MR
AK
AK AK AK AK AK AK
Figure 3.2: We apply the basic rebound attack to this minimum 7-round trun-
cated differential trail (black bytes are active). We start the attack in the mid-
dle with the inbound phase (red) and proceed outwards in the outbound phase
(blue).
of 2−48 . The complexity can be further reduced using the improved rebound
techniques of Section 3.6 and Section 3.7.
MC SR
SB
AK MC
average 1
Figure 3.3: Detailed round transformations for the 2-round truncated differential
trail of the inbound phase.
Of course there are many techniques to find solutions for the inbound phase
efficiently, but one simple approach is as follows: We start the inbound phase
with differences in state S3SR and S4M C (see Figure 3.3). Remember that the
probability of propagation from 4 → 16 active bytes through MixColumns is 1.
In other words, any choice of a non-zero differences in S3SR and S4M C results in
a state with full active bytes at S3 and S4SB . Then, we just need to find S-box
differentials such that the whole trail of the inbound phase is satisfied. In detail,
we get one valid pair as follows:
1. Precompute the differential distribution table (DDT) of the AES S-box.
Also compute and store the according values for each S-box differential.
number of active bytes of the trail in the outbound phase. In the truncated
differential trail of Figure 3.2 we get 4 ← 1 ← 4 in backward direction and
4 → 1 → 4 → 4 in forward direction. The differential probability of these
truncated differential trails is given as follows:
r r
1
P(4 ←− 2
1 ←− 4) = 1 · 2−24 = 2−24
r r r
5
P(4 −→ 6
1 −→ 7
4 −→ 4) = 2−24 · 1 · 1 = 2−24
Note that we have only two probabilistic MixColumns transformations with a
total probability of 2−48 . Hence, we can find a right pair for the whole 7-round
truncated differential trail by constructing 248 pairs for the inbound phase and
propagating them outwards in the outbound phase. The total complexity is
about 248 evaluations of the AES state update.
SBin
1 SBout
1 MCin
1 MCout
1
1 3 5
SB1 SR1 MC1
1 3 5
5 1 3
3 5 1
SBin
2 SBout
2 MCin
2 MCout
2
1 3 5 1 3 5 1 3 5 1 3 5
1 3 5 SB2
1 3 5 SR2
1 3 5 MC2
1 3 5
5 1 3 5 1 3 1 3 5 1 3 5
3 5 1 3 5 1 1 3 5 1 3 5
SBin
3 SBout
3 MCin
3 MCout
3
1 3 5 2 4 6 2 4 6 2 4 6
1 3 5 SB3
2 4 6 SR3
2 4 6 MC3
2 4 6
1 3 5 6 2 4 2 4 6 2 4 6
1 3 5 4 6 2 2 4 6 2 4 6
SBin
4 SBout
4 MCin
4 MCout
4
2 4 6 2 4 6 2 4 6
2 4 6 SB4
2 4 6 SR4
4 6 2 MC4
2 4 6 2 4 6 6 2 4
2 4 6 2 4 6 2 4 6
Column 1. We start with the differences of the first column (marked by “1”
in state MCin out
2 and MC2 ) of the MixColumns operation of round 2 (MC2 ). Since
3 input byte differences are required to be zero, choosing one of the remaining 5
non-zero differences, uniquely determines all other differences of MC2 . Since the
SubBytes and AddRoundKey operations are linear, we get the same differences
for the bytes marked by “1” in states SBout 2 and SBin3 . It follows that we can
choose from 255 non-zero differences for the first byte of SBin 3 , and this choice
determines all differences marked by “1” between state SBout 2 and SB3 .
in
Column 2. Next, we continue with the differences of the first column of MC3
(marked by “2” in states MCin out
3 and MC3 ). Again, 3 differences of MC3 are zero
and choosing one byte determines all differences of the first column of MC3 . Note
that the input of the first column of SB3 and thus, the difference of SBin
3 [0, 0], has
already been fixed in the previous step. Due to the differential behavior of the
AES S-box (see Section 3.2.1), we can choose from only 127 differences for the
corresponding output byte of SB3 (SBout 3 [0, 0]). Choosing one of these possible
127 differences uniquely determines all differences marked by “2” between states
SBout
3 and SB4 .
in
3.6. Solving Linearly for Pairs 47
Column 4-5. We proceed with the second column of MC3 , marked by “4” in
states MCin out in
3 and MC3 . Note that the input bytes of two S-boxes (SB3 [0, 1] and
in
SB3 [3, 0]) have already been fixed due to Column 1 and Column 3. These two
input differences restrict the number of possible differences for the output of SB3
(bytes marked by “4”) to about 256/22 = 64 values. We continue with the third
column of MC2 (marked by “5”). Two output differences of the corresponding S-
box SB3 have already been fixed and thus, we can choose from about 64 possible
differences for the input bytes marked by “5” in SBin 3 as well.
Column 6-8. This procedure continues for all 4 columns of each of the
two MixColumns transformations MC2 and MC3 . The approximate number
of possible S-box differences for SBin out
3 and SB3 are halved for each additional
MixColumns column and are shown in Table 3.3.
MC1 and MC4 . Until now, we have determined differences for the states SBout 2 ,
SBin out
3 , SB3 and SBin out
4 . Since all differences in SB2 and SBin
4 have already been
determined, we have only about 255/28 ∼ 1 difference left for SBin out
2 and SB4 .
Note that choosing the difference for one byte determines all other differences
as well due to the restrictions by MixColumns.
Note that we can find one possible differential characteristic with a complex-
ity of about one, since we filter through each MixColumns and S-box transforma-
tion only once. The total number of possible differential trails can be determined
by considering the number of choices we have at the input and output of S-box
SB3 , the input of S-box SB2 and the output of S-box SB4 . The approximate
number of choices are listed in Table 3.3 and by multiplying these numbers we
can get up to ∼ 264 possible differential trails or starting points for the next
phase.
Table 3.3: The approximate number of possible choices for the differences at the
input and output of the 3 S-boxes SB2 , SB3 and SB4 .
SBin
2 SBin
3 SBout
3 SBout
4
16 255 127 8
8 127 64 4
4 64 32 2
2 32 16 1
48 Chapter 3. The Rebound Attack on AES-Based Permutations
k ⊕ x · (k ⊕ k 0 )
SBin 0
4 [0, 0] = (k ⊕ x · (k ⊕ k )) · L
(k ⊕ x · (k ⊕ k 0 )) · L = a ⊕ y · (a ⊕ a0 )
By doing the same for the other diagonals (corresponding to columns 2-4 of
MCin 3 ) we get a system of 16 equations in 16+4=20 variables which has to be
fulfilled to guarantee that the differential trail holds in the forward direction. In
a similar way we also get a system of 16 linear equations in 20 variables by going
backward from SBin out in
3 to SB2 . However, since the values of SB3 and SB3
out
are
related, we get in total a system of 64 equations in 24 variables by combining
them. In other words, to find a valid pair, we have to backtrack and try about
240 differential trails and thus, solve the linear system of equations 240 times.
Since we can start with up to 264 differential trails, we can only find about
264−40 = 216 pairs after the linear solving step.
In the case of AES, we get a better complexity if we first fix the differential
trail for rounds 1-3 (1 → 4 → 16 → 4) and then, solve for right pairs. In this
case, we get only 32 conditions and the complexity to solve for one pair is about
212 . Since we need to repeat the attack 224 times to fulfill the last MixColumns
operation we get a total complexity of only 236 in this case.
Note that the attack works similar if we use 4 possible input pairs for the
S-box. By choosing the differences in the previous step (Section 3.6.1) in a
way, to maximize the number of differentials with 4 possible pairs for the S-box,
the overall complexity can be reduced slightly (by about 22 to 25 ). The total
3.7. Time-Memory Trade-Offs using SuperBoxes 49
complexity of the attack is given by the number of times we need to solve the
resulting linear system of equations. We assume here that this corresponds to
about one call to the AES. Hence, the complexity is approximately 236 to find
a right pair and thus, a distinguisher for the 7-round path.
S0 S1 S2 S3 S4 S5 S6 S7 S8
SB SB SB SB SB SB SB
SB
SC SC SC SC SC SC SC
SC
MR MR MR MR MR MR MR
AK
AK AK AK AK AK AK AK
tion 3.5.2 with SuperBoxes instead of S-boxes. The inbound phase using Super-
Box matches is shown in Figure 3.6. The order of the SubBytes and ShiftRows
transformation in r4 has been swapped to get a better view on the SuperBox. In
the case of AES, this table has a size of about 264 . For SHA-3 candidates which
use the AES round transformations as a building block, this time and memory
complexity is still beyond any generic attacks on the hash function. The pre-
computation complexity to build the DDT is 264 . Once the table has been built,
the average complexity to find one right pair is 1.
As shown in Section 3.5.2, a random column or SuperBox differential is possi-
ble with a probability of about 2−4 . Hence, we need to try about 216 differentials
in the inbound phase to find a possible differential between state S3 0 and S5SB .
The main advantage of this method is that we need only one possible differential
to find a right pair. This fact can be quite important in some restricted attacks.
average 1
Figure 3.6: The 3-round truncated differential trail and the inbound phase using
SuperBoxes.
S0 S1 S2 S3 S0 S1 S2 S3
SB SB SB SB SB SB
SR SR SR SR SR SR
MC MC MC MC MC MC
AK AK AK AK AK AK
average 1 average 1
r2 r2
(a) 4 −
−→ 1 (memory 28 ). (b) 4 −
−→ 3 (memory 224 ).
Figure 3.7: Two 3-round truncated differential trails with non-full active Super-
Box matches in the inbound phase.
points are needed to find right pairs with an average complexity of 1 and memory
requirements of 28·min{x,y} . Recently, another technique has been published and
implemented which can find values for x → 4 active bytes with complexity 24+7·x
in the case of AES [JF11]. The memory complexity in this case is 216 .
3.8 Summary
In this chapter, we have applied the rebound attack to the AES in the known-
key setting which corresponds to the permutation setting of many AES-based
hash functions. We have analyzed the differential properties of the round trans-
formations in detail, discussed how to find good truncated differential trails and
computed the expected number of right pairs of a trail.
The rebound attack consists of two main phases, the inbound and outbound
phase. In the outbound phase, the propagation is probabilistic through the
MixColumns transformation and the probability can easily be deduced from the
truncated differential trail. In the inbound phase, we can use the available
freedom in choosing the values of the state. Various techniques have been shown
with slightly different requirements. An overview of the techniques for generic
state sizes of r × c with s-bit S-boxes and SuperBoxes matches of size r · s bits
with x → y active bytes is given in Table 3.4. We assume that a random S-box
differential is possible with probability 2−1 . In general, we can find one right pair
with an average complexity of one for any valid 3-round truncated differential
trail with most of these techniques. Details vary in memory requirements or the
complexity of finding the first right pair.
Table 3.4: Overview of different techniques to find right pairs for the 3-round
inbound phase with average complexity 1. The number of active bytes are the
same for each SuperBox and given for one SuperBox. (1) Differential distribu-
tion table. (2) Time-memory trade-off [LMR+ 09, GP10]. (3) Non-full active
SuperBoxes with x + y ≥ r + 1 [SLW+ 10]. (4) Start-from-the-middle techniques
[MPRS09, LMR+ 10]. (5) Linear solving technique [MPRS09].
The simplification that the inbound phase can be solved with an average
complexity of 1 for 3 rounds significantly improves the description of a rebound
attack. In this case, the complexity of an attack can be derived almost imme-
diately from a given truncated differential path (for example, see Figure 3.5).
However, one should of course be careful if all requirements of an attack are
met. The number of starting points, the conditions and the complexity to find
the first pair need to be considered. Furthermore, the truncated differential trail
3.8. Summary 53
should be valid and have enough freedom such that right pairs in every phase
of the attack can be found. Nevertheless, the rebound attack simplifies and im-
proves the analysis of AES-based primitives, which is shown in the attacks of
the following chapters.
Future work is to improve and extend the inbound phase. This is especially
possible for permutations with a less optimal diffusion than in the AES. First at-
tempts have been published by Naya-Plasencia in [Nay10] and other techniques
have been applied to the SHA-3 candidates Luffa [DSW09] in [KNPRS10] and
JH [Wu08] in [RTV10]. By using multiple inbound phases (see Section 2.4.5
and Chapter 6 and Chapter 7), the 8-round known-key distinguisher could be
extended to a 9-round chosen-key distinguisher, similar as in the attack on the
compression function of Whirlpool [LMR+ 09]. Also the application of the re-
bound attack to other primitives and the provable resistance against the rebound
attack is an open problem.
Design, Security and Implementation of
the Hash Function Grøstl
4
Since December 2010, the hash function Grøstl [GKM+ 11] is one of 5 final-
ists in the NIST SHA-3 competition [Nat07b]. Grøstl is an iterated wide-pipe
design with a permutation-based compression function. The permutations are
constructed using similar design principles as in the AES. We describe Grøstl
in detail in Section 4.1. We briefly discuss the security of the Grøstl hash func-
tion and its components in Section 4.2. Since the permutations in Grøstl are
based on AES, similar implementation techniques apply and are described in
Section 4.3. In that section, we also present a new byte-slicing technique which
allows us to implement Grøstl efficiently using the new Intel AES and AVX
instructions. More details of this implementations are given in [RS11].
55
56 Chapter 4. Design, Security and Implementation of Grøstl
H0 = IV
Hi = f (Hi−1 , Mi ) for 1 ≤ i ≤ t
h = Ω(Ht ).
f
Mi Q
Hi−1 P Hi
where truncn (x) discards all but the least significant n bits of x. The output
transformation is also shown in Figure 4.2.
4.1. Description of Grøstl 57
Ht P
AC
SB
SH
MB
Figure 4.3: One round of one permutation of the Grøstl-256 hash function.
4.1.4.1 AddRoundConstant
The AddRoundConstant (AC) transformation XORs a round-dependent constant
to one row of the state. The constant and the row is different for P and Q.
Additionally, a round-independent constant 0xff is XORed to every byte in Q.
The XOR constants for round i are shown in Figure 4.4.
4.1.4.2 SubBytes
The SubBytes (SB) transformation applies the AES S-box to each byte of the
state.
4.1.4.3 ShiftBytes
ShiftBytes (SH) cyclically rotates the bytes of row r to the left by σ[r] positions
with different values for P and Q in Grøstl-256 and Grøstl-512. We get the
58 Chapter 4. Design, Security and Implementation of Grøstl
0i 1i 2i 3i 4i 5i 6i 7i 0i 1i 2i 3i 4i 5i 6i 7i 8i 9i ai bi ci di ei fi
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
fi ei di ci bi ai 9i 8i fi ei di ci bi ai 9i 8i 7i 6i 5i 4i 3i 2i 1i 0i
S(x)
Figure 4.5: SubBytes substitutes each byte of the state using the AES S-box.
4.1.4.4 MixBytes
MixBytes (MB) is a linear diffusion layer, which multiplies each column A of the
state with a constant, circulant 8 × 8 MDS matrix B with branch number 9:
A=B×A
where
4.2. Security 59
Figure 4.7: The MixBytes transformation multiplies each column of the state by
a constant MDS matrix B with branch number 9.
2 2 3 4 5 3 5 7
7 2 2 3 4 5 3 5
5 7 2 2 3 4 5 3
3 5 7 2 2 3 4 5
B=
5
.
3 5 7 2 2 3 4
4 5 3 5 7 2 2 3
3 4 5 3 5 7 2 2
2 3 4 5 3 5 7 2
4.2 Security
Grøstl is a design with security proofs on the hash function, compression func-
tion and permutation. These proofs show that the construction of these com-
ponents is sound. Additionally, Grøstl is a failure-tolerant design. A distin-
guishing attack on the permutation most likely does not lead to an attack on
the compression function. Similarly, attacks on the (full) compression function
60 Chapter 4. Design, Security and Implementation of Grøstl
do not lead to attacks on the hash function due to the wide-pipe design. In
the following, we give a brief overview of the security of the building blocks in
Grøstl. For more details, we refer to the Grøstl specification [GKM+ 11]. All
details about rebound attacks on Grøstl are given in Chapter 5.
with complexity 2`/2 . Note that using cycle finding algorithms, preimages for
the compression can also be found in a memoryless way.
which is a 4-sum of value zero. Note that this also implies that H1 ⊕ H2 =
H3 ⊕ H4 = ∆1 and we get
4.2.3 Permutations
The AES-based permutations in Grøstl have been designed strictly according
to the wide-trail design strategy [DR02]. In both Grøstl-256 and Grøstl-512,
the branch number of MixBytes is 9 and ShiftBytes moves bytes of each column
to 8 different columns. It follows from [DR02, Theorem 9.5.1] that in any 4-
round differential or linear trail at least 92 = 81 S-boxes are active. Hence,
(Pmax
S )z = (2−6 )81 = 2−486 upper bounds the expected differential probability
of any 4-round differential trail (2−972 for any 8-round trail) and there is very
little chance that a classical differential (or linear) attack can be successful.
62 Chapter 4. Design, Security and Implementation of Grøstl
4.3.1 Table-Based
For the AES, a table-based approach to efficiently compute the combined
SubBytes and MixColumns has been proposed in [DR99b]. The same approach
can also be applied to Grøstl. Using this technique, at least one table lookup is
needed for each S-box. The MixBytes transformation is computed in parallel for
rows of the state and can be combined with the S-box lookup. This approach
is most efficient if the column size matches the register size. This is the case
on 32-bit platforms for AES and on 64-bits platforms for Grøstl. Since many
current and future small-scale 32-bit processors also provide 64-bit instructions
(MMX, NEON), Grøstl can also be implemented efficiently on these platforms.
where b0 = [b00 , b10 , · · · , b70 ]T is the resulting 64-bit value of the first column
computation. The input bytes aij are extracted from the state according to
the ShiftBytes transformation and the S-box S(x) is applied to these bytes prior
to the matrix multiplication of MixBytes. Expanding the matrix multiplication
then gives:
b00 2 · S(a00 ) 2 · S(a11 ) 3 · S(a22 ) 4 · S(a33 )
b10 7 · S(a00 ) 2 · S(a11 ) 2 · S(a22 ) 3 · S(a33 )
b20
5 · S(a00 )
7 · S(a11 )
2 · S(a22 )
2 · S(a33 )
b30
= 3 · S(a00 )
⊕ 5 · S(a11 )
⊕ 7 · S(a22 )
⊕ 2 · S(a33 )
⊕
b40
5 · S(a00 )
3 · S(a11 )
5 · S(a22 )
7 · S(a33 )
b50 4 · S(a00 ) 5 · S(a11 ) 3 · S(a22 ) 5 · S(a33 )
b60 3 · S(a00 ) 4 · S(a11 ) 5 · S(a22 ) 3 · S(a33 )
b70 2 · S(a00 ) 3 · S(a11 ) 4 · S(a22 ) 5 · S(a33 )
5 · S(a44 ) 3 · S(a55 ) 5 · S(a66 ) 7 · S(a77 )
4 · S(a44 ) 5 · S(a55 ) 3 · S(a66 ) 5 · S(a77 )
3 · S(a44 )
4 · S(a55 )
5 · S(a66 )
3 · S(a77 )
2 · S(a44 )
⊕ 3 · S(a55 )
⊕ 4 · S(a66 )
⊕ 5 · S(a77 )
2 · S(a44 )
2 · S(a55 )
3 · S(a66 )
4 · S(a77 )
7 · S(a44 ) 2 · S(a55 ) 2 · S(a66 ) 3 · S(a77 )
5 · S(a44 ) 7 · S(a55 ) 2 · S(a66 ) 2 · S(a77 )
3 · S(a44 ) 5 · S(a55 ) 7 · S(a66 ) 2 · S(a77 )
which simplifies to
where the tables y = Ti (x) contain 8 to 64-bit lookups of the S-box together
with the 8 multipliers of MixBytes. For example, for the first table T0 we get:
Extracting a single byte from a word can be implemented using one bit-
shift and one masking (logical and) instruction. Many processors also provide
instructions to directly access a single byte of a word. Then, the computation of
one column consists of only 8 table-lookups, 8 XORs (7 XORs for MB, 1 XOR
for AC), and 8 SHIFTs with 8 ANDs if no instruction to extract single bytes aij
from the 64-bit column values aj = [a00 , a10 , . . . , a70 ]T are available.
4.3. Efficient Implementation Techniques 65
P Q
mmx0
mmx1
mmx2
mmx3
mmx4
mmx5
mmx6
mmx7
mmx0
mmx1
mmx2
mmx3
mmx4
mmx5
mmx6
mmx7
Figure 4.8: For the T-table approach, the Grøstl-256 state is stored column-wise
in 64-bit registers.
The same T-table approach can also be used for efficient implementations
on 32-bit processors. In this case, we split up the computation into an upper
part and lower part. We need to split up the tables Ti into one table Ti0 storing
the upper 32 bits and one table Ti00 storing the lower 32 bits. Due to the cyclic
structure of the MixBytes transformation matrix, the tables Ti0 can be reused to
lookup also the lower 32 bits since we have Ti00 = T(i+4)mod8
0
. Hence, we get
Since the number of table-lookups and XORs double for the 32-bit T-table
implementation, we get a lower bound of 40 cycles/byte for Grøstl-256 and
56 cycles/byte for Grøstl-512 if no parallel table-lookups are possible. However,
many current and future 32-bit processors have 64-bit instruction set extensions
such as MMX for Intel/AMD processors [Int11b] and NEON for ARM processors
[ARM11].
Future work is to reduce the number of ALU instructions, for example us-
ing 128-bit registers to half the number of XORs. This could be particularly
useful for the AMD implementation with parallel table-lookups since the ALU
instructions are the bottleneck of the attack.
P Q
xmm0
xmm1
xmm2
xmm3
xmm4
xmm5
xmm6
xmm7
Figure 4.9: For the AES-NI implementation, the Grøstl-256 state is stored
row-wise in xmm registers to compute each column 16 times in parallel.
4.3.2.2 AddRoundConstant
The AddRoundConstant transformation XORs a round-dependent row-wise con-
stant to the first row in P and the last row in Q, and a round-independent
constant to each row of Q. Since the Grøstl state is stored in row-ordering,
these constants can be added efficiently in parallel to each column of the state.
4.3.2.3 SubBytes
SubBytes is usually the most difficult transformation to implement efficiently in
a byte-slice implementation. As already mentioned, for w-bit registers we need
an efficient method to compute w/8 parallel AES S-box lookups. This results
in only one (parallel) table lookup in the case of 8-bit implementations (w = 8).
Unfortunately, for larger register sizes, parallel table-lookups are usually non-
trivial.
Although Grøstl does not use the same MDS matrix as the AES, Grøstl
can still take advantage of the Intel AES new instruction set extension (AES-
NI). Since no MixColumns transformation is applied in the last round of the
AES, Intel also provides an AESENCLAST instruction. This instruction is able to
compute 16 AES S-boxes with a throughput of only 1 cycle and a latency of 4
cycles. The byte-shuffling of the AESENCLAST instruction can be reversed and
computed together with the ShiftBytes transformation (see Section 4.3.2.4).
For processors without AES instruction, another method to efficiently com-
pute many AES S-box lookups in parallel has been published by Mike Hamburg
in [Ham09] and first implemented for Grøstl by Çağdaş Çalik in [Çal10]. This
vperm implementation uses small log tables of the finite field GF (24 ) to effi-
ciently compute the inverse in GF (28 ) of the AES S-box. The log-tables for the
multiplication and inverse in GF (24 ) consist of 4-bit table lookups which can be
implemented efficiently using 128-bit registers and byte-shuffling operations (e.g.
using the PSHUFB instruction). Using the vperm implementation, we can com-
68 Chapter 4. Design, Security and Implementation of Grøstl
pute 16 AES S-box lookups within less than 10 cycles. An additional advantage
of the vperm implemenation is that we can multiply the resulting output by a
constant in GF (28 ) for free, which is useful for the MixBytes transformation.
4.3.2.4 ShiftBytes
Since ShiftBytes just moves bytes within one row of Grøstl, this transformation
can be implemented only using byte-shuffling instructions. If AESENCLAST is used
to compute the S-box lookups, we need to correct the ShiftRows transformation
of the last round in AES. These two byte-shufflings can be combined into a single
PSHUFB instruction. Note that any ShiftBytes rotation constants could be used
for P and Q at no additional cost.
4.3.2.5 MixBytes
The MixBytes transformation is the most costly transformation in a byte-slice
implementation of Grøstl. We need to combine the 8 rows of the Grøstl state
according to the MixBytes matrix multiplication. A naive approach needs to
multiply the bytes of all 8 rows by the 5 occurring multipliers. Then, we need
5 · 8 = 40 multiplications by 2 and 7 · 8 = 56 XORs to compute MixBytes.
Usually, the multiplication is between 3 (ATmega163) and 5 (Intel Core) times
as expensive as a simple XOR. Therefore, we usually use only the multipliers 1,
2, and 4. The resulting multiplication matrix is given in Table 4.3.
In this case, we need 14 · 8 = 112 XORs but only 16 multiplications by 2.
Note that the hash function Whirlpool needs 80 XORs and 24 multiplications by
2 to compute its 8 × 8 MDS matrix multiplication which results in a higher cost
on desktop processors. Furthermore, the MixBytes transformation in Grøstl
has been designed to reduce the number of XORs by increasing the Hamming
weight for the constants in the MDS matrix. This allows the computation of
many temporary results to save XOR operations. In the following, we show
two optimized variants to compute MixBytes efficiently. Note that the total
cost depends on the target platform different variants can be more efficient on
different platforms.
t = 2 · a0 + 2 · a2 + a5 + 4 · a7 + a7 (4.4)
a0 a1 a2 a3 a4 a5 a6 a7
4 2 1 4 2 1 4 2 1 4 2 1 4 2 1 4 2 1 4 2 1 4 2 1
b0 − •1 − − •2 − − •1 •9 •d − − •d − •2 − •9 •1 •2 − •2 •1 •2 •1
4.3. Efficient Implementation Techniques
b1 •5 •1 •5 − •a − − •1 − − •5 •b •d − − •5 − •1 − •b •a •1 − •1
b2 •5 − •5 •7 •2 •7 − •c − − •5 − − •7 •2 •5 − − •2 − •2 − •2 •c
b3 − •1 •3 •7 − •7 •3 •1 •3 − •3 − − •7 − − •3 •1 •d − − •1 − •1
b4 •d − •3 − •a •4 •3 − •3 •4 •3 •4 − •4 − − •3 − − •4 •a •d − −
b5 •d − − •6 − •4 •d •c •9 •4 − •4 •6 •4 •6 − •9 − − •4 − − •6 •c
b6 − •8 •3 •6 − − •3 − •3 − •3 •b •6 − •6 •8 •3 •8 − •b − − •6 −
b7 − •8 − − •2 •4 − − − •4 − •4 − •4 •2 •8 − •8 •2 •4 •2 − •2 −
69
70 Chapter 4. Design, Security and Implementation of Grøstl
Table 4.4: The MixBytes computation separated for factors 1, 2 and 4. ai denote
the input bytes and bi = bi,1 ⊕ bi,2 ⊕ bi,4 are the output bytes. A “•” marks
those inputs (ai , 2 · ai , 4 · ai ) which are added to get the intermediate results
bi,j . Superscripts denote the order in which temporary values are computed.
The results for factor 2 are computed by multiplying the results of factor 1 by 2
where bi,2 = 2 · bi+3mod8,1 .
we remove the already added terms, we continue with the greedy approach until
only single terms are left. Using this approach we found a sequence of computing
MixBytes which requires only 66 XORs and 16 multiplications by two. This
sequence is shown in Table 4.3 using superscript numbers to denote the order
of computing temporary results. It is still an open problem to find the smallest
number of XORs needed to compute MixBytes in Grøstl.
4.4. Summary 71
Exploiting XOR Parallelism. For processors with more than one ALU, a
MixBytes computation with the minimum number of instructions does not need
to result in the fastest implementation. For example, modern desktop CPUs
contain 3 ALUs which can compute 3 independent XORs in parallel. Currently,
the MixBytes computation contains many dependencies such that the ALU par-
allelism cannot be fully exploited. Additionally, parallel XOR computation can
also be used if even wider registers are available, for example using the Intel
AVX extension. Hence, there is still room for improvements since about 70% of
the time is spent for the computation of MixBytes.
4.4 Summary
In this chapter, we have presented the SHA-3 finalist Grøstl. Since Grøstl
is based on the AES, many cryptanalysis and implementation results can be
reused and applied also to Grøstl. Grøstl has proofs for the construction
72 Chapter 4. Design, Security and Implementation of Grøstl
and the permutations are provably resistant against standard differential and
linear attacks. Furthermore, since Grøstl has no key schedule, the freedom of
an attacker are limited and for example, related-key or similar attacks are not
possible. We have also discussed three efficient implementation techniques and
shown that Grøstl can also be implemented efficiently using the new Intel AES
and AVX instructions.
Applying the Rebound Attack to Grøstl
5
In this chapter we apply the rebound attack to the hash function Grøstl
[GKM+ 11], which is one of the 5 finalists of the SHA-3 competition. The it-
erated hash function Grøstl is based on a wide-pipe compression function and
has a non-invertible output transformation. Since the wide-pipe compression
function of Grøstl is known to be non-random, many distinguishers exist and
the hash function has been designed with this fact in mind. With ` denot-
ing the output size of the compression function for example, collision attacks
in 2`/3 time or 2`/4 permutation queries, memoryless preimage attacks in time
2`/2 , and very efficient distinguishers are known [GKM+ 11]. Hence, the strong
output transformation with truncation is an important part of the design.
The rebound attack [MRST09] has been developed together with the design
of Grøstl. Both the rebound attack and the Grøstl design simplify the use of
the available freedom. While the goal of the rebound attack is to efficiently use
all available freedom, Grøstl has been designed to limit the freedom that can
be used in an attack. For example, Grøstl consists only of two permutations
without key schedule inputs and each permutation strictly follows the wide-trail
design strategy. Hence, no complicated attacks are needed which use freedom of
a key schedule and no sparse paths exist which may provide additional freedom
for an attacker.
The clean design of Grøstl and the simple application of the rebound attack
provide additional assurance to the security of Grøstl. Once the basics of
the rebound attack are known (see Chapter 3), one can quickly understand
the rebound attacks on Grøstl by just looking at the figures in the following
sections. Moreover, one can determine the complexity of the attack and think
of extensions or variants without the need of complicated tools.
In Section 5.1, we first apply the basic rebound attack on AES (see Sec-
73
74 Chapter 5. Applying the Rebound Attack to Grøstl
tion 3.5) to the Grøstl-256 permutation and describe each step in detail. We
also analyze the various time-memory trade-offs (see Section 3.7) to efficiently
find pairs for the permutation. The results are distinguishers for up to 8 rounds
of the Grøstl-256 permutation and output transformation. In Section 5.2 show
how to use these results to get semi-free-start collisions for 6 rounds of the com-
pression function of Grøstl-256. In Section 5.3, we apply the rebound attack to
the Grøstl hash function and get collisions for 3 rounds, since only one half of
the freedom is available in an attack on the hash function. In Section 5.4 we also
apply all rebound attacks to Grøstl-512. Finally, we summarize the analysis of
Grøstl in Section 5.5.
Note that in the last round of the competition, Grøstl has been tweaked
to increase its security margin. The initial submission without tweak is called
Grøstl-0 and various rebound attacks on round-reduced versions of Grøstl-
0 have been presented in a series of papers. In Appendix A, we briefly
present the results on Grøstl-0 which can be viewed as a slightly simpli-
fied variant of Grøstl. The results of this chapter have been published in
[MRST09, MPRS09, MRST10]. External cryptanalysis of Grøstl has been pub-
lished in [GP09, Pey10, SLW+ 10, ITP10].
the path probabilistically. Therefore, we aim for a sparse path with a high
probability in the outbound phase.
In the following subsections, we briefly describe how to construct good trun-
cated differential paths for the Grøstl-256 permutation and compute their ex-
pected number of pairs. We then use these paths in the rebound attack on the
permutation and apply 3 different inbound phases to the 8 × 8 state of Grøstl-
256. Finally, we discuss the outbound phase and present the best known rebound
attacks on the Grøstl-256 permutation and output transformation.
The truncated differential path is also shown in Figure 5.1 Note that for an at-
tack on the permutation we need to observe some non-random property at the
input and at the output. The path has a non-full active state at the input but
contrary to the AES, the MixBytes transformation in the last round of Grøstl is
not omitted. Since MixBytes is a linear transformation, some non-random prop-
erties can still also be observed at the output of the permutation if the number
of active bytes is small prior to the last MixBytes transformation. Such non-
random properties have been analyzed in detail using the subspace distinguisher
in [LMR+ 09].
P0 P1 P2 P3 P4 P5 P6 P7
AC AC AC AC AC AC AC
SB SB SB SB SB SB SB
SH SH SH SH SH SH SH
MB MB MB MB MB MB MB
We can efficiently find pairs for this path by extending the inbound phase by one
round using SuperBox matches instead of S-box matches (also see Section 3.3.2).
Of course, also for Grøstl different time-memory trade-offs are possible but
the memory complexity is in general higher due to the larger MixBytes trans-
formation. We describe the different techniques for Grøstl in more detail in
Section 5.1.2.3.
P0 P1 P2 P3 P4 P5 P6 P7 P8
AC AC AC AC AC AC AC AC
SB SB SB SB SB SB SB SB
SH SH SH SH SH SH SH SH
MB MB MB MB MB MB MB MB
Figure 5.2: Extending the minimum truncated differential path by one more fully
active state in the middle. Also for this 8-round path, the expected number of
right pairs is only 216 .
Note that we get the same probabilities for the 8-round path in rounds r1 , r5
and r6 since we reduce by the same number of active bytes. Therefore, also the
expected number of pairs for the 8-round path is 216 .
P0 P1 P2 P3 P4 P5 P6 P7
AC AC AC AC AC AC AC
SB SB SB SB SB SB SB
SH SH SH SH SH SH SH
MB MB MB MB MB MB MB
P0 P1 P2 P3 P4 P5 P6 P7 P8 P9
AC AC AC AC AC AC AC AC AC
SB SB SB SB SB SB SB SB SB
SH SH SH SH SH SH SH SH SH
MB MB MB MB MB MB MB MB MB
(b) A 9-round truncated differential path which is most likely impossible. The probability that
at least one right pair exists is only 2−432 .
P0 P1 P2 P3 P4 P5 P6 P7 P8 P9
AC AC AC AC AC AC AC AC AC
SB SB SB SB SB SB SB SB SB
SH SH SH SH SH SH SH SH SH
MB MB MB MB MB MB MB MB MB
(c) A 9-round truncated differential path with an expected number of 216 right pairs. It is
unknown how to efficiently find pairs for this path.
Figure 5.3: Shows three alternative truncated differential paths for the permu-
tation P of Grøstl-256. The middle path is an impossible truncated differential
path.
even hash function attacks. However, one might also consider other truncated
differential paths which could be used to extend the number of rounds, improve
the complexity, or find more solutions in an attack on Grøstl-256. Figure 5.3
shows 3 such truncated differential paths and in the following, we analyze which
paths can or cannot be used for an attack.
For many attacks, we need to be able to efficiently construct more than 216
pairs for one permutation. This is the case for compression function collisions on
Grøstl-0 (see Section A.1.1) but may also happen for distinguishers, depending
on the property observed at the input and output. Figure 5.3a shows a 7-round
path for the permutation P where we can find more than 216 right pairs. For
this path, the MixBytes transformations in round r1 and r5 are probabilistic and
the expected number of right pairs is then:
Figure 5.3b shows an extension of the 8-round path to 9 rounds. Note that
a similar path has been successfully used in the compression function attacks on
Whirlpool [LMR+ 09]. However, this path is not possible in the case of Grøstl.
In Whirlpool, the freedom in the inputs of the key-schedule has been used to
control the propagation of truncated differences according to such a path. In
Grøstl, no key-schedule inputs exists and the truncated differential path can
only be controlled indirectly by the permutation input. However, for this path
the expected number of right pairs is far below one. The probability that a right
78 Chapter 5. Applying the Rebound Attack to Grøstl
Finally, Figure 5.3c shows a third 9-round path which could be used to get
a distinguisher for the reduced Grøstl-256 permutation. This path has three
fully active states in the middle and the expected number of right pairs is 216 .
However, it is still an open problem to find a right pair for this path with a
complexity smaller than in the generic case.
MB SH
SH SB AC
AC MB
average 1
Figure 5.4: The inbound phase of the attack on the Grøstl-256 compression
function using 8-bit S-box matches. The input and output of one S-box is
highlighted.
5.1. The Rebound Attack on the Grøstl-256 Permutation 79
For a single S-box, the probability that a random S-box differential exists is
about one half, which can be verified by computing the difference distribution
table (DDT) of the AES S-box (see Section 3.2.1 for more details). For one
column, we get a valid differential with a probability of about 2−8 . Hence, we
need to try all 255 non-zero differences for each active byte in P3SH to get a
valid differential for all 8 S-boxes of each column. If no match is found, we need
to restart from the beginning. Remember that for each valid S-box differential,
we get at least two (in some cases 4) right byte values such that the differential
holds.
We get at least 264 right pairs for the whole inbound phase with a complexity
of about 8 · 28 column operations. Furthermore, we can choose and start from
about 264 differences for the active bytes in P4 . Hence, we can construct up
to 2128 pairs that follow the truncated differential path of the inbound phase
between state P3SH and P4 with an average complexity of 1. The memory com-
plexity is only 216 and the minimum complexity to find the first pair is only
about 28 Grøstl round transformations.
Filtering for the differential path fixes the input and output differences of the
active S-boxes in round r3 and r4 . We get a 7-bit conditions for each of these
8+64 active S-boxes which results in a 504-bit condition (also see Section 3.6.2).
Since we have 512 free bits for the state, the linear system of equations is under-
defined and we expect to find a solution by solving the linear system of equations
only once.
In the 4-round case, we solve for pairs according to the following part of the
truncated differential path:
r
2 3 r 4 r
5 r
1 −→ 8 −→ 64 −→ 8 −→ 1
In this case, we get 7-bit conditions for 8 + 64 + 8 active S-box which results in a
560-bit condition. Since we have only 512 free bits for the state, the linear system
of equations is over-defined and we do not immediately get a solution. Instead,
we need to solve the system about for about 248 differential paths which results
in a complexity of approximately 248 with negligible memory requirements.
80 Chapter 5. Applying the Rebound Attack to Grøstl
8 → 64 → 64 → 8
average 1
Figure 5.5: The inbound phase on the Grøstl-256 compression function using
64-bit matches with one SuperBox being highlighted.
We get the best time-memory trade-off for Grøstl using the technique of
Section 3.7.3 which has first been published in [LMR+ 09], applied to Grøstl in
[MRST10] and independently been found in [GP10]. This technique is explained
in detail in Section 3.7.3 for the case of the AES SuperBox (32 bits) and scales up
to the Grøstl SuperBox (64 bits). Using this technique the inbound phase has
a time and memory complexity of 264 to find 264 solutions. Hence, the average
complexity is 1 if 264 or more pairs are computed. Note that this technique
only works if we are able to construct about 264 differences for one side of the
SuperBox. This is possible for the given path of the permutation since we have
8 active bytes in state P3SH as well as P5 .
To summarize, we can find one pair for the given 3-round inbound phase with
an average complexity of one. Note that two times 8 active bytes are active at
the input and output of the inbound phase and we expect to get one right pair
for each starting difference (also see Section 3.5.2 and Section 3.7. Hence, we
5.1. The Rebound Attack on the Grøstl-256 Permutation 81
can construct at most 216 right pairs). Hence, the complexity to distinguish 8
rounds of the permutation is about 2120 permutation evaluations.
In [SLW+ 10] another truncated differential path has been proposed to ana-
lyze the Grøstl permutation. In that work, the truncated differential path has
no single fully active state but many half-active states in the middle rounds. This
gives slightly more freedom in the middle of the path to reduce the complexity
of a distinguishing attack on the permutation to 248 with memory requirements
of 28 . The drawback of this method is, that such a path has also more active
bytes at the input and output of the permutation. Hence, also the complexity of
an equivalent generic attack reduces. Moreover, it is less likely that these larger
truncated differences can be used to extend the distinguisher to an attack on
the compression or hash function.
∆Mi Q
Hi−1 P Hi
Figure 5.6: Active permutations and inputs (red) to get semi-free-start collisions
for the compression function of Grøstl.
Hence, we only need to ensure that the following condition on the differences
holds which can be fulfill using the birthday effect:
Q0 Q1 Q2 Q3 Q4 Q5 Q6
AC AC AC AC AC AC
Mi SB SB SB SB SB SB
SH SH SH SH SH SH
MB MB MB MB MB MB
P0 P1 P2 P3 P4 P5 P6
Hi−1 AC AC AC AC AC AC
SB SB SB SB SB SB
Hi
SH SH SH SH SH SH
MB MB MB MB MB MB
Figure 5.7: A truncated differential path to get collisions for the compression
function. The number of active bytes at the input and prior to the last MixBytes
transformation is 1.
Extending this truncated differential path is difficult for three reasons. First,
the pattern of active bytes needs to be the same at the input and output of each
permutation. This is not the case if we extend the path by one round in any
direction. Second, extending the path also reduces the expected number of right
pairs and the degrees of freedom and an attack is often not possible anymore.
In Appendix A, we present an analysis of Grøstl-0 which has the same rotation
constants in both P and Q. In this case the second effect is indeed the limiting
factor of an attack. The third reason is the complexity of an attack which gets
too high if we want to extend the number of rounds.
5.2. Attacks on the Compression function of Grøstl-256 85
5.2.3.1 Path 1
The most straightforward approach is to consider only one active byte prior to
the first and after the last SubBytes layer. This way, the ShiftBytes transforma-
tions do not change the pattern of active bytes in the first and last round and
we can get a collision at the output of the compression function. The number
of active bytes for each round in both P and Q is then given as follows:
r1 2r 3 r 4 5r 6 r r
1 −→ 8 −→ 64 −→ 64 −→ 8 −→ 1 −→ 8
The truncated differential path is shown in Figure 5.8. Note that the path and
also the pattern of active bytes is still similar in P and Q.
Next, we verify if the truncated differential path is valid, i.e. if the expected
number of right pairs is at least 1. This number can be computed by multiplying
the total number of input pairs by the probability that the truncated differential
path is followed for each input pair. The given path is only probabilistic in the
MixBytes transformations of round r4 and r5 , and in the XOR at the output.
Hence, the expected number of pairs is given as follows:
−8·56
8·(64+1)
| {z } · 2
2 8·64
|{z} · 2| · 2−8·56} · |2−8·7{z
{z · 2−8·7} · |2−8·1
{z } = 2
16
In the compression function attack, we first compute pairs for each permu-
tation independently and then, match the input and output differences using
a birthday attack. By computing the inbound phase with SuperBox matches,
we can find pairs for the three middle rounds r2 , r3 and r4 with an average
complexity of 1 and memory requirements of 264 . For each permutation, we
independently propagate the resulting pairs outwards and get one active byte at
the input (P0 , Q0 ) and one active byte after round r5 (P5 , Q5 ) with a complexity
of 22·56 = 2112 . To get a semi-free-start collision, the 1-byte differences at the
input, and the 1-byte differences prior to the last MixBytes transformation need
to be equal. This 16-bit condition can be fulfilled with a complexity of 28 using
the birthday effect. In total, the complexity to get a semi-free-start collision for
6 rounds of Grøstl-256 is 2112 · 28 = 2120 in time with memory requirements of
264 .
86 Chapter 5. Applying the Rebound Attack to Grøstl
Q0 Q1 Q2 Q3 Q4 Q5 Q6
AC AC AC AC AC AC
Mi SB SB SB SB SB SB
SH SH SH SH SH SH
MB MB MB MB MB MB
5.2.3.2 Path 2
Note that also another path with more active bytes can be used to get a semi-
free-start collisions for 6 rounds and with the same complexity. This time, we
use two truncated differential paths in P and Q where the full active state does
not occur in the same round. Hence, the number of active bytes in P and Q are
different and given as follows:
r1 2 r 3 4 r 5 6 r r r
Q : 8 −→ 1 −→ 8 −→ 64 −→ 56 −→ 8 −→ 64
r1 2 r3 4 5r 6 r r r
P : 8 −→ 56 −→ 64 −→ 8 −→ 1 −→ 8 −→ 64
Q0 Q1 Q2 Q3 Q4 Q5 Q6
AC AC AC AC AC AC
Mi SB SB SB SB SB SB
SH SH SH SH SH SH
MB MB MB MB MB MB
1 256 average 1
232 232
average 1 256 1
P0 P1 P2 P3 P4 P5 P6
Hi−1 AC AC AC AC AC AC
SB SB SB SB SB SB
Hi
SH SH SH SH SH SH
MB MB MB MB MB MB
When applying the rebound attack to this path, we can solve the inbound
phase for rounds r1 , r2 and r3 in permutation P , and for rounds r3 , r4 and r5
in permutation Q independently and with average complexity 1. The memory
requirements are 264 again. In each permutation, we have one propagation
through MixBytes from 8 to 1 active byte which has a complexity of 256 in each
case. This time, we get a 128-bit condition such that the differences of the 8
active bytes at the input and output (prior to MixBytes) cancel each other. Using
a birthday attack we can match the differences with a complexity of 264 in time
and memory. In total, the complexity for this semi-free-start collision attack on
6 rounds is again 256 · 264 = 2120 in time with memory requirements of 264 .
IV
M1
P3 P2 P1 P0 Q0 Q1 Q2 Q3
AC AC AC AC AC AC
SB SB SB SB SB SB
SH SH SH SH SH SH
MB MB MB MB MB MB
H1
Figure 5.10: The truncated differential path to get a collision for the hash func-
tion of Grøstl. The permutations are shown next to each other. This way,
the rebound attack on the hash function is viewed very similar to the rebound
attack on the compression function.
The rebound attack on the hash function of Grøstl is actually quite similar
to the attack on the compression function. We can do a basic inbound phase
again since the S-boxes of the first round in P and Q are completely independent.
Furthermore, we can add one more round in either P or Q to do independent
64-bit matches in the inbound phase as well. In this case, the resulting sequence
of transformations is similar as for the Grøstl SuperBox. The 64-bit matches of
the hash function attack consists of an additional inverse SubBytes layer which
results in a (keyed) differential match on SB−1 - SB - MB - SB instead of SB
- MB - SB . Figure 5.10 and Figure 5.11 show these round transformations and
the first column of the 64-bit matches in more detail.
M1 Q0 Q1 QSB
2 Q2
AC AC
SB AC SH SB
SH SB MB SH
MB MB
outbound
average 1 inbound
outbound
IV P0 P1SB P1 P2
AC AC
AC SH SB SB
SB MB SH SH
MB MB
Figure 5.11: The inbound phase of the attack on the hash function Grøstl-256.
The first 64-bit match is highlighted.
need a small number of active bytes at the output such that the complexity of
the attack is low. Due to the different shift values of ShiftBytes, it is difficult
to construct good truncated differential paths for both, P and Q such that the
output patterns are the same. However, in the following we present two such
truncated differential paths which lead to a collision attack for 3 out of 10 rounds
of the hash function.
5.3.2.1 Path 1
The most simple case is to consider only one active byte prior to MixBytes in
the last round of each permutation. Then we immediately get the minimum
3-round truncated differential path given in Figure 5.12, with full active states
at the input of each permutation.
Next, we need to verify if the truncated differential path is valid, i.e. if we
have enough freedom such that the expected number of right pairs is at least 1.
The expected number of right pairs can be computed by multiplying the total
number of input pairs by the probability that the truncated differential path is
followed for each input pair. For the truncated differential path of Figure 5.12,
the total number of input pairs depends on the number of pairs for the message
Mi and for the chaining input Hi−1 or initial value (IV ). The probability of the
given truncated differential path is determined by the probabilistic propagation
in the MixBytes transformations of round r1 and r2 and in the final XOR at the
output. For example, in the MixBytes transformation of round r2 in permutation
Q, the path reduces from 8 → 1 active bytes which happens with a probability
of about 2−56 . Hence, the expected number of right pairs of the truncated
differential path given in Figure 5.12 can be computed as follows:
8·(64+64)
1 · 2| −8·56{z
| {z } · |{z}
2 · 2−8·56} · 2
|
−8·7
· 2−8·7} · |2−8·1
{z {z } = 2
8
We use the rebound attack to find pairs for the truncated differential paths
90 Chapter 5. Applying the Rebound Attack to Grøstl
M1 Q0 Q1 Q2 Q3
AC AC AC
SB SB SB
SH SH SH
MB MB MB
average 1 28
256
IV P0 P1 P2 P3 H1
AC AC AC
SB SB SB
SH SH SH
MB MB MB
Figure 5.12: The truncated differential path to get a collision attack on 3 out
of 10 rounds for the hash function of Grøstl-256. The inbound phase (red) can
be solved with average complexity 1, the outbound phase (blue) with a total
complexity of about 256 · 28 = 264 . The first SuperBox in the inbound phase is
shown by red rectangles.
in P and Q. First, we compute pairs for the inbound phase between rounds
r1 and r2 in Q and round r1 in P . Note that in this path, the SuperBoxes
are not fully active. In this case, the memory complexity of the attack can
be reduced significantly. Both techniques of [MPRS09] and [SLW+ 10] can be
applied with negligible memory requirements (for more details, see Section 3.7.4
and Section 5.1.2.4). In any case, the complexity to find a conforming input pair
according to the truncated differential path until state Q2 in permutation Q,
and until state P1 in permutation P is 1 on average. We compute 264 such pairs
and propagate them outwards. With a probability of 2−56 we get one active
byte in P2 and with a probability of 2−8 also the 1-byte differences in the last
round prior to MixBytes are equal. Hence, we get a collision for 3 rounds of
the hash function with a total complexity of 264 in time and negligible memory
requirements.
5.3.2.2 Path 2
Again we can use a second truncated differential path which has the same time
complexity, but higher memory complexities. We still mention this path here
since it could be interesting in future analysis of the Grøstl-256 hash function.
The path is constructed in a similar way as the second path of the compression
function attacks on Grøstl-256 and given in Figure 5.13. Note that the pattern
of active bytes in Q2 can be determined from the pattern in P2 by the relation
Q2 ← ShiftBytes−1
Q ◦ ShiftBytesP ◦ P2
M1 Q0 Q1 Q2 Q3
AC AC AC
SB SB SB
SH SH SH
MB MB MB
average 1 264
1
IV P0 P1 P2 P3 H1
AC AC AC
SB SB SB
SH SH SH
MB MB MB
Again, we first verify if the truncated differential path is valid and compute
the expected number of right pairs. The path is probabilistic in the MixBytes
transformations of round r1 and r2 in Q, in the MixBytes transformations of
round r1 in P , and in the XOR at the output. Hence, the expected number of
pairs is given as follows:
−8·8
8·(64+64)
|2 {z } · |{z}
1 ·2
| · 2−8·56} · 2
{z
−8·49 −8·8
| {z } · |2 {z } = 2
56
optimal truncated differential paths which balance the complexity and available
freedom of an attack slightly better. Due to the higher security level, Grøstl-
512 has 14 instead of 10 rounds, although the best currently known attacks are
on the same number of rounds as for Grøstl-256.
impossible
Figure 5.14: Impossible truncated differential path for an attack on the Grøstl-512 compression function. The number of
active bytes in the MixBytes transformation of round r3 is below 9 for most columns. For the highlighted column (or SuperBox),
the number of active bytes at the input and output of MixBytes is only 5.
P0 P1 P2 P3SH P3 P4SH P4 P5
AC AC AC
AC AC
SB SB SB
SB MB SB MB
SH SH SH
SH SH
5.4. Application to Grøstl-512
MB MB MB
Figure 5.15: To get a possible truncated differential path for the Grøstl-512 compression function, we need at least 2 active
bytes in either state P0 or P5 . Note that the number of active bytes at the MixBytes layer in round r3 is at least 9 for every
column. One column including the 64-bit SuperBox match is highlighted.
Q0 QSH
1 Q2 Q3 Q4
AC AC
M1 AC
SB SB
SB MB
SH SH
SH
MB MB
impossible
P0 P1SH P2 P3 P4
AC AC
IV AC H1
SB SB
SB MB
SH SH
SH
MB MB
Figure 5.16: Impossible truncated differential path for an attack on the Grøstl-512 hash function. For a possible truncated
differential path, the pattern of active bytes has to be the same in state P0 and Q0 . To get a valid truncated differential path,
93
we need at least 3 active bytes in state P3 and Q3 such that both P0 and Q0 are fully active (also see Figure 5.18).
94 Chapter 5. Applying the Rebound Attack to Grøstl
Q0 Q1 Q2 Q3 Q4 Q5 Q6
AC AC AC AC AC AC
Mi SB SB SB SB SB SB
SH SH SH SH SH SH
MB MB MB MB MB MB
Again, we use the rebound attack to find pairs for each truncated differential
path in P and Q. We compute pairs for each permutation independently and
match the input and output differences using a birthday attack. By computing
the inbound phase with SuperBox matches, we can find pairs for the three middle
rounds r2 , r3 and r4 with an average complexity of 1 and memory requirements
of 264 . For each permutation, we independently propagate the resulting pairs
5.4. Application to Grøstl-512 95
outwards and get one active byte at the input (P0 , Q0 ) and one active byte after
round r5 (P5 , Q5 ) with a complexity of 23·56 = 2168 . To get a semi-free-start
collision, the 1-byte differences at the input, and the 2-byte differences prior
to the last MixBytes transformation need to be equal. This 24-bit condition
can be fulfilled with a complexity of 212 using the birthday effect. In total,
the complexity to get a semi-free-start collision for 6 rounds of Grøstl-512 is
2168 · 212 = 2180 in time with memory requirements of 264 .
Q2 ← ShiftBytes−1
Q ◦ ShiftBytesP ◦ P2
We also verify if this truncated differential path is valid and compute the
expected number of right pairs. The path is probabilistic in the MixBytes trans-
formations of round r1 and r2 of both P and Q, and in the XOR at the output.
Hence, the expected number of right pairs is given as follows:
−8·104
2
|
8·(128+128)
{z } · |{z}
1 ·2| · 2−8·104} · |2−8·21{z
{z · 2−8·21} · |2−8·3
{z } = 2
24
Again we use the rebound attack to find right pairs according to this trun-
cated differential path. First, we compute pairs for the inbound phase between
rounds r1 and r2 in Q and round r1 in P . The complexity to find a solution for
the truncated differential path until state Q2 in permutation Q, and until state
P1 in permutation P is 1 on average with memory requirements of 264 for a stan-
dard SuperBox match. Using non-full active SuperBox matches or by solving
linearly for pairs, we can significantly reduce the memory requirements to 216 .
We compute 2192 such pairs and propagate them outwards. With a probability
of 2−168 we get 3 active bytes in state P2 and with a probability of 2−24 the
3-byte differences in the last round prior to MixBytes are equal. Hence, we get
a collision for 3 rounds of the hash function with a total complexity of 2192 in
time with negligible memory requirements.
96 Chapter 5. Applying the Rebound Attack to Grøstl
Q0 Q1 Q2 Q3
AC AC AC
M1 SB SB SB
SH SH SH
MB MB MB
average 1 224
2168
P0 P1 P2 P3
AC AC AC
IV SB SB SB
H1
SH SH SH
MB MB MB
Figure 5.18: The truncated differential path to get a collision attack on 3 out
of 14 rounds for the hash function of Grøstl-512. The inbound phase (red) can
be solved with average complexity 1, the outbound phase (blue) with a total
complexity of about 23·56 · 23·8 = 2192 . The first SuperBox in the inbound phase
is shown by red rectangles.
5.5 Summary
In this chapter, we have analyzed the SHA-3 finalist Grøstl in detail. We have
applied various rebound attacks to different versions of Grøstl. For the final
round version (with tweak) we get hash function collisions for 3 rounds and
compression function collisions for 6 rounds for both, Grøstl-256 (10 rounds)
and Grøstl-512 (14 rounds). Using the rebound attack, distinguishers for 8
rounds of the permutation and output transformation can be constructed. Also
the initial submission Grøstl-0 has been analyzed in detail. All these results
show that Grøstl still has a high security margin.
Grøstl consists of two permutations which strictly follow the wide-trail de-
sign strategy. Hence, no sparse (truncated) differential paths exist for Grøstl.
Furthermore, in block cipher-based designs the freedom in round keys can be
used to control the internal state. This is not possible in a permutation-based
design such as Grøstl. Both effects limit the degrees of freedom which can
be used in an attack. Due to this limited freedom, no rebound attack with
multiple inbound and outbound phases is possible. Note that such attacks are
possible for the block cipher-based hash function Whirlpool (see [LMR+ 09]), of
for permutation-based hash functions where sparse truncated differential paths
exist, such as the SHA-3 candidates ECHO (see Chapter 6) and LANE (see
Chapter 7).
Multiple Inbound and Multiple
Outbound Phases in ECHO
6
In this chapter, we analyze the hash function ECHO [BBG+ 08] which is one of
the 14 second round candidates of the NIST SHA-3 competition. ECHO is a
wide-pipe, AES based design which transforms 128-bit words similarly as AES
transforms bytes. Inside these 128-bit words, two AES rounds are used. The
compression function of ECHO consists of one large 2048-bit permutation with
feed-forward and a simple compressing finalization function. Prior to the work
described in this chapter, most cryptanalytic results of ECHO were limited to
the internal permutation [GP10, MPRS09] and to reduced variants of the wide-
pipe compression function [Pey10]. The compression function results have been
published by the designers of ECHO and cover up to 4 out of 8 rounds of ECHO-256
and 6 out of 10 rounds of ECHO-512.
In the following, we extend the analysis to the hash function of ECHO and
present collisions for up to 5 out of 8 rounds in the case of ECHO-256. Fur-
thermore, we provide improved attacks on the compression function for up to
7 out of 8 rounds of ECHO-256 and 7 out of 10 with chosen salt. The main
improvement is to construct a new type of sparse truncated differential paths
where at most one fourth of each ECHO state is active. In all previous paths,
at least one state was fully active. The construction of sparse paths is possi-
ble by combining the last MixColumns transformation of the second AES round
with the BigMixColumns transformation of an ECHO round and analyzing the re-
sulting SuperMixColumns transformation. The attack itself is a rebound attack
[MRST09] with multiple inbound phases and multiple outbound phases. Similar
attacks have been applied to the SHA-3 candidate LANE [MNPN+ 09] and the
hash function Whirlpool [LMR+ 09].
97
98 Chapter 6. Multiple Inbound and Multiple Outbound Phases in ECHO
Since the truncated differential paths are very sparse, we have enough free-
dom to merge the solutions of multiple inbound phases. Using multiple in-
bound phases, we can control more distant parts of much longer truncated dif-
ferential paths than in a standard rebound attack including SuperBox analy-
sis [GP10, LMR+ 09, MRST10] or the techniques proposed in [MPRS09]. Al-
though ECHO has a rather good diffusion, the 4 big ECHO columns, the 64
SuperBoxes or the 16 SuperMixColumns transformations within one round are
always independent. We can exploit this property and apply even general-
ized birthday techniques [Wag02] to efficiently merge independent solutions of
multiple inbound phases. The results of this chapter have been published in
[Sch10b, Sch10c, Sch10a].
H0 = IV
Hi = f (Hi−1 , Mi , ci , s) for 1 ≤ i ≤ t
h = truncn (Ht ).
The message block size is 1536 bits for ECHO-256 and 1024 bits for ECHO-512, and
the message is padded by adding a single 1 followed by zeros to fill up the block
size. Note that the last 18 bytes of the last message block always contain the
2-byte hash output size, followed by the 16-byte message length.
The compression function of ECHO uses one internal 2048-bit permutation
P which manipulates 128-bit words similar as AES manipulates bytes. The
permutation consists of 8 rounds in the case of ECHO-256 and has 10 rounds for
ECHO-512. The internal state of the permutation P can be modeled as a 4 × 4
matrix of 128-bit words. We denote one ECHO state by Si . Each 128-bit word (or
AES state) is indexed by [r, c], with rows r ∈ {0, ..., 3} and columns c ∈ {0, ..., 3}
of the ECHO state.
The 2048-bit input of the permutation (which is also tweaked by the counter
ci and the salt s) are the previous chaining variable Hi−1 and the current message
block Mi , concatenated to each other. After the last round of the permutation,
a feed-forward (FF) is applied to get the preliminary output V :
To get the 512-bit chaining variable Hi for ECHO-256, all columns of the ECHO
output state V are XORed. In the case of ECHO-512, the 1024-bit chaining
variable Hi is the XOR of the two left and the two right columns of V . The
6.2. Truncated Differential Analysis of ECHO 99
The linear diffusion layer BigMixColumns (BMC) mixes the AES states of
each ECHO column by the same MDS matrix MMC but applied to those
bytes with equal position inside the AES states.
Figure 6.1: The sparse truncated differential path for 4 rounds of ECHO. By 1,
D, C, F we denote the pattern and number of active bytes in each AES state
(also see [GP10]). A 1 denotes an AES state with only one active byte, a D an
active diagonal (4 active bytes), a C an active column (4 active bytes) and an F
denotes a full active state (16 active bytes). Note that a maximum of 64 bytes
are active in each single ECHO state.
Also, the same sequence of active bytes holds for 4 rounds of each AES state. In
previous analysis of ECHO, truncated differential paths have been used with 16
active bytes in those AES states where the ECHO state has also 16 active words.
In these attacks always one full active state with 256 active S-boxes was used.
In the following, we show how to construct sparse truncated differential paths
with a maximum of 64 active bytes in each single ECHO state.
The main idea is to place AES states with only one active S-box into those
ECHO rounds with 16 active words. This way, the number of total active bytes (or
S-boxes) can be greatly reduced. The resulting 4-round truncated differential
path of ECHO is given in Figure 6.1 and consists of only 245 active S-boxes.
Since one round of ECHO consists of two AES rounds, it follows that the full
active AES states result in those rounds of ECHO with 4 active words. The
ECHO state with only one active AES state contains only one active byte. Note
that in the attacks on ECHO, we use this truncated differential path with small
modifications to improve the overall complexity of the attacks.
6.2. Truncated Differential Analysis of ECHO 101
MixColumns
ShiftRows
ShiftRows
SubBytes
SubBytes
BigMixColumns
BigShiftRows
MixColumns
ShiftRows
ShiftRows
Figure 6.2: The two super-round transformations of ECHO: SuperBox (top, red)
and SuperMixColumns (bottom, green) with adjacent byte shuffling operations
(ShiftRows and BigShiftRows).
6.2.3 SuperBox
The properties of the AES SuperBox [DR06a] have already been discussed in
Section 3.3.2. Since one round of ECHO consists of two consecutive AES rounds
we use SuperBoxes in our analysis as well. Using SuperBoxes, we can represent
the two AES rounds of ECHO using a single non-linear layer and two adjacent
byte shuffling layers. The second MixColumns transformation is moved to the
SuperMixColumns transformation. Then, the only non-linear part of one ECHO
round consists of 64 parallel and independent 32-bit SuperBox transformations
(see Figure 6.2).
This separation of AES rounds into parallel 32-bit SuperBoxes allows to ef-
ficiently find right pairs for a given (truncated) differential path. If we do not
102 Chapter 6. Multiple Inbound and Multiple Outbound Phases in ECHO
care about memory, we can simply pre-compute and store the whole differential
distribution table (DDT) of the AES SuperBox with a time and memory com-
plexity of 264 as described in Section 3.3.2. Remember that the DDT stores
which input/output differentials of the SuperBox are possible. Furthermore, all
input values for a given valid differential are stored in the table. Note that
in ECHO, each SuperBox is keyed in the middle by the counter value. Hence,
we need different DDTs for all SuperBoxes with different keys. To reduce the
memory requirements and the maximum time to find values for given SuperBox
differentials, also other time-memory trade-offs as given in Section 3.7 can be
used.
6.2.4 SuperMixColumns
The SuperMixColumns transformation combines four MixColumns transfor-
mations of the second AES round with 4 MixColumns transformations of
BigMixColumns in the same 1 × 16 column slice of the ECHO state (see Fig-
ure 6.2). We denote by a column slice the 16 bytes of the same 1-byte wide
column of the 16 × 16 ECHO state. Note that the BigMixColumns transformation
consists of 16×4 parallel MixColumns transformations. Each of these MixColumns
transformations mixes those four bytes of an ECHO column, which have the same
position in the four AES states. Using the alternative description of ECHO (see
Figure 6.2), it is easy to see that four MixColumns operations of the second
AES round work on the same column slice as four MixColumns operations of
BigMixColumns. We combine these eight MixColumns transformations to get a
SuperMixColumns transformation on a 1-byte wide column slice of ECHO.
We have determined the 16 × 16 matrix MSMC of the SuperMixColumns
transformation which is applied to the ECHO state instead of MixColumns and
BigMixColumns. This matrix can be computed by the Kronecker product of two
AES MixColumns matrices MMC :
" 2 3 1 1 # " 2 3 1 1 #
1 2 3 1 1 2 3 1
MSMC = MMC ⊗ MMC = 1 1 2 3 ⊗ 1 1 2 3 =
3 1 1 2 3 1 1 2
4 6 2 2 6 5 3 3 2 3 1 1 2 3 1 1
2 4 6 2 3 6 5 3 1 2 3 1 1 2 3 1
2 2 4 6 3 3 6 5 1 1 2 3 1 1 2 3
6 2 2 4 5 3 3 6 3 1 1 2 3 1 1 2
2 3 1 1 4 6 2 2 6 5 3 3 2 3 1 1
1 2 3 1 2 4 6 2 3 6 5 3 1 2 3 1
1 1 2 3 2 2 4 6 3 3 6 5 1 1 2 3
3 1 1 2 6 2 2 4 5 3 3 6 3 1 1 2
2 3 1 1 2 3 1 1 4 6 2 2 6 5 3 3
1 2 3 1 1 2 3 1 2 4 6 2 3 6 5 3
1 1 2 3 1 1 2 3 2 2 4 6 3 3 6 5
3 1 1 2 3 1 1 2 6 2 2 4 5 3 3 6
6 5 3 3 2 3 1 1 2 3 1 1 4 6 2 2
3 6 5 3 1 2 3 1 1 2 3 1 2 4 6 2
3 3 6 5 1 1 2 3 1 1 2 3 2 2 4 6
5 3 3 6 3 1 1 2 3 1 1 2 6 2 2 4
Note that the optimal branch number of a 16 × 16 matrix is 17, which could
6.2. Truncated Differential Analysis of ECHO 103
For these transformations, we can find right pairs for any valid truncated differ-
ential path with an average complexity of 1. Note that not all differentials of a
truncated differential path are possible differentials (see Section 3.2 and 3.3.2).
Therefore, we usually need to try many starting differentials such that a right
pair can be constructed. For each SuperBox, a differential is possible with a
probability of about 2−4 . Hence, for each active SuperBox we need to try about
24 differentials to find the first right pair. However, for each possible differential,
the expected number of right pairs is 24 . We can generalize this for the whole
state and for x active SuperBoxes, we need to construct 24·x starting differentials
and then get 24·x right pairs.
Again, we can use many different techniques to find these right pairs (see Ta-
ble 3.4 of Section 3.8). In all subsequent attacks on ECHO, the memory complexity
of the final phase is at least 264 . Therefore, using the DDT of the SuperBox does
not increase the total memory requirements. The advantage of using the DDT
is that we only need the minimum number of starting points to find the first
1 http://magma.maths.usyd.edu.au/magma/
104 Chapter 6. Multiple Inbound and Multiple Outbound Phases in ECHO
right pair. On the other hand, for a practical implementation of the attacks
a slightly higher time complexity but less memory requirements could be more
appropriate [JF11]. Nevertheless, in the following attacks we assume that one
right pair can be computed with average complexity one and the complexity to
find the first right pair is 24·x for an inbound phase with x active SuperBoxes.
−8 · (3 + 4 · 12 + 16 · 3 + 4 · 3 + 4 · 12 + 3 · 4) = −8 · 171
and we get about 712 degrees of freedom for this 4-round truncated differential
path.
Note that in this path we keep the number of active bytes low as described in
Section 6.2.1. Except for the beginning and end, at most one fourth of the ECHO
state is active and therefore, we have enough freedom to find many solutions.
We do not allow differences in the chaining input (blue) and in the padding
(cyan). The last 16 bytes (one AES state) of the padding contain the message
length, and the two bytes above contain the 2-byte value with the hash size.
Note that the AES states containing the chaining values (blue) and padding
(cyan) do not get mixed with other AES states until the first BigMixColumns
transformation. Since the lower half of the state (row 2 and 3) is truncated, we
force all differences to be in the lower half of the message as well.
H
S32
S24
S16
S8
M
ShiftRows ShiftRows ShiftRows ShiftRows ShiftRows
S33
S25
S17
S9
S1
SubBytes SubBytes SubBytes SubBytes SubBytes
S34
S26
S18
S10
S2
MixColumns MixColumns MixColumns MixColumns MixColumns
S35
S27
S19
S11
S3
SubBytes SubBytes SubBytes SubBytes SubBytes
S36
S28
S20
S12
S4
ShiftRows ShiftRows ShiftRows ShiftRows ShiftRows
S37
S29
S21
S13
S5
S30
S22
S14
S6
S31
S23
S15
S7
S32
S24
S16
S8
BigFinal
Trunc
6.3. Attacks on the ECHO-256 Hash Function 107
merge the solutions of the two inbound phases by determining the remaining
(white) values using a generalized birthday attack on 4 independent columns of
the state. Note that in some cases, the probability to find one solution is only
close to one. However, for the sake of simplicity we assume it is one, since we
have enough freedom in the attack to repeat all phases with different starting
points to get one solution on average.
we need at least 216 starting differentials for each column to find the first right
pair.
The difference in S14 is already fixed due to the yellow inbound phase but we
can still choose from 232 differences for each active AES state in S7 . As shown
in Section 6.2.5, we can find one pair on average for each starting difference in
the inbound phase. We independently iterate through all 232 starting differences
for the 1st, 2nd and 3rd column, and through all 264 starting differences for the
4th column of state S7 . We get 232 right pairs for each of the first three columns
and 264 pairs for the 4th column. The total complexity to find all these pairs is
264 in time and memory.
For each resulting right pair, the values and differences of the red and black
bytes between state S7 and S14 can be computed. Furthermore, the truncated
differential path in backward direction, except for two cyan bytes in the first
states, is fulfilled. In the next phase, we partially merge the right pairs of the
yellow and red inbound phase, but first we determine the conditions for this
merge.
2 2 4 3 3 6 1 1 2 1 1 2 c3
On the right side, we have the constant values c0 , c1 , c2 , c3 which are determined
by A0 , A1 , A2 , A3 and B0 , B1 , B2 , B3 and we get for example:
The matrix of this linear system has rank 3 (instead of 4) and therefore, we
only get a solution with a probability of 2−8 for given Ai , Bi . We can solve this
system of equations by transforming the system into echelon form. We get:
0
1 0 0 1 0 0 1 0 0 1 0 0 c0
0 1 0 0 1 0 0 1 0 0 1 0 c01
0 0 1 0 0 1 0 0 1 0 0 1 · X = c02 (6.3)
0 0 0 0 0 0 0 0 0 0 0 0 c03
where the values c00 , c01 , c02 , c03 are a linear combination of c0 , c1 , c2 , c3 . From the
last equation, we get the 8-bit condition c03 = 0. In [JF11], this 8-bit condition
has been derived and is given as follows:
2 · A0 + 3 · A1 + A2 + A3 = 14 · B0 + 11 · B1 + 13 · B2 + 9 · B3 . (6.4)
Similar 8-bit conditions exist for all 16 columns slices. In total, each right pair
of the red and yellow inbound phases results in a 128-bit condition on the whole
SuperMixColumns transformation between state S14 and S16 .
2 · A0 + 3 · A1 = A2 + A3 + 14 · B0 + 11 · B1 + 13 · B2 + 9 · B3 . (6.5)
Then, we apply the left-hand side to the elements of L1 and the right-hand side
to elements of L2 and sort L1 according to the bytes to be matched. Finally, we
just iterate through all elements of L2 and collect the 232 pairs which satisfy the
128-bit condition. These 232 pairs are then partial right pairs for the combined
red and yellow inbound phase. The complexity of this part can probably be
further reduced using the techniques proposed in [JF11].
110 Chapter 6. Multiple Inbound and Multiple Outbound Phases in ECHO
S7 S8 S14 S16
SR SB MC SB SR BSR
MC BMC
BMC
Figure 6.4: States used to merge the two inbound phases with the chaining
values. The merge inbound phase consists of three parts. Brown bytes show
values already determined (1st part) and gray values are chosen at random (2nd
part). Green, blue, yellow and red bytes show independent values used in the
generalized birthday attack (3rd part) and cyan bytes represent values with the
target conditions.
SuperMixColumns. For the first column slice, the system is given as follows:
The free variables in this system are x6 , . . . , x11 (green). The values A0 , A1 ,
A2 , A3 , B0 , B1 , B2 , B3 (brown) have been determined by the first or second
inbound phase, and the values L0 , L1 , L2 (lightgray) and L00 , L01 , L02 (gray)
are determined by the choice of arbitrary values in state S7 . Also this resulting
linear system of equations has rank 3 and we can proceed as in the 1st part of
the merge inbound phase and we get:
x6
3 1 1 3 1 1 x7
c0
2 3 1 2 3 1 x 8 c1
1 2 3 1 2 3 · x 9 = c2 (6.6)
1 1 2 1 1 2 x10 c3
x11
The resulting linear 8-bit equation to get a solution for this system can be
separated into terms depending on values of Li and on L0i , and we get
For all other 16 column slices and fixed positions of gray bytes, we get matrices
of rank 3 as well. In total, we get 16 8-bit conditions and the probability to
find a solution for a given choice of gray and lightgray values in state S14 and
S16 is 2−128 . However, we can find a solution to these linear equations using
the birthday effect and a meet-in-the-middle attack (see Section 2.2.2) with a
complexity of 264 in time and memory.
112 Chapter 6. Multiple Inbound and Multiple Outbound Phases in ECHO
We start by choosing 264 values for each of the big first (gray) and second
(lightgray) column in state S7 . We compute these values independently forward
to state S14 and store them in two lists L and L0 . We also separate all equations
of the 128-bit condition into parts depending only on values of L and L0 . We
apply the resulting functions f1 , f2 , f3 to the elements of lists Li and L0i , and
merge two lists L ./128 L0 using the birthday effect (see Section 2.2.3).
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
·
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
4 6 2 2 6 5 3 3 2 3 1 1 2 3 1 1 a
2 4 6 2 3 6 5 3 1 2 3 1 1 2 3 1 0
2 2 4 6 3 3 6 5 1 1 2 3 1 1 2 3 0
6 2 2 4 5 3 3 6 3 1 1 2 3 1 1 2 0
2 3 1 1 4 6 2 2 6 5 3 3 2 3 1 1 b
1 2 3 1 2 4 6 2 3 6 5 3 1 2 3 1 0
1 1 2 3 2 2 4 6 3 3 6 5 1 1 2 3 0
3 1 1 2 6 2 2 4 5 3 3 6 3 1 1 2 0
· =
2 3 1 1 2 3 1 1 4 6 2 2 6 5 3 3 c
1 2 3 1 1 2 3 1 2 4 6 2 3 6 5 3 0
1 1 2 3 1 1 2 3 2 2 4 6 3 3 6 5 0
3 1 1 2 3 1 1 2 6 2 2 4 5 3 3 6 0
6 5 3 3 2 3 1 1 2 3 1 1 4 6 2 2 d
3 6 5 3 1 2 3 1 1 2 3 1 2 4 6 2 0
3 3 6 5 1 1 2 3 1 1 2 3 2 2 4 6 0
5 3 3 6 3 1 1 2 3 1 1 2 6 2 2 4 0
4 6 2 2
2 3 1 1
2 3 1 1 " a
#
6 5 3 3 b
·
2 4 6 2 c
1 2 3 1 d
1 2 3 1
3 6 5 3
Analyzing the resulting matrix Mcomb for all 4 active column slices shows that
in each case, the rank of Mcomb is 2 instead of 4. This reduces the dimension
of the vector space in each active column slice from 32 to 16. Since we have 4
active columns, the total dimension of the vector space at the output of the hash
function is 64.
114 Chapter 6. Multiple Inbound and Multiple Outbound Phases in ECHO
We use [LMR+ 09, Equation 19] to compute the complexity of a generic distin-
guishing attack on the ECHO-256 hash function. We get the parameters N = 256
(hash function output size), n = 64 (dimension of vector space) and t = 232
(number of outputs in vector space) for the subspace distinguisher. Then, the
generic complexity to construct 232 elements in a vector space of dimension 64
is about 2111.8 compression function evaluations. Remember that in our attack
on ECHO we also get 232 pairs in a vector space of the same dimension. Hence,
the total complexity for the subspace distinguisher on 5 rounds of the ECHO-256
hash function is about 296 compression function evaluations with memory re-
quirements of 264 .
as we have in the hash function case in forward direction. Then, the full active
ECHO state is located in the middle round and we can construct attacks for up
to 7 rounds for the compression functions of ECHO-256 (see Figure 6.5).
which corresponds to 800 degrees of freedom. Note that this is much more than
for the paths given in [MPRS09] and [Pey10].
BF
S16
S24
S32
S40
S48
S56
S8
BIG
BIG
BIG
BIG
BIG
BIG
BIG
MC
MC
MC
MC
MC
MC
MC
S15
S23
S31
S39
S47
S55
S7
MC
MC
MC
MC
MC
MC
MC
S14
S22
S30
S38
S46
S54
S6
BIG
BIG
BIG
BIG
BIG
BIG
BIG
SR
SR
SR
SR
SR
SR
SR
S13
S21
S29
S37
S45
S53
S5
SR
SR
SR
SR
SR
SR
SR
S12
S20
S28
S36
S44
S52
S4
SB
SB
SB
SB
SB
SB
SB
S11
S19
S27
S35
S43
S51
S3
MC
MC
MC
MC
MC
MC
MC
S10
S18
S26
S34
S42
S50
S2
SB
SB
SB
SB
SB
SB
SB
S17
S25
S33
S41
S49
S1
S9
SR
SR
SR
SR
SR
SR
SR
M
S16
S24
S32
S40
S48
S8
H
Figure 6.5: The truncated differential path to get collisions for 6 rounds and
near-collisions for 7 rounds of the ECHO-256 compression function. Black bytes
are active, red bytes are values computed in the 1st inbound phase, yellow bytes
in the 2nd, blue bytes in the 3rd and green bytes in the 4th inbound or 2nd
outbound phase, and cyan bytes in the 3rd outbound phase. Purple bytes are
determined in the 1st outbound phase and gray bytes are chosen in the merge
inbound phase.
118 Chapter 6. Multiple Inbound and Multiple Outbound Phases in ECHO
the diagonal bytes of the salt such that the values match. We sort the resulting
list according to the 4-byte salt value and repeat the same for all 4 BigColumns
of state S23 . Then, we just need to iterate through all 4 lists and search for
matching salt values. Note that for some salt values, we will get no solution, but
for some we will get more than one solution. On average, we expect to get 232
matching pairs with a complexity of 232 with chosen diagonal bytes of the salt.
function with a complexity of 2160 in time and 2128 memory and with chosen
salt.
Note that we can use almost the same attack to construct 232 near-collisions
with a zero difference in the same 320 bits. Again we need to satisfy the 96-bit
condition in the cyan bytes in the last round. However, this time we require that
the overlapping 4-byte differences in the feed-forward cancel each other. This
32-bit condition ensures that we get only 4 × 6 = 24 active bytes at the output
of the compression function for 232 pairs with a total complexity of 2160 in time
and 2128 memory and with chosen salt.
6.5 Summary
In this chapter, we have presented a detailed analysis of the ECHO hash function.
We provide collision attacks for up to 5 out of 8 rounds of the ECHO-256 hash
function. Furthermore, we have improved the analysis of the ECHO compression
functions to get attacks for up to 7 (out of 8) rounds of ECHO-256. We expect
that similar compression function results can also be obtained for ECHO-512.
In our improved attacks we combine the MixColumns transformation of the
second AES round with the subsequent BigMixColumns transformation to a com-
bined SuperMixColumns transformation. This allows us to construct very sparse
truncated differential paths. In these paths, at most one fourth of the bytes
are active throughout the whole computation of ECHO. Note that truncated dif-
ferential paths with non-full active states have also been used in the full com-
pression function attacks on Whirlpool and Lane. However, the rather good
diffusion in the single permutation in ECHO does not provide completely inde-
pendent parts covering more than one round. Nevertheless, we are able to ap-
ply a rebound attack with multiple inbound phases to ECHO. Using generalized
birthday techniques applied to the 4 independent columns or the 16 independent
SuperMixColumns transformations we are able to efficiently merge these inbound
phases.
Note that in the given attacks on ECHO, the available freedom in the compres-
sion and hash function attacks are not yet fully used. As a rough estimation,
we need the freedom of about 1/4 of the 2048-bit ECHO state for each inbound
phase. The hash function attack on 5 rounds consists of 2 inbound phases and
1/4 of the state is determined by the chaining input. The compression function
attack on 7 rounds consists of 3 big and one small inbound phase which together,
need about 3/4 of the freedom. Hence, in both attacks about 3/4 of the degrees
of freedom are used. However, it is unknown how the remaining freedom could
be used in attacks on more rounds.
Future work includes the search for even sparser truncated differential paths
and the improvement of the given attacks by using the available freedom. Also
the separate search for differences and values as proposed in [MPRS09] and
[KNPRS10] may be used to improve the complexity of additional inbound phases.
Finally, an improvement of the low complexity, full round distinguisher published
122 Chapter 6. Multiple Inbound and Multiple Outbound Phases in ECHO
in [SLW+ 10] using a rebound attack with multiple inbound phases my lead to a
distinguisher on the full 8 round compression function of ECHO-256.
Semi-Free-Start Collisions for the Full
Compression Function of Lane
7
In this chapter, we apply the rebound attack to the SHA-3 candidate Lane.
Lane [Ind08] is a single-pipe, iterative hash function based on the Merkle-
Damgård design principle [Dam89, Mer89]. The permutation-based compression
function consists of 6 parallel lanes and a linear message expansion. The permu-
tations of each lane are based on the round transformations of the AES. Lane
has been first analyzed in [WFW09] using the rebound attack. In that work,
semi-free-start collisions for 3 rounds of Lane-256 and 4 rounds of Lane-512
are proposed. Also, a hash function attack for 3 rounds of Lane-512 is given.
In this chapter, we use sparser truncated differential paths and are able to
apply a rebound attack with multiple inbound phases. The results are semi-
free-start collisions for the full 6 rounds of Lane-256, and for the full 8 rounds
of Lane-512. Beside multiple inbound phases, the main idea of this improved
rebound attack on Lane is to search for solutions of each lane independently.
Furthermore, we use a truncated differential path such that a collision at the end,
and a valid expanded message at the input can be found mostly independently
as well. This allows us to use the birthday effect at multiple levels and find
collisions for the full compression function with a relatively low complexity. The
results of this chapter have been published in [MNPN+ 09].
123
124 Chapter 7. Collisions for the Full Compression Function of LANE
that supports four digest sizes (224, 256, 384 and 512 bits) and the use of a salt.
Since Lane-224 and Lane-256 are rather similar except for truncation, we write
Lane-256 whenever we refer to both of them. The same holds for Lane-384 and
Lane-512.
The hashing of a message proceeds as follows. First, the initial chaining value
H−1 , of size 256 bits for Lane-256, and 512 bits for Lane-512, is set to an initial
value that depends on the digest size n and the optional salt value S. At the same
time, the message is padded and split into message blocks Mi of length 512 bits
for Lane-256, and 1024 bits for Lane-512. Then, a compression function f is
applied iteratively to process message blocks one by one as Hi = f (Hi−1 , Mi , Ci ),
where Ci is a counter that indicates the number of message bits processed so far.
Finally, after all message blocks are processed, the final digest is derived from
the last chaining value, the message length and the salt by an additional call to
the compression function.
Figure 7.1: Overview of the Figure 7.2: Pseudocode for the round trans-
compression funtion of Lane. formations used in the Lane permutations.
pansion of Lane ensures that in a differential attack at least 4 lanes are active.
In Lane-256, the 512-bit message block Mi is split into four 128-bit blocks m0 ,
m1 , m2 , m3 and the 256-bit chaining value Hi−1 is split into two 128-bit words
h0 , h1 as follows m0 ||m1 ||m2 ||m3 ← Mi , h0 ||h1 ← Hi−1 . Then, six more 128-bit
words a0 , a1 , b0 , b1 , c0 , c1 are computed
a0 = h0 ⊕ m0 ⊕ m1 ⊕ m2 ⊕ m3 , a1 = h1 ⊕ m0 ⊕ m2 ,
b0 = h0 ⊕ h1 ⊕ m0 ⊕ m2 ⊕ m3 , b1 = h0 ⊕ m1 ⊕ m2 , (7.1)
c0 = h0 ⊕ h1 ⊕ m0 ⊕ m1 ⊕ m2 , c1 = h0 ⊕ m0 ⊕ m3 .
SC256 (x0 ||x1 || . . . ||x7 ) = x0 ||x1 ||x4 ||x5 ||x2 ||x3 ||x6 ||x7
SC512 (x0 ||x1 || . . . ||x15 ) = x0 ||x4 ||x8 ||x12 ||x1 ||x5 ||x9 ||x13 ||
x2 ||x6 ||x10 ||x14 ||x3 ||x7 ||x11 ||x15 .
Hi−1 ∆Mi
message expansion
Figure 7.3: Outline of the rebound attack on Lane. In the attack we first find
partial inputs such that the truncate differential path is satisfied (red) and fulfill
the first half of the message expansion (green). Then, we search for colliding
differences at the output of the P -lanes (cyan) and fulfill the second half of the
message expansion (yellow).
0 0
SB SB SB SB SB SB
1 1
SR SR SR SR SR SR
2 2
MC MC MC MC MC MC
3 3
SC SC
4 4
SB SB SB SB SB SB
5 5
SR SR SR SR SR SR
6 6
MC MC MC MC MC MC
7 7
SC SC
8 8
SB SB SB SB SB SB
9 9
SR SR SR SR SR SR
10 10
Figure 7.4: The inbound phase for Lane-256 (left) and Lane-512 (right). Black
bytes are active, gray bytes fixed by solutions of the inbound phase.
Furthermore, Equation (7.1) gives for the differences in the expanded message
words (a0 , a1 ) and (c0 , c1 ):
Beside the differences, we also need to match the values in the message
expansion. Since we aim for a semi-free-start collision, we can freely choose the
chaining value (h0 , h1 ) such that the conditions on (a0 , a1 ) are satisfied:
h0 = a0 ⊕ m0 ⊕ m1 ⊕ m2 ⊕ m3 , h1 = a1 ⊕ m0 ⊕ m2
That means we have conditions on the input (c0 , c1 ) left, which we need to match
with the message words m0 , m1 , m2 and m3 . Since we can vary lanes P0 ,P2 and
P4 ,P5 independently in the following attacks, we can satisfy these conditions by
merging the results of both sides. Using the equations of the message expansion,
we get for (c0 , c1 ) using the values of (a0 , a1 ):
c0 = a0 ⊕ a1 ⊕ m0 ⊕ m2 ⊕ m3 , c1 = a0 ⊕ m1 ⊕ m2
m0 ⊕ m2 ⊕ m3 = c0 ⊕ a0 ⊕ a1 , m1 ⊕ m2 = c1 ⊕ a0 (7.5)
To merge the two sides, we will compute, store and compare the following values
using independent lists:
v1 = c 0 ⊕ a 0 ⊕ a 1 , v2 = c1 ⊕ a0 , v3 = m0 ⊕ m2 ⊕ m3 , v4 = m1 ⊕ m2
are quite high. We assume that the time and memory complexities can be
reduced by a more careful merging process of the different lists and by using
better time-memory trade-offs as given in Section 3.7. Furthermore, collision
attacks on the hash function might be possible for at least half the number of
rounds.
P0 P1 P2 P3 P4 P5
Figure 7.5: The truncated differential path for 6 rounds of Lane-256. Black bytes are active, red (gray) bytes correspond
to the first inbound phase, gray (dark gray) bytes to the second inbound phase and blue (light gray) bytes are used to find
Merge Lanes Merge Lanes
Message Expansion
a0 a1 b0 b1 c0 c1 h0 h1 m0 m1 0
m20 m30
0
0:
SB SB SB SB SB SB SB SB SB SB SB SB
1:
SR SR SR SR SR SR SR SR SR SR SR SR
2:
MC MC MC MC MC MC MC MC MC MC MC MC
3:
SC SC SC SC SC SC
First Inbound
4:
SB SB SB SB SB SB SB SB SB SB SB SB
5:
SR SR SR SR SR SR SR SR SR SR SR SR
6:
MC MC MC MC MC MC MC MC MC MC MC MC
7:
Merge Inbound
SC SC SC SC SC SC
8:
SB SB SB SB SB SB SB SB SB SB SB SB
10:
MC MC MC MC MC MC MC MC MC MC MC MC
11:
Second Inbound
SC SC SC SC SC SC
12:
SB SB SB SB SB SB SB SB SB SB SB SB
13:
SR SR SR SR SR SR SR SR SR SR SR SR
14:
MC MC MC MC MC MC MC MC MC MC MC MC
15:
SC SC SC SC SC SC
16:
SB SB SB SB SB SB SB SB SB SB SB SB
17:
SR SR SR SR SR SR SR SR SR SR SR SR
18:
MC MC MC MC MC MC MC MC MC MC MC MC
19:
SC SC SC SC SC SC
20:
SB SB SB SB SB SB SB SB SB SB SB SB
21:
SR SR SR SR SR SR SR SR SR SR SR SR
22:
MC MC MC MC MC MC MC MC MC MC MC MC
23:
SC SC SC SC SC SC
24:
input (state #4) and output (state #5) of the SubBytes layer in between. We
get at least 296 solutions for the inbound phase with a complexity of 296 (see
Section 7.2.2). For each result, only the red and black bytes in Figure 7.5 are
determined, i.e. the differences as well as the actual values of the bytes are found.
Note that we have chosen the position of active bytes in state #0, such that at
least one term of Equation (7.2) or (7.4) is zero for each byte. At this point, we
can compute backwards to state #0 and independently verify the condition on
one byte of the input differences:
The condition on each of these bytes is fulfilled with a probability of 2−8 and we
store the 288 valid results of each lane P0 , P2 , P4 and P5 in the corresponding
lists L0 , L2 , L4 and L5 . Note that we store the values and differences of state
#10 (red and black bytes) in these lists, since we need to merge these bytes with
the second inbound phase in the following. For an efficient merging step, the
lists are stored in hash tables (or sorted) according to the bytes to be merged
(diffences and values of active bytes in state #10).
gray bytes) are determined. We compute and store the 256 input values and
differences of state #0 in lists L0 , L2 , L4 and L5 . Altough we still do not know
half of the state, each of these input pairs conforms to the whole truncated
differential path from state #0 to state #24 with a probability of 1. In other
words, we know that in state #24, there are at most the given bytes active.
These conditions are fulfilled with a probability of 2−32 and by merging two lists
(L0 and L2 ) of size 256 , we get 256 × 256 × 2−32 = 280 valid matches which we
store in list L02 . We repeat the same for lane P4 and P5 by merging lists L4 and
L5 . We get 280 matches for list L45 as well, since we need to fulfill the 32-bit
conditions on the differences of the following 4 bytes:
Again, if we use hash tables or the previous lists are sorted according to the
bytes to match, the merge operation can be performed very efficiently. Hence,
the total complexity to produce the lists L02 and L45 is determined by their final
size and requires an effort of around 280 computations.
Note that we also need to fulfill the conditions on the values of the states.
Remember that we can freely choose the chaining values (h0 , h1 ) to satisfy the
values in the first 16 bytes of the message expansion (a0 , a1 ). To fulfill the
conditions on the 16 bytes of (c0 , c1 ) we need to satisfy Equation (7.5) using the
corresponding values v1 , v2 , v3 and v4 . Hence, we need to find a match for the
following values and differences by merging lists L02 and L45 :
Since we have 280 elements in each list and conditions on 160 bits, we expect to
find 280 × 280 × 2−160 = 1 result. This result satisfies the message expansion for
all lanes and is a solution for the truncated differential path of each active lane
between state #0 and state #24. However, we do not get a collision at the end
of the P -lanes yet, since we do not know the differences of state #24.
The black, red and gray bytes represent values which have already been
determined by the previous parts of the attack.
The blue bytes represent values not yet determined and can be used to
vary the differences in state #22.
To find a collision between two lanes, we can still choose 264 values for the blue
bytes in state #7 of each lane and store these results in lists L0 , L2 , L4 and
L5 . Note that for these 264 values, we get only 232 different values for the two
free bytes in the first and fifth column of state #18. Hence, we can only iterate
through 232 differences in state #22 for each lane. However, this is enough to
find one colliding difference for each side, since 232 ×232 ×2−64 = 1. By repeating
this step 232 times for each side, we expect 264 × 264 × 2−64 = 264 results for
each merged list L02 and L45 .
7.3. Compression Function Attacks on Lane 135
7.3.1.8 Complexity
Let us find the complexity of the whole attack. The first inbound phase requires
296 computations and 288 memory, the second inbound requires 296 computations
and 296 memory, and the merging of the inbound phases requires 288 hash table
lookups and 256 memory. Obviously, the second inbound phase and the merge
inbound phases can be united to lower the memory requirement of these three
steps. Namely, we create the lists L0 , L2 , L4 and L5 in the first inbound phase.
Then, for each differential path of the second inbound phase, instead of storing
it in a list, we immediately check if it can be merged with some differential from
the lists. Only if it can be merged, we do the outbound phase and compute state
#0. Hence, the first three steps of our attack require around 296 computations
and 288 memory. The merge lanes step requires 280 computations and memory.
The message expansion steps require 280 computations, while the find collisions
steps require 232 computations. Hence, the total attack complexity is around
296 computations and 288 memory. Note that the cost of each computation is
never greater than the cost of one compression function evaluation. Therefore,
the complexity to find a semi-free-start collision for all 6 rounds of Lane-256 is
about 296 compression function evaluations and 288 memory.
1. First Inbound Phase: Apply the inbound phase at the beginning of the
truncated differential path (state #2 to state #7) for each lane P0 , P2 , P4 ,
P5 independently.
2. Merge Lanes: Merge the two neighboring lanes P0 ,P2 and P4 ,P5 and
satisfy according differences of the message expansion.
136 Chapter 7. Collisions for the Full Compression Function of LANE
3. Message Expansion: Merge the two sides (P0 , P2 ) and (P4 , P5 ) and
satisfy the remaining conditions on the message expansion (differences and
values).
4. Second Inbound Phase: Apply the inbound phase in the middle of each
lane again (state #10 to state #15).
5. Merge Inbound Phases: Merge the results of the two inbound phases.
6. Starting Points: Choose random values for the brown bytes in state #7
to get enough starting points for the subsequent phases.
7. Merge Lanes: Merge the values of the starting points for the two neigh-
boring lanes P0 ,P2 and P4 ,P5 and satisfy the according differences of the
message expansion.
8. Message Expansion: Merge the two sides (P0 , P2 ) and (P4 , P5 ) and
satisfy the remaining conditions on the message expansion (differences and
values) for the starting points.
9. Third Inbound Phase: Apply the inbound phase at the end of each lane
for a third time (state #18 to state #23).
10. Merge Inbound Phases: Merge the results of the three inbound phases
and use the remaining freedom in between.
11. Find Collisions: Merge the corresponding two lanes to find a collision
for each side (P0 , P2 ) and (P4 , P5 ) independently.
12. Message Expansion: Merge the two sides (P0 , P2 ) and (P4 , P5 ) and
satisfy the conditions on the message expansion of the remaining bytes.
a0 a1 b0 b1 c0 c1 h0 h1 m0 m1 0
m20 m30
0
0:
SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB
1:
SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR
2:
MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC
3:
SC SC SC SC SC SC
4:
SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB
5:
SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR
6:
MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC
7:
SC SC SC SC SC SC
8:
SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB
9:
SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR
10:
MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC
11:
SC SC SC SC SC SC
12:
SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB
13:
SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR
14:
MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC
15:
SC SC SC SC SC SC
16:
SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB
17:
SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR
18:
MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC
19:
SC SC SC SC SC SC
20:
SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB
21:
SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR
22:
MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC
23:
SC SC SC SC SC SC
24:
SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB
25:
SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR
26:
MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC
27:
SC SC SC SC SC SC
28:
SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB SB
29:
SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR SR
30:
MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC MC
31:
SC SC SC SC SC SC
32:
Figure 7.6: The truncated differential path for 8 rounds of Lane-512. Lane P0
shows the plain truncated differential path, lane P2 other possible truncated
differential paths and lane P4 and P5 are used to describe the attack.
138 Chapter 7. Collisions for the Full Compression Function of LANE
The conditions on each of the lanes are fulfilled with a probability of 2−16 and we
store the 268 valid matches of each lane P0 , P2 , P4 and P5 in the corresponding
lists L0 , L2 , L4 and L5 .
Since this match is fulfilled with a probability of 2−48 and we merge two lists of
size 268 , we get 268 × 268 × 2−48 = 288 valid matches which we store in L02 . We
repeat the same for lane P4 and P5 merge lists L4 and L5 . We get 288 matches
for list L45 , since we need to fulfill conditions on differences of 6 bytes as well:
Remember that we can freely choose the chaining values (h0 , h1 ) to satisfy the
values in the first 16 bytes of the message expansion (a0 , a1 ). To fulfill the
conditions on the 16 bytes of (c0 , c1 ) we need to find matches for the following
values and differences using lists L02 and L45 :
8 bytes of v1 from L02 with v3 from L45 ,
8 bytes of v2 from L02 with v4 from L45 ,
6 bytes of differences in L02 and in L45 .
7.3. Compression Function Attacks on Lane 139
Since we have 288 elements in each list and conditions on 176 bits, we expect to
find 288 × 288 × 2−176 = 1 result. This result satisfies the message expansion for
all lanes and is a solution for the truncated differential path of each active lane
between state #0 and state #10.
both lists, we expect 2128 × 2128 × 2−128 = 2128 matching pairs which we store
in list Ls . We will use these values in a later phase of the attack.
7.3.2.13 Complexity
The total complexity of the rebound attack on Lane-512 is determined by the
merging step after the third inbound phase. This step has a complexity of
296 compression function evaluations and is repeated 2128 times. The memory
requirements are determined by the largest lists, which are L002 and L045 (or Ls )
with a size of 2128 . Hence, the total complexity to find a semi-free-start collision
for Lane-512 is about 2128 · 296 = 2224 compression function evaluations and
2128 in memory.
7.4 Summary
In this chapter, we have applied the rebound attack to the hash function Lane.
In the attack we use a truncated differential path with differences concentrating
mostly in one half (or quarter) of the lanes. Due to the relatively slow diffusion
of the parallel AES rounds, we are able to solve parts of the lanes independently.
First, we search for differences and values (for parts of the state) according to
the truncated differential path and also satisfy the message expansion. Then,
we choose values which can be changed such that the truncated differential path
and according message expansion still holds. The freedom in these values is
then used to search for a collision at the end of the lanes without violating the
differential path or message expansion.
In the case of Lane-256 we can merge two inbound phases since at most
one half of the state is active. In Lane-512, only one quarter of the states is
active and we are able to merge 3 inbound phases. Note that in theory there is
enough freedom to merge 4 inbound phases in Lane-512. However, this seems
to be difficult since the diffusion gets better the more rounds are covered in an
attack. Another option to improve the attacks is to consider differential paths
with two subsequent full active AES states and use SuperBox techniques (see
Section 3.7) to find conforming pairs. However, this results in more dense paths
and it is more difficult to merge multiple inbound phases.
142 Chapter 7. Collisions for the Full Compression Function of LANE
143
144 Chapter 8. Conclusions
145
146 Appendix A. Analysis of Grøstl-0
Q0 Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8
AC AC AC AC AC AC AC AC
Mi SB SB SB SB SB SB SB SB
SH SH SH SH SH SH SH SH
MB MB MB MB MB MB MB MB
P0 P1 P2 P3 P4 P5 P6 P7 P8
Hi−1 AC AC AC AC AC AC AC AC
SB SB SB SB SB SB SB SB
Hi
SH SH SH SH SH SH SH SH
MB MB MB MB MB MB MB MB
However, the expected number of solutions for this path is far below 1. Again,
this can be verified by multiplying the total number of input pairs and the proba-
bility of the path. The 8-round path is probabilistic in the MixBytes transforma-
tions of round r1 , r4 , r5 , and in the XOR at the output. Hence, the probability
that a right pair for this truncated differential path exist is very small and given
as follows:
−8·7
8·(64+8)
28·64 · 2
| {z } · |{z}
2 | · 2−8·7} · 2| −8·56{z
{z · 2−8·56} · |2−8·7{z
· 2−8·7} · |2−8·8
{z } = 2
−96
Note that also removing the last round gives an invalid truncated differential
path.
We only get a valid truncated differential path if we reduce from 8 → 1
active byte only once in each permutation. This results in an attack on 7 rounds
of the Grøstl-0-256 compression function. We use a truncated differential path
with two full active states in the middle, as in the attack on Grøstl-256. We can
extend the path by one round in backward direction and match 8-byte differences
at the input. In forward direction we can only reduce the path to 8 active bytes
(but for two rounds) and get a full active state at the output. The detailed path
is given in Figure A.2 and the sequence of active bytes in each round ri of each
permutation is given as follows:
1 r 2 3 r 4 r5 6 r7 r r r
8 −→ 1 −→ 8 −→ 64 −→ 64 −→ 8 −→ 8 −→ 64
A.1. Using the Same Truncated Differential Path 147
Q0 Q1 Q2 Q3 Q4 Q5 Q6 Q7
AC AC AC AC AC AC AC
Mi SB SB SB SB SB SB SB
SH SH SH SH SH SH SH
MB MB MB MB MB MB MB
1 256 average 1 1
232 232
1 256 average 1 1
P0 P1 P2 P3 P4 P5 P6 P7
Hi−1 AC AC AC AC AC AC AC
SB SB SB SB SB SB SB
Hi
SH SH SH SH SH SH SH
MB MB MB MB MB MB MB
Figure A.2: The truncated differential path for the semi-free-start collision on 7
rounds of the compression function of Grøstl-256.
For this 7-round colliding truncated differential path we first compute the
expected number of right pairs. The path is probabilistic in the MixBytes trans-
formation of round r1 and r4 in each of P and Q, as well as in the XOR operation
at the output of the compression function. Hence, the expected number of semi-
free-start collisions we can get for the truncated differential path of Figure A.2
is:
−8·7
8·(64+8)
| {z } · 2
2 8·64
|{z} · |2 · 2−8·7} · |2−8·56{z
{z · 2−8·56} · |2−8·8
{z } = 2
16
Also for Grøstl-0 we first find pairs for each permutations independently and
use the birthday effect to get colliding differences at the input and output of the
compression function. The inbound phase of the attack is the same as for Grøstl-
256 (see Section 5.2.3) and we can get one pair with an average complexity of
one and memory requirements of 264 . The solutions of the inbound phase are
propagated outwards in the outbound phase. Note that the propagation in the
two rounds r1 and r6 are for free. We need to fulfill one 8 → 1 MixBytes transition
in round r2 with probability 2−56 , and a birthday match on 2 · 64 = 128 bits
at the input and output with complexity 264 . Hence, the total complexity to
get semi-free-start collisions for 7-rounds of Grøstl-0-256 is 264 · 256 = 2120
compression function evaluations with memory requirements of 264 due to the
SuperBox match in the inbound phase and the birthday match in the outbound
phase.
Q0 Q1 Q2 Q3 Q4 Q5 Q6 Q7
AC AC AC AC AC AC AC
SB SB SB SB SB SB SB
SH SH SH SH SH SH SH
MB MB MB MB MB MB MB
1 256 average 1 1
232 264
1 256 average 1 1
P0 P1 P2 P3 P4 P5 P6 P7
AC AC AC AC AC AC AC
SB SB SB SB SB SB SB
SH SH SH SH SH SH SH
MB MB MB MB MB MB MB
The path can be constructed by carefully placing the positions of active bytes
in round r6 such that the two active bytes are shifted into the same column in
round r7 . However, the expected number of right pairs for such a path is only
−8·7
8·(128+8)
|2 {z } · 2
8·128
| {z } · |2 · 2−8·7} · 2| −8·14·8{z
{z · 2−8·14·8} · |2−8·48{z
· 2−8·48} · |2−8·16
{z } = 2
−16
M1 Q0 Q1 Q2 Q3 Q4
AC AC AC AC
SB SB SB SB
SH SH SH SH
MB MB MB MB
1
average 1 28
1
IV P0 P1 P2 P3 P4 H1
AC AC AC AC
SB SB SB SB
SH SH SH SH
MB MB MB MB
Figure A.4: Truncated differential path for the collision attack on 4 rounds of
the Grøstl-0-256 hash function.
Note that for Grøstl-0 we can use two full active states in each of P and Q
since the first ShiftBytes in P and Q cancel out when going around the input.
Hence, the columns of almost two rounds can be solved independently in the in-
bound phase (see Figure A.5). The technique is similar to the SuperBox match,
since we just do independent 64-bit matches again. These two consecutive Su-
perBoxes (in both P and in Q) are completely independent between state QSB 2
and P2SB . In other words, this time we have a longer non-linear 64-bit SuperBox
with the following sequence of transformations (starting from Q):
Also for this construction, we can find one right pair for the inbound phase with
an average complexity of one and memory requirements of 264 .
M1 Q0 Q0SH
1 QSB
2 Q2
SB AC
AC MC SH SB
SH AC MB SH
SB MC
IV P0 P10SH P2SB P2
SB AC
AC MC SH SB
SH AC MB SH
SB MC
Figure A.5: The inbound phase of the attack on the hash function Grøstl-0-256
with one 64-bit match (two SuperBoxes) being highlighted.
In the outbound phase, each of the pairs constructed in the inbound phase
are propagate to the output of each permutation with a probability of one. To
150 Appendix A. Analysis of Grøstl-0
get a zero output difference for the hash function, the 8-byte differences prior
to the last MixBytes need to be the same which happens with a probability of
2−64 . Hence, the complexity of this collision attack on the Grøstl-0-256 hash
function is 264 in both time and memory.
Note that using the previous techniques a collision attack on 5 rounds ac-
cording to the following truncated differential path for both, P and Q is not
possible:
r1 r2 r3 r4 r5
64 −→ 64 −→ 8 −→ 1 −→ 8 −→ 64
Each of the two 8 → 1 transitions of MixBytes in round r3 have a probability of
2−56 . Together with the probabilistic match on 64 bits at the end of the path,
the total complexity is 256+56+64 = 2176 which exceeds the generic complexity
for a collision attack on Grøstl-0-256.
Q0 Q1 Q2 Q3 Q4 Q5
AC AC AC AC AC
SB SB SB SB SB
SH SH SH SH SH
MB MB MB MB MB
256 1
average 1 264
256 1
P0 P1 P2 P3 P4 P5
AC AC AC AC AC
SB SB SB SB SB
SH SH SH SH SH
MB MB MB MB MB
Figure A.6: Truncated differential path for the collision attack on 5 rounds of the
Grøstl-512 hash function. An additional first block is used to generate enough
freedom for the attack to succeed.
We can get the needed additional freedom for a 5 round collision attack by
prepending a first message block. The collision attack works as follows. First we
choose an arbitrary first message block. Then, we repeat the inbound phase for
all 2128 possible starting points to get 2128 solutions. Since the probability of the
outbound phase is 2−176 we need to repeat the inbound phase with 248 different
A.2. Considering Differences between P and Q 151
first message blocks to find a collision for 5 rounds. The total complexity of
the attack is about 264+56+56 = 2176 compression function evaluations and 264
memory.
Mi = Q0
Hi−1 = P0 ⊕ Q0 = ∆0 (A.1)
Hi = P10 ⊕ Q10 ⊕ Hi−1 = ∆0 ⊕ ∆10 .
For this truncated differential path between P and Q we also compute the ex-
pected number of right input pairs (Hi−1 , Mi ). Note that pairs are actually input
values to the compression function. Additionally to the probabilistic MixBytes
propagation, we also need to match the exact values of the XOR constants in
round r3 , r7 and r8 . Hence, for the given path the expected number of right
inputs (Hi−1 , Mi ) is only 1:
∆0 ∆1 ∆2 ∆3 ∆4 ∆5 ∆6 ∆7 ∆8 ∆9 ∆10
AC AC AC AC AC AC AC AC AC AC
SB SB SB SB SB SB SB SB SB SB
SH SH SH SH SH SH SH SH SH SH
MB MB MB MB MB MB MB MB MB MB
Figure A.7: The 10 round truncated differential path which considers differences
between P and Q.
The attack to find this right input is similar as in the case of a single permu-
tation (see Section 5.1). The complexity increases since two columns are active
152 Appendix A. Analysis of Grøstl-0
∆0 ∆1 ∆2 ∆3 ∆4 ∆5 ∆6 ∆7
AC AC AC AC AC AC AC
SB SB SB SB SB SB SB
SH SH SH SH SH SH SH
MB MB MB MB MB MB MB
∆0 ∆1 ∆2 ∆3 ∆4 ∆5 ∆6
AC AC AC AC AC AC
SB SB SB SB SB SB
SH SH SH SH SH SH
MB MB MB MB MB MB
[AB96] Ross J. Anderson and Eli Biham. TIGER: A Fast New Hash Func-
tion. In Dieter Gollmann, editor, FSE, volume 1039 of LNCS, pages
89–97. Springer, 1996.
[AGM+ 09] Kazumaro Aoki, Jian Guo, Krystian Matusiewicz, Yu Sasaki, and
Lei Wang. Preimages for Step-Reduced SHA-2. In Mitsuru Mat-
sui, editor, ASIACRYPT, volume 5912 of LNCS, pages 578–597.
Springer, 2009.
[AHMP11] Jean-Philippe Aumasson, Luca Henzen, Willi Meier, and Raphael
C.-W. Phan. SHA-3 proposal BLAKE. Submission to NIST
(Round 3), January 2011. Available online: http://csrc.nist.
gov/groups/ST/hash/sha-3/Round3/submissions_rnd3.html.
[AMP10a] Elena Andreeva, Bart Mennink, and Bart Preneel. On the Indif-
ferentiability of the Grøstl Hash Function. In Juan A. Garay and
Roberto De Prisco, editors, SCN, volume 6280 of LNCS, pages
88–105. Springer, 2010.
[AMP10b] Elena Andreeva, Bart Mennink, and Bart Preneel. Security Reduc-
tions of the Second Round SHA-3 Candidates. In Mike Burmester,
Gene Tsudik, Spyros S. Magliveras, and Ivana Ilic, editors, ISC,
volume 6531 of LNCS, pages 39–53. Springer, 2010.
[And08] Elena Andreeva. On LANE Modes of Operation. Technical report,
COSIC, Katholieke Universiteit Leuven, 2008.
155
156 Bibliography
[BBG+ 08] Ryad Benadjila, Olivier Billet, Henri Gilbert, Gilles Macario-Rat,
Thomas Peyrin, Matthew J. B. Robshaw, and Yannick Seurin.
SHA-3 Proposal: ECHO. Submission to NIST (Round 1), De-
cember 2008. Available online: http://csrc.nist.gov/groups/ST/
hash/sha-3/Round1/submissions_rnd1.html.
[BDK01] Eli Biham, Orr Dunkelman, and Nathan Keller. The Rectangle
Attack - Rectangling the Serpent. In Birgit Pfitzmann, editor,
EUROCRYPT, volume 2045 of LNCS, pages 340–357. Springer,
2001.
[BDPV07] Guido Bertoni, Joan Daemen, Michaël Peeters, and Gilles Van
Assche. Sponge functions. ECRYPT Hash Workshop, Barcelona,
Spain, May 24-25, 2007. Available online: http://sponge.noekeon.
org/SpongeFunctions.pdf.
[BDPV08] Guido Bertoni, Joan Daemen, Michael Peeters, and Gilles Van Ass-
che. On the Indifferentiability of the Sponge Construction. In
Nigel P. Smart, editor, EUROCRYPT, volume 4965 of LNCS, pages
181–197. Springer, 2008.
Bibliography 157
[BDPV11a] Guido Bertoni, Joan Daemen, Michaël Peeters, and Gilles Van Ass-
che. Cryptographic Sponge Functions, January 2011. Available
online: http://sponge.noekeon.org/CSF-0.1.pdf.
[BDPV11b] Guido Bertoni, Joan Daemen, Michaël Peeters, and Gilles Van
Assche. The Keccak reference. Submission to NIST (Round 3),
January 2011. Available online: http://csrc.nist.gov/groups/ST/
hash/sha-3/Round3/submissions_rnd3.html.
[BDVP06] Guido Bertoni, Joan Daemen, Gilles Van Assche, and Michaël
Peeters. RadioGatún, a Belt-and-Mill Hash Function. NIST - Sec-
ond Cryptographic Hash Workshop, August 24-25, 2006. Avail-
able online: http://csrc.nist.gov/groups/ST/hash/documents/
VANASSCHE_RadioGatun_0720.pdf.
[BS92] Eli Biham and Adi Shamir. Differential Cryptanalysis of the Full
16-Round DES. In Ernest F. Brickell, editor, CRYPTO, volume
740 of LNCS, pages 487–496. Springer, 1992.
[Çal10] Çağdaş Çalik. Multi-stream and Constant-time SHA-3 Implemen-
tations. NIST hash function mailing list, December 2010. Available
online: http://www.metu.edu.tr/~ccalik/software.html#sha3.
[CDMP05] Jean-Sébastien Coron, Yevgeniy Dodis, Cécile Malinaud, and
Prashant Puniya. Merkle-Damgård Revisited: How to Construct
a Hash Function. In Victor Shoup, editor, CRYPTO, volume 3621
of LNCS, pages 430–448. Springer, 2005.
[CHK+ 08] Donghoon Chang, Seokhie Hong, Changheon Kang, Jinkeon
Kang, Jongsung Kim, Changhoon Lee, Jesang Lee, Jongtae Lee,
Sangjin Lee, Yuseop Lee, Jongin Lim, and Jaechul Sung. Ari-
rang. Submission to NIST (Round 1), December 2008. Avail-
able online: http://csrc.nist.gov/groups/ST/hash/sha-3/Round1/
submissions_rnd1.html.
[CLS06] Scott Contini, Arjen K. Lenstra, and Ron Steinfeld. VSH, an Effi-
cient and Provable Collision-Resistant Hash Function. In Serge
Vaudenay, editor, EUROCRYPT, volume 4004 of LNCS, pages
165–182. Springer, 2006.
[Dae09] Joan Daemen. FSE, 2009. personal communication.
Bibliography 159
[dBB91] Bert den Boer and Antoon Bosselaers. An Attack on the Last Two
Rounds of MD4. In Joan Feigenbaum, editor, CRYPTO, volume
576 of LNCS, pages 194–203. Springer, 1991.
[dBB93] Bert den Boer and Antoon Bosselaers. Collisions for the Com-
pressin Function of MD5. In Tor Helleseth, editor, EUROCRYPT,
volume 765 of LNCS, pages 293–304. Springer, 1993.
[DKT08] Ivan Damgård, Lars R. Knudsen, and Søren S. Thomsen. Dakota-
Hashing from a Combination of Modular Arithmetic and Symmet-
ric Cryptography. In Steven M. Bellovin, Rosario Gennaro, Ange-
los D. Keromytis, and Moti Yung, editors, ACNS, volume 5037 of
LNCS, pages 144–155, 2008.
[Dob96a] Hans Dobbertin. Cryptanalysis of MD4. In Dieter Gollmann, edi-
tor, FSE, volume 1039 of LNCS, pages 53–69. Springer, 1996.
[Dob96b] Hans Dobbertin. Cryptanalysis of MD5 Compress. Technical re-
port, German Information Security Agency, May 1996.
[Dob96c] Hans Dobbertin. The Status of MD5 After a Recent Attack. Cryp-
toBytes, 2(2):1–6, 1996.
[Dob98] Hans Dobbertin. Cryptanalysis of MD4. J. Cryptology, 11(4):253–
271, 1998.
[DR99a] Joan Daemen and Vincent Rijmen. AES Proposal: Rijndael. AES
Algorithm Submission, September 1999. Available online: http:
//csrc.nist.gov/archive/aes/rijndael/Rijndael-ammended.pdf.
[DR01] Joan Daemen and Vincent Rijmen. The Wide Trail Design Strat-
egy. In Bahram Honary, editor, IMA Int. Conf., volume 2260 of
LNCS, pages 222–238. Springer, 2001.
[DR02] Joan Daemen and Vincent Rijmen. The Design of Rijndael: AES
- The Advanced Encryption Standard. Springer, 2002.
[DR05] Joan Daemen and Vincent Rijmen. Probability distributions of
Correlation and Differentials in Block Ciphers. Cryptology ePrint
Archive, Report 2005/212, 2005. http://eprint.iacr.org/.
160 Bibliography
[FLS+ 09] Niels Ferguson, Stefan Lucks, Bruce Schneier, Doug Whiting, Mihir
Bellare, Tadayoshi Kohno, Jon Callas, and Jesse Walker. The Skein
Hash Function Family. Submission to NIST (Round 2), Septem-
ber 2009. Available online: http://csrc.nist.gov/groups/ST/hash/
sha-3/Round2/submissions_rnd2.html.
[FLS+ 11] Niels Ferguson, Stefan Lucks, Bruce Schneier, Doug Whiting, Mi-
hir Bellare, Tadayoshi Kohno, Jon Callas, and Jesse Walker. The
Skein Hash Function Family. Submission to NIST (Round 3),
January 2011. Available online: http://csrc.nist.gov/groups/ST/
hash/sha-3/Round3/submissions_rnd3.html.
[KK06] John Kelsey and Tadayoshi Kohno. Herding Hash Functions and
the Nostradamus Attack. In Serge Vaudenay, editor, EURO-
CRYPT, volume 4004 of LNCS, pages 183–200. Springer, 2006.
[KL06] John Kelsey and Stefan Lucks. Collisions and Near-Collisions for
Reduced-Round Tiger. In Matthew J. B. Robshaw, editor, FSE,
volume 4047 of LNCS, pages 111–125. Springer, 2006.
Bibliography 163
[MPL+ 11] Amir Moradi, Axel Poschmann, San Ling, Christof Paar, and
Huaxiong Wang. Pushing the Limits: A Very Compact and a
Threshold Implementation of AES. In EUROCRYPT, 2011. To
appear.
[MPR+ 06] Florian Mendel, Bart Preneel, Vincent Rijmen, Hirotaka Yoshida,
and Dai Watanabe. Update on Tiger. In Rana Barua and Tanja
Lange, editors, INDOCRYPT, volume 4329 of LNCS, pages 63–79.
Springer, 2006.
[MPRS09] Florian Mendel, Thomas Peyrin, Christian Rechberger, and Mar-
tin Schläffer. Improved Cryptanalysis of the Reduced Grøstl Com-
pression Function, ECHO Permutation and AES Block Cipher. In
Michael J. Jacobson Jr., Vincent Rijmen, and Reihaneh Safavi-
Naini, editors, Selected Areas in Cryptography, volume 5867 of
LNCS, pages 16–35. Springer, 2009.
[MR07] Florian Mendel and Vincent Rijmen. Cryptanalysis of the Tiger
Hash Function. In Kaoru Kurosawa, editor, ASIACRYPT, volume
4833 of LNCS, pages 536–550. Springer, 2007.
Bibliography 165
[NRS11] Svetla Nikova, Vincent Rijmen, and Martin Schläffer. Secure Hard-
ware Implementation of Nonlinear Functions in the Presence of
Glitches. J. Cryptology, 24(2):292–321, 2011.
[Pey07] Thomas Peyrin. Cryptanalysis of Grindahl. In Kaoru Kuro-
sawa, editor, ASIACRYPT, volume 4833 of LNCS, pages 551–567.
Springer, 2007.
[Pey10] Thomas Peyrin. Improved Differential Attacks for ECHO and
Grøstl. In Tal Rabin, editor, CRYPTO, volume 6223 of LNCS,
pages 370–392. Springer, 2010.
[PGV93] Bart Preneel, René Govaerts, and Joos Vandewalle. Hash Functions
Based on Block Ciphers: A Synthetic Approach. In Douglas R.
Stinson, editor, CRYPTO, volume 773 of LNCS, pages 368–378.
Springer, 1993.
[PMK+ 11] Axel Poschmann, Amir Moradi, Khoongming Khoo, Chu-Wee Lim,
Huaxiong Wang, and San Ling. Side-Channel Resistant Crypto for
less than 2,300 GE. J. Cryptology, 24(2):322–345, 2011.
[Pre93] Bart Preneel. Analysis and Design of Cryptographic Hash Func-
tions. PhD thesis, Katholieke Universiteit Leuven, Belgium, 1993.
[QD89a] Jean-Jacques Quisquater and Jean-Paul Delescaille. How Easy is
Collision Search? Application to DES (Extended Summary). In
Jean-Jacques Quisquater and Joos Vandewalle, editors, EURO-
CRYPT, volume 434 of LNCS, pages 429–434. Springer, 1989.
[QD89b] Jean-Jacques Quisquater and Jean-Paul Delescaille. How Easy is
Collision Search. New Results and Applications to DES. In Gilles
Brassard, editor, CRYPTO, volume 435 of LNCS, pages 408–413.
Springer, 1989.
[Rab78] Michael O. Rabin. Digitalized Signatures. Foundations of Secure
Computation, pages 155–168, 1978.
[Rec09] Christian Rechberger. Cryptanalysis of Hash Functions. PhD the-
sis, Graz University of Technology, Austria, 2009.
[Riv92a] Ronald L. Rivest. The MD4 Message-Digest Algorithm. IETF
Request for Comments (RFC) 1320, 1992. Available online at http:
//www.ietf.org/rfc/rfc1320.html.
[SS08] Somitra Kumar Sanadhya and Palash Sarkar. New Collision At-
tacks against Up to 24-Step SHA-2. In Dipanwita Roy Chowdhury,
Vincent Rijmen, and Abhijit Das, editors, INDOCRYPT, volume
5365 of LNCS, pages 91–103. Springer, 2008.
[Sta08] Martijn Stam. Beyond Uniformity: Better Security/Efficiency
Tradeoffs for Compression Functions. In David Wagner, editor,
CRYPTO, volume 5157 of LNCS, pages 397–412. Springer, 2008.
[Til08] Stefan Tillich. Bitsliced Implementation of Grøstl-0-256, 2008.
Personal communication, implementation written by Stefan Tillich
and benchmarked in eBash.
[WFW09] Shuang Wu, Dengguo Feng, and Wenling Wu. Cryptanalysis of the
LANE Hash Function. In Michael J. Jacobson Jr., Vincent Rijmen,
and Reihaneh Safavi-Naini, editors, Selected Areas in Cryptogra-
phy, volume 5867 of LNCS, pages 126–140. Springer, 2009.
[Wil08] David A. Wilson. Constructing Second Preimages in the WaMM
Hash Algorithms. NIST hash function mailing list, November
2008. Available online: http://web.mit.edu/dwilson/www/hash/
wamm.html.
[WLF+ 05] Xiaoyun Wang, Xuejia Lai, Dengguo Feng, Hui Chen, and Xiuyuan
Yu. Cryptanalysis of the Hash Functions MD4 and RIPEMD.
In Ronald Cramer, editor, EUROCRYPT, volume 3494 of LNCS,
pages 1–18. Springer, 2005.
[Wu08] Hongjun Wu. The Hash Function JH. Submission to NIST
(Round 1), December 2008. Available online: http://csrc.nist.
gov/groups/ST/hash/sha-3/Round1/submissions_rnd1.html.
[Wu11] Hongjun Wu. The Hash Function JH. Submission to NIST (Round
3), January 2011. Available online: http://csrc.nist.gov/groups/
ST/hash/sha-3/Round3/submissions_rnd3.html.
[WY05] Xiaoyun Wang and Hongbo Yu. How to Break MD5 and Other
Hash Functions. In Ronald Cramer, editor, EUROCRYPT, volume
3494 of LNCS, pages 19–35. Springer, 2005.
[WYY05b] Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu. Finding Collisions
in the Full SHA-1. In Victor Shoup, editor, CRYPTO, volume 3621
of LNCS, pages 17–36. Springer, 2005.
[WYY05c] Xiaoyun Wang, Hongbo Yu, and Yiqun Lisa Yin. Efficient Collision
Search Attacks on SHA-0. In Victor Shoup, editor, CRYPTO,
volume 3621 of LNCS, pages 1–16. Springer, 2005.
171
172 Author Index
61–63, 75, 77, 80–82, 96–98, 101, 16, 21, 28, 33, 41, 49, 55, 60, 73, 74,
104, 105, 112, 114, 120 80, 82, 94, 97, 98, 105
Ristenpart, Thomas 12 Tillich, Stefan 71
Rivest, Ronald L. 3, 10, 13 Tischhauser, Elmar 9, 74, 152, 153
Robshaw, Matthew J. B. 14, 16, 62, Toz, Deniz 53
97–99
Röck, Andrea 6, 28, 45, 53, 121 Van Assche, Gilles 4, 12, 14, 16, 62
Rogaway, Phillip 10, 11, 13, 14, 60 van Oorschot, Paul C. 1, 11, 13, 18
Roland, Günther A. 66, 71 Vandewalle, Joos 13
Vanstone, Scott A. 1, 11, 13
Sakiyama, Kazuo 9, 51, 52, 74, 81, 82, Varici, Kerem 53
90, 91, 122
Sanadhya, Somitra Kumar 3 Wagner, David 18, 19, 21, 29, 98, 112,
115, 120
Sarkar, Palash 3
Walker, Jesse 4, 13, 28
Sasaki, Yu 3, 6, 9, 30, 31, 51, 52, 74,
Wang, Huaxiong 7, 38
81, 82, 90, 91, 97, 105, 122, 123, 142
Wang, Lei 3, 9, 51, 52, 74, 81, 82, 90,
Sato, Hisayoshi 14, 28, 53
91, 122
Scheibelhofer, Karl 71
Wang, Xiaoyun 3, 21, 27, 87
Schläffer, Martin 3, 4, 6, 7, 9, 14, 16,
Watanabe, Dai 14, 28, 29, 53
21, 28, 30, 31, 33, 38, 41, 45, 49,
Whiting, Doug 4, 13, 28
51–53, 55, 60, 62, 66, 67, 73–75, 77,
Wiener, Michael J. 18
79–82, 90, 94, 96–98, 104, 105, 112,
Wilson, David A. 4
114, 116, 120, 121, 123, 142
Winternitz, Robert S. 12
Schneier, Bruce 4, 12, 13, 28, 60
Wu, Hongjun 4, 14, 53
Schwabe, Peter 62, 71
Wu, Shuang 123
Seurin, Yannick 14, 16, 97–99
Wu, Wenling 123
Shamir, Adi 20, 21, 26, 35
Shirai, Taizo 27 Yao, Andrew C. 3
Shrimpton, Thomas 10, 12, 13, 60 Yao, Frances 3
Stam, Martijn 14 Yin, Yiqun Lisa 3, 21, 27, 87
Steinberger, John P. 14 Yoshida, Hirotaka 29
Steinfeld, Ron 2 Yu, Hongbo 3, 21, 27, 87
Stern, Jacques 14, 55, 60, 61 Yu, Xiuyuan 3, 21
Sung, Jaechul 38 Yuval, Gideon 17
International Journals
1. ChangKyun Kim, Martin Schläffer, and SangJae Moon. Differential Side
Channel Analysis Attacks on FPGA Implementations of ARIA. Electronics
and Telecommunications Research Institute (ETRI), 30(2):315–325, April
2008.
2. Svetla Nikova, Vincent Rijmen, and Martin Schläffer. Secure Hardware
Implementation of Nonlinear Functions in the Presence of Glitches. J.
Cryptology, 24(2):292–321, 2011.
175
176 List of Publications
15. Svetla Nikova, Vincent Rijmen, and Martin Schläffer. Secure Hardware
Implementation of Non-linear Functions in the Presence of Glitches. In
Pil Joong Lee and Jung Hee Cheon, editors, ICISC, volume 5461 of LNCS,
pages 218–234. Springer, 2008.
16. Svetla Nikova, Vincent Rijmen, and Martin Schläffer. Using Normal Bases
for Compact Hardware Implementations of the AES S-Box. In Rafail
Ostrovsky, Roberto De Prisco, and Ivan Visconti, editors, SCN, volume
5229 of LNCS, pages 236–245. Springer, 2008.
17. Martin Schläffer. Subspace Distinguisher for 5/8 Rounds of the ECHO-
256 Hash Function. In Alex Biryukov, Guang Gong, and Douglas R.
Stinson, editors, Selected Areas in Cryptography, volume 6544 of LNCS,
pages 369–387. Springer, 2010.
18. Martin Schläffer and Elisabeth Oswald. Searching for Differential Paths
in MD4. In Matthew J. B. Robshaw, editor, FSE, volume 4047 of LNCS,
pages 242–261. Springer, 2006.
Preprints
1. Praveen Gauravaram, Lars R. Knudsen, Krystian Matusiewicz, Florian
Mendel, Christian Rechberger, Martin Schläffer, and Søren S. Thomsen.
Grøstl – a SHA-3 candidate. Submission to NIST, December 2008. Avail-
able online: http://groestl.info/.
2. Mario Lamberger, Florian Mendel, Christian Rechberger, Vincent Rijmen,
and Martin Schläffer. The Rebound Attack and Subspace Distinguishers:
Application to Whirlpool. Cryptology ePrint Archive, Report 2010/198,
2010. http://eprint.iacr.org/.
3. Florian Mendel and Martin Schläffer. Collisions and Preimages for Sarmal.
NIST hash function mailing list, December 2008. Available online: http:
//ehash.iaik.tugraz.at/uploads/d/d1/Salt-collision.pdf.
Deutsche Fassung:
Beschluss der Curricula-Kommission für Bachelor-, Master- und Diplomstudien vom 10.11.2008
Genehmigung des Senates am 1.12.2008
EIDESSTATTLICHE ERKLÄRUNG
Ich erkläre an Eides statt, dass ich die vorliegende Arbeit selbstständig verfasst, andere als die
angegebenen Quellen/Hilfsmittel nicht benutzt, und die den benutzten Quellen wörtlich und inhaltlich
entnommene Stellen als solche kenntlich gemacht habe.
Englische Fassung:
STATUTORY DECLARATION
I declare that I have authored this thesis independently, that I have not used other than the declared
sources / resources, and that I have explicitly marked all material which has been quoted either
literally or by content from the used sources.
…………………………… ………………………………………………..
date (signature)