0% found this document useful (0 votes)
12 views

Carlet C. Boolean Functions for Cryptography and Coding Theory 2020

Cryptography

Uploaded by

Franke
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Carlet C. Boolean Functions for Cryptography and Coding Theory 2020

Cryptography

Uploaded by

Franke
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 577

Boolean Functions for Cryptography and Coding Theory

Boolean functions are essential to systems for secure and reliable communication.
This comprehensive survey of Boolean functions for cryptography and coding covers
the whole domain and all important results, building on the authors influential
articles with additional topics and recent results. A useful resource for researchers
and graduate students, the book balances detailed discussions of properties and
parameters with examples of various types of cryptographic attacks that motivate
the consideration of these parameters. It provides all the necessary background
on mathematics, cryptography, and coding and an overview of recent applications,
such as side-channel attacks on smart cards and hardware, cloud computing through
fully homomorphic encryption, and local pseudorandom generators. The result is a
complete and accessible text on the state of the art in single- and multiple-output
Boolean functions that illustrates the interaction among mathematics, computer
science, and telecommunications.

c l a u d e c a r l e t is Professor Emeritus of Mathematics at the University of


Paris 8, France, and member of the Bergen University Department of Computer
Science. He has contributed to 16 books, and published more than 130 papers in
international journals and more than 70 papers in international proceedings. He has
been a member of 80 program committees of international conferences and served as
cochair for 10 of them. He has overseen the research group Codage-Cryptographie,
which gathers all French researchers in coding and cryptography, and is editor-in-
chief of the journal Cryptography and Communications. He has been an invited
plenary speaker at 20 international conferences and the invited speaker at 30 other
international conferences and workshops.
Boolean Functions for Cryptography
and Coding Theory

Claude Carlet
University of Bergen, Norway, and University of Paris 8, France
University Printing House, Cambridge CB2 8BS, United Kingdom

One Liberty Plaza, 20th Floor, New York, NY 10006, USA

477 Williamstown Road, Port Melbourne, VIC 3207, Australia

314-321, 3rd Floor, Plot 3, Splendor Forum, Jasola District Centre, New Delhi – 110025, India

79 Anson Road, #06–04/06, Singapore 079906

Cambridge University Press is part of the University of Cambridge.

It furthers the University’s mission by disseminating knowledge in the pursuit of


education, learning, and research at the highest international levels of excellence.

www.cambridge.org
Information on this title: www.cambridge.org/9781108473804
DOI: 10.1017/9781108606806

© Claude Carlet 2020

This publication is in copyright. Subject to statutory exception


and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without the written
permission of Cambridge University Press.

First published 2020

Printed in the United Kingdom by TJ Books Limited, Padstow Cornwall

A catalogue record for this publication is available from the British Library.

Library of Congress Cataloging-in-Publication Data


Names: Carlet, Claude, author.
Title: Boolean functions for cryptography and coding theory / Claude Carlet.
Description: Cambridge ; New York, NY : Cambridge University Press, 2020. |
Includes bibliographical references and index.
Identifiers: LCCN 2020002605 (print) | LCCN 2020002606 (ebook) |
ISBN 9781108473804 (hardback) | ISBN 9781108606806 (epub)
Subjects: LCSH: Algebra, Boolean. | Cryptography. | Coding theory.
Classification: LCC QA10.3 .C37 2020 (print) | LCC QA10.3 (ebook) |
DDC 003/.5401511324–dc23
LC record available at https://lccn.loc.gov/2020002605
LC ebook record available at https://lccn.loc.gov/2020002606

ISBN 978-1-108-47380-4 Hardback

Cambridge University Press has no responsibility for the persistence or accuracy


of URLs for external or third-party internet websites referred to in this publication
and does not guarantee that any content on such websites is, or will remain,
accurate or appropriate.
Contents

Preface page ix
Acknowledgments x
Notation xii
1 Introduction to cryptography, codes, Boolean, and vectorial
functions 1
1.1 Cryptography 1
1.2 Error-correcting codes 4
1.3 Boolean functions 17
1.4 Vectorial functions 24
2 Generalities on Boolean and vectorial functions 27
2.1 A hierarchy of equivalence relations over Boolean and vectorial
functions 27
2.2 Representations of Boolean functions and vectorial functions 30
2.3 The Fourier–Hadamard transform and the Walsh transform 52
2.4 Fast computation of S-boxes 74
3 Boolean functions, vectorial functions, and cryptography 76
3.1 Cryptographic criteria (and related parameters) for Boolean functions 76
3.2 Cryptographic criteria for vectorial functions in stream and block
ciphers 112
3.3 Cryptographic criteria and parameters for vectorial functions
in stream ciphers 129
3.4 Cryptographic criteria and parameters for vectorial functions
in block ciphers 134
3.5 Search for functions achieving the desired features 142
3.6 Boolean and vectorial functions for diffusion, secret sharing,
and authentication 145
4 Boolean functions, vectorial functions, and error-correcting codes 151
4.1 Reed–Muller codes 151
4.2 Other codes related to Boolean functions 159

v
vi Contents

5 Functions with weights, Walsh spectra, and nonlinearities


easier to study 164
5.1 Affine functions and their combinations 164
5.2 Quadratic functions and their combinations 170
5.3 Cubic functions 180
5.4 Indicators of flats 181
5.5 Functions admitting (partial) covering sequences 182
5.6 Functions with low univariate degree and related functions 187
6 Bent functions and plateaued functions 189
6.1 Bent Boolean functions 190
6.2 Partially-bent and plateaued Boolean functions 255
6.3 Bent4 and partially-bent4 functions 266
6.4 Bent vectorial functions 268
6.5 Plateaued vectorial functions 274
7 Correlation immune and resilient functions 284
7.1 Correlation immune and resilient Boolean functions 284
7.2 Resilient vectorial Boolean functions 313
8 Functions satisfying SAC, PC, and EPC, or having good GAC 318
8.1 P C(l) criterion 318
8.2 P C(l) of order k and EP C(l) of order k criteria 319
8.3 Absolute indicator 320
9 Algebraic immune functions 321
9.1 Algebraic immune Boolean functions 321
9.2 Algebraic immune vectorial functions 344
10 Particular classes of Boolean functions 352
10.1 Symmetric functions 352
10.2 Rotation symmetric, idempotent, and other similar functions 360
10.3 Direct sums of monomials 362
10.4 Monotone functions 363
11 Highly nonlinear vectorial functions with low differential
uniformity 369
11.1 The covering radius bound; bent/perfect nonlinear functions 370
11.2 The Sidelnikov–Chabaud–Vaudenay bound 370
11.3 Almost perfect nonlinear and almost bent functions 371
11.4 The known infinite classes of AB functions 394
11.5 The known infinite classes of APN functions 399
11.6 Differentially uniform functions 412
Contents vii

12 Recent uses of Boolean and vectorial functions and related


problems 425
12.1 Physical attacks and related problems on functions and codes 425
12.2 Fully homomorphic encryption and related questions on Boolean
functions 453
12.3 Local pseudorandom generators and related criteria on Boolean
functions 467
12.4 The Gowers norm on pseudo-Boolean functions 469
13 Open questions 475
13.1 Questions of general cryptography dealing with functions 475
13.2 General questions on Boolean functions and vectorial functions 475
13.3 Bent functions and plateaued functions 476
13.4 Correlation immune and resilient functions 477
13.5 Algebraic immune functions 477
13.6 Highly nonlinear vectorial functions with low differential uniformity 478
13.7 Recent uses of Boolean and vectorial functions and related problems 478
14 Appendix: finite fields 480
14.1 Prime fields and fields with four, eight, and nine elements 480
14.2 General finite fields: construction, primitive element 483
14.3 Representation (additive and multiplicative); trace function 488
14.4 Permutations on a finite field 490
14.5 Equations over finite fields 494

References 498
Index 557
Preface

The present monograph is a merged, reorganized, significantly revised, and extensively


completed version of two chapters, entitled “Boolean Functions for Cryptography and Error
Correcting Codes” [236] and “Vectorial Boolean Functions for Cryptography” [237], which
appeared in 2010 as parts of the book Boolean Models and Methods in Mathematics,
Computer Science, and Engineering [394] (editors, Yves Crama and Peter Hammer). It is
meant for researchers but is accessible to anyone who knows basics in linear algebra and
general mathematics. All the other notions needed are introduced and studied (even finite
fields are, in the Appendix).
Since these chapters were written in 2009, about 1,500 papers have been published that
deal with this twofold topic (which is broad, as we see), and this version is updated with the
main references and their main results (with corrections in the rare cases where they were
needed). It also contains original results.
New notions on Boolean and vectorial functions and new ways of using them have also
emerged. A chapter devoted to these recent and/or not enough studied directions of research
has been included.
In the limit of a book, we tried to be as complete as possible. Of course, we could not go
into details as much as do papers, but we made our best to ensure a good trade-off between
completeness in scope and in depth. The choice of those papers that are referred to and of
those results that are developed may seem subjective; it has been difficult, given the large
number of papers. We tried, within the imposed length limit, to give the proof of a result
each time it was short and simple enough, and when it provided a vision (we tried to avoid
giving too technical proofs whose only – but of course important – value would have been
to convince the reader that the result is true). We would have liked to avoid, when presenting
arguments and observations, to refer to results (and concepts) to come later in the text, but the
large number of results has made this necessary; otherwise, it would have been impossible
to gather in a same place all the facts related to a same notion.
We have limited ourselves to Boolean and vectorial functions in characteristic 2, since
these fit better with applications in coding and cryptography, and since dealing with
p-ary and generalized functions would have reduced the description of the results on binary
functions.

ix
Acknowledgments

The author wishes to thank Cambridge University Press for publishing this monograph,
and in particular Kaitlin Leach, Amy He, and Mark Fox for their kind help. He deeply
thanks Lilya Budaghyan, from the Selmer Center, University of Bergen, for her kind
support and her precious and numerous bits of information, in particular on almost perfect
nonlinear (APN) functions, which allowed me to improve several chapters, making them
more accurate, complete, and up-to-date, and Sihem Mesnager, from the University of Paris
8 and the Laboratoire Analyse, Géométrie et Applications (LAGA), for her careful reading
of the whole book during the time it was written, for her supporting advice, and for her
detailed additive proposals, which improved the completeness. I also thank very much Victor
Chen, Sylvain Guilley, Pierrick Méaux, Lauren De Meyer, Stjepan Picek, Emmanuel Prouff,
Sondre Rønjom, and Deng Tang, each of whom helped with completing and correcting a
part of a section of the book or even several. Many thanks also to the anonymous reviewers
invited by Cambridge University Press, whose comments have been helpful.
Research is a collective action and a too-long list of names should be cited to acknowledge
all the stimulating discussions, collaborations, and information that contributed to this
book. A few names are the 10 previously mentioned and Kanat Abdukhalikov, Benny
Applebaum, Thierry Berger, Marco Calderini, Xi Chen, Robert Coulter, Diana Davidova,
Ulrich Dempwolff, John Dillon, Cunsheng Ding, the late Hans Dobbertin, Yves Edel,
Keqin Feng, Caroline Fontaine, Rafael Fourquet, Philippe Gaborit, Faruk Göloglu, Guang
Gong, Aline Gouget, Cem Güneri, Tor Helleseth, Xiang-dong Hou, Nikolay Kaleyski,
William Kantor, Selçuk Kavut, Jenny Key, Alexander Kholosha, Andrew Klapper, Nicholas
Kolokotronis, Gohar Kyureghyan, Philippe Langevin, Gregor Leander, Alla Levina, Chunlei
Li, Nian Li, Konstantinos Limniotis, Mikhail Lobanov, Luca Mariot, Subhamoy Maitra, the
late James Massey, Gary McGuire, Wilfried Meidl, Willi Meier, Harald Niederreiter, Svetla
Nikova, Kaisa Nyberg, Ferruh Özbudak, Daniel Panario, Matthew Parker, Enes Pasalic,
George Petrides, Alexander Pott, Mathieu Rivain, Thomas Roche, François Rodier, Neil
Sloane, François-Xavier Standaert, Henning Stichtenoth, Yin Tan, Chunming Tang, Horacio
Tapia-Recillas, Faina Solov’eva, Pante Stănică, Yuriy Tarannikov, Cédric Tavernier, Alev
Topuzoğlu, Irene Villa, Arne Winterhof, Satoshi Yoshiara, Xiangyong Zeng, Fengrong
Zhang, and Victor Zinoviev, as well as the members of the National Institute for Research
in Computer Science and Automation (INRIA) team, whose CODES project (now called
SECRET) has been a nice research environment and has supported me during my thesis and
many years after, and the Bergen Selmer Center team, which does the same now, with a
spirit of kindness and generosity, for my great scientific benefit.

x
Acknowledgments xi

I also wish to acknowledge that gathering the bibliography has been considerably eased by
websites such as dblp: computer science bibliography (https://dblp.uni-trier.de), Research-
Gate (www.researchgate.net), and Google Scholar (https://scholar.google.fr/schhp?hl=fr
&tab=Xs).
Last but not least, I am so grateful to my wife Madeleine and my family for their support,
patience, and understanding of what a researcher’s work is. This is even more true for the last
three years, during which the writing of this book, the reviewing of the numerous published
papers, and the copyediting took so much of my time. I dedicate my book to them, with a
special thought for my children and grandchildren, who will have to face the world we leave
them.
Notation

|I | size of a set I ,
u integer part (floor) of a real number u,
u ceiling of u (the smallest integer larger than or equal to u),
φ −1 (u) preimage of u by a function φ, 
1 if x ∈ E
1E indicator (or characteristic) function of a set E: 1E (x) =
0 otherwise,
δa the Dirac (or Kronecker) symbol at a (i.e. the indicator of {a}),
F2 the finite field with two elements 0, 1 (bits),
Fn2 the n-dimensional vector space over F2 (sometimes identified with F2n ),
Ln,m the vector space of linear (n, m)-functions,
0n zero vector in Fn2 or in Fnq , n > 1 (in other groups, we just write 0),
1n vector (1, . . . , 1) in Fn2 ,
+
 addition in characteristic 0 (e.g., in R), and in Fn2 and F2n for n > 1,
i multiple sum of +,

 addition in F2 (i.e., modulo 2); direct sum of two vector spaces,
i multiple sum of ⊕,
x x + 1n , where x ∈ Fn2 ,
a·x inner product in Fn2 ,
a (x), ta (x) = a · x, resp. x + a, where “·” is an inner product in Fn2 ,
FI2 the vector space over F2 of all binary vectors whose indices range in I ,
F 2n the finite (Galois) field of order 2n , identified with Fn2 as a vector space,
m 2m n−m
trmn (x) = x + x 2 + x 2 + · · · + x 2 , trace function from F2n to F2m (m | n),
n−1 2i
trn (x) = tr1n (x) = i=0 x the absolute trace function,
F∗2n F2n \ {0}, where 0 denotes the zero element of F2n ,
α primitive element of F2n ,
⊗ convolutional product of two functions over Fn2 (see page 60),
f , g, h, . . . Boolean functions,
BF n the F2 -vector space of all n-variable Boolean functions f : Fn2 → F2 ,
F , G, H , . . . vectorial functions,
GF graph of a vectorial function: GF = {(x, F (x)); x ∈ Fn2 },
wH () Hamming weight (of a vector, of a function),
dH (, ) Hamming distance (between two vectors, two functions),
d(C) minimum (Hamming) distance of code C,

xii
Notation xiii

supp() the support (of a vector, of a function),


x y “x is covered by y” (i.e., supp(x) ⊆ supp(y)),
x∨y vector such that supp(x ∨ y) = supp(x) ∪ supp(y),
x∧y vector such that supp(x ∧ y) = supp(x) ∩ supp(y),
ei  vector of the canonical
ith basis of Fn2 ,
n ui
i∈I xi , I ⊆ {1, . . . , n}, i=1 xi , u ∈ F2 ,
xI , xu n

f → f ◦ binary Möbius transform (f ◦ : u → au , coef. of x u in the ANF of f ),



ϕ Fourier–Hadamard transform of a real-valued function ϕ over Fn2 ,
fχ sign function of a Boolean function f , that is, x → (−1)f (x) ,
Wf () Walsh transform of a Boolean function f (i.e., fχ ),
WF (, ) Walsh transform of a vectorial function F ,
supp(Wf ) support of Wf : {u ∈ Fn2 ; Wf (u) = 0},
N Wf cardinality of the support of Wf ,

F (f ) x∈Fn2 (−1)
f (x) (= W (0 )),
f n
nl() nonlinearity of a Boolean or vectorial function,
nlr () r-th order nonlinearity of a Boolean function,
ln, log2 natural (Neperian) logarithm, base 2 logarithm,
dalg (f ) the algebraic degree of f (i.e., the degree of its ANF),
dnum (f ) the numerical degree of f (i.e., the degree of its NNF),
w2 (j ) 2-weight of integer j (see page 45),
(n, m, t)-function t-resilient (n, m)-function,
AI () algebraic immunity of a function, 
Mf ,d matrix of the system of equations I ⊆{1,...,n} aI uI = 0, u ∈ supp(f ),
|I |≤d
rk(M) the rank of a matrix M,
F AC() fast algebraic complexity of a function,
F AI () fast algebraic immunity of a function,
D a f , Da F derivatives in the direction a: x → f (x) ⊕ f (x + a), F (x) + F (x + a),
 the symmetric difference between two sets,

f (a) autocorrelation function f (a) = x∈Fn (−1)Da f (x) ,
2
f absolute indicator of f : f = maxa∈Fn2 \{0n } |f (a)|,

V (f ) sum-of-squares indicator of f : e∈Fn F 2 (De f ),
2
Ef linear kernel of a Boolean function f ,
RM(r, n) Reed–Muller code of order r and length 2n ,
ρ(r, n) covering radius of RM(r, n),
βf the symplectic form associated to a quadratic function f ,
f dual of a bent Boolean function (Definition 51, page 197),
M Maiorana–McFarland’s class,
PS partial spread class,
L∗ adjoint operator of a linear automorphism L,
I m(F ) the range (i.e., image set) F (Fn2 ) of an (n, m)-function,
An(f ) the F2 -vector space of annihilators of a Boolean function f ,
And (f ) restriction of An(f ) to those functions of algebraic degree at most d,
Bk,l (f ) = {g ∈ BF n ; dalg (g) ≤ k and dalg (fg) ≤ l},
xiv Notation

f defined by f (x) = f(wH (x)), when f is symmetric,



σi (x) elementary symmetric Boolean fct., of ANF: I ⊆{1,...,n}/ |I |=i x I ,

Si (x) elementary symmetric pseudo-Boolean fct. NNF: I ⊆{1,...,n}/ |I |=i x I ,
δF differential uniformity of an (n, m)-function F ,
NbF imbalance of an (n, m)-function (see page 113),
NBF derivative imbalance of an (n, m)-function (see page 138),
x a sharing of x (see page 436),
F a threshold implementation of function F (see page 436),
En,k = {x ∈ Fn2 ; wH (x) = k},
wH (f )k Hamming weight of the restriction of function f to En,k ,
1

Introduction to cryptography, codes, Boolean,


and vectorial functions

1.1 Cryptography
A fundamental objective of cryptography is to enable two persons to communicate over an
insecure channel (a public channel such as the internet) in such a way that any other person
is unable to recover their messages (constituting the plaintext) from what is sent in its place
over the channel (the ciphertext). The transformation of the plaintext into the ciphertext is
called encryption, or enciphering. It is ensured by a cryptosystem. Encryption–decryption is
the most ancient cryptographic activity (ciphers already existed in the fourth century bc) but
its nature has deeply changed with the invention of computers, because the cryptanalysis
(the activity of the third person, the eavesdropper, who aims at recovering the message, or
better, the secret data used by the algorithm – which is assumed to be public) can use their
power. Another important change will occur (see e.g., [70, 360, 832]), at least for public-key
cryptography (see the definition below), when quantum computers become operational.
The encryption algorithm takes as input the plaintext and an encryption key KE , and
it outputs the ciphertext. The decryption (or deciphering) algorithm takes as input the
ciphertext and a private1 decryption key KD . It outputs the plaintext.

Plaintext Ciphertext Plaintext


Encryption Decryption
Public
channel
KE KD

For being considered robust, a cryptosystem should not be cryptanalyzed by an attack


needing less than 280 elementary operations (which represent thousands of centuries of
computation with a modern computer) and less than billions of plaintext–ciphertext pairs. In
particular, an exhaustive search of the secret parameters of the cryptosystem (consisting in
trying every possible value of them until the data given to the attacker match the computed
data) should not be feasible in less than 280 elementary operations. In fact, we most often
even want that there is no faster cryptanalysis than exhaustive search.

1 According to principles already stated in 1883 by A. Kerckhoffs [688], who cited a still more ancient
manuscript by R. du Carlet [207], only the key(s) need absolutely to be kept secret – the confidentiality should
not rely on the secrecy of the encryption method – and a cipher cannot be considered secure if it can be
decrypted by the designer himself without using the decryption key.

1
2 Introduction to cryptography, codes, Boolean, and vectorial functions

Note that the term of cryptography is often used indifferently for naming the two activities
of designing cryptosystems and of cryptanalyzing them, while the correct term when dealing
with both is cryptology.

1.1.1 Symmetric versus public-key cryptosystems


If the encryption key is supposed to be secret, then we speak of conventional cryptography
or of private-key cryptography. We also speak of symmetric cryptography since the same
key can then be used for KE and KD . In practice, the principle of conventional cryptography
relies then on the sharing of a private key between the sender of a message (often called
Alice) and its receiver (often called Bob). Until the late 1970s, only symmetric ciphers
existed.
If the encryption key can be public, then we speak of public-key cryptography (or
asymmetric cryptography), which is preferable to conventional cryptography, since it makes
it possible to securely communicate without having previously shared keys in a secure
way: every person who wants to receive secret messages can keep secret a decryption
key and publish an encryption key; if n persons want to secretly communicate pairwise
using a public-key cryptosystem, they need n encryption keys and n decryption keys, when
conventional cryptosystems will need n2 = n(n−1) 2 keys. Of course, it must be impossible
to deduce in reasonable time, even with huge computational power, the private decryption
key from the public encryption key. Such requirement is related to the problem of building
one-way functions, that is, functions such that computing the image of an element is fast
(i.e., is a problem of polynomial complexity), while the problem of computing the preimage
of an element has exponential complexity.
All known public-key cryptosystems, such as RSA, which uses operations in large
rings [846], allow a much lower data throughput; they also need keys of sizes 10 times
larger than symmetric ciphers for ensuring the same level of security. Some public-key
cryptosystems, such as those of McEliece and Niederreiter (based on codes) [846], are
faster, but have drawbacks, because the ciphertext and the plaintext have quite different
lengths, and the keys are still larger than for other public-key cryptosystems.2 Private-
key cryptosystems are then still needed nowadays for ensuring the confidential transfer of
large data. In practice, they are widely used for confidentiality in the internet, banking,
mobile communications, etc., and their study and design are still an active domain of
research. Thanks to public-key cryptosystems, the share-out of the necessary secret keys
for the symmetric cipher can be done without using a secure channel (the secret keys for
conventional cryptosystems are strings of a few hundreds of bits only and can then be
encrypted by public-key cryptosystems). The protagonists can then exchange safely, over a
public channel such as the internet, their common private encryption–decryption key, called
a session key. Protocols specially devoted to key exchange can also be used.
The change caused by the intervention of quantum computers will be probably much less
important for symmetric than for public-key cryptography. Most current symmetric ciphers

2 Code-based, lattice-based, and other “postquantum” cryptosystems are, however, actively studied, mainly
because they would be alternatives to RSA and to the cryptosystems based on the discrete logarithm, in case
an efficient quantum computer could be built in the future, which would break them.
1.1 Cryptography 3

seem secure against attacks by quantum computers (Grover’s algorithm [576], which, given
a black box with√N possible inputs and some output, deduces with high probability from
the results of O( N) evaluations the supposedly unique input,3 will probably have as an
impact the necessity to double the length of the keys).

1.1.2 Block ciphers versus stream ciphers


The encryption in a symmetric cipher can be treated block by block in a so-called block
cipher (such as the Advanced Encryption Standard, AES [403, 404]). The binary plaintext
is then divided into blocks of the same size, several blocks being encrypted with the same
key (and a public data called initial vector being changed more often). It can also be treated
in a stream cipher [463], through the addition, most often mod 2, of a keystream of the
same size as the plaintext, output by a pseudorandom generator (PRG) parameterized by a
secret key (the keystream can be produced symbol by symbol, or block by block when the
PRG uses a block cipher in a proper mode4 ). A quality of stream ciphers is to avoid error
propagation, which gives them an advantage in applications where errors may occur during
the transmission.
The ciphertext can be decrypted in the case of block ciphers by inverting the process and
in the case of stream ciphers by the same bitwise addition of the keystream, which gives back
the plaintext. Stream ciphers are also meant to be faster and to consume less electric power
(which makes them adapted to cheap embedded devices). The triple constraint of being
lightweight and fast while ensuring security is a difficult challenge for stream ciphers, all
the more since they do not have the advantage of involving several rounds like block ciphers
(their security is dependent on the PRG only). And the situation is nowadays still more
difficult because modern block ciphers such as the AES are very fast. This difficulty has
been illustrated by the failure of all six stream ciphers submitted to the 2000–2003 NESSIE
project (New European Schemes for Signatures, Integrity and Encryption) [901], whose
purpose was to identify secure cryptographic primitives. NESSIE has then been followed
by the contest eSTREAM [495] organized later, between 2004 and 2008, by the European
Union (EU) ECRYPT network.
As mentioned in [242], the price to pay for these three constraints described above is that
security proofs hardly exist for efficient stream ciphers as they do for block ciphers. This
is a drawback of stream ciphers, compared to block ciphers.5 The only practical possibility
for verifying the security of efficient stream ciphers (in particular, the unpredictability of the
keystream they generate) is to prove that they resist the known attacks. It is then advisable
to include some amount of randomness in them, so as to increase the probability of resisting
future attacks.6

3 Or equivalently finds with high probability a specific entry in an unsorted database of N entries.
4 Note, however, that stream ciphers are often supposed to be used on lighter devices than block ciphers
(typically not needing cryptoprocessors, for instance).
5 However, the security of block ciphers is actually proved under simplifying hypotheses, and it has been said by
Lars Knudsen that “what is provably secure is probably not.”
6 Some stream cipher proposals, such as the Toyocrypt, LILI-128 and SFINKS ciphers, learned this at their own
expense; see [387].
4 Introduction to cryptography, codes, Boolean, and vectorial functions

Proving the security of a cipher consists of reducing it to the intractability of a hard


problem (a problem that has been extensively addressed by the academic community, and
for which only algorithms of exponential or subexponential complexity could be found),
implying that any potential attack on it could be used for designing an efficient algorithm
(whose worst-case complexity would be polynomial in the size of its input) solving the
hard problem.
Note that provably secure stream ciphers do exist (some proposals are even uncondition-
ally secure, that is, are secure even if the attacker has unlimited computational power, but
limited storage or access); see for instance the proposals by Alexi–Chor–Goldreich–Schnorr
(whose security is reducible to the intractability of the RSA problem) or Blum–Blum–
Shub [98] (whose security is reducible to the intractability of the quadratic residue problem
modulo pq, where p and q are large primes), or the stream cipher QUAD [61] (based on the
iteration of a multivariate quadratic system over a finite field, and whose security is reducible
to the intractability of the so-called multivariate quadratic (MQ) polynomial problem). But
they are too slow and too heavy for being used in practice. Even in the case of QUAD, which
is the fastest, the encryption speed is lower than for the AES. And this is still worse when
security is ensured unconditionally. This is why the stream ciphers using Boolean functions
(see below) are still much used and studied.

1.2 Error-correcting codes


The objective of error-detecting/-correcting codes in coding theory is to enable digital
communication over a noisy channel, in such a way that the errors of transmission can
be detected by the receiver and, in the case of error correcting codes, corrected. General
references are [63, 780, 809]. Shannon’s paper [1033] is also prominent.
Without correction, when an error is detected, the information needs to be requested again
by the receiver and sent again by the sender (such procedure is called an Automatic Repeat
reQuest, ARQ). This is what happened with the first computers: working with binary words,
they could detect  only one error (one bit) in the transmission of (x1 , . . . , xk ), by adding a
parity bit xk+1 = ki=1 xi (this transformed the word of length k into a word of length k + 1
having even Hamming weight, i.e., an even number of nonzero coordinates, which was then
sent over the noisy channel; if an error occurred in the transmission, then, assuming that
only one could occur, this was detected by the fact that the received word had odd Hamming
weight).
With correction, the ARQ is not necessary, but this requires in practice that fewer errors
have occurred than for detection (see below). Hybrid coding techniques exist then that make
a trade-off between the two approaches.
The aim of error detection/correction is achieved by using an encoding algorithm that
transforms the information (assumed to be a sequence over some alphabet A) before sending
it over the channel. In the case of block coding,7 the original sequence (the message) is
treated as a list of vectors (words) of the same length – say k – called source vectors which
are encoded into codewords of a larger length – say8 n. If the alphabet with which the words
7 We shall not address convolutional coding here.
8 When dealing with Boolean functions, the symbol n will be often devoted to their number of variables; the
length of the codes they will constitute will then not be n but N = 2n . See Section 1.3.
1.2 Error-correcting codes 5

are built is the field F2 of order 2, we say that the code is binary. If the code is not binary,
then the symbols of the alphabet will have to be transformed into binary vectors before being
sent over a binary channel.
Thanks to the length extension, called redundancy, the codewords sent over the channel
are some of all possible vectors of length n. The set C of all codewords is called the code
(for instance, in the case of the detecting codes using a parity bit as indicated above, the
code is made of all binary words of length n = k + 1 and of even Hamming weights; it is
called the parity code). The only information the receiver has, concerning the sent word, is
that it belongs to C.

Message Codeword Message


Encoding Decoding
Noisy
channel

1.2.1 Detecting and correcting capacities of a code


The decoding algorithm of an error-detecting code is able to recognize if a received vector
is a codeword. This makes possible to detect errors of transmission if (see [585]) denoting
by d the minimum Hamming distance between codewords, i.e., the minimum number of
positions at which codewords differ (called the minimum distance of the code), no more
than d − 1 coordinates of the received vector differ from those of the sent codeword
(condition for having no risk that a codeword different from the sent one can be received
and then accepted). In the case of an error-correcting code, the decoding algorithm can
additionally correct the errors of transmission, if their number is smaller than or equal to
the so-called correction capacity of the code. This capacity equals e = d−1 2 , where “ ”
denotes the integer part (and so, roughly, a code can detect twice as many errors than it can
correct), since the condition for having no risk that a vector corresponds, as received vector,
to more than one sent codeword with at most t errors of transmission in each case is that
2t < d. Indeed, in order to be always able (theoretically) to recover the correct codeword,
we need that, for every word y at distance at most t from a codeword x, there does not exist
another codeword x  at distance at most t from y, and this is equivalent to saying that the
Hamming distance between any two different codewords is larger than or equal to 2t + 1:
• If there exist a vector y and two codewords x and x  at Hamming distance at most t
from y, then we have d ≤ 2t by the triangular inequality on distances.
• Conversely, if there exist two codewords x and x  at Hamming distance δ ≤ 2t from
each other, then there exists a vector y such that dH (x, y) ≤ t and dH (x  , y) ≤ t (let I
be the set of positions where x and x  coincide; take yi = xi when i ∈ I and among the
δ others, take for instance  2δ  coordinates of y equal to those of x and the  2δ  others
equal to those of x  ).

In practice, determining d and then e = d−1 2 and showing that they are large is not
sufficient. We still need to have an efficient decoding algorithm to recover the sent codeword.
The naive method consisting in visiting all codewords and keeping the nearest one from the
received word is inefficient because the number 2k of codewords is too large, in general.
6 Introduction to cryptography, codes, Boolean, and vectorial functions

Determining the nearest codeword from a received vector is called maximum likelihood
decoding.
The correction capacity e is limited by the Hamming bound (or sphere-packing bound):
since all the balls B(x, e) = {y ∈ An ; dH (x, y) ≤ e}, of radius e and centered in
codewords  are pairwise
 disjoint, and since there are |C| of them, the size of their union
equals |C| ei=0 ni (q − 1)i , where q is the size of the alphabet. This union is a subset of
An . This implies the following:
e 
n
|C| (q − 1)i ≤ q n .
i
i=0
The codes that achieve this bound with equality are called perfect codes.

Puncturing, shortening, and extending codes


The punctured code of a code C is the set of vectors obtained by deleting the coordinate at
some fixed position i in each codeword of C; we shall call such transformation puncturing
at position i. This operation can be iterated, and we shall still speak of puncturing a code
when deleting the codeword coordinates at several positions.
The shortened code of a code C is the set of vectors obtained by keeping only those
codewords whose ith coordinate is null and deleting this ith coordinate.
The extended code of a code C over an additive group is the set of vectors, say,
(c0 , c1 , . . . , cn ), where (c1 , . . . , cn ) ∈ C and c0 = −(c1 + · · · + cn ). Note that the extended
code of C equals the intersection of the  code {(c0 , c1 , . . . , cn ) ∈ Fq ; (c1 , . . . , cn ) ∈ C} and
of the parity code (c0 , c1 , . . . , cn ) ∈ Fq ; ni=0 xi = 0}.

1.2.2 Parameters of a code


Sending words of length n over the channel instead of words of length k slows down the
transmission of information in the ratio of nk . This ratio, called the transmission rate, must
be as high as possible, for a given correction capacity, to make possible fast communication.
As we see, the three important parameters of a code C are n, k, d (or equivalently n, |C|, d
since if q is the alphabet’s size, we have |C| = q k ), and the first aim9 of algebraic coding is to
find codes minimizing n, maximizing k, and maximizing d, for diverse ranges of parameters
corresponding to the needs of communication (see tables of best-known codes in [570]). It is
easily seen that k ≤ n − d + 1 (this inequality, valid for any code over any alphabet, is called
the Singleton bound) since erasing the coordinates of all codewords at d − 1 fixed positions
gives a set of q k distinct vectors of length n − d + 1, where q is the size of the alphabet, and
the number of all vectors of length n − d + 1 equals q n−d+1 . Codes achieving the Singleton
bound with equality are called maximum distance separable (MDS). In the case of binary
linear codes (see below), it can be shown by using the Pless identities (see, e.g., [348]) that
MDS codes have dimension at most 1 or at least n − 1 and, except for such codes, the bound
becomes then k ≤ n − d.
Another important parameter is the covering radius, which is the smallest integer ρ such
that the spheres of (Hamming) radius ρ centered at the codewords cover the whole space. In
9 The second aim is to find decoding algorithms for the codes found.
1.2 Error-correcting codes 7

other words, it is the minimal integer ρ such that every vector of length n lies at Hamming
distance at most ρ from at least one codeword, that is, the maximum number of errors to
be corrected when maximum likelihood decoding (see page 6) is used. The book [375] is
devoted to its study.
The sphere-covering bound is the lower bound on the covering radius ρ, which expresses
that, by definition, the balls B(x, ρ) = {y ∈ An ; dH (x, y) ≤ ρ}, of radius ρ and centered in
codewords, cover the whole space An :
ρ 
n
|C| (q − 1)i ≥ q n .
i
i=0

1.2.3 Linear codes


The general class of linear codes gives a simple and wide example of codes and how they
can be used in error correction.

Definition 1 A code is called a linear code if its alphabet is a finite field Fq (where q is the
power of a prime) and if it has the structure of an Fq -linear subspace of Fnq , where n is its
length (see [809]).

A code that is not necessarily linear is called an unrestricted code. The minimum
distance of a linear code equals the minimum Hamming weight of all nonzero codewords,
since the Hamming distance between two vectors equals the Hamming weight of their
difference. We shall write that a linear code10 over Fq is an [n, k, d]q -code (and if the value
of q is clear from the context, an [n, k, d]-code) if it has length n, dimension k, and minimum
distance d. The translates of a linear code are called its cosets and the elements of minimum
Hamming weights in these cosets are called coset leaders (there may exist several in some
cosets).

Generator matrix
Any linear code can be described by a generator matrix G, obtained by choosing a basis of
this vector space and writing its elements as the rows of this matrix. The code equals the set
of all the vectors of the form u × G, where u ranges over Fkq (and × is the matrix product)
and a possible encoding algorithm is therefore the mapping u ∈ Fkq → u × G ∈ Fnq . When
the codeword corresponding to a given source vector u is obtained by inserting so-called
parity check coordinates in the source vector (whose coordinates are then called information
coordinates), the code is called systematic (it equals then the graph {(x, F (x), x ∈ Fkq } of a
function, up to coordinate permutation). The corresponding generator matrix is then called a
systematic generator matrix and has the form [Ik : M], where Ik is the k × k identity matrix,
up to column permutation. It is easily seen that every linear code has such a generator matrix:
any generator matrix (of rank k) has k linearly independent columns, and if we place these
columns at the k first positions, we obtain G = [A : M], where A is a nonsingular k × k
matrix; then A−1 × G = [Ik : A−1 × M] is a systematic generator matrix of the permuted
10 The square brackets around n, k, d specify that the code is linear, contrary to standard parentheses.
8 Introduction to cryptography, codes, Boolean, and vectorial functions

code (since the multiplication by A−1 transforms a basis of the permuted code into another
basis of the permuted code).

Dual code and parity check matrix


The generator matrix is well suited for generating the codewords, but it is not for checking
if a received word of length n is a codeword or not. A characterization of the codewords is
obtained thanks to the generator matrix H of the dual code C ⊥ = {x ∈ Fnq ; ∀y ∈ C, x · y =
 n
i=1 xi yi = 0} (such a matrix is called a parity check matrix and “·” is called the usual
inner product, or scalar product, in Fnq ): we have x ∈ C if and only if x×H t is the null vector.
Consequently, the minimum distance of any linear code equals the minimum number of
Fq -linearly dependent columns in one of its parity check matrices (any one). For instance,
the binary Hamming code of length n = 2m − 1, which has by definition for parity check
matrix the m × (2m − 1) binary matrix whose columns are all the nonzero vectors of
Fm2 in some order, has minimum distance 3. This code, which by definition is unique up
to equivalence, has played an important historical role since it is the first perfect code
found. It still plays a role since many computers use it to detect errors in their internal
communications. It is the basis on which BCH and Reed–Muller codes were built (see
pages 10 and 151). It depends on the choice of the order, but we say that two codes over
Fq are equivalent codes if they are equal, up to some permutation of the coordinates of
their codewords (and, for nonbinary codes, to the multiplication of each coordinate in each
codeword by a nonzero element of Fq depending only on the position of this coordinate).
Note that such codes have the same parameters.
The dual of the binary Hamming code is called the simplex code. A generator matrix of
this code being the parity check matrix of the Hamming code described above, and the rows
of this matrix representing then the coordinate functions in Fm 2 (sometimes called dictator
functions), on which the order chosen for listing the values is given by the columns of the
matrix, the codewords of the simplex code are the lists of values taken on Fm 2 \ {0m } by all
linear functions.
Note that the dual of a linear code C permuted by some bijection over the indices equals
C ⊥ permuted by this same bijection, and that, if G = [Ik : M] is a systematic generator
matrix of a linear code C, then [−M t : In−k ] is a parity check matrix of C, where M t is the
transposed matrix of M.
The linear codes that are supplementary with their duals (or equivalently that have
trivial intersection with their duals since the dimensions of a code and of its dual are
complementary to n) are called complementary dual codes (LCD) and will play an important
role in Subsection 12.1.5.

The advantages of linearity


Linearity allows considerably simplifying some main issues about codes. Firstly, the
minimum distance being equal to the minimum nonzero Hamming weight, computing it
(if it cannot be determined mathematically) needs only to visit q k − 1 codewords instead of
q k (q k −1)
2 pairs of codewords. Secondly, the knowledge of the code is provided by a k × n
generator matrix and needs then the description of k codewords instead of all q k codewords.
1.2 Error-correcting codes 9

Thirdly, a general decoding algorithm is valid for every linear code. This algorithm is not
efficient in general, but it gives a framework for the efficient decoding algorithms that will
have to be found for each class of linear codes. The principle of this algorithm is as follows:
let y be the (known) received vector corresponding to the (unknown) sent codeword x. We
assume that there has been at most d − 1 errors of transmission, where d is the minimum
distance, if the code is used for error detection, and at most e errors of transmission, where
e = (d −1)/2 is the correcting capacity of the code, if the code is used for error correction.
The error detection is made by checking if the so-called syndrome s = y × H t is the zero
vector. If it is not, then denoting by the so-called (unknown) error vector = y − x,
correcting the errors of transmission is equivalent to determining . This can be done by
visiting all vectors z of Hamming weight at most e in Fnq and checking if z × H t = s
(indeed, by linearity of matrix multiplication, the syndrome of the error vector equals the
syndrome of the received vector, which is known). There exists a unique z of Hamming
weight at most e in Fnq such that z × H t = s; this unique z equals .

Concatenating codes
Given an Fq -linear [n, k, d] code C (where n is the length, k is the dimension, and d
is the minimum distance), where q = 2e , e ≥ 2, a binary [n , e, d  ] code C  and an
F2 -isomorphism φ : Fq → C  , the concatenated code C  equals the [nn , ke, d  ≥ dd  ]
binary code {(φ(c1 ), . . . , φ(cn )); (c1 , . . . , cn ) ∈ C}. Codes C and C  are respectively called
outer code and inner code for this construction.

MDS linear codes


Let C be an [n, k, d] code over a field K, let H be its parity check matrix, and G its
generator matrix. Then n − k is the rank of H , and we have then d ≤ n − k + 1 since
n − k + 1 columns of H are always linearly dependent and therefore any set of indices of
size n − k + 1 contains the support of a nonzero codeword. This proves again the Singleton
bound: d ≤ n − k + 1.
Recall that C is called MDS if d = n − k + 1. The following are the properties of MDS
linear codes:
1. C is MDS if and only if each set of n − k columns of H has rank n − k.
2. If C is MDS, then C ⊥ is MDS.
3. C is MDS if and only if each set of k columns of G has rank k (and their positions
constitute then an information set; see page 161).

Other properties of linear codes


Puncturing, shortening, and extending codes preserve their linearity. Puncturing preserves
the MDS property (if n > k).
The following lemma will be useful when dealing with Reed–Muller codes in Chapter 4.

Lemma 1 Let C be a linear code of length n over Fq and Ĉ its extended code. We have
Ĉ ⊥ = {(y0 , . . . , yn ) ∈ Fqn+1 ; (y1 − y0 , . . . , yn − y0 ) ∈ C ⊥ }.
10 Introduction to cryptography, codes, Boolean, and vectorial functions

Proof We have Ĉ ⊥ = {(y0 , . . . , yn ) ∈ Fqn+1 ; ∀(x1 , . . . , xn ) ∈ C, y0 (− ni=1 xi ) +
n n
i=1 xi yi = 0} = {(y0 , . . . , yn ) ∈ Fq ; ∀(x1 , . . . , xn ) ∈ C, i=1 xi (yi − y0 ) = 0} =
n+1

{(y0 , . . . , yn ) ∈ Fq ; (y1 − y0 , . . . , yn − y0 ) ∈ C }.
n+1

Uniformly packed codes: These codes will play a role with respect to almost perfect
nonlinear (APN) functions, at page 381.

Definition 2 [50] Let C be any binary code of length N, with minimum distance d = 2e+1
and covering radius ρ. For any x ∈ FN 2 , let us denote by ζj (x) the number of codewords
of C at distance j from x. The code C is called a uniformly packed code, if there exist real
numbers h0 , h1 , . . . , hρ such that, for any x ∈ FN 2 , the following equality holds:
ρ
hj ζj (x) = 1.
j =0

As shown in [51], this is equivalent to saying that the covering radius of the code equals its
external distance (i.e., the number of different nonzero distances between the codewords of
its dual).

1.2.4 Cyclic codes


Two-error correcting Bose–Chaudhuri–Hocquenghem (BCH) codes
The binary Hamming code of length n = 2m − 1 has dimension n − m and needs m parity
check bits for being able to correct 1 error. It happens that 2-error binary correcting codes can
be built with 2m parity check bits. Let us denote by W1 , . . . , Wn the nonzero binary vectors
of length m written as columns in some order. The parity check matrix of the Hamming code
of length n = 2m − 1 is as follows:
H = [W1 , . . . , Wn ].
To find a 2-error correcting code C of the same length, we consider the codes whose parity
check matrices H  are the 2m × n matrices whose m first rows are those of H . These codes
being subcodes of the binary Hamming code, they are at least 1-error correcting. For each
such matrix H  , there exists a function F from Fm
2 to itself such that:
 
W1 W2 ... Wn
H = .
F (W1 ) F (W2 ) . . . F (Wn )
Note that, when F is a permutation (i.e., is bijective), the code of generator matrix H  is a
so-called double simplex code (and plays a central role in [136]); it is the direct sum of two
simplex codes: the standard one and its permutation by F .
Going back to general F , assume that two errors are made in the transmission of a
codeword of C, at indices i = j . The syndrome of the received vector equals that of the
error vector, that is,
     
S1 Wi Wj
= + ,
S2 F (Wi ) F (Wj )
1.2 Error-correcting codes 11

with S1 = 0m (where 0m is the length m all-zero vector) since i = j . We have then the
following:

W i + Wj = S1 = 0m
F (Wi ) + F (Wj ) = S2 .
The code is then 2-error correcting if and only if, for every S1 , S2 ∈ Fm  S1 = 0m ,
2 such that
S1
this system of equations has either no solution (i, j ) (which happens when S2 is not the
syndrome of an error vector of Hamming weight 2) or only two solutions (one solution if
we impose i < j ).
Note that since {W1 , . . . , Wn } equals Fm2 \ {0m } and these vectors are all distinct, it is
equivalent to consider the system

x+y = S1 = 0m
F (x) + F (y) = S2 ,

where x and y range over Fm 2 \ {0m }. This is where finite fields of orders larger than 2
played a historical role in coding theory (see Appendix, page 480, for a description of finite
fields): considering such functions F and such systems of equations is easier when we have
a structure of field (even though the equations do not involve multiplications). This allows us
indeed to take F (x) in a polynomial form, and the first polynomials to be tried are of course
monomials. The monomials x and x 2 , being linear functions, do not satisfy the condition
needed for the code to be 2-error correcting, but the next monomial x 3 does satisfy it (this
is easily
 seen since x 3 + y 3 = (x + y)3 + x y (x + y) implies that the system is equivalent
x + y = S1 = 0
to S +S 3 and such an equation results in an equation of degree 2, which has at
x y = 2S1 1
most two solutions over a finite field).
The condition on F (or more precisely on its extension by taking F (0) = 0) is equivalent
to saying that it is an APN function. This notion plays a very important role in cryptography;
see Chapter 11, page 369.
We need here the notion of primitive element; see page 487. Such element α satisfies that
F2n = {0, 1, α, α 2 , . . . , α 2 −2 } and exists for every n.
n

Definition 3 Let α be a primitive element of F2m . The binary 2-error correcting BCH code
of length n = 2m − 1 is the [n, n − 2m, 5] code due to Bose, Chaudhuri, and Hocquenghem,
of the following parity check matrix:
 
 α α2 . . . αn
H = .
α 3 α 6 . . . α 3n

Ordering the elements of F∗2n as α, α 2 , . . . , α n−1 , α n = 1 (we could have also chosen
1, α, α 2 , . . . , α n−1 ) implies a property that does not seem so important at first glance but
which played a central role in the history of codes and still plays such role nowadays: the
code is (globally) invariant under cyclic permutations of the codeword coordinates. This
property, when added to the linearity of the code, confers to them a structure of principal
ideal, with very nice theoretical and practical consequences.
12 Introduction to cryptography, codes, Boolean, and vectorial functions

General cyclic codes


A linear code C of length n is a cyclic code if it is (globally) invariant under cyclic shifts
of the codeword coordinates (see [809, page 188]). For this, it is enough that it is invariant
under one of the primitive cyclic shifts, for instance:
(c0 , . . . , cn−1 ) → (cn−1 , c0 , . . . , cn−2 ).
Cyclic codes have been extensively studied in coding theory, because of their strong
properties.

Representation of codewords
Each codeword (c0 , . . . , cn−1 ) is represented by the polynomial c0 + c1 X + · · · + cn−1 Xn−1 ,
viewed as an element of the quotient algebra A = Fq [X]/(Xn − 1) (each element of this
algebra is an equivalence class modulo Xn − 1, which will be always represented by its
unique element of degree at most n − 1, equal to the common rest in the division by Xn − 1
of the polynomials constituting the class). We shall call c0 + c1 X + · · · + cn−1 Xn−1 the
polynomial representation of codeword (c0 , . . . , cn−1 ). Then it is easily shown that C is
cyclic if and only if it is an ideal of Fq [X]/(Xn − 1), that is, it satisfies f C ⊆ C for every
nonzero f ∈ A (C being assumed linear, it is a subgroup of A).

Generator polynomial
The algebra Fq [X]/(Xn − 1) is a principal domain. It is easily shown that any (linear) cyclic
nontrivial11 code has a unique monic element g(X) (whose leading coefficient equals 1)
having minimal degree, which generates the ideal and is called the generator polynomial of
the code. In fact, g(X) is a generator of the code in the strong sense that every polynomial
of degree at most n − 1 is a codeword if and only if it is a multiple of g(X) in Fq [X] (which
implies that it is a multiple of g(X) in Fq [X]/(Xn − 1)). The code equals then the set of all
those polynomials that include the zeros of g(X) (in the splitting field of g(X)) among their
own zeros. It is also easily seen that g(X) is a divisor of X n − 1.

Zeros of the code


In our framework, the length will have the form n = q m − 1 (we call such length a primitive
length). In such a case, since g(X) divides X n − 1, the zeros of g(X) all belong to F∗q m . The
generator polynomial having all its coefficients in Fq , its zeros are of the form {α i , i ∈ I }
(where α is a primitive element of Fq m ), where I ⊆ Z/nZ is a union of cyclotomic classes
of q modulo n = q m − 1 (and vice versa). The set I is called the defining set of the code.
The elements α i , i ∈ I are called the zeros of the cyclic code, which has dimension n − |I |.
The elements α i , i ∈ Z/nZ \ I are called the nonzeros of the cyclic code. The generator
polynomial of C ⊥ is the reciprocal of the quotient of X n − 1 by g(X), and its defining set
therefore equals {n − i; i ∈ Z/nZ \ I }.

11 That is, it is different from {0n }; in fact, we shall consider that the trivial cyclic code has also a generator
polynomial: Xn − 1 itself.
1.2 Error-correcting codes 13

McEliece’s theorem [833] states that a binary cyclic code is exactly 2l -divisible (that is,
l is the maximum such that all codeword Hamming weights are divisible by 2l ) if and only
if l is the smallest number such that l + 1 nonzeros of C (with repetitions allowed) have
product 1 (and recall that α j = 1 if and only if 2n − 1 divides j ).

Generating all cyclic codes of some primitive length


Since a polynomial over Fq is the generator polynomial of a cyclic code of length n if
and only if it divides Xn − 1, we obtain all cyclic codes from all the divisors of Xn − 1
in Fq . Any such divisor is the product of some irreducible
 factors of Xn − 1 in Fq . These
irreducible factors are the polynomials of the form j ∈C (X − α j ), where C is a cyclotomic
class of q modulo n. The number of cyclic codes of length n over Fq is then 2r , where
r is the number of these cyclotomic classes (including the trivial cyclic code {0n } and the
full one Fnq ). The Hamming code has for generator polynomial the irreducible polynomial
corresponding to the cyclotomic class containing 1. Its dual, the simplex code, has then for
generator polynomial the polynomial corresponding to all cyclotomic classes except that
of n − 1.

Nonprimitive length
If the length is not primitive, the zeros of Xn − 1 live in its splitting field Fq m (where n
divides q m − 1, and m is minimal). If n and q are coprime, the zeros of Xn − 1 are simple
since the derivative nXn−1 of this polynomial does not vanish on them, and the same theory
applies by replacing Fq m by the group of nth roots of unity in Fq m and α by a primitive nth
root of unity.

BCH bound
A very efficient bound on the minimum distance of cyclic codes is the BCH bound [809,
page 201]: if I contains a “string” {l + 1, . . . , l + δ − 1} of length δ − 1 of consecutive12
elements of Z/nZ, then the cyclic code has minimum distance larger than or equal to δ
(which is then called the designed distance of the cyclic code). A proof of this bound (in the
framework of Boolean functions) is given in the proof of Theorem 23, page 337.

BCH codes
Let n be coprime with q and δ < n, the BCH codes of length n and designed distance
δ are the cyclic codes that have such string of length δ − 1 in their zeros (and have then
minimum distance at least δ, according to the BCH bound) and maximal dimension (i.e.,
minimal number of zeros) with such constraint.

Reed–Solomon codes
When n = q − 1, the cyclotomic classes of q modulo n are singletons and the set of zeros of
a cyclic code can then be any set of nonzero elements of the field (the generator polynomial

12 Considering of course that 0 is the successor of n − 1 in Z/nZ.


14 Introduction to cryptography, codes, Boolean, and vectorial functions

can be any divisor of Xn − 1); when it is constituted of consecutive powers of a primitive


element, this particular case of a BCH code is called a Reed–Solomon (RS) code. Such
codes are important because they achieve the Singleton bound with equality (i.e., they are
maximum distance separable MDS). Indeed, the BCH bound gives δ ≤ d ≤ n − (n − (δ −
1)) + 1 = δ, and the Singleton bound is then achieved with equality.

Remark. There exists another equivalent definition of Reed–Solomon codes; see the
remark on page 45. RS codes are widely used in consumer electronics (CD, DVD, Blu-ray),
data transmission technologies (DSL, WiMAX), broadcast systems, computer applications,
and deep-space communications.

Extended Reed–Solomon codes


A cyclic code C of length n being given, recall that the extended code of C is the set of
vectors (c∞ , c0 , . . . , cn−1 ), where c∞ = −(c0 +· · ·+cn−1 ). It is a linear code of length n+1
and of the same dimension as C. When C is a Reed–Solomon code whose defining set has
the form {1, 2, . . . , δ − 1}, its extended code is also MDS, because when (c0 , . . . , cn−1 ) is a
codeword of C of minimal Hamming weight δ, we have c∞ = 0 (again according to the BCH
bound: if c∞ = 0, then the polynomial c0 + c1 X + · · · + cn−1 Xn−1 has also α 0 = 1 for zero
and has then Hamming weight at least δ+1, thanks to the BCH bound applied with the string
{0, . . . , δ −1}). Hence, either (c0 , . . . , cn−1 ) is a codeword of C of minimal Hamming weight
δ and then (c∞ , c0 , . . . , cn−1 ) has Hamming weight δ + 1 or (c0 , . . . , cn−1 ) has Hamming
weight at least δ + 1 and (c∞ , c0 , . . . , cn−1 ) has a fortiori Hamming weight at least δ + 1.
Hence the minimum distance of the extended code is δ + 1 = (n + 1) − (n − δ + 1) + 1.
The extended code is MDS.

Cyclic codes and Boolean functions


Cyclic codes over F2 and of length 2m − 1 can be viewed as sets of m-variable Boolean
functions.Indeed, any codeword in such cyclic code with defining set I can be represented in
the form li=1 trn (ai x −ui ), ai ∈ F2m , where u1 , . . . , ul are representatives of the cyclotomic
classes lying outside I (see Relation (2.20) in Subsection 2.2.2, page 45).

1.2.5 The MacWilliams identity and the notion of dual distance


Linear codes
A nice relationship, due to F. J. MacWilliams [809, page 127], exists between the Hamming
weights in every binary linear code13 and those inits dual: let C be any binary linear code of
length n; consider the polynomial WC (X, Y ) = ni=0 Ai Xn−i Y i , where Ai is the number
of codewords of Hamming weight i. This polynomial is called the weight enumerator of C
and describes14 the weight distribution (Ai )0≤i≤n of C. Then
WC (X + Y , X − Y ) = |C| WC ⊥ (X, Y ). (1.1)

13 It exists for every linear code over a finite field and even for more general codes, but we shall need it only for
binary codes.
14 WC is a homogeneous version of classical generating series for the weight distribution of C.
1.2 Error-correcting codes 15

We givea sketch of proof15 of this MacWilliams’ identity: we observe first that WC (X, Y ) =
 n 1−xi xi substituting X by X + Y and Y by X − Y , we deduce that W (X +
x∈C i=1 X  Y ; C
n
Y,X − Y) = x∈C i=1 (X + (−1) Y ). We apply then the classical relation
xi
 making
possible to expand products of sums: for every λ1 , . . . , λn , μ1 , . . . , μn , we have ni=1 (λi +
 
μi ) = b∈Fn ni=1 (λ1−b i
i bi
μi ) (indeed, choosing λi in the ith factor when bi = 0 and μi
2
when bi = 1 provides when b ranges  overF2 all
n the possible
 terms in the expansion). This
gives here WC (X + Y , X − Y ) = x∈C b∈Fn ni=1 X1−bi ((−1)xi Y )bi . We obtain then
  n−w (b) w 2 (b)  b ·x , where “·” is the usual
WC (X + Y , X − Y ) = b∈Fn X
H Y H x∈C (−1)
2
inner product in Fn2 , and we conclude by observing that, if b ∈ C ⊥ , then the linear form b · x
over the vector space C is nonzero, and takes then values 0 and 1 on two complementary
hyperplanes, that is, the same number of times (we will find again this in Relation (2.38),
page 58). This proves Relation (1.1). Of course, we deduce that WC (X, Y ) = |C1⊥ | WC ⊥ (X +
Y , X − Y ) and the same method
 shows, as observed in [37], that for every coset a + C, we
have Wa+C (X, Y ) = |C1⊥ | 2 WC ⊥ ∩{0n ,a}⊥ (X + Y , X − Y ) − WC ⊥ (X + Y , X − Y ) .
n
Remark. We have |C| = i=0 Ai = WC (1, 1). The fact that the polynomial
WC (1,1) WC (X + Y , X − Y ) has nonnegative integer coefficients is very specific (among
1

all homogeneous polynomials P (X, Y ) whose coefficients are nonnegative integers). As far
as we know, the characterization of all homogeneous polynomials P (X, Y ) over N such that
P (1,1) P (X + Y , X − Y ) has nonnegative integer coefficients has never been investigated in
1

a paper.

Remark. The average Hamming weight of the codewords of a linear binary code
C equals (WC )Y (1, 1) (the value at (1, 1) of the partial derivative of WC (X, Y ) with
respect to Y ), divided by |C|. MacWilliams’ identity writes WC (X, Y ) = |C1⊥ | WC ⊥ (X +
Y , X − Y ). Differentiating with respect to Y gives (WC )Y (X, Y ) = 1
|C ⊥ |
(WC ⊥ )X (X +
Y,X − Y) − 1
|C ⊥ |
(WC ⊥ )Y (X + Y , X − Y ) and thus (WC )Y (1, 1) = − 1
|C ⊥ |
(WC ⊥ )X (2, 0)
 n2n−1 
1
|C ⊥ |
(WC ⊥ )Y (2, 0) = |C ⊥ | − |C ⊥ | (WC ⊥ )Y (2, 0), and the average Hamming weight of
1

codewords equals n2 − 2−n (WC ⊥ )Y (2, 0), which depends on the number of words of
Hamming weight 1 in C ⊥ (see more in [809, page 131] on the moments of the weight
distribution of codes) and is bounded above by n2 . In fact, it is easily seen directly that the
average Hamming weight of codewords equals n−r 2 , where r is the number of positions
where all codewords are null, since if there is a codeword with 1 at position i, the average
value of codewords at position i equals 12 .

Remark. Some authors call weight enumerator of C the univariate


n   polynomial AC (Z) =
i=0 Ai Z . MacWilliams’ identity writes then (1 + Z) AC 1+Z = |C| WC ⊥ (Z), where
i n 1−Z

n is the length of the binary code C.

15 The classical proof uses Fourier–Hadamard transform; since this transform will be addressed later in this
book, in Section 2.3, we give a proof more coding theory oriented.
16 Introduction to cryptography, codes, Boolean, and vectorial functions

The MacWilliams identity gives information on self-dual codes (i.e., codes equal to their
duals) through the Gleason theorem, which says that the weight enumerator of a self-dual
code is in the ring generated by X2 + Y 2 and XY − Y 2 (see [809, page 602]).

Unrestricted codes
The principle of MacWilliams’ identity can also be applied to unrestricted codes. When C
is not linear, the weight distribution of C has no great relevance. The distance distribution
has more interest. We consider the distance enumerator of C:
n
1
DC (X, Y ) = Bi Xn−i Y i ,
|C|
i=0

where Bi is the size of the set {(x, y) ∈ C 2 ; dH (x, y) = i}. Note that, if C is linear, then
DC = WC . Similarly as above, we see the following:
1 
n
DC (X, Y ) = X1−(xi ⊕yi ) Y xi ⊕yi ;
|C|
(x,y)∈C 2 i=1

we deduce as follows:
1 
n
DC (X + Y , X − Y ) = (X + (−1)xi ⊕yi Y ).
|C|
(x,y)∈C 2 i=1

Expanding these products by the same method as above, we obtain the following:
n 
 
1
DC (X + Y , X − Y ) = X 1−bi ((−1)xi ⊕yi Y )bi ;
|C|
(x,y)∈C 2 b∈Fn2 i=1

that is,
 2
1
DC (X + Y , X − Y ) = Xn−wH (b) Y wH (b) (−1)b·x . (1.2)
|C|
b∈Fn2 x∈C

Hence, DC (X + Y , X − Y ) has nonnegative coefficients (but DC (X, Y ) is not necessarily the


weight enumerator of a code; note, however, that it is one in the case of distance-invariant
codes, such as Kerdock codes; see Section 6.1.22).

Definition 4 The smallest nonzero exponent of Y with nonzero coefficient in the polynomial
DC (X + Y , X − Y ), that is, the number
 
min wH (b); b = 0n , (−1)b·x = 0 ,
x∈C

often denoted by d ⊥ (C), is called the dual distance of C.

The dual distance of C is strictly larger than an integer t if and only if the restriction to
C of any sum of at least one and at most t coordinate functions in Fn2 is balanced (i.e., has
1.3 Boolean functions 17

uniform distribution), that is, any of the punctured codes of length t of C equals the whole
vector space Ft2 and each vector in Ft2 is matched the same number of times.16 Hence, as we
shall see again at page 88, the size of a code of dual distance d is divisible by 2d−1 ; note that
for linear codes, this tells more than the Singleton bound applied to the dual.
This notion will play an important role with Boolean functions (see Definition 21, page
86; this is why we include Lemma 2 below) and with a recent kind of cryptanalysis that
plays an important role nowadays: side channel attacks (see Section 12.1, page 425).

Lemma 2 1. Any coset a + C of a binary unrestricted code has the same dual distance
as C. Any union of cosets of a linear code C has at least the same dual distance as C.
2. The dual distance of a punctured code is larger than or equal to the dual distance of the
original code (assuming that the latter has minimum distance at least 2).
3. The dual distance of the Cartesian product of two binary unrestricted codes equals the
minimum of their dual distances.
4. Let C1 and C2 be binary unrestricted codes of the same length n and
C  = {(c1 , c1 + c2 ); c1 ∈ C1 , c2 ∈ C2 },

then d  ⊥ = min(d1⊥ , 2 d2⊥ ).

The proof of this lemma is also an easy consequence of the properties of the Fourier–
Hadamard transform that we shall see in Section 2.3.

Remark. When C is linear, d ⊥ equals the minimum distance of the dual code C ⊥ . Hence,
since the minimum distance of a linear code over Fq equals the minimum nonzero number
of Fq -linearly dependent columns in its parity check matrix, its dual distance equals the
minimum nonzero number of Fq -linearly dependent columns in its generator matrix.

1.3 Boolean functions


We call Boolean functions (and sometimes we specify n-variable Boolean functions or
Boolean functions in dimension n) the (single-output) functions from the n-dimensional
vector space Fn2 over F2 , to F2 itself. Their set is denoted by BF n . Number n will be
named the number of variables, or of input bits. More generally,17 we call n-variable pseudo-
Boolean functions the functions from Fn2 to R.
Boolean functions will also be viewed in some cases as taking their input in the field F2n .
Indeed, this field is an n-dimensional vector space over F2 and it can then be identified with
the vector space Fn2 through the choice of a basis.
Boolean functions play roles in both cryptographic and error-correcting coding activities
in information protection:

16 This is a consequence of the properties of the Fourier–Hadamard transform that we shall see in Section 2.3,
applied to the indicator of C; see Corollary 6, page 88, and Theorem 5.
17 When we will consider Boolean functions as particular pseudo-Boolean functions, by viewing their output
values 0 and 1 as elements of Z rather than F2 (for instance, when defining their numerical normal form in
Subsection 2.2.4 or their Fourier–Hadamard transform in Section 2.3), adding their values will be made in Z,
with notation +; otherwise, it will be made modulo 2, with notation ⊕.
18 Introduction to cryptography, codes, Boolean, and vectorial functions

Table 1.1 Number of n-variable Boolean functions.

n 4 5 6 7 8
|BF n | 216 232 264 2128 2256
≈ 6 · 104 4 · 109 1019 1038 1077

– Every binary unrestricted code of length 2n , for some positive integer n, can be
interpreted as a set of Boolean functions, since every n-variable Boolean function can
be represented by its truth table (an ordering of the set of binary vectors of length n
being first chosen) and thus associated with a binary word of length 2n , and vice versa;
important codes (Reed–Muller, Kerdock codes; see Sections 4.1 and 6.1.22) can be
defined this way as sets of Boolean functions.
– The role of Boolean functions in conventional cryptography is even more important:
cryptographic transformations can be designed by appropriate composition of Boolean
functions.18

In both frameworks, n is rarely large, in practice:


– The error-correcting codes derived from n-variable Boolean functions have length 2n ;
so, taking n = 10 already gives codes of length 1024.
– For reason of efficiency, the Boolean functions used in stream ciphers had about 10
variables until algebraic attacks were invented in 2003, and the number of variables is
now most often limited to at most 20, except when the functions are particularly fast to
compute.

Despite their low numbers of variables, the Boolean functions used in cryptography and
satisfying the desired conditions (see Section 3.1 below) cannot be determined or studied
n
by an exhaustive computer investigation: the number |BF n | = 22 of n-variable Boolean
functions is too large when n ≥ 6. We give in Table 1.1 below the values of this number
for n ranging between 4 and 8.
Assume that visiting an n-variable Boolean function, and determining whether it has the
desired properties, requires one nanosecond (10−9 seconds); then it would need millions
of hours to visit all functions in six variables, and about 100 billions times the age of the
universe to visit all those in seven variables. The number of eight-variable Boolean functions
approximately equals the number of atoms in the whole universe! We see that trying to find
functions satisfying the desired conditions by simply picking up functions at random is also
impossible for these values of n, since visiting a nonnegligible part of all Boolean functions
in seven or more variables is not feasible, even when parallelizing. The study of Boolean
functions for constructing or studying codes or ciphers is essentially mathematical. But
clever computer investigation is very useful to imagine or to test conjectures, and sometimes
to generate interesting functions.

18 Boolean functions play also a role in hash functions, but we shall not develop this aspect, for lack of space,
and in the inner protection of some chips.
1.3 Boolean functions 19

Key Key

Plaintext Ciphertext Ciphertext Plaintext


⊕ ... ⊕

Figure 1.1 Vernam cipher.

Remark. Boolean functions play an important role in computational complexity theory,


with the notion of NP-complete decisional problem (where “NP” stands for nondeterministic
polynomial time), for which satisfiability problems (in particular, the 3-SAT problem) are
central. These problems are related to representations of Boolean functions by disjunctive
and conjunctive normal forms, which do not ensure uniqueness and are not much used in
cryptography and error-correcting coding. We refer the reader interested in satisfiability
problems and in the related complexity theory of Boolean functions to [31, 81, 1117].

A nice site under construction at the moment this book is written can be found at the URL
http://boolean.h.uib.no/mediawiki.

1.3.1 Boolean functions and stream ciphers


Stream ciphers are based on the so-called Vernam cipher (see Figure 1.1) in which the
plaintext (a binary string of some length) is bitwise added to a (binary) secret key of
the same length, in order to produce the ciphertext. The Vernam cipher is also called the
one time pad because a new random secret key must be used for every encryption. Indeed,
the bitwise addition of two ciphertexts corresponding to the same key equals the addition of
the corresponding plaintexts, which gives much information on these plaintexts when they
code for instance natural language (it is often enough to recover both plaintexts, even when
one of them is reversed; some secret services and spies learned this at their own expense).
The Vernam cipher, which is the only known cipher offering unconditional security
(see [1034]) if the key is truly random and if it is changed for every new encryption, was
used for the communication between the heads of the USA and the USSR during the cold
war (the keys being carried by diplomats) and by some secret services.
In practice (except in the very sensitive situations indicated above), since in the Vernam
cipher, the length of the private key must be equal to the length of the plaintext (which
is impractical), a so-called pseudorandom generator (PRG) is used for producing a
long pseudorandom sequence (the keystream, playing the role of the private key in the
Vernam cipher) from the short random secret key. Only the latter is actually shared.19
The unconditional security is then no longer ensured (this is the price to pay for making
the cipher lighter). If the keystream only depends on the key (and not on the plaintext), the

19 The PRG is supposed to be public since taking a part of the secret for describing it would reduce in practice
the length of the key.
20 Introduction to cryptography, codes, Boolean, and vectorial functions

  

×c1 ×cL−1 ×cL

sn

sn−1 ... sn−L+1 sn−L

Figure 1.2 LFSR.

cipher is called synchronous.20 Stream ciphers, because they operate on data units as small
as a bit or a few bits, are suitable for fast telecommunication applications. Having also a very
simple construction, they are easily implemented both in hardware and software. They need
to resist all known attacks (see in Section 3.1 those that are known so far). The so-called
attacker model for these attacks (that is, the description of the knowledge the attacker is
supposed to have) is as follows: some knowledge on the plaintext may be unavoidable and it
is then assumed that the attacker has access to a small part of it. Since the keystream equals
the XOR of the plaintext and the ciphertext, the attacker is then assumed to have access to a
part of the keystream, and he/she needs to reconstruct the whole sequence.
A first method for generating pseudorandom sequences from secret keys has used linear
feedback shift registers (LFSR) [550]. In such an LFSR (see Figure 1.2, where × means
multiplication), at every clock cycle, the bits sn−1 , . . . , sn−L contained in the flip-flops of
the LFSR move to the right. The right-most bit is the current output (a keystream of length
N will then beproduced after N clock cycles) and the leftmost flip-flop is fed with the linear
combination L i=1 ci sn−i , where the ci s are bits. Thus, such an LFSR outputs a recurrent
sequence satisfying the relation

L
sn = ci sn−i .
i=1

Such a sequence is always ultimately periodic21 (if cL = 1, then it is periodic; we shall


assume that cL = 1 in the sequel, because otherwise the same sequence can be output by an
LFSR of a shorter length, except for its first bits, and this
can be exploited in attacks) with
period at most 2L − 1. The generating series s(X) = i≥0 si Xi of the sequence can be
expressed in a nice way (see the chapter by Helleseth and Kumar in [959] and Section 10.2,
“LFSR sequences and maximal period sequences”, by Niederreiter in [890]): s(X) = G(X) F (X) ,
L−1 i i 
where G(X) = i=0 X j =0 ci−j sj is a polynomial of degree smaller than L and
F (X) = 1 + c1 X + · · · + cL XL is the feedback polynomial (an equivalent representation

20 There also exist self-synchronizing stream ciphers, in which each keystream bit depends on the n preceding
ciphertext bits, which makes possible resynchronizing after n bits if an error of transmission occurs between
Alice and Bob.
21 Conversely, every ultimately periodic sequence can be generated by an LFSR.
1.3 Boolean functions 21

uses the characteristic polynomial, which is the reciprocal of the feedback polynomial). The
minimum length of the LFSR producing a sequence is called the linear complexity of the
sequence (and sometimes its linear span). It equals L if and only if the polynomials F and
G above are coprime and is equal in general to N − deg (gcd(X N + 1, S(X))), where N is
a period and S(X) is the generating polynomial S(X) = s0 + s1 X + · · · + sN−1 XN−1 . An
m-sequence (or maximum length sequence) is a sequence of period 2L − 1, where L is the
linear complexity. Assuming that L = L, this corresponds to taking a primitive feedback
polynomial (see page 488). The sequence can then be represented in the form si = trn (aα i ),
where α is a primitive element of F2n (see page 487) and trn is the trace function from F2n
to F2 (see pages 42 and 489). The m-sequences have very strong properties; see the chapter
by Helleseth and Kumar in [959].
The initialization s0 , . . . , sL−1 of the LFSR and the values of the feedback coefficients ci
must be kept secret (they are then computed from the secret key); if the feedback coefficients
were public, the observation of L consecutive bits of the keystream would allow recovering
all the subsequent sequence.

Berlekamp–Massey attack
The use of LFSRs as pseudorandom generators is cryptographically weak because of an
attack found in the late 1970s called the Berlekamp–Massey (BM) algorithm [826]: let L
be the linear complexity of the sequence, assumed to be unknown from the attacker; if
he knows at least 2L consecutive bits of the sequence, the BM algorithm allows him to
recover the values of L and of the feedback coefficients of an LFSR of length L generating
the sequence, as well as the initialization of this LFSR. The BM algorithm has quadratic
complexity, that is, works in O(L2 ) elementary operations. Improvements of the algorithm
exist, which have lower complexity: the main idea22 is to use the extended Euclidean (EE)
algorithm (or its variants). The way to use this algorithm is shown in the section “Linearly
recurrent sequences” (Section 12.3) of the book Modern Computer Algebra by J. von zur
Gathen and J. Gerhard [533] (Algorithm 12.9 in this book is essentially an EE algorithm).
The complexity of an EE algorithm being O(M(L) log(L)), where M(L) is the cost of the
multiplication between two polynomials of degree L, and this latter cost being quasilinear,
the complexity of finding the retroaction polynomial of an LFSR is roughly O(L log(L)).
The data complexity is still 2L, but these 2L bits of the sequence do not need to be strictly
consecutive: having k strings of 2L/k consecutive bits is enough, thanks to a matrix version
of the BM algorithm found by Coppersmith, coupled with an algorithm due to Beckerman
and Labahn, or with a simpler (and implemented) one due to Thomé; see more in [1085].

The role of Boolean functions


Many keystream generators still use LFSRs, and to resist the Berlekamp–Massey attack,
either combine several LFSRs (and possibly some additional memory) as in the case of E0 ,
the keystream generator that is part of the Bluetooth standard, or use Boolean functions; see
[1006]. The first model that appeared in the literature for such use is the combiner model
(see Figure 1.3).

22 We thank Pierrick Gaudry for his kind explanations.


22 Introduction to cryptography, codes, Boolean, and vectorial functions

x1
LFSR 1

x2
LFSR 2 Output si
f
..
.
xn
LFSR n

Figure 1.3 Combiner model.

Notice that the feedback coefficients of the n LFSRs used in such a generator can be
public. The Boolean function is also public, in general, and the (short) secret key is necessary
only for the initialization of the n LFSRs (also depending on an initial vector, which being
public can be changed more often than the key): if we want to use for instance a 128-bit-long
secret key, this makes possible using n LFSRs of lengths L1 , . . . , Ln such that L1 + · · · +
Ln = 128.
Such system clearly outputs a periodic sequence whose period is at most the LCM of
the periods of the sequences output by the n LFSRs (assuming that cL = 1 in each LFSR;
otherwise, the sequence is ultimately periodic and the period is shorter). So, this sequence
satisfies a linear recurrence and can therefore be produced by a single LFSR. However, as
we shall see, well-chosen Boolean functions allow the linear complexity of the sequence to
be much larger than the sum of the lengths of the n LFSRs. Nevertheless, choosing LFSRs
producing sequences of large periods, choosing these periods pairwise co-prime in order
to have the largest possible global period, and choosing f such that the linear complexity
is large enough too are not sufficient. As we shall see, the combining function should also
not leak information about the individual LFSRs and behave as differently as possible from
affine functions, in several different ways.
The combiner model is only a model, useful for studying attacks and related criteria. In
practice, the systems are more complex (see for instance at URL www.ecrypt.eu.org/stream/
to see how the stream ciphers of the eSTREAM Project [495] are designed).
A more recent model is the filter model, which uses a single LFSR (of a longer length). A
filtered LFSR outputs f (x1 , . . . , xn ), where f is some n-variable Boolean function, called a
filtering function, and where x1 , . . . , xn are the bits contained in some flip-flops of the LFSR;
see Figure 1.4.
Such a system is equivalent to the combiner model using n copies of the LFSR. However,
the attacks, even when they apply to both systems, do not work similarly (a first obvious
difference is that the lengths of the LFSRs are different in the two models). Consequently,
the criteria that the involved Boolean functions must satisfy to allow resistance to these
attacks need to be studied for each model (we shall see that they are in practice not so
different, except for one criterion that will be necessary for the combiner model but not for
the filter model).
1.3 Boolean functions 23

  

si+L−1 ... si+1 si


x1 xi xn

f (x1 , x2 , . . . , xn )

Output

Figure 1.4 Filter model.

Note that in both models, the PRG is made of a linear part (constituted by the LFSRs),
the linearity allowing speed, and a nonlinear part (made of the combiner/filter function)
providing confusion (see the meaning of this term in Section 3.1). Generalizations of the two
models have been proposed with the same structure “linear part, nonlinear part” [495, 901].
In practice, models will not be used as is; we shall add memory and/or few combinatoric
stages and/or initialization registers; a high level security is ensured by the fact that the
model, as is, is proved resistant to all known attacks, and the additional complexity will
make the work of the attacker still more difficult.
Other kinds of pseudorandom generators exist that are not built on the same principle.
A feedback shift register (FSR) has the same structure as an LFSR, but the leftmost flip-
flop is feeded with g(xi1 , . . . , xin ), where n ≤ L and xi1 , . . . , xin are bits contained in the
flip-flops of the FSR, and where g is some n-variable Boolean function called the feedback
function (if g is not affine, then we speak of NFSR, where N stands for nonlinear). The
linear complexity of the produced sequence can be near 2L (see [640] for general FSRs
and [344] for FSRs with a quadratic feedback function; see the definition of “quadratic” at
page 36). Some finalists of the eSTREAM project [495] such as Grain and Trivium use
NFSRs. But the theory of NFSRs is not completely understood. The linear complexity
is difficult to study in general. Even the period is not easily determined, although some
special cases have been investigated [630, 702, 1045, 1046]. Nice results similar to those
on the m-sequences exist in the case of feedback with carry shift-registers (FCSRs); see
[30, 559, 560, 703].

1.3.2 Boolean functions and error-correcting codes


As explained above, every binary unrestricted code whose length equals 2n for some positive
integer n can be interpreted as a set of Boolean functions. A particular class of codes has
its very definition given by means of Boolean functions. This class is that of Reed–Muller
codes. We shall see in Chapter 2 that an integer lying between 0 and n and called algebraic
degree can be associated to every Boolean function over Fn2 . The Reed–Muller code of
order k ∈ {0, . . . , n} is made of all Boolean functions over Fn2 whose algebraic degree is
bounded above by k; see Section 4.1. This linear code has length 2n since each Boolean
function is identified to the list of its values over Fn2 , in some order. It is linear and has nice
24 Introduction to cryptography, codes, Boolean, and vectorial functions

particularities, thanks to which Reed–Muller codes are still used nowadays, even if their
parameters are not very good, except for the first-order Reed–Muller code. The second-order
Reed–Muller code contains a nonlinear code, called the Kerdock code, which has minimum
distance almost the same as that of the first-order Reed–Muller code of the same length and
size roughly the square of its size. In fact, the parameters of the Kerdock code are so good
that they are provably optimal among all unrestricted codes; see Section 6.1.22.

1.4 Vectorial functions


The functions from Fn2 to Fm 2 are called (n, m)-functions. Such function F being given,
the
Boolean functions f1 , . . . , fm defined at every x ∈ Fn2 by F (x) = (f1 (x), . . . , fm (x)),
are
called the coordinate functions of F . When the numbers m and n are not specified, (n, m)-
functions are called multioutput Boolean functions or vectorial Boolean functions. Those
vectorial functions whose role is to ensure confusion23 in a cryptographic system are called
substitution boxes (S-boxes).
Note that (n, m)-functions can also be viewed as taking their input in F2n as we have
seen with Boolean functions, and if m divides n, then we shall see that the output can then
be expressed as a polynomial function of the input. We shall be in particular interested in
power functions F (x) = x d , x ∈ F2n .

1.4.1 Vectorial functions and stream ciphers


In the pseudorandom generators of stream ciphers, (n, m)-functions can be used to combine
the outputs of n LFSRs or to filter the content of a single one, generating m bits at each clock
cycle instead of only one, which increases the speed of the cipher (but risks decreasing its
robustness). The attacks described about Boolean functions are obviously also efficient on
these kinds of ciphers. They are in fact often more efficient – see Section 3.3, page 129 –
since the attacker can combine in any way the m output bits of the function.

1.4.2 Vectorial functions and block ciphers


Vectorial functions play mainly a role with block ciphers. All known block ciphers are
iterative, that is, are the iterations of a transformation depending on a key over each block
of plaintext. The iterations are called rounds and the key used in an iteration is called a
round key. The round keys are computed from the secret key (called the master key) by a
key scheduling algorithm. The rounds consist of vectorial Boolean functions combined in
different ways and involve the round key.

Remark. Boolean functions also play an important role in block ciphers, each of which
admits as input a binary vector (x1 , . . . , xn ) (a block of plaintext) and outputs a binary
vector (y1 , . . . , ym ); the coordinates y1 , . . . , ym are the outputs of Boolean functions
(depending on the key) over (x1 , . . . , xn ); see Figure 1.5.
But the number n of variables of these Boolean functions being large (often more than
100), they are hardly analyzed precisely.

23 See Section 3.1 for the meaning of this term.


1.4 Vectorial functions 25

Plaintext: x1 xn

...

Key
E

...

Ciphertext: y1 ym

Figure 1.5 Block cipher.

Round key

+ P S + E

Figure 1.6 A DES round.

S1 S16
...

Linear permutation

Round key +

Figure 1.7 An AES round.

We give in Figures 1.6 and 1.7 a description of the rounds of the Data Encryption Standard
(DES) [88] and of the Advanced Encryption Standard (AES) [404].
The input to a DES round is a binary string of length 64, divided into two strings of
32 bits each (in the figure, they enter the round, from above, on the left and on the right);
confusion is achieved by the S-box, which is a nonlinear transformation of a binary string
26 Introduction to cryptography, codes, Boolean, and vectorial functions

of 48 bits24 into a 32-bit -long one. So, 32 Boolean functions on 48 variables are involved.
But, in fact, this nonlinear transformation is the concatenation of eight sub-S-boxes, which
transform binary strings of six bits into 4-bit-long ones. Before entering the next round, the
two 32-bit-long halves of data are swapped. Such Feistel cipher structure does not need the
involved vectorial functions (in particular the S-boxes) to be injective for the decryption to
be possible. Indeed, any function of the form (x, y) → (y, x + φ(y)) is a permutation. The
number of output bits can be smaller than that of input bits like in the DES; it can also be
larger, like in the CAST cipher [6], where input dimension is eight and output dimension
is 32. However, if the S-boxes are not balanced (that is, if their output is not uniform), this
represents a weakness against some attacks, and it obliges the designer to complexify the
structure (for instance by including expansion boxes); see more in [957].
In the (standard) AES round, the input is a 128-bit-long string, divided into 16 strings of
eight bits each; the S-box is the concatenation of 16 sub-S-boxes corresponding to 16 × 8
Boolean functions in eight variables. Such a substitution permutation network (SPN) needs
the vectorial functions (in particular the S-boxes) to be bijective, so that decryption is
possible. Then n = m. Another well-known example of such cipher is PRESENT [100].
A third general structure for block ciphers is ARX structure; see [708].

Remark. Klimov and Shamir [705] have identified a particular kind of vectorial functions
usable in stream and block ciphers (and in hash functions), called T-functions. These are
mappings F from Fn2 to Fm 2 such that each ith bit of F (x) depends only on x1 , . . . , xi .
For example, addition and multiplication in Z, viewed in binary expansion, are T-functions;
logical operations (XOR and AND, that is, addition and multiplication in F2 ) are T-functions
too. Any composition of T-functions is a T-function as well. Their simplicity makes them
appealing for lightweight cryptography. But they may be too simple to provide enough
confusion; they have suffered attacks.

1.4.3 Vectorial functions and error-correcting codes


We shall see in Chapter 4 that interesting linear subcodes of the Reed–Muller codes and
other (possibly nonlinear) codes can be built from vectorial functions.

24 The E-box has expanded the 32-bit-long string into a 48-bit-long one.
2

Generalities on Boolean and vectorial functions

The set Fn2 of all binary vectors1 of length n will be viewed as an F2 -vector space (with
null element 0n ). This vector space will sometimes be also endowed with the structure
of the field F2n (denoted by GF (2n ) by some authors), with null element 0; indeed, this
field being an n-dimensional vector space over F2 , each of its elements can be identified
with the binary vector of length n of its coordinates relative to a fixed basis. The set of
all Boolean functions f : Fn2 → F2 will be denoted by BF n . It is a vector space over
F2 . The Hamming weight wH (x) of a binary vector x ∈ Fn2 being the number of its
nonzero coordinates (i.e., the size of supp(x) = {i ∈ {1, . . . , n}; xi = 0}, the support
of vector x), the Hamming weight wH (f ) of a Boolean function f on Fn2 is (also) the size
of supp(f ) = {x ∈ Fn2 ; f (x) = 0}, the support of function f . Note that if we denote by 
the symmetric difference between two sets, we have supp(f ⊕ g) = supp(f )  supp(g).
The Hamming distance dH (f , g) between two functions f and g is the size of the set
{x ∈ Fn2 ; f (x) = g(x)}. Thus it equals wH (f ⊕ g).
Note. Some additions of bits will be considered in Z (in characteristic 0) and denoted then
by +, and some will be computed in characteristic 2 and denoted by ⊕. These two different
notations will be necessary for F2 because some representations of Boolean functions will
live in characteristic 2 and some will live in characteristic 0. But the addition in the finite
field F2n will be denoted by +, as usual in mathematics, as well as the addition in Fn2 when
n > 1, since Fn2 will often be identified with F2n , and because there will be no ambiguity.

2.1 A hierarchy of equivalence relations over Boolean and vectorial functions


Each notion that we shall study on Boolean or vectorial functions will be preserved by
some equivalence relations that we need to define. It is important to determine precisely,
for each notion, those equivalence relations that preserve it. Indeed, if we prove that some
function has some property, say P , preserved by a given equivalence relation, this implies
automatically that all functions in the equivalence class containing this function share the
same property P ; to classify the set of functions satisfying P , we need to determine all
equivalence classes of functions sharing P . This is often a difficult task. Even determining
the size of the union of these classes may be quite difficult. If classification is elusive, a
possible contribution to the domain is to provide constructions of functions satisfying P . For
being able to say that some construction of functions satisfying P provides new functions, it
1 Coders say “words.”

27
28 Generalities on Boolean and vectorial functions

is needed to prove that at least one function obtained through this construction is inequivalent
(for every equivalence relation preserving P ) to all known functions satisfying P . This may
be a huge work.
There are five main notions of equivalence among vectorial functions and four in the
subcase of Boolean functions (because the fifth notion is then equivalent to the fourth one).
We give the definitions for vectorial functions; the corresponding definitions for Boolean
functions are with m = 1 (then all the permutations composed with the functions on their
left can be taken equal to identity).

Remark. In the next definition and in the sequel, we present linear functions over Fn2 in
the form L : (x1 , x2 , . . . , xn ) → (x1 , x2 , . . . , xn ) × M, with (x1 , x2 , . . . , xn ) a row vector,
as it is usual in information protection, rather than dealing with a column vector, as it is
usual in mathematics. Applying transposition to the expressions allows us to translate a
representation into the other.

Definition 5 The main notions of equivalence on Boolean and vectorial functions are as
follows:
1. Two (n, m)-functions F and τ ◦ F ◦ σ , where σ is a permutation of {1, . . . , n}, extended
to a permutation of Fn2 by
σ : (x1 , x2 , . . . , xn ) ∈ Fn2 → (xσ (1) , xσ (2) , . . . , xσ (n) ) ∈ Fn2
and τ is a permutation of {1, . . . , m}, similarly extended to a permutation of Fm
2 , are
called permutation equivalent.
2. Two (n, m)-functions F and L ◦ F ◦ L, where
L : (x1 , x2 , . . . , xn ) ∈ Fn2 → (x1 , x2 , . . . , xn ) × M ∈ Fn2
is an F2 -linear automorphism of Fn2 , M being a nonsingular n × n matrix over F2 , and
L is an F2 -linear automorphism of Fm 2 , are called linearly equivalent.

3. Two (n, m)-functions F and L ◦ F ◦ L, where
L : (x1 , x2 , . . . , xn ) ∈ Fn2 → (x1 , x2 , . . . , xn ) × M + (a1 , a2 , . . . , an )
is an affine automorphism of Fn2 and L is an affine automorphism of Fm 2 , are called
affinely equivalent or affine equivalent [907].
4. Two (n, m)-functions F and L ◦ F ◦ L + L , where L is an affine automorphism of Fn2 ,
L is an affine automorphism of Fm 
2 , and L : (x1 , x2 , . . . , xn ) ∈ F2 → (x1 , x2 , . . . , xn ) ×
n
 
M + (a1 , a2 , . . . , am ) ∈ F2 is an affine (n, m)-function, M being an n × m binary
m

matrix, are called (extended affine) EA equivalent.


5. Two (n, m)-functions F and G whose graphs GF = {(x, y) ∈ Fn2 × Fm 2 ; y = F (x)} and
GG = {(x, y) ∈ Fn2 × Fm 2 ; y = G(x)} are affinely equivalent (i.e., such that L(GF ) = GG
for some affine automorphism L on Fn2 × Fm 2 ) are called Carlet–Charpin–Zinoviev (CCZ)
equivalent2 (the notion is from [257] and the term from [163]).

2 This notion was rediscovered by L. Breveglieri, A. Cherubini, and M. Macchetti at Asiacrypt 2004.
2.1 A hierarchy of equivalence relations over Boolean and vectorial functions 29

A property or a parameter will be called a permutation invariant (resp. a linear invariant, an


affine invariant, an EA invariant, or a CCZ invariant) if it is preserved by permutation (resp.
linear, affine, extended affine, CCZ) equivalence.

In [432], an asymptotic estimate is given for the number of EA equivalence classes of


Boolean functions.
Note that if F and G are CCZ equivalent and if we write L = (L1 , L2 ), where L is the
automorphism in Definition 5 (Item 5) with L1 : Fn2 × Fm 2 → F2 and L2 : F2 × F2 →
n n m

F2 , and if, for every x ∈ F2 , we define F1 (x) = L1 (x, F (x)) and F2 (x) = L2 (x, F (x)),
m n

then function F1 is bijective because G is a function, and G = F2 ◦ F1−1 . Note also that,
given a function F , finding all functions CCZ equivalent to F consists in finding all affine
automorphisms L = (L1 , L2 ) such that F1 is bijective. Moreover, CCZ equivalent functions
corresponding to a same F , a same L1 , and different L2 are EA equivalent; see [163], which
shows that an (n, n)-function G is EA equivalent to a function F or to F −1 (if it exists) if
and only if there exists an affine permutation L = (L1 , L2 ) where L1 depends only on x or
y, and such that L(GF ) = GG .
CCZ equivalence can be translated in terms of codes; see the remark on page 379.

Proposition 1 For n and m ranging over N, each equivalence relation in Definition 5 is a


strict particular case of the next one.

Proof The only nonobvious facts are that EA equivalence implies CCZ equivalence and
that the converse is false. This can be seen as follows:
– If φ1 and φ2 are affine automorphisms of Fn2 , Fm2 , respectively, and if G = φ2 ◦ F ◦ φ1 ,
then defining L1 (x, y) = φ1−1 (x) and L2 (x, y) = φ2 (y), we have that L = (L1 , L2 ) is an
−1
affine automorphism of Fn2 × Fm 2 that maps GF onto GG , since G(φ1 (x)) = φ2 (F (x)),
and F and G are then CCZ equivalent. 3
– If φ(x) is an affine function from Fn2 to Fm2 and G(x) = F (x) + φ(x), then L(x, y) =
(x, y + φ(x)) is an affine automorphism that maps GF onto GG , and F and G are CCZ
equivalent.
– EA equivalence preserves algebraic degree (see Definition 6, page 35) when it is larger
than 1 and it is shown in [162, 163] that CCZ equivalence does not.

Note that if m = n and (L1 , L2 )(x, y) = (y, x), then F2 ◦ F1−1 is equal to F −1 .

2.1.1 Relations between these equivalences


For a lack of space, in this subsection, we shall refer to papers for the proofs.
It has been proved in [163] that CCZ equivalence between (n, n)-functions4 is strictly
more general than EA equivalence together with taking inverses of permutations, by
exhibiting functions that are CCZ equivalent to the function F (x) = x 3 on F2n , but that
3 Conversely, if F and G are CCZ equivalent and L1 (x, y) and L2 (x, y) depend only on x and y, respectively,
say L1 (x, y) = φ1−1 (x) and L2 (x, y) = φ2 (y), then φ1 and φ2 are affine automorphisms of Fn2 and Fm
2,
respectively, and G = φ2 ◦ F ◦ φ1 .
4 For (n, m)-functions, see [149, 966].
30 Generalities on Boolean and vectorial functions

cannot be obtained from F by any sequence of applications of EA equivalence and inverse


transformation; see also [771].
However, CCZ equivalence coincides with EA equivalence when restricted to some
classes of functions (whose definitions will be, in some cases, given after):
1. Boolean (i.e., single-output) functions,5 as shown in [149] (on the contrary, CCZ
equivalence is shown to be strictly more general than EA equivalence in the case of
(n, m)-functions when n ≥ 5 and m is larger than or equal to the smallest divisor of n
different from 1, e.g., when n is even and m ≥ 2).
2. Bent functions (see page 269) as proved in [148, 150] and more generally, functions
having surjective derivatives (see page 38), as proved in [164].
3. Quadratic APN functions (see page 281), as shown in [1139] (extending [119]).
CCZ equivalence also coincides with EA equivalence (see page 281 as well):
• For n even, with plateaued APN functions, one of which is a power function.
• For a quadratic APN function and a power (APN) function (they are then EA equivalent
to one of the Gold functions).

And the CCZ equivalence between two power functions coincides with their EA equivalence
or with the EA equivalence between one function and the inverse of the other if it is bijective
(see Proposition 113, page 281).

Remark. It has been shown in [149] that the CCZ equivalence (i.e., the EA equivalence,
thanks to 1 above) between the indicators (i.e., characteristic functions) of the graphs of two
functions coincides with their CCZ equivalence.

Finding new EA inequivalent functions by using CCZ equivalence is not easy (this
could be done in particular cases; see pages 396, 404). If (L1 , L2 ) and (L1 , L2 ) are linear
permutations of Fn2 × Fm2 and F1 = L1 (x, F (x)) is a permutation of F2 , then since the
n

functions F and F obtained by CCZ equivalence from F by using (L1 , L2 ) and (L1 , L2 )
 

are EA equivalent, finding EA inequivalent functions by using CCZ equivalence requires


finding new permutations F1 .

2.2 Representations of Boolean functions and vectorial functions


Among the classical representations of Boolean (resp. vectorial) functions, the most well
known is the truth table (resp. the lookup table, or LUT), equal to the list of all pairs of an
element of Fn2 and of the value of the function at this input (an ordering of Fn2 being chosen).

2.2.1 Algebraic normal form


The truth table is not much used for defining Boolean functions in the frameworks of
cryptography and coding theory, because the features of Boolean functions that play a role in
these two domains are not easily captured by such representation (except for the Hamming

5 If one function is Boolean (and viewed as multiouput thanks to F2 ⊂ F2m ), this suffices.
2.2 Representations of Boolean functions and vectorial functions 31

weight). The most used representation in cryptography and coding is the algebraic normal
form (in brief the ANF).6

Algebraic normal form of Boolean functions


This is an n-variable polynomial representation over F2 , of the form f (x) =
 
  
aI xi = aI x I ∈ F2 [x1 , . . . , xn ]/(x12 ⊕ x1 , . . . , xn2 ⊕ xn ). (2.1)
I ⊆{1,...,n} i∈I I ⊆{1,...,n}

Every coordinate xi appears in this polynomial with exponents at most 1, because every bit
in F2 equals its own square.

Example Let us consider the function f whose truth table is as follows:

x1 x2 x3 x in hexa f (x)

0 0 0 0 0
0 0 1 1 1
0 1 0 2 0
0 1 1 3 0
1 0 0 4 0
1 0 1 5 1
1 1 0 6 0
1 1 1 7 1

It is the sum (modulo 2 or not, no matter) of the atomic functions f1 , f2 , f3 :

x1 x2 x3 x in hexa f1 (x) f2 (x) f3 (x)

0 0 0 0 0 0 0
0 0 1 1 1 0 0
0 1 0 2 0 0 0
0 1 1 3 0 0 0
1 0 0 4 0 0 0
1 0 1 5 0 1 0
1 1 0 6 0 0 0
1 1 1 7 0 0 1

6 It can have other names in circuit theory, such as Zhegalkin polynomial, modulo-2 sum-of-products,
Reed–Muller-canonical expansion, and positive polarity Reed–Muller form.
32 Generalities on Boolean and vectorial functions

The function f1 (x) takes value 1 if and only if 1 ⊕ x1 = 1 ⊕ x2 = x3 = 1, that is,


(1 ⊕ x1 )(1 ⊕ x2 ) x3 = 1. Thus the ANF of f1 can be obtained by expanding the product
(1 ⊕ x1 )(1 ⊕ x2 ) x3 . After similar observations on f2 and f3 , we see that the ANF of f
equals (1 ⊕ x1 )(1 ⊕ x2 ) x3 ⊕ x1 (1 ⊕ x2 ) x3 ⊕ x1 x2 x3 = x1 x2 x3 ⊕ x2 x3 ⊕ x3 .

Another possible representation of this same ANF uses an indexation by means of vectors
of Fn2 instead of subsets of {1, . . . , n}; if, for any such vector u, we denote by au what is
denoted by asupp(u) in Relation (2.1) (where supp(u) denotes the support of u), we have the
following equivalent representation:
⎛ ⎞
 
n
f (x) = au ⎝ xj uj ⎠ .
u∈Fn2 j =1
n
The monomial j =1 xj uj is often denoted7 by x u . We have x u x v = x u∨v , where supp(u∨v)
= supp(u) ∪ supp(v).

Existence and uniqueness of the ANF


By applying the method described in the example above, it is a simple matter to show the
existence of the ANF of any Boolean function: we have

f (x) = f (a)δa (x) = f (a)δa (x) (2.2)
a∈Fn2 a∈Fn2

where the function δa is the Dirac (or Kronecker) symbol at a and equals δa (x) =
 n
i=1 (xi ⊕ ai ⊕ 1). Replacing in (2.2) each δa by this expression, expanding it, and
simplifying (mod 2) gives an expression (2.1) for f , which shows the existence of an
ANF of any Boolean function. This implies that the mapping from polynomials P ∈
F2 [x1 , . . . , xn ]/(x12 ⊕ x1 , . . . , xn2 ⊕ xn ) to the corresponding functions x ∈ Fn2 → P (x), is
onto BF n . Since the size of BF n equals the size of F2 [x1 , . . . , xn ]/(x12 ⊕ x1 , . . . , xn2 ⊕ xn ),
this correspondence is one to one.8 But more can be said.

Relationship between a Boolean function and its ANF



The product =
xI i∈I xi is nonzero if and only if xi is nonzero (i.e., equals 1) for
∈ I , that is, if I is included in the support of x; hence, the Boolean function
every i 
f (x) = I ⊆{1,...,n} aI x I takes the value

f (x) = aI , (2.3)
I ⊆supp(x)

where supp(x) denotes the support of x.

7 The reader should not confuse this notation with a univariate monomial.
8 Another argument is that this mapping is a linear mapping from a vector space over F2 of dimension 2n to a
vector space of the same dimension.
2.2 Representations of Boolean functions and vectorial functions 33
 
If we use the notation f (x) = u∈Fn2 au x , we obtain the relation f (x) =
u
u x au ,
where u x means that supp(u) ⊆ supp(x) (we say that u is covered by x). A Boolean
function f ◦ can be associated to the
 ANF of fu : for every x ∈ Fn2 , we set f ◦ (x) = asupp(x) ,

that is, with the notation f (x) = u∈Fn au x : f (u) = au . Relation (2.3) shows that f is
2
the image of f ◦ by the so-called binary Möbius transform. The converse is also true:

Theorem 1 Let f be a Boolean function on Fn2 and let I ⊆{1,...,n} aI x I be its ANF. We
have:

∀I ⊆ {1, . . . , n}, aI = f (x). (2.4)
x∈Fn2 ; supp(x)⊆I


Proof Let us denote x∈Fn2 ; supp(x)⊆I f (x) by bI and consider the function g(x) =

I ⊆{1,...,n} bI xI . We have
⎛ ⎞
  
g(x) = bI = ⎝ f (y)⎠
I ⊆supp(x) I ⊆supp(x) y∈Fn2 ; supp(y)⊆I
⎛ ⎞
 
= f (y) ⎝ 1⎠ .
y∈Fn2 I ⊆{1,...,n}; supp(y)⊆I ⊆supp(x)

The sum I ⊆{1,...,n}; supp(y)⊆I ⊆supp(x) 1 is null if y = x. Indeed, if supp(y) ⊆ supp(x),
then the sum is empty and if supp(y) ⊆ supp(x), then the set {I ⊆ {1, . . . , n}; supp(y) ⊆
I ⊆ supp(x)} contains 2wH (x)−wH (y) elements. Hence, g = f and, by the uniqueness of the
ANF of f , bI = aI for every I .

Algorithm (Fast binary Möbius transform)


There exists a simple divide-and-conquer butterfly algorithm to compute the ANF from the
truth table (or vice versa), called the fast Möbius transform. For every u = (u1 , . . . , un ) ∈
Fn2 , the coefficient au of x u in the ANF of f equals
 
f (x1 , . . . , xn−1 , 0) if un = 0 and
(x1 ,...,xn−1 ) (u1 ,...,un−1 )
 
f (x1 , . . . , xn−1 , 0) ⊕ f (x1 , . . . , xn−1 , 1) if un = 1.
(x1 ,...,xn−1 ) (u1 ,...,un−1 )

Hence if, in the truth table of f , the binary vectors are ordered in lexicographic order,
with the bit of higher weight on the right, the table of the ANF equals the concatenation
of the ANFs of the (n − 1)-variable functions f (x1 , . . . , xn−1 , 0) and f (x1 , . . . , xn−1 , 0) ⊕
f (x1 , . . . , xn−1 , 1). This gives the recursive algorithm below. Note that taking the lexico-
graphic order with the bit of higher weight on the left (i.e., the standard lexicographic
order) would work as well (as would any other order corresponding to a permutation of
{1, . . . , n}).
34 Generalities on Boolean and vectorial functions

1. Write the truth table of f , in which the binary vectors of length n are in lexicographic
order with the bit of higher weight on the right.
2. Let f0 and f1 be the restrictions of f to F2n−1 × {0} and F2n−1 × {1}, respectively;9 replace
the values of f1 by those of f0 ⊕ f1 .
3. Apply recursively step 2, separately to the functions now obtained in the places of f0
and f1 .
When the algorithm ends (i.e., arrives to functions in one variable each), the global table
gives the values of the ANF of f . The complexity of this algorithm is of n 2n XORs; it is
then in O(N log2 N), where N = 2n is the size of its input f .

Algorithm 1: Computing the algebraic normal form.


Data: tt ← truth table, n ← number of variables
Result: anf ← algebraic normal form
for i = 0 to n − 1 do
for j = 0 to 2n−1 − 1 do
t[j ] = tt[2 ∗ j ];
u[j ] = tt[2 ∗ j ] ⊕ tt[2 ∗ j + 1];
end
for k = 0 to 2n−1 − 1 do
anf [k] = t[k];
anf [2n−1 + k] = u[k];
end
end

We give in Table 2.1 an example of the computation of the ANF from the truth table using
the algorithm of the fast binary Möbius transform, and of the computation of the truth table
from the ANF, using this same algorithm.

Remark. The algorithm does not work if the order on F2n is not a permuted lexicographic
order (for instance, an order by increasing weights of inputs).

ANF of the graph indicator of a vectorial function


Denoting by 1GF (x, y), the indicator (i.e., the characteristic function) of the graph GF =
{(x, F (x)); x ∈ Fn2 } of an (n, m)-function F (sometimes called its codebook), Relation
(2.4) applied to 1GF gives that, for every I ⊆ {1, . . . , n}, J ⊆ {1, . . . , m}, the coefficient of
x I y J in its ANF equals the following:
aI ,J = |{x ∈ Fn2 ; supp(x) ⊆ I and supp(F (x)) ⊆ J }| [mod 2].
We have also the following:

9 The truth table of f0 (resp. f1 ) corresponds to the upper (resp. lower) half of the table of f .
2.2 Representations of Boolean functions and vectorial functions 35

Table 2.1 ANF of a function from its truth table and recalculation of the truth table from ANF
(for function f (x) = x2 ⊕ x1 x2 x3 ⊕ x1 x4 ; x = (x1 , x2 , x3 , x4 )).

x1 x2 x3 x4 x in hexa f (x) f ◦ (x) f (x)

0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 1 0 0 0 0 0 0 0 0 0
0 1 0 0 2 1 1 1 1 1 1 1 1 1
1 1 0 0 3 1 1 1 1 0 0 0 0 1
0 0 1 0 4 0 0 0 0 0 0 0 0 0
1 0 1 0 5 0 0 0 0 0 0 0 0 0
0 1 1 0 6 1 1 0 0 0 0 1 1 1
1 1 1 0 7 0 0 1 1 1 1 1 1 0
0 0 0 1 8 0 0 0 0 0 0 0 0 0
1 0 0 1 9 1 1 1 1 1 1 1 1 1
0 1 0 1 a 1 0 0 0 0 1 1 1 1
1 1 0 1 b 0 1 1 0 0 0 0 1 0
0 0 1 1 c 0 0 0 0 0 0 0 0 0
1 0 1 1 d 1 1 0 0 0 0 1 1 1
0 1 1 1 e 1 0 0 0 0 0 1 1 1
1 1 1 1 f 1 1 0 0 0 1 1 0 1

Proposition 2 [253, 254] Let F be any (n, m)-function, and let f1 , . . . , fm be its
coordinate functions. We have the following:

m  
1GF (x, y) = (yj ⊕ fj (x) ⊕ 1) = yJ (fj (x) ⊕ 1).
j =1 J ⊆{1,...,m} j ∈{1,...,m}\J

m 
Indeed, for every y, y  ∈ Fm 
2 , we have y = y if and only if j =1 (yj ⊕ yj ⊕ 1) = 1. This,
with y  = F (x), proves the first assertion and the rest is straightforward.
Note that, if F is a permutation (m = n), then 1GF (x, y) = 1GF −1 (y, x), where F −1 is the
compositional inverse of F , and thus
   
yJ (fj (x) ⊕ 1) = xI (fi (y) ⊕ 1), (2.5)
J ⊆{1,...,n} j ∈{1,...,n}\J I ⊆{1,...,n} i∈{1,...,n}\I

where the fi s are the coordinate functions of F −1 .

Algebraic degree of a Boolean function


Definition 6 The degree of the ANF shall be denoted by dalg (f ) and is called the algebraic
degree of the function10 : dalg (f ) = max{|I |; aI = 0}, where |I | denotes the size of I (with
the convention that the zero function has algebraic degree 0).

This makes sense thanks to the existence and uniqueness of the ANF.
10 Some authors also call it the nonlinear order of f , but this terminology is more or less obsolete.
36 Generalities on Boolean and vectorial functions

Of course, given two n-variable Boolean functions f , g, we have dalg (f ⊕ g) ≤


max(dalg (f ), dalg (g)) and dalg (f g) ≤ dalg (f ) + dalg (g).
Note that a Boolean function is affine if and only if it has algebraic degree at most 1.
We shall call quadratic functions the Boolean functions of algebraic degree at most 2 and
cubic functions those of algebraic degree at most 3. Note that this means for instance that an
affine function is a particular quadratic function (just as, by definition, a constant function is
a particular affine function). This may be a little confusing for the reader, but we are obliged
to adopt this terminology, since otherwise, we would have sentences like “all derivatives of
a Boolean function are affine if and only if the function is quadratic or affine,” “all second-
order derivatives are affine if and only if the function is cubic or quadratic or affine,” etc.
According to Relation (2.4), we have directly:

Proposition 3 The algebraic degree dalg (f ) of any n-variable Boolean function f equals
the maximum dimension of the subspaces {x ∈ Fn2 ; supp(x) ⊆ I } on which f takes value 1
an odd number of times. In particular:
– dalg (f ) = n if and only if wH (f ) is odd,
– dalg (f ) = n − 1 if and only if (1) wH (f ) is even and (2) there exists i such that |{a ∈

supp(f ); ai = 0}| is odd, or equivalently thanks to (1), a∈supp(f ) a = 0.

The index i is indeed characterized by a∈supp(f ) ai = 0. The two latter properties above
will be seen under another viewpoint in Corollary 2, page 46.
The algebraic degree is an affine invariant (i.e., it is invariant under the action of the
general affine group; see Section 2.1): for every affine automorphism L : (x1 , x2 , . . . , xn ) ∈
Fn2 → (x1 , x2 , . . . , xn ) × M + (a1 , a2 , . . . , an ), where M is a nonsingular n × n matrix
over F2 , we have dalg (f ◦ L) = dalg (f ). Indeed, the composition by L clearly cannot
increase the algebraic degree, since the coordinates of L(x) have degree 1. Hence we have
dalg (f ◦L) ≤ dalg (f ) (in fact, for every affine homomorphism). And applying this inequality
to f ◦ L in the place of f and to L−1 in the place of L shows the inverse inequality.
Note in particular that, if F is an (n, n)-permutation, then we have dalg (1GF ) =
dalg (1GF −1 ): these two indicators correspond to each other by swapping x and y. For
functions of algebraic degree strictly larger than 1, the algebraic degree is an EA invariant
(but not a CCZ invariant; see [162, 163]).
The algebraic degree being an affine invariant, Proposition 3 implies that it also equals
the maximum dimension of all the affine subspaces of Fn2 on which f takes value 1 an odd
number of times. Equivalently:

Proposition 4 A Boolean function has algebraic degree at most d if and only if its
restriction to any (d + 1)-dimensional flat (i.e., affine subspace) has an even Hamming
weight.

This shows in particular that, given an n-variable Boolean function f and an affine
subspace A = a + E of Fn2 (where E is the vector space equal to the direction of A),
the restriction of f to A, viewed as a k-variable function where k is the dimension of A (by
identifying the elements of a + E with the vectors of Fk2 through the choice of a basis of E),
has algebraic degree at most dalg (f ).
2.2 Representations of Boolean functions and vectorial functions 37

It is shown in [955] that, for every nonzero n-variable Boolean function f , denoting by
g the binary Möbius transform of f , we have dalg (f ) + dalg (g) ≥ n. This same paper
deduces characterizations and constructions of the functions that are equal to their binary
Möbius transform, called coincident functions.

Remark. 1. Every atomic function has algebraic degree n, since its ANF equals (x1 ⊕ 1 )
(x2 ⊕ 2 ) . . . (xn ⊕ n ), where i ∈ F2 . Thus, a Boolean function f has algebraic degree n
if and only if, in its decomposition as a sum of atomic functions, the number of these
atomic functions is odd, that is, if and only if wH (f ) is odd. This property will have an
important consequence on the Reed–Muller codes, and it will be also useful in Chapter 4.
2. If we know that the algebraic degree of an n-variable Boolean function f is bounded
above by d < n, then the whole function can be recovered from some of its restrictions
(i.e., a unique function corresponds to this partially defined Boolean function). Precisely,
according to the existence and uniqueness of the ANF, the knowledge of the restriction
f|E of the Boolean function f (of algebraic degree at most d) to a set E implies
the knowledge of the whole function if and only if the system of the equations f (x)
= I ⊆{1,...,n}; |I |≤d aI x I , with indeterminates aI ∈ F2 , and where x ranges over E (this
makes |E| equations), has a unique solution.11
This happens with the set Ed of all words of Hamming weights smaller than or equal to
d (and then, by affine equivalence, it happens with every set E affinely equivalent to Ed ),
since Relation (2.4) gives the value of aI for |I | ≤ d and the others are null by hypothesis.
And since |Ed | = |{I ⊆ {1, . . . , n}; |I | ≤ d}|, any choice of f|Ed works.
Notice that Relation (2.3) makes it possible to express the value of f (x) for every x ∈
Fn2 by means of the values taken by f on E. For instance, for E = Ed , we have the fol-
lowing (using the notation au instead of aI , see above, and still using that dalg (f ) ≤ d):
  
f (x) = au = au = f (y) |{u ∈ Ed ; y u x|
u x u x y x
u∈Ed y∈Ed
⎡⎡ ⎤ ⎤
 d−wH (y) 
wH (x) − wH (y) ⎦
= f (y) ⎣⎣ [mod 2]⎦ .
y x
i
i=0
y∈Ed

These observations generalize to pseudo-Boolean (that is, real-valued) functions, if we


consider the numerical degree (see below) instead of the algebraic degree, cf. [1090].

The simplest functions, from the viewpoint of the ANF, are those Boolean functions of
algebraic degree at most 1, that is, affine functions (the sums of linear and constant functions,
sometimes called parity functions; see, e.g., [914]):
f (x) = a0 ⊕ a1 x1 ⊕ · · · ⊕ an xn ; ai ∈ F2 .
Denoting by a · x, the usual inner product a · x = a1 x1 ⊕ · · · ⊕ an xn in Fn2 (already
encountered in Section 1.2), or any other inner product (that is,12 any symmetric bivariate
11 Note that taking f (x) = 0, ∀x ∈ E, leads to another problem: determining the so-called annihilators f of the
indicator 1E of E (the characteristic function of E, defined by 1E (x) = 1 if x ∈ E and 1E (x) = 0 otherwise).
This is the core analysis of Boolean functions from the viewpoint of algebraic attacks; see Section 3.1.
12 In nonzero characteristic, there is no possible notion of positivity.
38 Generalities on Boolean and vectorial functions

function such that, for every a = 0, the function x → a · x is a nonzero linear form13 on
Fn2 ), the general form of an n-variable affine function is a · x ⊕ a0 = a (x) ⊕ a0 (with
a ∈ Fn2 ; a0 ∈ F2 ), since the nondegeneracy of the bilinear form implies that the mapping
a → a is injective and therefore bijective.
Affine functions play an important role in coding (they are involved in the definition of the
Reed–Muller code of order 1; see Section 4.1) and in cryptography (the Boolean functions
used as “nonlinear functions” in cryptosystems must behave as differently as possible from
affine functions; see Section 3.1).

Algebraic degree and derivation


The derivation of Boolean functions must not be confused with the derivation of
polynomials:

Definition 7 Let f be an n-variable Boolean function and let a be any vector in Fn2 . We
call derivative14 in the direction a (or with the input difference a) of f the Boolean function
Da f (x) = f (x) ⊕ f (x + a).

For instance, the derivative of a function expressed in the form g(x1 , . . . , xn−1 ) ⊕
xn h(x1 , . . . , xn−1 ) in the direction (0, . . . , 0, 1) equals h(x1 , . . . , xn−1 ).

Proposition 5 Any derivative of any nonconstant Boolean function f has an algebraic


degree strictly smaller than the algebraic degree of f , and there exists at least one derivative
of algebraic degree dalg (f ) − 1.

 each monomial x , where I = ∅:


Proof The first assertion can be checked I
  easily for
we have x ⊕ (x + a) =
I I
J ⊂I ,J =I
J
j ∈I \J aj x . The second assertion is a direct
consequence, by affine invariance of the algebraic degree, of the fact observed just above for
direction (0, . . . , 0, 1).

Note that this implies that a function is affine if and only if all its derivatives are constant
(this is more generally valid for every function defined over a vector space). And it is
quadratic if and only if all its derivatives are affine. For a general function, the sets of those
vectors a such that Da f is constant (resp. affine) are vector subspaces of Fn2 ; see page 99.
In [275], Boolean functions f whose restrictions to all affine hyperplanes have the same
algebraic degree equal to dalg (f ) and functions whose derivatives Da f (x), a = 0n , have all
the same algebraic degree dalg (f ) − 1 are studied. Three classes of Boolean functions are
presented; the first class satisfies both conditions, the second class satisfies the first condition
but not the second, and the third class satisfies the second condition but not the first. This
same paper gives, for any fixed positive integer k and for all integers n, p, s such that
p ≥ k + 1, s ≥ k + 1, and n ≥ ps, a class Cn,p,s of n-variable Boolean functions whose
restrictions to all k-codimensional affine subspaces of Fn2 have the same algebraic degree as
the function.
Higher-order derivatives have been introduced by Lai [735].
13 That is, “·” is a nondegenerate bilinear form.
14 Some authors write “directional derivative”.
2.2 Representations of Boolean functions and vectorial functions 39

Definition 8 Let f be an n-variable Boolean function and let a1 , . . . , ak be k vectors in


Fn2 . We call the k-th order derivative of f in the directions a1 , . . . , ak the Boolean function
Da1 Da2 · · · Dak f (x).

It is easily seen by induction  on k that if a1 , . . . , ak are linearly independent, then


Da1 Da2 · · · Dak f (x) = a∈E (x + a), where E is the F2 -vector space spanned by
f
a1 , . . . , ak , and otherwise Da1 Da2 · · · Dak f (x) = 0.

Corollary 1 Any k-th order derivative of any Boolean function f of an algebraic degree
at least k has an algebraic degree at most dalg (f ) − k.

The algebraic normal form of vectorial functions


The notion of the algebraic normal form of Boolean functions can easily be extended
to (n, m)-functions. Given such function F , each coordinate function of F is uniquely
represented by its ANF, which is an element of F2 [x1 , . . . , xn ]/(x12 ⊕ x1 , , . . . , xn2 ⊕ xn ).
Function F is then represented in a unique way as an element of Fm 2 [x1 , . . . , xn ]/(x1 ⊕
2

x1 , . . . , xn2 ⊕ xn ):
 

F (x) = aI xi = aI x I , (2.6)
I ⊆{1,...,n} i∈I I ⊆{1,...,n}

where aI belongs to Fm
2 (maybe we should write
 

F (x) = xi aI = x I aI ,
I ⊆{1,...,n} i∈I I ⊆{1,...,n}

since i∈I xi is a scalar
 and aI is a vector). According to our convention on the notation for
additions,we used to denote the sum in Fm 2 , but recall that, coordinate by coordinate, this
sum is a .
This polynomial is called the algebraic normal form (ANF) of F . According to Relation
(2.3),
 we have F (x) = I ⊆supp(x) aI , and according to Relation (2.4), we have aI =
n
x∈F ; supp(x)⊆I F (x) (these sums being calculated in Fm 2 ).
2

Remark. An (n, m)-function F (x) being given by its ANF and an (m, r)-function G(y)
being given by the ANF of the indicator 1GG (y, z) of its graph GG = {(y, G(y)); y ∈ Fm 2 },
the ANF of the indicator 1GG◦F (x, z) of the graph of the composite function G ◦ F equals
1GG (F (x), z), where we denote a function and its ANF the same way.
If we are given the ANF of 1GF rather than that of F (x), then as observed in [254],
1GG◦F (x, z) can be obtained by the elimination of y from the two equations 1GF (x, y) = 1
and 1GG (y, z) = 1.  one y such that 1GF (x, y) = 1, then
Since for every x, there is exactly
1GG◦F (x, z) equals y∈Fm 1GF (x, y)1GG (y, z) = y∈Fm 1GF (x, y)1GG (y, z). This formula
2 2
can be easily iterated (with more than two functions), and we shall see that it gives
information that is more exploitable than 1GG◦F (x, z) = 1GG (F (x), z) because it deals with
a multiplication instead of a composition.
40 Generalities on Boolean and vectorial functions

Algebraic degree of a vectorial function


The algebraic degree of an (n, m)-function is by definition the global degree of its ANF:
dalg (F ) = max{|I |; I ⊆ {1, . . . , n}, aI = 0m }. It therefore equals the maximal algebraic
degree of the coordinate functions of F . It also equals the maximal algebraic degree of the
component functions (in brief, components) of F , that is, the nonzero linear combinations of
the coordinate functions, i.e., the functions of the form v · F , where v ∈ Fm 2 \ {0m } and “·” is
an inner product in Fm2 . The algebraic degree of vectorial functions is an affine invariant (that
is, its value does not change when we compose F , on the right or on the left, by an affine
automorphism). For functions of algebraic degree strictly larger than 1, it is an EA invariant,
but it is not a CCZ invariant. In particular, the algebraic degrees of a permutation and its
compositional inverse are in general not equal. It is, however, observed in [106] that if an
(n, n)-permutation F has algebraic degree n − 1 (the maximum for a permutation), then its
inverse has also algebraic degree n − 1. In fact, this is a direct consequence of Relation (2.5)
by considering the terms x I y J where |I | = |J | = n − 1. Note that, according to Proposition
2 on the graphs of (n, m)-functions, writing 1GF (x, y) in the form J ⊆{1,...,m} ϕJ (x)y J , we
have that dalg (F ) = max|J |=m−1 dalg (ϕJ (x)) and
⎛ ⎛ ⎞ ⎞

dalg (1GF ) = max ⎝dalg ⎝ (fj ⊕ 1)⎠ + |J |⎠ (2.7)
J ⊆{1,...,m}
j ∈{1,...,m}\J

≥ max(m, m − 1 + dalg (F )). (2.8)

If the algebraic degree of 1GF is low (i.e., close to m), then all the products of a few
coordinate functions of F have low algebraic degree.

Proposition 2 and the relation 1GG◦F (x, z) = y∈Fm 1GF (x, y)1GG (y, z) lead in [254] to
2
the bounds

dalg (G ◦ F ) ≤ dalg (1GF ) + dalg (G) − m and (2.9)


 
dalg (H ◦ G ◦ F ) ≤ dalg 1GF + dalg 1GG + dalg (H ) − m − r, (2.10)

for every (n, m)-function F , (m, r)-function G and (r, s)-function H .


If F is a permutation, then, as observed in [253, 254], 1GG◦F (x, z) is equal to
y∈Fm 1GF −1 (y, x)1GG (y, z), that is, according to Proposition 2, page 35, and Proposition 3,
2
page 36:
⎛ % &⎞
   
1GG◦F (x, z) = x I zK ⎝ (fi (y) ⊕ 1) (gk (y) ⊕ 1) ⎠
I ⊆{1,...,n} y∈Fm i∈I c k∈K c
K⊆{1,...,r} 2

= x I zK , (2.11)
I ⊆{1,...,n},K⊆{1,...,r};
  
dalg (i∈I c (fi ⊕1) k∈K c (gk ⊕1) =n )

where I c = {1, . . . , n} \ I , K c = {1, . . . , r} \ K and the fi s are the coordinate functions
of F −1 and the gk s are those of G. Then, still according to Proposition 2 and as proved in
[254], we have directly from (2.11) that
2.2 Representations of Boolean functions and vectorial functions 41
    

dalg (G ◦ F ) = max max |I |; dalg (gk ⊕ 1) (fi ⊕ 1) = n . (2.12)
k∈{1,...,r}
i∈I c
Note that, according to Relation (2.5), page 35, and as observed by [106] (but in a more
complex way), for every every integers k, l, the maximal algebraic degree of the product of
(k) (k)
at most k coordinate functions15 of F , that we shall denote by dalg (F ), satisfies: dalg (F ) <
(l)
n − l ⇐⇒ dalg (F −1 ) < n − k.
The case of functions over F2n is also studied in [254].
Another notion of degree is also relevant to cryptography (and is also affine invariant): the
minimum algebraic degree of all the component functions16 of F , often called the minimum
degree:
dmin (F ) = min{dalg (v · F ) : 0m = v ∈ Fm
2} dalg (F ).

2.2.2 Univariate and trace representations


A second kind of representation plays an important role in sequence theory, and is also used
for defining and studying Boolean functions. For instance, it allows us to define the S-box of
the AES and leads to the construction of the Kerdock codes (see Section 6.1.22). Recall that,
for every n, there exists a (unique up to isomorphism) field F2n (also denoted by GF (2n ) in
some papers) of order 2n ; see [775, 890]. For making this book self-contained, we recall in
Appendix (Chapter 14, page 480) the basics on finite fields, permutation polynomials and
equations over finite fields. The vector space Fn2 can be endowed with the structure of this
field F2n (by construction and because F2n has the structure of an n-dimensional F2 -vector
space; if we choose an F2 -basis (α1 , . . . , αn ) of this vector space, then every element x ∈ Fn2
can be identified with x1 α1 + · · · + xn αn ∈ F2n ). We shall still denote by x this element of
the field.

Univariate representation of (n,n)-functions


Every mapping from F2n into F2n (and hence any (n, n)-function17 ) admits a (unique)
representation as a polynomial over F2n in one variable and of (univariate) degree at most
2n − 1:
2n −1
F (x) = δi x i ; δi ∈ F2n . (2.13)
i=0
Indeed, the function mapping every such polynomial to the corresponding polynomial
function from F2n to F2n is F2n -linear and has trivial kernel since a nonzero polynomial
cannot have a number of distinct zeros larger than its degree. Since the dimensions of the
F2n -vector space of such polynomials and of the F2n -vector space of all (n, n)-functions both
equal 2n , this function is a bijection.
15 The algebraic degree of the product of k coordinate functions equals n if k = n and is strictly smaller if k < n,
as can be easily shown and is characteristic of permutations.
16 Not just the coordinate functions; the notion would then not be affine invariant.
17 Note that if m divides n, then any function from F2n into F2m is a function from F2n into F2n ; hence we also
cover such (n, m)-functions here. When m does not divide n, we can view the elements of Fm 2 as elements of
Fm
2 × {0 n−m } ⊂ Fn and represent them as elements of F n , but this is a little more artificial.
2 2
42 Generalities on Boolean and vectorial functions

Definition
2n −19 We call univariate representation of an (n, n)-function F the unique polyno-
i
mial i=0 δi X satisfying (2.13).

We shall also sometimes write that F is in univariate form.


n
Remark. F2n is the set of solutions of equation x 2 + x = 0. We can then better view the
n
univariate representation of (n, n)-functions as lying in the quotient ring F2n [X]/(X2 + X),
each element of this ring being then represented as the remainder in the division by
n
X 2 + X.

Note that the univariate representation of any (n, n)-function can be obtained by the
Lagrange interpolation method or as follows: since every element x in F∗2n satisfies
x 2 −1 = 1, the function x2 −1 +1 equals the Dirac (or Kronecker) symbol (i.e., the indicator
n n

of {0}), the polynomial a∈F2n F (a)((X + a)2 −1 + 1) is the univariate representation of


n

F . Note in particular that the coefficient of x 2 −1 in this univariate representation equals the
n

sum of all values F (a). A way of obtaining more directly the univariate representation is by
using the so-called Mattson–Solomon polynomial that we shall see at page 44.

Univariate representation of Boolean functions


Any Boolean function on F2n is a particular case of a vectorial function from F2n to F2n
(since F2 is a subfield of F2n ) and has then a (unique) univariate representation. Recall that
the mappingx → x 2 is a field automorphism called the Frobenius automorphism. The
2n −1
polynomial i=0 δi Xi , δi ∈ F2n , is the univariate representation of a Boolean function if
 n 2 2n −1
2 −1
and only if the functions i=0 δi x
i and i=0 δi x i take the same value at every x ∈
2n −1 2 2i 2n −1 n
F2n , that is, if and only if i=0 δi X ≡ i=0 δi Xi [mod X2 +X], that is, δ0 , δ2n −1 ∈ F2
and, for every i = 1, . . . , 2n − 2, δ2i = δi2 , where the index 2i is taken mod 2n − 1.

Absolute trace representation of Boolean functions and vectorial functions


2 n−1
The absolute trace function on F2n , trn (x) = x + x 2 + x 2 + · · · + x 2 , is addressed at
page 489 (it is F2 -linear, satisfies (trn (x))2 = trn (x 2 ) = trn (x), and is valued in F2 ). The
function (x, y) → trn (x y) is an inner product in F2n (recall that this means it is symmetric
and, for every y = 0, the function x → trn (x y) is a nonzero linear form over F2n ). Every
Boolean function can be written in the form f (x) = trn (F (x)), where F is a mapping from
F2n into F2n (an example of such mapping F is defined by F (x) = λ f (x), where trn (λ) = 1
and f (x) is in univariate representation). Thus, every n-variable Boolean function f can be
also represented in the form
⎛n ⎞
2 −1
f (x) = trn ⎝ βi x i ⎠ , (2.14)
i=0

where βi ∈ F2n . Note that, thanks to the fact that trn is F2 -linear and trn (x 2 ) = trn (x)
for every x ∈ F2n , each term βi x i in (2.14) can be replaced by its 2j th power, for every
j and without changing the value of the expression. We can then transform (2.14) into
2.2 Representations of Boolean functions and vectorial functions 43
 i where I contains at most one element of each cyclotomic
an expression
'  trn i∈I γi x (
class i × 2j mod (2n − 1) ; j ∈ N of 2 modulo 2n − 1 (but this still does not make the
representation unique).
More generally, if m is a divisor of n, then any (n, m)-function F admits a univariate
polynomial representation in the form:
⎛ ⎞
2n −1
n ⎝
F (x) = trm δj x j ⎠ , (2.15)
j =0

n (x) = x + x 2 + x 2 m 2m 3m n−m
where trm + x2 + · · · + x2 is the trace function from F2n to
F2m . Indeed, there exists a function G from F2n to F2n such that F equals trm n ◦ G (for

instance, G(x) = λF (x), where trm (λ) = 1, since trm is a F2m -linear form). But there is no
n n

uniqueness of G in this representation as well.

Definition 10 We shall call the representation (2.14), resp. (2.15), an absolute trace
representation of Boolean function f (resp. of (n, m)-function F ).

Its use is convenient, with the drawback of nonuniqueness, which makes it more difficult
to determine when two functions are equal.

Subfield trace representation of Boolean functions


2n −1
We come back to the univariate representation i=0 δi Xi . We have seen that for any
Boolean function, we have δ0 , δ2n −1 ∈ F2 , and for every i = 1, . . . , 2n − 2, δ2i = δi2 , where
the index 2i is taken modulo 2n − 1. Gathering all the elements of a same cyclotomic class
of 2 modulo 2n − 1 allows the univariate representation of f in the following form:

n −1 ∀j ∈ (n), βj ∈ F2nj ,
f (x) = trnj (βj x j ) + β2n −1 x 2 , with (2.16)
β2n −1 ∈ F2
j ∈(n)

where (n) is a set of representatives of the cyclotomic classes of 2 modulo 2n − 1 (the most
usual choice of representative is the smallest element in the cyclotomic class, called the coset
leader of the class) and nj is the size of the cyclotomic class containing j . It is easily seen
nj
that nj divides n and that βj ∈ F2nj because βj2 = βj . We also have that the j th power
nj
of every x ∈ F2n belongs to F2nj because j 2nj ≡ j [mod 2n − 1] implies (x j )2 = x j .
Hence, trnj takes as an argument an element of F2nj , as it should. This representation allows
uniqueness.

Definition 11 We call (2.16) the subfield trace representation of function f .

We shall also sometimes write more simply that f is in trace form.


44 Generalities on Boolean and vectorial functions

Calculating the univariate and subfield trace representations of a Boolean function


from its truth table
Denoting by α a primitive element of the field F2n (recall that this means that F2n =
{0, 1, α, α 2 , . . . , α 2 −2 }), the Mattson–Solomon polynomial18 of the vector (f (1), f (α),
n

f (α 2 ), . . . , f (α 2 −2 )) is the polynomial [809, page 239]:


n

2n −1 2n −2
2n −1−j
A(x) = Aj x = A2n −1−j x j (2.17)
j =1 j =0

with the following:


2n −2
Aj = f (α k )α kj . (2.18)
k=0
2n −2
Note that Aj = a(α j ), where a(x) = k=0 f (α k )x k .
We have, for every 0 ≤ i ≤ 2n − 2:
2n −1 2n −1 2n −2
−ij
A(α ) =
i
Aj α = f (α k )α (k−i)j = f (α i ) (2.19)
j =1 j =1 k=0

2n −1 2n −2
α (k−i)(2 −1) + 1
n

(since, if 1 ≤ k = i ≤ 2n − 2, then α (k−i)j


= α (k−i)j = = 0,
α k−i + 1
j =1 j =0
2n −1
and if k − i = 0, then j =1 α (k−i)j =
1). Note that, with the usual convention 00 = 1,
2n −2
we have A(0) = A2n −1 . Hence, if f (0) = A2n −1 = k
k=0 f (α ), that is, if f has
even Hamming weight (i.e., algebraic degree strictly less than n), the Mattson–Solomon
polynomial A(x) equals the univariate representation of f (x). Otherwise, we have f (x) =
A(x) + 1 + x 2 −1 , since 1 + x 2 −1 equals the Dirac (or Kronecker) function at 0 (i.e., takes
n n

value 1 at 0 and 0 at every nonzero element of F2n ). This provides the following univariate
representation:
2n −2
n −1−j n −1
f (x) = f (0) + Aj x 2 + (wH (f ) [mod 2]) x 2
j =1

and the subfield trace representation:


n −1
f (x) = trnj (A2n −1−j x j ) + (wH (f ) [mod 2])(1 + x 2 ).
j ∈(n)

Remark. For any Boolean function f , we have in (2.18) that A2j = A2j and this allows
us to gather the terms corresponding to a same cyclotomic class. This provides the
subfield trace representation of f . We can also, thanks to a change of the coefficients,
write

18 The Mattson–Solomon transform is a discrete Fourier transform (over F2n ); other discrete Fourier transforms
exist (e.g., over the complex field, as in [1110]).
2.2 Representations of Boolean functions and vectorial functions 45

2n −2
f (α ) =
j
trn (aj α −ij ) (2.20)
j =1

and obtain the absolute trace representation of f . This shows what was asserted at the end
of Subsection 1.2.4.

Remark on RS codes. Relations (2.17), (2.18), and (2.19) are valid for every function f
from F∗2n to F2n . In this framework, A(x), which according to Relation (2.17) is the polyno-
mial representation (see page 12) of codeword (A2n −1 , A2n −2 , . . . , A1 ), belongs to the Reed–
Solomon code (see page 13) over F2n of length 2n − 1 and zeros α 2 −δ , α 2 −δ+1 , . . . , α 2 −2
n n n

(whose designed distance is δ) if and only if a(x) has degree at most 2n − 1 − δ (according
to Relation (2.19)), and the codeword (A2n −1 , A2n −2 , . . . , A1 ) is an evaluation vector of this
polynomial over F∗2n , according to Relation (2.18). The BCH bound in this case corresponds
to the fact that a nonzero polynomial of degree at most 2n − 1 − δ has at most 2n − 1 − δ
zeros in F2n and therefore has at least δ nonzeros in F∗2n . This generalizes to RS codes
over Fq .

Calculating the ANF of a Boolean function or a vectorial function


from its univariate representation
n
We express x in the form i=1 xi αi , where (α1 , . . . , αn ) is a basis of the F2 -vector
 space
F2 . Recall that, for every j ∈ Z/(2 −1)Z, the binary expansion of j has the form s∈E 2s ,
n n

where E ⊆ {0, 1, . . . , n − 1}. The size of E is often called the 2-weight of j and written
n−1
w2 (j ). We write more conveniently the binary expansion of j in the form: s=0 js 2s , js ∈
{0, 1}. We have then the following:
2n −1
 n j
F (x) = δj xi αi
j =0 i=1

2n −1
 n
n−1 js 2s
s=0
= δj xi αi
j =0 i=1
2n −1
  js

n−1 n
s
= δj xi αi2 .
j =0 s=0 i=1

Expanding these last products and simplifying gives the ANF of F .

Proposition 6 Any Boolean function (resp. any (n, n)-function) whose univariate repre-
sentation equals (2.13) has algebraic degree maxj =0,...,2n −1; δj =0 w2 (j ).

Proof According to the above equalities, the algebraic degree is bounded above by this
number, and it cannot be strictly smaller, because the dimension of the F2 -vector space (resp.
the F2n -vector space) of Boolean
d n n-variable functions (resp. of (n, n)-functions) of algebraic
degree at most d equals i=0 i , which is also the dimension of the vector space of those
46 Generalities on Boolean and vectorial functions
2n −1
polynomials j =0 δj x j such that δ0 , δ2n −1 ∈ F2 , δj ∈ F2n , δ2j = δj2 ∈ F2n for every j =
 n −1
1, . . . , 2n − 2 and maxj =0,...,2n −1; δj =0 w2 (j ) ≤ d (resp. of those polynomials j2 =0 δj x j
such that δj ∈ F2n for every j = 0, . . . , 2 − 1 and maxj =0,...,2n −1; δj =0 w2 (j ) ≤ d).
n

In particular, an (n, n)-function F is F2 -linear (resp. affine) if and only if F (x) is a


n−1 2j
linearized polynomial over F2n : F (x) = j =0 βj x ; x, βj ∈ F2n (resp. a linearized
polynomial plus a constant).
We have also the following proposition:

Proposition 7 [209] Let a be any element of F2n and k any integer [mod 2n − 1]. If
f (x) = trn (ax k ) is not the null function, then it has algebraic degree w2 (k).

Proof Let nk be again the size of the cyclotomic class containing k. Then the univariate
representation of f (x) equals
 nk 2nk n−nk
  nk 2nk

n−nk 2 2k
a + a2 + a2 + · · · + a2 x k + a + a2 + a2 + · · · + a2 x
 nk 2nk
 n −1 n −1
n−nk 2 k
+ · · · + a + a2 + a2 + · · · + a2 x2 k .
k

All the exponents of x have 2-weight w2 (k) and their coefficients are nonzero if and only if
f is not null.

Remark. An alternative (more complex but enlightening) way of showing Proposition 7


is also given in [209] as follows: let r = w2 (k); we consider the r-linear function φ over
the field F2n whose value at (x1 , . . . , xr ) ∈ (F2n )r equals the sum of the images by f of all
the 2r possible linear combinations of the xj s. Then φ(x1 , . . . , xr ) equals the sum, for all
  σ (j )
bijective mappings σ from {1, . . . , r} onto E (where k = s∈E 2s ) of trn (a rj =1 xj2 ).
Proving that f has degree r is equivalent to proving that φ is not null, and it can be shown
that if φ is null, then f is null.

Remark. For calculating the univariate representation from the ANF, we can only propose
to calculate the truth table (resp. the LUT) by the fast Möbius transform and then to apply
the method of page 44. Note, however, that the coefficient of ni=1 xi in the ANF of F is
directly linked to the coefficient of x 2 −1 in its univariate representation since these two
n

coefficients are equal to each other (up to the correspondence between Fn2 and F2n ) because
they are both equal to the sum of all values F (x).

To complete this subsection, we give a corollary of Proposition 6 (which for d = n − 2,


n − 1 gives back the two last properties in Proposition 3, page 36):

Corollary 2 A vectorial function F : F2n → F2n has an algebraic degree at most d if and
only if, for every nonnegative integer k of 2-weight at most n − d − 1, we have the following:

x k F (x) = 0.
x∈F2n
2.2 Representations of Boolean functions and vectorial functions 47

The condition is necessary by applying to function  x k F (x) the fact that, for every (n, n)-
function G of algebraic degree at most n−1, we have x∈F2n G(x) = 0, and since, for every
nonnegative integer i, we have w2 (k + i) ≤ w2 (k) + w2 (i). The condition is also sufficient
since, for every (n, n)-function G of algebraic degree n, we have x∈F2n G(x) = 0, and
since, for every i of 2-weight strictly larger than d, there exists k of 2-weight at most n−d −1
such that w2 (k + i) = n; if i is taken with  the highest possible 2-weight in the univariate
representation of F , we can manage that x∈F2n x k+j = 0 for other j = i such that x j has
nonzero coefficient in the univariate representation.
See more on the algebraic degree, in particular for composite functions, in ANF or
univariate representations, in [253, 254].

2.2.3 Bivariate representation of functions with an even number of input bits


The bivariate representation of n-variable Boolean functions f and of (n, m)-functions F
where n is even and m = n2 is as follows: we identify Fn2 with F2m ×F2m and we consider then
the input to F
as an ordered pair (x, y) of elements of F2m . There exists a unique bivariate
polynomial 0≤i,j ≤2m −1 ai,j x y over F2m such that the given function is the bivariate
i j

polynomial function over F2m associated to it. Then the algebraic degree of the function
equals max(i,j ) | ai,j =0 (w2 (i) + w2 (j )), and in the case of a Boolean function, the bivariate
representation can be written in the form f (x, y) = trm (P (x, y)), where P (x, y) is some
polynomial in two variables over F2m . This latter absolute trace representation is not unique.
A unique representation uses relative traces; see [245, section 2.4.2].

Moving from bivariate to univariate representation and vice versa


Any bivariate Boolean or vectorial function F (x, y) over F2n/2 and valued in F2n/2 can be
represented as a function of X ∈ F2n , which we can denote by F (X) by abuse of notation, by
n (aX) = aX+(aX)2n/2 and y = tr n (bX) = bX+(bX)2n/2 for some Fn/2 -
posing x = trn/2 n/2 2
linearly independent elements a, b ∈ F2n (constituting a basis of F2n over F2n/2 ; choosing
another basis would result in a linearly equivalent function). The obtained expression can be
expressed by means of trn by using that, for every λ ∈ F2n/2 , we have trn/2 (λ) = trn (aλ)
n (a) = a + a 2n/2 = 1. Conversely, given a Boolean or vectorial function F (X)
where trn/2
over F2n valued in F2n/2 in univariate representation and a basis (u, v) of F2n over F2n/2 , we
get its bivariate representation by decomposing X over this basis into X = ux + vy. The
obtained expression can be expressed by means of trn/2 by using that, for every u ∈ F2n ,
we have trn (u) = trn/2 (trn/2
n (u)) = tr 2n/2 ).
n/2 (u + u

2.2.4 Representation over the reals (numerical normal form)


This version over R (in fact, over Z, for Boolean and integer-valued functions over Fn2 ) of
the algebraic normal form has proved itself useful for characterizing several cryptographic
criteria [220, 292, 293] (see Chapters 6 and 7). When studied in these papers, it was already
known in other domains of Boolean functions (see e.g., [886, 905]), but rather informally
studied.

Definition 12 [292] We call numerical normal form (NNF) the representation of


pseudo-Boolean functions (i.e., real-valued functions over Fn2 ) in the quotient ring
48 Generalities on Boolean and vectorial functions

R [x1 , . . . , xn ]/(x12 − x1 , . . . , xn2 − xn ) (or Z [x1 , . . . , xn ]/(x12 − x1 , . . . , xn2 − xn ) for integer-


valued functions).

The existence of this representation for every pseudo-Boolean function can be shown with
the same arguments as for the ANFs of Boolean functions (writing 1 − xi instead of 1 ⊕ xi ).
In the case of a Boolean function,
 it can also be directly deduced from the existence of the
ANF, since, denoting x I = i∈I xi , we have the following:
  I
f (x) = aI x I ⇐⇒ (−1)f (x) = (−1)aI x
I ⊆{1,...,n} I ⊆{1,...,n}

⇐⇒ 1 − 2 f (x) = (1 − 2 aI x I ) (2.21)
I ⊆{1,...,n}

and expanding (2.21) gives the NNF of f (x).


The uniqueness of the NNF of any pseudo-Boolean function is deduced from its existence
by the usual argument: the linear mapping from every element of the 2n -dimensional R-
vector space R [x1 , . . . , xn ]/(x12 − x1 , . . . , xn2 − xn ) to the corresponding pseudo-Boolean
function on Fn2 being surjective, it is therefore one to one (the R-vector space of pseudo-
Boolean functions on Fn2 having also dimension 2n ).

Remark. The NNF does not contain properly speaking more information on a Boolean
function than its ANF, since both are unique representations and contain then full informa-
tion on the function. But the NNF contains more exploitable information in the sense that
the coefficients of the ANF contain individually little information on the function, while we
shall see that those of the NNF contain more.

Definition 13 [292] We call the degree of the NNF of a Boolean or pseudo-Boolean


function f its numerical degree and denote it by dnum (f ).

Since the ANF of a Boolean function is the mod 2 version of its NNF, the numerical
degree is always bounded below by the algebraic degree.
It is shown in [905] that, if a Boolean function f has no ineffective variable (i.e., if it
actually depends on each of its variables), then the numerical degree of f is larger than or
equal to log2 n − O(log2 log2 n) (we shall give a proof of this bound – in fact, of a slightly
more precise and stronger bound – in Proposition 15, page 67).
The numerical degree is permutation invariant but is not affine invariant. Nevertheless, the
NNF leads to an affine invariant (see a proof of this fact in [293]) which is more discriminant
than the algebraic degree:

 Let f be a Boolean function on F2 . We call the generalized degree


Definition 14 [293] n

of f the sequence di i≥1 defined as follows:


For every i ≥ 1, di is the smallest integer d > di−1 (if i > 1) such that, for every
multiindex I of a size strictly larger than d, the coefficient λI of x I in the NNF of f is a
multiple of 2i .
2.2 Representations of Boolean functions and vectorial functions 49

Example The generalized degree of any nonzero affine function is the sequence of all
positive integers.


Similarly to the case of the ANF, a (pseudo-) Boolean function f (x) = I ⊆{1,...,n} λI xI
takes the following value:

f (x) = λI . (2.22)
I ⊆supp(x)

But, contrary to what we observed for the ANF, the reverse formula is not identical to the
direct formula:

Proposition
 8 [292] Let f be a pseudo-Boolean function on Fn2 and let its NNF be
I
I ⊆{1,...,n} λI x . Then:

∀I ⊆ {1, . . . , n}, λI = (−1)|I | (−1)wH (x) f (x). (2.23)


x∈Fn2 ; supp(x)⊆I

In other words, function f and its NNF are related through the Möbius transform and its
inverse (for which there exist algorithms similar to the fast binary Möbius transform).

Proof Let us denote the number (−1)|I | (−1)wH (x) f (x) by μI and consider
x∈Fn2 ; supp(x)⊆I

the function g(x) = I ⊆{1,...,n} μI xI . We have
⎛ ⎞

g(x) = μI = ⎝(−1)|I | (−1)wH (y) f (y)⎠


I ⊆supp(x) I ⊆supp(x) y∈Fn2 ; supp(y)⊆I

and thus
⎛ ⎞

g(x) = (−1)wH (y) f (y) ⎝ (−1)|I | ⎠ .


y∈Fn2 I ⊆{1,...,n}; supp(y)⊆I ⊆supp(x)

The sum (−1)|I | is null if supp(y) ⊆ supp(x). It is also null if


I ⊆{1,...,n}; supp(y)⊆I ⊆supp(x)
supp(y) is included in supp(x), but different. Indeed, denoting |I | − wH (y) by i, it equals
wH (x)−wH (y) wH (x)−wH (y)
± i=0 i (−1)i = ±(1 − 1)wH (x)−wH (y) = 0. Hence, g = f , and, by
uniqueness of the NNF, we have μI = λI for every I .

Remark. According to Relation (2.4), page 33, the coefficient of x I in the ANF of a
Boolean function f is equal to zero if and only if supp(f )∩{x ∈ Fn2 ; supp(x) ⊆ I } has even
size. According to Relation (2.23), the coefficient of x I in the NNF of a Boolean function f
is equal to zero if and only if supp(f ) ∩ {x ∈ Fn2 ; supp(x) ⊆ I } ∩ {x ∈ Fn2 ; wH (x) even}
has same size as supp(f ) ∩ {x ∈ Fn2 ; supp(x) ⊆ I } ∩ {x ∈ Fn2 ; wH (x) odd}.
50 Generalities on Boolean and vectorial functions
n
Remark. Denoting function i=1 xi by(x) and taking I = ∅, Relation
 (2.23) can be
(−1) (x) (−1) f (x)⊕(x)
interpreted as λI = (−1)|I | − , and, since I is not
2 2
x∈Fn2 ; supp(x)⊆I
empty,  is linear and nonconstant over the vector space EI = {x ∈ Fn2 ; supp(x) ⊆ I }, and
 (x)
we have x∈Fn ; supp(x)⊆I (−1)2 = 0. After replacing (−1)f (x)⊕(x) by 1 − 2(f ⊕ )(x),
2
this gives
 
λI = (−1)|I | wH ((f ⊕ )|EI ) − 2|I |−1 ,

where (f ⊕ )|EI is the restriction of the Boolean function f ⊕  to EI . Applying this to


the function f ⊕  instead of f , we can see that the coefficients in the NNF of f ⊕  give
the Hamming weights of the restrictions of f to all vector subspaces of Fn2 of the form
{x ∈ Fn2 ; supp(x) ⊆ I }.

We have seen that the ANF f (x) = I
I ⊆{1,...,n} aI x of any Boolean function can

be deduced from its NNF f (x) = I
I ⊆{1,...,n} λI x by reducing it modulo 2, and that,
conversely, the NNF can be deduced from the ANF. The formula is obtained by expanding
(2.21) (and has been first obtained in [292] by a slightly more complex way):
2n
λI = (−2)k−1 aI1 . . . aIk , (2.24)
k=1 {I1 ,...,Ik } |
I1 ∪···∪Ik =I

where “{I1 , . . . , Ik }; I1 ∪ · · · ∪ Ik = I ” means that the multiindices I1 , . . . , Ik are all distinct,


in indefinite order, and that their union equals I . 
For instance, for the Boolean function f (x) = ni=1 xi , we have λI = (−2)|I |−1 . This,
applied to fi in the place of xi , implies that, for every Boolean functions f1 , . . . , fk , we have
the following:


k 
fi = (−2)|I |−1 fi . (2.25)
i=1 ∅=I ⊆{1,...,k} i∈I

Applying then Relation  J ⊆ {1, . |I. .|−1


(2.25) to each , k}instead of {1, . . . , k} provides the
system of the relations i∈J fi = ∅=I ⊆J (−2) i∈I fi which can be inverted and
gives the expression of the product of the fi ’s by means of their linear combinations
over R:
 
l
1 
fi = l−1 (−1)|J |−1 fi . (2.26)
2
i=1 ∅=J ⊆{1,...,l} i∈J

Indeed, J ; I ⊆J ⊆{1,...,l} (−1)|J |−1 equals (−1)l−1 if I = {1, . . . , l} and is null otherwise, and
this shows that the matrices  of the two systems of relations are inverses of each other.
A polynomial P (x) = J
J ⊆{1,...,n} λJ x , with real coefficients, is the NNF of some
Boolean function if and only if we have P 2 (x) = P (x), for every x ∈ Fn2 (which is
2.2 Representations of Boolean functions and vectorial functions 51

equivalent to P = P 2 in R [x1 , . . . , xn ]/(x12 − x1 , . . . , xn2 − xn )), or equivalently, denoting


supp(x) by I :
⎛ ⎞2

∀I ⊆ {1, . . . , n}, ⎝ λJ ⎠ = λJ . (2.27)


J ⊆I J ⊆I

Remark. Imagine that we want to generate a random Boolean function through its NNF
(this can be useful, since we will see below that the main cryptographic criteria, on Boolean
functions, can be characterized, in simple ways, through their NNFs). Assume that we have
already chosen the values λJ for every J ⊆ I (where I ⊆ {1, . . . , n} is some multiindex)
except for I itself. Let us denote the sum J ⊆I | J =I λJ by μ. Relation (2.27) gives
(λI + μ)2 = λI + μ. This equation of degree 2 has two solutions. One solution corresponds
to the choice P (x) = 0 (where I = supp(x)) and the other one corresponds to the choice
P (x) = 1.


Thus, verifying that a polynomial P (x) = I ⊆{1,...,n} λI x I with real coefficients repre-
sents a Boolean function can be done by checking 2n relations. But it can also be done by
verifying a simple condition on P and checking a single equation.

Proposition 9 [293] Any polynomial P ∈ R [x1 , . . . , xn ]/(x12 − x1 , . . . , xn2 − xn ) is the


NNF of an integer-valued function if and only if all of its coefficients are integers. Assuming
that
 this 2condition is satisfied, then P is the NNF of a Boolean function if and only if:
x∈F n P (x) = x∈Fn P (x).
2 2

Proof The first assertion is a direct consequence of Relations (2.22) and (2.23). If all the
coefficients of P are integers, then we have P 2 (x) ≥ P (x) for every x; this implies that the
2n equalities (one for eachx), expressing that
 the corresponding function is Boolean, can be
reduced to the single one x∈Fn P 2 (x) = x∈Fn P (x).
2 2

According to Relation (2.27), the translation of this characterization in terms of the


coefficients λI of P (x) is as follows:

2n−|I | λJ λJ  = 2n−|I | λI , (2.28)


I ⊆{1,...,n} J ,J  ⊆{1,...,n}; I =J ∪J  I ⊆{1,...,n}

since the number of those x ∈ Fn2 such that I ⊆ supp(x), equals 2n−|I | .
More results related to the NNF can be found in [292] and [293].

Case of vectorial functions


An extention of the NNF to (n, m)-functions is given in [484], but it seems simpler to
consider the NNF of the indicator 1GF of the graph GF = {(x, F (x)); x ∈ Fn2 }. We obtain
a (unique) characterization of the following form:
52 Generalities on Boolean and vectorial functions
⎛ ⎞
⎜ ⎟
∀x ∈ Fn2 , ∀y ∈ Fm
2 , (y = F (x)) ⇔ ⎝ λI ,J x I y J = 1⎠ .
I ⊆{1,...,n}
J ⊆{1,...,m}

Note that, if we have the NNF of each coordinate function fj of F , for j = 1, . . . , m, then
the NNF of 1GF can be deduced from the following:
m 
 
1GF (x, y) = 1 − (fj (x) − yj )2
j =1
 m

= 1 − fj (x) + yj (2fj (x) − 1)
j =1
⎛ ⎞
 
= ⎝ (1 − fj (x)) (2fj (x) − 1)⎠ y J .
J ⊆{1,...,m} j ∈{1,...,m}\J j ∈J

Note that in the case of a Boolean function f (i.e., in the case of m = 1), we have then
1GF (x, y) = 1 − f (x) + (2f (x) − 1) y, for x ∈ Fn2 and y ∈ F2 .

Remark. As we can see, some representations of Boolean functions (resp. of vectorial


function) such as the ANF are such that any object having the form of an ANF is the ANF
of some function. Some others such as the NNF do not have such property. The Fourier–
Hadamard and Walsh transforms that we shall see below provide also representations of
Boolean and vectorial functions, which are of the latter kind. Some other representations
also exist; see, e.g., [484], where their relationships are studied as well as their behavior
with respect to composition, and their eigenanalysis in relation with graphs (see page 70), in
the case of representations by square matrices.

2.3 The Fourier–Hadamard transform and the Walsh transform


2.3.1 Fourier–Hadamard transform of pseudo-Boolean functions
Almost all the characteristics needed for Boolean functions in cryptography and for sets
of Boolean functions in coding can be expressed by means of the weights of two kinds of
related Boolean functions: f ⊕  where  is linear,19 and Da f (x) = f (x) ⊕ f (x + a)
(the derivatives of f ). In this framework, the Fourier–Hadamard transform is an efficient
tool: for a given Boolean function f , the Fourier–Hadamard transform of f provides the
knowledge of the weights of all the functions f ⊕ , where  is a linear (or an affine) form,
and the weights of the derivatives Da f are also directly related to the Fourier–Hadamard
transform.

19 As far as we know, and as reported in [555, 1110], the weights of these functions have been originally
considered by S. Golomb [549] to define what he called invariants: given a positive integer t ≤ n, the tth
invariant defined by Golomb is the unordered set of values max(wH (f (x) ⊕ u · x), wH (f (x) ⊕ u · x ⊕ 1)),
where a ranges over Fn2 .
2.3 The Fourier–Hadamard transform and the Walsh transform 53

Definition 15 The Fourier–Hadamard transform20 is the R-linear mapping that maps any
pseudo-Boolean function ϕ on Fn2 to the function 
ϕ defined on Fn2 by


ϕ (u) = ϕ(x) (−1)u·x , (2.29)
x∈Fn2

where “·” is some chosen inner product in Fn2 . We call the Fourier–Hadamard spectrum of
ϕ the multiset of all the values 
ϕ (u), where u ∈ Fn2 and Fourier–Hadamard support of ϕ the
set of those u such that 
ϕ (u) = 0.

Remark. The most used inner product in Fn2 is the usual inner product u · x = u1 x1
⊕ · · · ⊕ un xn . If Fn2 is identified to the finite field F2n , then u · x = trn (ux); u, x ∈ F2n , is
better used; and if n is even, say n = 2m, and Fn2 is identified to F22m , then it is (u1 , u2 ) ·
(x1 , x2 ) = trm (u1 x1 + u2 x2 ); u1 , u2 , x1 , x2 ∈ F2m . In all cases, the Walsh functions (−1)u·x
constitute an orthogonal basis of the vector space RF2 over R, according to properties we
n

shall see at page 58.

Recall that every linear form over Fn2 equals u : x → u · x for some unique u in Fn2 . If ϕ
is a Boolean function(viewed as an integer-valued function), then  ϕ (0) equals wH (ϕ) and,
for u = 0n , 
ϕ (u) = x∈Fn ϕ(x) (1 − 2 u · x) equals wH (ϕ) − 2wH (ϕ u ) = wH (ϕ ⊕ u ) −
2
wH (u ) = wH (ϕ ⊕ u ) − 2n−1 . This proves what we asserted above. And we shall show a
relation between wH (Da f ) and the Fourier–Hadamard transform.

Algorithm (Fast Fourier–Hadamard transform)


There exists a simple divide-and-conquer butterfly algorithm to compute  ϕ , called the fast
Fourier–Hadamard transform (FFT). Let us give it in the case where “·” is the usual inner
product. For every a = (a1 , . . . , an−1 ) ∈ F2n−1 and every an ∈ F2 , the number  ϕ (a1 , . . . , an )
equals

(−1)a·x ϕ(x1 , . . . , xn−1 , 0) + (−1)an ϕ(x1 , . . . , xn−1 , 1) .
x=(x1 ,...,xn−1 )∈Fn−1
2

Hence, if in the tables of values of the functions the vectors are ordered, for instance, in
lexicographic order with the bit of highest weight on the right, the table of  ϕ equals the
concatenation of those of the Fourier–Hadamard transforms of the (n−1)-variable functions
ψ0 (x) = ϕ(x1 , . . . , xn−1 , 0) + ϕ(x1 , . . . , xn−1 , 1) and ψ1 (x) = ϕ(x1 , . . . , xn−1 , 0) −
ϕ(x1 , . . . , xn−1 , 1). We deduce the following algorithm:
1. Write the table of the values of ϕ (its truth table if ϕ is Boolean), in which the
binary vectors of length n are in lexicographic order with the bit of highest weight on
the right.

20 We write “Fourier–Hadamard” because “Fourier” would be ambiguous (and for the reason that the matrix
involved in the transform is the Hadamard matrix [609]; see page 191); even “discrete Fourier” would be
ambiguous; see, e.g., [1110].
54 Generalities on Boolean and vectorial functions
x1 x2 x3 ϕ Step 1 Step 2 Step 3: 
ϕ
0 0 0 t0 + t0 + t1 + t0 + t1 + t2 + t3 + t0 + t1 + t2 + t3 + t4 + t5 + t6 + t7
0 0 1 t1 − t0 − t1 + t0 − t1 + t2 − t3 + t0 − t1 + t2 − t3 + t4 − t5 + t6 − t7
0 1 0 t2 + t2 + t3 − t0 + t1 − t2 − t3 + t0 + t1 − t2 − t3 + t4 + t5 − t6 − t7
0 1 1 t3 − t2 − t3 − t0 − t1 − t2 + t3 + t0 − t1 − t2 + t3 + t4 − t5 − t6 + t7
1 0 0 t4 + t4 + t5 + t4 + t5 + t6 + t7 − t0 + t1 + t2 + t3 − t4 − t5 − t6 − t7
1 0 1 t5 − t4 − t5 + t4 − t5 + t6 − t7 − t0 − t1 + t2 − t3 − t4 + t5 − t6 + t7
1 1 0 t6 + t6 + t7 − t4 + t5 − t6 − t7 − t0 + t1 − t2 − t3 − t4 − t5 + t6 + t7
1 1 1 t7 − t6 − t7 − t4 − t5 − t6 + t7 − t0 − t1 − t2 + t3 − t4 + t5 + t6 − t7

Figure 2.1 Fast Fourier–Hadamard transform.

2. Let ϕ0 be the restriction of ϕ to F2n−1 × {0} and ϕ1 the restriction of ϕ to F2n−1 × {1}21 ;
replace the values of ϕ0 by those of ϕ0 + ϕ1 and those of ϕ1 by those of ϕ0 − ϕ1 .
3. Apply recursively step 2, separately from the functions now obtained in the places of ϕ0
and ϕ1 .
When the algorithm ends (after arriving at functions in one variable each), the global table
gives the values of ϕ . The complexity of this algorithm is of n 2n additions/substractions; it
is then in O(N log2 N), where N = 2n is the size of its input f .
As for the fast binary Möbius transform, taking the lexicographic order with the bit of
higher weight on the left (i.e., the standard lexicographic order) works as well because,
for every permutation σ of {1, . . . , n}, we have u · x = σ (u) · σ (x) for every u, x,
  −1
and this implies that ϕ ◦ σ (u) = x∈Fn ϕ(σ (x)) (−1)u·x = x∈Fn ϕ(x) (−1)u·σ (x) =
 σ (u)·x = 
2 2

x∈Fn2 ϕ(x) (−1) ϕ ◦ σ (u), and the final values are the same (but not the
intermediate ones).

Remark. Here again, the algorithm may not work if the order on F2n is not a coordinate-
wise permuted version of lexicographic order (for instance, if it is an order by increasing
Hamming weights of inputs).

Figure 2.1 illustrates how this algorithm works (with a display of the rows in a different
order, better adapted to apprehend the figure).

2.3.2 Fourier–Hadamard and Walsh transforms of Boolean functions


For a given Boolean function f , the Fourier–Hadamard transform can be applied to f itself,
viewed as a function valued in {0, 1} ⊂ Z (we denote then by f the corresponding Fourier–
Hadamard transform of f ). Notice that f(0n ) equals the Hamming weight of f . Thus, the
Hamming distance dH (f , g) = |{x ∈ Fn2 ; f (x) = g(x)}| = wH (f ⊕ g) between two
functions f and g equals f ⊕ g(0n ).
Note that, by linearity of the Fourier–Hadamard transform, Relations (2.25), page 50, and
(2.26) imply:


k 

fi = (−2)|I |−1 fi , (2.30)
i=1 ∅=I ⊆{1,...,k} i∈I

21 The table of values of ϕ0 (resp. ϕ1 ) corresponds to the upper (resp. lower) half of the table of ϕ.
2.3 The Fourier–Hadamard transform and the Walsh transform 55



l 
1 
fi = (−1)|J |−1 fi . (2.31)
2l−1
i=1 ∅=J ⊆{1,...,l} i∈I

The Fourier–Hadamard transform can also be applied to the pseudo-Boolean func-


tion fχ (x) = (−1)f (x) (often called the sign function22 of f ) instead of f itself.

Definition 16 We call the Walsh transform23 of a Boolean function f the Fourier–


Hadamard transform of the sign function fχ , and we denote it24 by Wf :

Wf (u) = (−1)f (x)⊕u·x .


x∈Fn2

We call the Walsh spectrum of f the multiset of all the values Wf (u), where u ∈ Fn2 . We
call the extended Walsh spectrum25 of f the multiset of their absolute values, and the Walsh
support of f the set of those u such that Wf (u) = 0.

We give in Table 2.2 an example of the computation of the Walsh transform, when the
inner product chosen in Fn2 is the usual inner product, using the algorithm of the fast Fourier–
Hadamard transform.26
Notice that fχ being equal to 1 − 2f , we have
Wf = 2n δ0 − 2f, (2.32)
where δ0 denotes the Dirac (or Kronecker) symbol, i.e., the indicator of the singleton {0n },
defined by δ0 (u) = 1 if u is the null vector and δ0 (u) = 0 otherwise; see Proposition 10 for
a proof of the relation 
1 = 2n δ0 . Relations (2.30) and (2.31) give then the following:
Wk (a) = 2n−1 (1 + (−1)k )δ0 (a) + (−2)|I |−1 Wi∈I fi (a), (2.33)
i=1 fi
∅=I ⊆{1,...,k}

and Wl (a) =


i=1 fi
  1
2n − 2n−l+1 δ0 (a) + l−1 (−1)|J |−1 Wi∈I fi (a), (2.34)
2
∅=J ⊆{1,...,l}
 |I |−1 = 1 − (1−2) −1 k 1+(−1)k
since we have 1 − ∅=I ⊆{1,...,k} (−2) = and 1 −
1  |I |−1 = 1 + 1 ((1 − 1)l − 1) = 1 − 1 .
(−2) 2

2l−1 ∅=I ⊆{1,...,l} (−1) 2l−1 2l−1

22 The symbol χ is used here because the sign function is the image of f by the nontrivial character over F2
(usually denoted by χ ).
23 Some authors specify “Walsh–Hadamard transform” like in signal processing, but most do not, since the risk
of ambiguity is weaker than for the Fourier transform; note that a few authors use “Walsh” or
“Hadamard–Walsh” for what we call “Fourier–Hadamard”; we shall use the term of “Walsh” only when
dealing with the sign function.
24 This notation is now widely used; a few years ago, diverse notations were used.
25 “Extended” is in the sense of “extended by the addition of constant Boolean functions to f ,” since knowing
|Wf (u)| is equivalent to knowing the unordered pair {Wf (u), Wf ⊕1 (u)}, because Wf ⊕1 and Wf take opposite
values; we shall sometimes call the extended Walsh transform of f the function |Wf |.
26 The truth table of the function is first directly calculated. We could also have applied the fast binary Möbius
transform to obtain it; this has been done in Table 2.1 for the same function.
56 Generalities on Boolean and vectorial functions

Algorithm 2: Computing the Walsh–Hadamard transform.


Data: tt ← truth table, n ← number of variables
Result: wt ← Walsh–Hadamard spectrum
for i = 0 to 2n − 1 do
wt[i] = (−1)tt[i] ;
end
for i = 1 to n do
for r = 0 to 2n − 1 by 2i do
t1 = r;
t2 = r + 2i−1 ;
for j = 0 to 2i−1 − 1 do
a = wt[t1 ];
b = wt[t2 ];
wt[t1 ] = a + b;
wt[t2 ] = a − b;
t1 = t1 + 1;
t2 = t2 + 1;
end
end
end

Table 2.2 Truth table and Walsh spectrum of f (x) = x1 x2 x3 ⊕ x1 x4 ⊕ x2 .

x1 x2 x3 x4 hexa x1 x2 x3 x1 x4 f (x) fχ (x) Wf (x)

0 0 0 0 0 0 0 0 1 2 4 0 0
1 0 0 0 1 0 0 0 1 0 0 0 0
0 1 0 0 2 0 0 1 −1 −2 −4 8 8
1 1 0 0 3 0 0 1 −1 0 0 0 8
0 0 1 0 4 0 0 0 1 2 0 0 0
1 0 1 0 5 0 0 0 1 0 0 0 0
0 1 1 0 6 0 0 1 −1 −2 0 0 0
1 1 1 0 7 1 0 0 1 0 0 0 0
0 0 0 1 8 0 0 0 1 0 0 0 4
1 0 0 1 9 0 1 1 −1 2 4 4 −4
0 1 0 1 a 0 0 1 −1 0 0 0 4
1 1 0 1 b 0 1 0 1 −2 0 4 −4
0 0 1 1 c 0 0 0 1 0 0 0 −4
1 0 1 1 d 0 1 1 −1 2 0 −4 4
0 1 1 1 e 0 0 1 −1 0 0 0 4
1 1 1 1 f 1 1 1 −1 2 −4 4 −4
2.3 The Fourier–Hadamard transform and the Walsh transform 57

Relation (2.33) has been originally obtained by induction and calculation in [204].
Relation (2.32) gives conversely f = 2n−1 δ0 − 2f and in particular the following:
W

Wf (0n )
wH (f ) = 2n−1 − . (2.35)
2
The mapping f → Wf (0n ) playing an important role, and being applied in the sequel to
various functions deduced from f , we shall also use the specific notation

F (f ) = Wf (0n ) = (−1)f (x) . (2.36)


x∈Fn2

Relation (2.35) applied to f ⊕ a , where a (x) = a · x, gives the following:


Wf (a)
dH (f , a ) = wH (f ⊕ a ) = 2n−1 − . (2.37)
2

Remark. The Walsh transform represents the correlation between Boolean functions and
affine functions and is related to attacks on stream ciphers using LFSR. The best affine
approximations of f (x) are the functions a · x ⊕ , where |Wf (a)| is maximal and equals
0 if Wf (a) > 0 (since f (x) ⊕ a · x has then low Hamming weight), and 1 otherwise.
In [302, 704], the arithmetic Walsh transform of Boolean functions is studied, which is
based on modular arithmetic and is related to feedback with carry shift registers (FCSRs,
having the operation of retroaction made with carry).

The supports of the Walsh transforms of Boolean functions have been studied in [308],
among which we find all possible affine subspaces of Fn2 and the complements of singletons
(for n ≥ 10).
In [582] is proposed an algorithm, deduced from the formulae relating NNF and Walsh
transform that we shall see in Subsection 2.3.4, page 66, for computing the Walsh transform
(for a small set of points) from the ANF when the FFT is not efficient for computing it
from the truth table (because the number of variables is too large, which happens when n is
significantly larger than 30). For example, it is possible in certain cases to run their algorithm
for 50 to 100 variable functions having a few hundreds of terms in their ANF.
In [373] are given concise representations of Walsh transform by binary decision diagrams
(BDD) for functions with several hundred variables.

2.3.3 Properties of the Fourier–Hadamard and Walsh transforms


of Boolean functions
The Fourier–Hadamard transform, as with other Fourier transforms, has very nice and useful
properties. The number of these properties and the richness of their mutual relationship
are impressive. All of these properties are very useful in practice for studying Boolean
functions. We shall often refer to the relations below, by applying them to the Fourier–
Hadamard transforms of pseudo-Boolean functions or to the Walsh transforms of Boolean
functions (which are a particular case). Almost all properties can be deduced from the next
two lemmas and proposition.
58 Generalities on Boolean and vectorial functions

Lemma 3 Let E be any vector space over F2 and  any nonzero linear form on E. Then
 (x) is null.
x∈E (−1)

Proof The linear form  being nonzero,


 its support is an affine hyperplane of E and has
2dimE−1 = |E| 2 elements. 27 Thus,
x∈E (−1) (x) being the sum of 1s and –1s in equal

numbers, it is null.28

Proposition 10 Let E be any vector subspace of Fn2 . Denote by 1E its indicator (i.e., the
Boolean function defined by 1E (x) = 1 if x ∈ E and 1E (x) = 0 otherwise). Then:
1+
E = |E| 1E ⊥ , (2.38)
where E ⊥ = {x ∈ Fn2 ; ∀y ∈ E, x · y = 0} is the orthogonal space of E with respect to the
inner product “·”
In particular, for E = Fn2 , we have 
1 = 2n δ0 .

Proof For every u ∈ Fn2 , we have 1+ E (u) = x∈E (−1) . If the linear form x ∈ E → u·x
u·x

is not null on E (i.e., if u ∈ E ), then 1+


⊥ ⊥
E (u) is null, according to Lemma 3. And if u ∈ E ,
then 1+E (u) clearly equals |E|.

This proposition leads to the very important Poisson formula below. To be able to state
this formula in its general form, we need the:

Lemma 4 For every pseudo-Boolean function ϕ on Fn2 and every elements a, b, and u
of Fn2 , the value at u of the Fourier–Hadamard transform of the function (−1)a·x ϕ(x + b)
equals (−1)b·(a+u)  ϕ (a + u).

Proof The valueat u of the Fourier–Hadamard of the function x → (−1)a·x


 transform(a+u)·(x+b)
ϕ(x + b) equals x∈Fn (−1) (a+u)·x ϕ(x + b) = x∈Fn (−1) ϕ(x) and thus equals
2 2
(−1)b·(a+u) 
ϕ (a + u).

We deduce from Proposition 10 and Lemma 4 the Poisson summation formula, which has
been used to prove many cryptographic properties in [759], [797], [212] and later in [190,
191], and whose most general statement is:

Corollary 3 For every pseudo-Boolean function ϕ on Fn2 , for every vector subspace E
of Fn2 , and for every elements a and b of Fn2 , we have:

(−1)b·u 
ϕ (u) = |E| (−1)a·b (−1)a·x ϕ(x). (2.39)
u∈a+E x∈b+E ⊥

27 Another way of seeing this is to choose a ∈ E such that (a) = 1 and observe that the mapping x → x + a is
a bijection between ker  and its complement.
28 Alternatively, choosing again a ∈ E such that (a) = 1, we have
 (x) =
 (x+a) = (−1)(a)
 (x) = −
 (x) .
x∈E (−1) x∈E (−1) x∈E (−1) x∈E (−1)
2.3 The Fourier–Hadamard transform and the Walsh transform 59
   
Proof For a = b = 0n , the sum u∈E  ϕ (u) equals u∈E x∈Fn ϕ(x)(−1)u·x = x∈Fn
2 2
ϕ(x) 1+
E (x) by definition. Hence, according to Proposition 10:


ϕ (u) = |E| ϕ(x). (2.40)
u∈E x∈E ⊥

We
 apply b·(a+u)
this equality to function a·x
 (−1) ϕ(x + b). Using Lemma 4, we deduce
u∈E (−1) 
ϕ (a + u) = |E| x∈E ⊥ (−1) a·x ϕ(x + b), that is, (2.39).

Relation (2.39) applied to ϕ(x) = fχ , the sign function of f , gives the following:

(−1)b·u Wf (u) = |E| (−1)a·b (−1)f (x)⊕a·x . (2.41)


u∈a+E x∈b+E ⊥

Note that, according to this latter relation, for everyBoolean function f , every vector
subspace E of Fn2 , and every a, b ∈ Fn2 , we have | u∈a+E (−1)b·u Wf (u)| ≤ 2n (with
equality if and only if f (x) ⊕ a · x is constant on b + E ⊥ ).
Relation (2.39) with a = 0n and E = Fn2 gives the following:

Corollary 4 For every pseudo-Boolean function ϕ on Fn2 :



ϕ = 2n ϕ. (2.42)

Thus, the Fourier–Hadamard transform is a permutation on the set of pseudo-


Boolean functions on Fn2 and is its own inverse, up to the division29 by the constant
2n . Relation
 (2.42) is called the inverse Fourier–Hadamard transform formula and
writes u∈F2 
n ϕ (u) (−1)u·x = 2n ϕ(x). It means that, viewing ϕ as a function of
xχ = ((−1)x1 , . . . , (−1)xn ), the number 2−n 
ϕ (u) is the NNF coefficient indexed by u of the
resulting function.30 Applied to a sign function, Relation (2.42) is called the inverse Walsh
transform formula and writes the following:
Wf (u) (−1)u·x = 2n (−1)f (x) . (2.43)
u∈Fn2

Corollary 4 shows easily that a given property observed on the Fourier–Hadamard


transform of any pseudo-Boolean function ϕ having some specificity is in fact a necessary
and sufficient condition for ϕ having this specificity. For instance, according to Proposition
10, the Fourier–Hadamard transform of any constant function ϕ takes the null value at every
nonzero vector. Since the Fourier–Hadamard transform of a function null at every nonzero
vector is constant, Corollary 4 implies that the Fourier–Hadamard transform is a bijection
between the set of constant functions and the set of those functions null at every nonzero
vector. Similarly, ϕ is constant on Fn2 \ {0n } if and only if 
ϕ is constant on Fn2 \ {0n }.

29 In order
√ to avoid
n
this division, the Fourier–Hadamard transform is often normalized, that is, divided
by 2n = 2 2 , so that it becomes its own inverse. We do not use this normalized transform here because the
functions we consider are integer valued, and we want their Fourier–Hadamard transforms to be also integer
valued.
30 In [693, 779, 914], the authors call Fourier transform this representation of ϕ viewed as a polynomial in xχ .
60 Generalities on Boolean and vectorial functions

A classical property of the Fourier transform is to be an isomorphism from the set of


functions endowed with the so-called convolutional product (denoted by ⊗) into this same
set, endowed with the usual product (denoted by ×). We recall the definition31 of the
convolutional product between two functions ϕ and ψ:
(ϕ ⊗ ψ)(x) = ϕ(y)ψ(x + y).
y∈Fn2

Proposition 11 Let ϕ and ψ be any pseudo-Boolean functions on Fn2 . We have the


following:
ϕ
⊗ψ = .
ϕ ×ψ (2.44)
Consequently:
  = 2n ϕ×ψ.
ϕ⊗ψ  (2.45)

Proof We have
ϕ
⊗ ψ(u) = (ϕ ⊗ ψ)(x) (−1)u·x = ϕ(y)ψ(x + y) (−1)u·y⊕u·(x+y) .
x∈Fn2 x∈Fn2 y∈Fn2

Thus, by the change of variable (x, y) → (x + y, y), we have the following:


⎛ ⎞⎛ ⎞

⊗ ψ(u) = ⎝
ϕ ϕ(y)(−1)u·y ⎠ ⎝ ψ(x) (−1)u·x ⎠ =  (u).
ϕ (u) ψ
y∈Fn2 x∈Fn2

This proves the first equality. Applying it to   in the places of ϕ and ψ, we obtain
ϕ and ψ


ϕ⊗ψ  = 2 ϕ×ψ, according to Corollary 4. Using again this same corollary, we deduce
2n

Relation (2.45).

Relation (2.45) applied at 0n gives a relation sometimes called Plancherel’s formula:


 (x) = 2n
ϕ (x)ψ ϕ(x)ψ(x). (2.46)
x∈Fn2 x∈Fn2

Taking ψ = ϕ in (2.46), we obtain Parseval’s relation:

Corollary 5 For every pseudo-Boolean function ϕ, we have the following:



ϕ 2 (u) = 2n ϕ 2 (x).
u∈Fn2 x∈Fn2

If ϕ takes values ±1 only, this becomes the following:



ϕ 2 (u) = 22n . (2.47)
u∈Fn2

31 Since the operations take place in Fn2 , we have a + in the formula, where for general groups we would have
a −.
2.3 The Fourier–Hadamard transform and the Walsh transform 61

This is why, when dealing with Boolean functions, we most often prefer using the Walsh
transform of f instead of the Fourier–Hadamard transform of f . Parseval’s relation for
Walsh transform writes the following:

Wf2 (u) = 22n . (2.48)


u∈Fn2

According to the inverse Walsh transform formula and to the Parseval formula, we have
 2  
2
for every function f that n
u∈F2 W f (u) = 2n (−1)f (0n ) = 2
u∈Fn2 Wf (u), that is,

u=v Wf (u)Wf (v) = 0. Note that this proves (as observed in [312]) that it is impossible,
except when the function is affine, i.e., when the Walsh transform is null except at one point,
that all nonzero values of the Walsh transform have the same sign.
Relation (2.45) applied at a = 0n gives

 (a) = 2n ϕ×ψ(a)
ϕ⊗ψ  = 2n ϕ(x)ψ(x)(−1)a·x . (2.49)
x∈Fn2

If ϕ takes values ±1 only and ψ = ϕ, this becomes the following:

 ϕ (u + a) = 0.
ϕ (u) (2.50)
u∈Fn2

This provides the relation that some authors call the Titsworth relation:

Wf (u)Wf (u + a) = 0, ∀a = 0n . (2.51)
u∈Fn2

Note that in some cases (for instance, for designing correlation immune functions of low
Hamming weights; see Section 7.1.9, page 303) using the Fourier–Hadamard transform of a
Boolean function is more convenient.
When Fn2 is identified to F2n , with inner product u · x = trn (ux), Parseval’s relation is a
particular case (corresponding to a = 1) of the following more general relation:

Wf (u)Wf (au) = (−1)f (y)⊕f (x)⊕trn (uy+aux)


u∈F2n u,x,y∈F2n

= 2n (−1)f (x)⊕f (ax) .


x∈F2n

Relation (2.44) applied with ψ = ϕ = fχ implies the Wiener–Khintchine formula:

fχ
⊗ fχ = Wf2 , (2.52)

derivatives of the Boolean function, since for every a ∈ Fn2 ,


which involves in fact the 
we have (fχ ⊗ fχ )(a) = x∈Fn (−1)Da f (x) = F (Da f ) (the notation F was defined at
2
Relation (2.36), page 57).

Definition 17 The function a → F (Da f ) is called the autocorrelation function of f and


denoted by f .
62 Generalities on Boolean and vectorial functions

Relation (2.52) means that Wf2 is the Fourier–Hadamard transform of the autocorrelation
function of f :
+f (u) =
∀u ∈ Fn2 ,  f (a)(−1)u·a = Wf2 (u). (2.53)
a∈Fn2

Equivalently, by applying the inverse Fourier transform formula, we have


f (a) = 2−n Wf2 (u)(−1)u·a . (2.54)
u∈Fn2

This property was first used (as far as we know) in the domain of cryptography in [211] to
study the so-called partially-bent functions (see Section 6.2). It leads also to a lower bound
on the numerical degree of Boolean functions by means of the Hamming weights of their
derivatives (in directions of Hamming weight 1), first given in [905], that we shall give (and
prove) as Relation (2.63), page 67.
Applied at vector 0n , Relation (2.53) gives
+f (0n ) =
 F (Da f ) = F 2 (f ). (2.55)
a∈Fn2

Corollary 3 (the Poisson summation formula), page 58, and Relation (2.53) imply that, for
every vector subspace E of Fn2 and every vectors a and b (cf. [191]):

(−1)b·u Wf2 (u) = |E|(−1)a·b (−1)a·e F (De f ) . (2.56)


u∈a+E e∈b+E ⊥

This leads to an interesting relation, first shown in [191] for Boolean functions (but similar
relations exist in other domains such as sequences and learning; see, e.g., [779]), and that,
because of its similarity with the Poisson summation formula, we shall call the second-order
Poisson summation formula:32

Proposition 12 Let E and E  be supplementary subspaces33 of Fn2 (i.e., be two subspaces


such that E ∩ E  = {0n } and whose direct sum equals Fn2 ). For every a ∈ E  , let ha be the
restriction of f to the coset a + E (ha can be identified with a function on Fk2 where k is the
dimension of E). Then
Wf2 (u) = |E ⊥ | F 2 (ha ) . (2.57)
u∈E ⊥ a∈E 

Proof Every element of Fn2 can be written in a unique way in the form x + a, where x ∈ E
and a ∈ E  .

32 This formula is sometimes more convenient to use than the Poisson summation formula. An example where it
helps proving more can be found in Section 10.4.
33 Some authors say “complementary,” but we prefer avoiding the confusion with complementary sets and use
“supplementary.”
2.3 The Fourier–Hadamard transform and the Walsh transform 63
 
For every e ∈ E, we have that F (De f ) = x∈E;a∈E  (−1)f (x+a)⊕f (x+e+a) = a∈E 
F (De ha ). We deduce from Relation (2.56), applied with E ⊥ instead of E, and with a =
b = 0n , that
 
Wf2 (u) = |E ⊥ | F (De f ) = |E ⊥ | F (De ha )
u∈E ⊥ e∈E e∈E a∈E 
 
= |E ⊥ | F (De ha ) .
a∈E  e∈E

Thus, according to Relation (2.55) applied with E in


 the place of Fn2 (recallthat E can be
identified with Fk2 where k is the dimension of E): u∈E ⊥ Wf2 (u) = |E ⊥ | a∈E  F 2 (ha ).

Fourier–Hadamard transform and affine automorphisms


A last relation that must be mentioned shows what the composition with a linear isomor-
phism implies on the Fourier transform of a pseudo-Boolean function:

Proposition 13 Let ϕ be any pseudo-Boolean function on Fn2 . Let M be a nonsingular n ×


n binary matrix and L the linear automorphism L : (x1 , x2 , . . . , xn ) → (x1 , x2 , . . . , xn ) ×
M. Let us denote by M  the transpose of M −1 and by L the linear automorphism L :
(x1 , x2 , . . . , xn ) → (x1 , x2 , . . . , xn ) × M  (note that L is the adjoint operator of L−1 , that
is, it satisfies u · L−1 (x) = L (u) · x for every x and u, where · is the usual inner product).
Then
ϕ ϕ ◦ L .
◦L= (2.58)

Proof By the change of variable x → L−1 (x), we have that for every u ∈ Fn2 , ϕ ◦ L(u) =
  −1 (x)
x∈Fn2 ϕ(L(x))(−1)
u·x equals x∈Fn ϕ(x)(−1) u·L and, by the definition of L , equals
 
2
then x∈Fn ϕ(x)(−1)L (u)·x .
2

It is easily deduced from this Relation (2.58) and from Lemma 4, page 58, that the affine
equivalence of Boolean functions translates into the affine equivalence of their extended
Walsh transforms and in particular of their Walsh supports.
Given linear bijections L1 , L2 , a linear function L3 and vectors a, b, c, the value of
W(L1 +a)◦F ◦(L2 +b)+L3 +c (u, v) = ± (−1)v·(L1 (F (L2 (x)+b))+L3 (x))⊕u·x equals ±WF ((L3 ◦
x∈Fn2
L2 −1 )∗ (v) + (L2 −1 )∗ (u), L1 ∗ (v)), where ∗ is the adjoint operator.

Relationship between algebraic degree and Walsh transform


The following bound was shown in [737] (see also [212, Lemma 3]):
64 Generalities on Boolean and vectorial functions

Theorem 2 Let f be an n-variable Boolean function (n ≥ 2), and let 1 ≤ k ≤ n.


Assume that the Walsh transform values of f are all divisible by 2k (i.e., according to
Relation (2.32), that its Fourier–Hadamard transform takes values divisible by 2k−1 , or
equivalently, according to Relation (2.37), that all the Hamming distances between f and
affine functions are divisible by 2k−1 ). Then f has algebraic degree at most n − k + 1.

Proof Let us suppose that f has algebraic degree d > n − k + 1 and consider a term x I
of degree d in its algebraic normal form. The Poisson summation formula  (2.40) applied to
ϕ = fχ and to the vector space E = {u ∈ Fn2 ; ∀i ∈ I , ui = 0} gives u∈E Wf (u) =
2n−d x∈E ⊥ fχ (x). The orthogonal E ⊥ of E equals {u ∈ Fn2 ; ∀i ∈ I , ui = 0} = {u ∈
Fn2 ; supp(u)
 ⊆ I }. According to Relation 2.4, we have that x∈E ⊥ f (x) is not even and
therefore x∈E ⊥ fχ (x) is not divisible by 4. Hence, u∈E Wf (u) is not divisible by 2n−d+2
and it is therefore not divisible by 2k – a contradiction.

Remark. The result is of course also valid for those vectorial (n, m)-functions whose
Walsh transform values are divisible by 2k . It is shown in [204] that for any (n, m)-
function F having such a divisibility property and for every (m, r)-function G, we have
dalg (G ◦ F ) ≤ n − k + dalg (G). This bound on the algebraic degree of composite functions
is a direct consequence of Relation (2.34), page 55: all component functions of F having
a Walsh transform values divisible by 2k , then for every l = 1, . . . , k, all products of l
coordinate functions of F have a Walsh transform divisible by 2k−l+1 , and Theorem 2
completes the proof. As shown in [254], it is also a direct consequence of Relation (2.9),
page 40.

Remark.
1. The converse of Theorem 2 is valid if k = 1 (since the Walsh transform values of all
Boolean functions are even by definition). It is also valid if k = 2, since the n-variable
Boolean functions of degrees at most n−1 are those Boolean functions of even Hamming
weights, and f (x)⊕u·x has degree at most n−1 too for every u, since n ≥ 2. It is finally
also valid for k = n, since the affine functions are characterized by the fact that their
Walsh transforms take values ±2n and 0 only (more precisely, their Walsh transforms
take value ±2n once, and all their other values are null). The converse is false for any
other value of k. Indeed, it is false for k = n − 1 (n ≥ 4), since there exist quadratic
n
functions f whose Walsh transforms take values ±2 2 for n even, ≥ 4, and ±2(n+1)/2
for n odd, ≥ 5 (see Section 5.2, page 170). It is then an easy task to deduce that the
converse of Theorem 2 is also false for any value of k such that 3 ≤ k ≤ n − 2: we
choose a quadratic function g in 4 variables, whose Walsh transform value at 0n equals
22 , that is, whose weight equals 23 −2 = 6, and we take f (x) = g(x1 , x2 , x3 , x4 ) x5 . . . xl
(5 ≤ l ≤ n). Such a function has the algebraic degree l − 2 and its weight equals 6; hence
its Walsh transform value at 0n equals 2n − 12 and is therefore not divisible by 2k with
n − 2 ≥ k = n − (l − 2) + 1 = n − l + 3 ≥ 3.
2. It is possible to characterize the functions whose Walsh transform values are all divisible
by 2n−1 (i.e., equal 0, ±2n−1 and/or ±2n ): according to Theorem 2, they have algebraic
degree at most 2, and the characterization follows from the results of Section 5.2 on
2.3 The Fourier–Hadamard transform and the Walsh transform 65

quadratic functions (see the last remark of page 173); these functions are the sums
of an affine function and of the product of two affine functions (see, for instance, the
observation after Theorem 10, page 172). Determining those Boolean functions (in the
Reed–Muller code of order n − k + 1) whose Walsh transform is divisible by 2k is
an open problem for 3 ≤ k ≤ n − 2. The Poisson summation formula provides some
information; it shows by applying the proof of Theorem 2 to Wf (a + u) that, for every
supplementary subspaces E and E  of Fn2 (i.e., such that E ∩ E  = {0n } and whose direct
sum equals Fn2 ), where E has dimension d ≥ n − k, denoting for every a ∈ E  by ha
the restriction of f to the coset a + E, the value of F (ha ) is divisible by 2k+d−n ; and
we also have that the arithmetic
 mean (i.e., average) of F (ha ) when a ranges over E  is
divisible by 2k+d−n (indeed, a∈E  F (ha ) = F (f ) = Wf (0n ) is divisible by 2k ). The
second-order Poisson summation formula also provides complementary information on
 mean square) of F (h
such functions: the quadratic mean (i.e., the root a ) when a ranges
over E  is also divisible by 2k+d−n . Indeed, a∈E  F 2 (ha ) = |E1⊥ | u∈E ⊥ Wf2 (u) is
divisible by 22k+d−n ; hence, the arithmetic mean of F 2 (ha ) is divisible by 22k+2d−2n .
Summarizing, we have that the integer sequence 2−(k+d−n) F (ha ), of length 2n−d , has
integer arithmetic and quadratic means. Note that, according to McEliece’s theorem (see
page 13), given a monomial Boolean function f (x) = trn (x d ) where gcd(d, 2n − 1) = 1,
the largest possible exponent of a power of 2 dividing each Walsh transform value of f
equals min{w2 (t0 ) + w2 (t1 ); 1 ≤ t0 , t1 < 2n − 1, t0 + t1 d ≡ 0 [mod 2n − 1] (see the
definition of w2 at page 45). See bounds in [674, 676].
3. It is possible to characterize the fact that a Boolean function has algebraic degree at most
d by means of its Fourier–Hadamard or Walsh transforms: since, as seen in Proposition 4,
page 36, a Boolean function has an algebraic degree at most d if and only if its restriction
to any (d + 1)-dimensional flat (i.e., affine subspace) has even Hamming weight, we can
apply Poisson summation formula (2.39). For instance, in terms of the Walsh transform,
f has an algebraic degree at most d if and only if,  for every (n − d − 1)-dimensional
vector subspace E of F2 and every b ∈ F2 , the sum u∈E (−1)b·u Wf (u) is divisible by
n n

2n−d+1 . But this characterization is not simple.

Characterizing the Fourier–Hadamard transforms of pseudo-Boolean functions


and the Walsh transforms of Boolean functions
According to the inverse Fourier–Hadamard transform formula (2.42), the Fourier–
Hadamard transforms of integer-valued functions (resp. the Walsh transforms of Boolean
functions) are those integer-valued functions over Fn2 whose Fourier–Hadamard transforms
take values divisible by 2n (resp. take values ±2n ). Also, according to the inverse Walsh
transform formula (2.43), page 59, the Walsh transforms of Boolean functions are those
)2 equals the constant function 22n ; they are
integer-valued functions ψ over Fn2 such that (ψ
then those integer-valued functions ψ such that ψ ⊗ ψ = 22n (according to Relation (2.44)
applied with ϕ = ψ), that is, ψ ⊗ ψ = 2 δ0 .
2n

These characterizations need to check 2n divisibilities by 2n for the Fourier–Hadamard


transforms of integer-valued functions, and 2n equalities for the Walsh transforms of
Boolean functions.
66 Generalities on Boolean and vectorial functions

Case of monomial (or power) Boolean functions: So-called monomial Boolean (univariate)
functions are those functions over F2n of the form f (x) = trn (ax d ) (recall from page 42
that when Fn2 is identified with F2n , an inner product is then (x, y) → trn (x y)). We shall
give at page 72 the known results and a conjecture on the Walsh spectrum of such functions.

2.3.4 Fourier–Hadamard (and Walsh) transform and numerical normal form


Since the main cryptographic criteria on Boolean functions will be characterized as
properties of their Fourier–Hadamard/Walsh transforms (see Section 3.1), it is useful to
clarify the relationship between these and the NNF representation. Note that there is
a similarity between the Fourier–Hadamard transform and the NNF of pseudo-Boolean
functions:
– The functions (−1)u·x , u ∈ Fn2 , constitute an orthogonal basis of the space of pseudo-
Boolean functions, and the Fourier–Hadamard transform can be seen as a classical
decomposition over an orthogonal basis.
– The NNF is defined similarly with respect to the basis of monomials, which is nonorthog-
onal but allows as well simple calculation of the coefficients in this decomposition.

Let us see now how each representation can be expressed


 by means of the other.
I
Let ϕ(x) be any pseudo-Booleanfunction, and let I ⊆{1,...,n} λI x be its NNF. For every
vector x ∈ Fn2 , we have: ϕ(x) = I ⊆supp(x) λI . Setting 1n = (1, . . . , 1), we have ϕ(x +
1n ) = λI (since the support of x + 1n equals Fn2 \ supp(x)). Hence,
I ⊆{1,...,n};
 supp(x)∩I =∅
ϕ(x +1n ) = I ⊆{1,...,n} λI 1EI , where EI is the (n−|I |)-dimensional vector subspace of Fn2
equal to {x ∈ Fn2 ; supp(x)∩I = ∅}, whose orthogonal space equals {u ∈ Fn2 ; supp(u) ⊆ I }.
Applying Lemma 4 with a = 0n and b = 1n , and Proposition 10, we deduce (as proved in
[292]) the following:


ϕ (u) = (−1)wH (u) 2n−|I | λI . (2.59)
I ⊆{1,...,n}; supp(u)⊆I

We deduce the following:

λI = 2−n (−2)|I | 
ϕ (u). (2.60)
u∈Fn2 ; I ⊆supp(u)

Indeed, according to Relation (2.59), we have 2−n (−2)|I | u∈Fn ; I ⊆supp(u)  ϕ (u) =
   2

2−n (−2)|I | J ⊆{1,...,n} u∈Fn2 ; I ⊆supp(u)⊆J (−1)


wH (u) 2n−|J | λ , and the sum inside the
J
parentheses
 equals 0 if I ⊆
 J and otherwise is also null if J = I since it equals
(−1)|I | u∈Fn ; supp(u)⊆J \I (−1)wH (u) = (−1)|I | (1 − 1)|J \I | .
2
Relation (2.60) has been proved in [292] in a slightly more complex way. Applied when ϕ
equals a Boolean function f and using that Wf (u) = 2n δ0 (u)−2f(u), we get the following:

Wf (u) = (−1)wH (u)+1 2n−|I |+1 λI if u = 0n , (2.61)


I ⊆{1,...,n}; supp(u)⊆I
2.3 The Fourier–Hadamard transform and the Walsh transform 67

Wf (0n ) = 2n − 2n−|I |+1 λI ,


I ⊆{1,...,n}

and
λI = 2−n (−2)|I |−1 Wf (u) if I = ∅, (2.62)
u∈Fn2 ; I ⊆supp(u)

⎛ ⎞

λ∅ = −2−(n+1) ⎝ Wf (u) − 2n ⎠ .
u∈Fn2

Remark. This provides a simpler proof of Theorem 2, page 63: according to Relations
(2.61) and (2.62), the hypothesis of the theorem is equivalent to saying that, for every I such
that |I | ≥ n − k + 1, the coefficient λI of x I in the NNF of f is divisible by 2|I |+k−n−1 ,
and this implies that for |I | ≥ n − k + 2, it is even. This gives also more information
on those functions whose Walsh transform values are all divisible by 2k . For instance, if
|I
| ≥ n − k + 3, then since λI is divisible by 4, using Relation (2.24), page 50, we have that
{I1 ,I2 } | aI1 aI2 is even, that is, {I1 ,I2 } | aI1 aI2 = 0. This is exploited in [301] for bounding
I1 ∪I2 =I I1 ∪I2 =I
numbers of functions (see pages 243 and 311). Other similar (but more complex) properties
of the coefficients aI can be obtained by considering the divisibility of λI by powers of 2
larger than 4.

We deduce the following from Relations (2.59) through (2.62):

Proposition 14 Any pseudo-Boolean function ϕ has a numerical degree at most d if and


only if 
ϕ (u) = 0 for every vector u of Hamming weight strictly larger than d. Any Boolean
function f has a numerical degree at most d if and only if Wf (u) = 0 for every such vector.

In other words, the numerical degree equals the maximal Hamming weight of those u ∈
Fn2 such that Wf (u) = 0.
This allows proving the fact mentioned at page 48 that, if a Boolean function f
has no ineffective variable, then the numerical degree of f is larger than or equal to
log2 n − O(log2 log2 n). In fact, we can prove a little more with the same method as in
the sketch of proof given in [905]:

Proposition 15 Let f be any n-variable Boolean function. Denoting by ei the ith vector
of the canonical basis of Fn2 , the numerical degree of f satisfies the following:
n
dnum (f ) ≥ 2−n wH (Dei f ). (2.63)
i=1

If each variable xi is effective in f (x), that is, if each derivative Dei f is nonzero, then we
have
dnum (f ) ≥ n 2−dalg (f )+1 . (2.64)
68 Generalities on Boolean and vectorial functions

and a fortiori

n ≤ dnum (f ) 2dnum (f )−1 . (2.65)

Consequently:

dnum (f ) ≥ 1 + log2 n − log2 (1 + log2 n). (2.66)

Proof According to Relation (2.54), page 62, we have the following:


n n
f (ei ) = 2−n Wf2 (u) (−1)u·ei = 2−n Wf2 (u)[n − 2wH (u)],
i=1 u∈Fn2 i=1 u∈Fn2

and therefore, since f (ei ) = 2n − 2wH (Dei f ):


n
wH (Dei f ) = n2n−1 − 2−(n+1) Wf2 (u)[n − 2wH (u)]. (2.67)
i=1 u∈Fn2

Using Parseval’s relation, we deduce that


n
wH (Dei f ) = 2−n Wf2 (u) wH (u) ≤ 2−n dnum (f ) Wf2 (u) = 2n dnum (f ).
i=1 u∈Fn2 u∈Fn2

This proves Relation (2.63).


If each derivative Dei f is nonzero, then wH (Dei f ) is at least 2n−dalg (Dei f ) ≥
2n−dalg (f )+1 ≥ 2n−dnum (f )+1 , since the minimum nonzero Hamming weight of
n-variable Boolean functions of algebraic degree at most r equals 2n−r , as we shall see
in Theorem 7, page 152. According to Relation (2.63), we have then Relation (2.64) and
that dnum (f ) 2dnum (f ) ≥ 2n, and this proves Relation (2.65).
Relation (2.66) is directly deduced, since for x ≥ 1, function x 2x is increasing and we
have x 2x = y ⇒ x + log2 x = log2 y ⇒ x ≤ log2 y, and therefore x 2x = y ⇒ x =
log2 y − log2 x ≥ log2 y − log2 log2 y.

The value of 2−n wH (Dei f ) is called in [905, 914] the influence of variable xi and the sum
of these values the total influence. See more in [652, 653].
Bound (2.66) is tight up to approximately the term in log2 log2 n. Indeed, the so-called
k 
address function f (x, y) = xϕ(y) , x ∈ F22 , y ∈ Fk2 , where ϕ(y) = 1 + ki=1 yi 2i−1 , has for
  k
NNF: u∈Fk xϕ(u) δu (y) = u∈Fk xϕ(u) i=1 (1 − yi − ui + 2yi ui ). It has then n = k + 2k
2 2
variables and numerical degree 1 + k.

Remark. According to the calculations above, bound (2.63) is an equality if and only
if the Walsh support of f is included in the set of vectors of Hamming weight dnum (f )
(i.e., the Walsh transform of f is homogeneous; an example is affine functions). Under this
condition, Bound (2.65) is an equality if and only if, for every i = 1, . . . , n, we have that
dnum (Dei f ) = dalg (f ) − 1 and Dei f is the indicator of an affine space (see Theorem 8).
We do not know if such function exists.
2.3 The Fourier–Hadamard transform and the Walsh transform 69

Remark. Functions of very low numerical degree d ≈ log2 n (such as the address
d n nlog2 n
function) have a Walsh support of a size at most D = i=0 i ≈ log2 n! , according
to Proposition 14. It is interesting to see that the Walsh support’s size can be that small,
while the function depends on all its variables and can be rather complex. In fact, this size
k
is still smaller when f is the
 address function
 above, since for every a ∈ F22 and b ∈ Fk2 ,
we have then Wf (a, b) = y∈Fk (−1)b·y x∈F2k (−1)xϕ(y) ⊕a·x and therefore Wf (a, b) = 0
2 2
if a has Hamming weight different from 1, and the Walsh support of f has size 22k < n2
k
(the size is exactly 22k since, if a is the j th vector of the canonical basis of F22 , then for
k
every b ∈ Fk2 we have Wf (a, b) = 22 (−1)b·y = 0, where y is the unique element such that
ϕ(y) = j ). The Walsh support is the union of 2k cosets of the k-dimensional linear subspace
k
{02k } × Fk2 of F22 × Fk2 .
The address function is a particular case of a general class of functions called Maiorana–
McFarland, which we shall see at page 165, and which can provide more cases of small
Walsh supports (see after Proposition 53, page 166).

Determining, for all n, the exact minimum numerical degree of n-variable Boolean
functions depending on all their variables is open.
Of course, if a function does not depend on all its n variables, we still have the bound
dnum (f ) ≥ 1 + log2 m − log2 (1 + log2 m), where m is the number of effective variables in
f (x).

Remark. The NNF presents the interest of being a polynomial representation, but it
can also be viewed as the transform that maps any pseudo-Boolean function f (x) =

I ⊆{1,...,n} λI x to the pseudo-Boolean function g defined by g(x) = λsupp(x) . Let us
I

denote this mapping by . Three other transforms have also been used for studying Boolean
functions:
– The mapping −1 (the formulae relating this mapping and the Walsh transform are
slightly simpler than for ; see [985])
– A mapping defined by a formula similar to Relation (2.23), but in which supp(x) ⊆ I is
replaced by I ⊆ supp(x); see [579]
– The inverse of this latter mapping

Remark. An interesting question is, given a Boolean function f , what is the minimum
numerical degree of all the Boolean functions affine equivalent to f (that is, thanks to the
fact that the affine equivalence of functions implies the linear equivalence of their Walsh
supports (see page 63), what is the minimum for all the sets S that are linearly equivalent
to the Walsh support of f , of the maximum Hamming weight of the elements of S)? Note
that there exist functions such that this minimum is strictly larger than the algebraic degree
(this is the case of bent functions, for instance – see Definition 19, page 80 – since for
these functions the minimum is n and the algebraic degree equals n/2). Note also that if
we replace “minimum” with “maximum,” the number is n for every nonconstant Boolean
function, since for any element of Fn2 , there exists a linear permutation that maps this element
to the all-1 vector.
70 Generalities on Boolean and vectorial functions

2.3.5 The size of the support of the Fourier–Hadamard transform


and Cayley graphs
In graph theory, an undirected graph is an ordered pair (V , E), where V is a set of points
called vertices or nodes, and E is a set of pairs of vertices (that we shall assume distinct)
called edges (more generally, in the case of hypergraphs, edges are subsets of more than two
nodes). The degree of a vertex equals the number of edges it is in. Let f be a Boolean
function, and let Gf be the Cayley graph associated to f : the vertices of this graph
are the elements of Fn2 and there is an edge between two vertices u and v if and only if
the vector u + v belongs to the support of f . Then (see [68]), the values f(a), a ∈ Fn2 , of the
Fourier–Hadamard transform of f are the eigenvalues of the graph Gf (that is, by definition,
the eigenvalues of the adjacency matrix (Mu,v )u,v∈Fn2 of Gf , whose term Mu,v equals 1 if
u + v belongs to the support of f , and equals 0 otherwise). Their product equals then the
determinant of the adjacency matrix. Indeed, the matrix is 2n × 2n , and we have the 2n
linearly independent eigenvectors ((−1)a·v )v∈Fn2 , each one corresponding to an eigenvalue,
 
since for every a ∈ Fn2 , we have v∈Fn ;u+v∈supp(f ) (−1)a·v = x∈supp(f ) (−1)
a·(u+x)
2
= f(a)(−1)a·u , ∀u ∈ Fn2 .
As a consequence, the size Nf of the support {a ∈ Fn2 ; f(a) = 0} of the Fourier–
Hadamard transform of any n-variable Boolean function f is larger than or equal to the size
Ng of the support of the Fourier–Hadamard transform of any restriction g of f , obtained
by keeping constant some of its input bits. Indeed, the adjacency matrix Mg of the Cayley
graph Gg is a submatrix of the adjacency matrix Mf of the Cayley graph Gf ; the number
Ng equals the rank of Mg , and is then smaller than or equal to the rank Nf of Mf .
This property can be generalized to any pseudo-Boolean function ϕ, with a simpler proof
using the Poisson summation formula (2.39): let I be any subset of {1, . . . , n}; let E be the
vector subspace of Fn2 equal to {x ∈ Fn2 ; xi = 0, ∀i ∈ I }; we have E ⊥ = {x ∈ Fn2 ; xi = 0,
∀i ∈ {1, . . . , n} \ I } and E and of E ⊥ is direct; then, for every a ∈ E ⊥ and every
 the sum of b·u
b ∈ E, the equality u∈a+E (−1)  ϕ (u) = |E| (−1)a·b ψ (a), where ψ is the restriction

of ϕ to b + E , implies that, if N = k, that is, if 
ϕ (u) is nonzero for exactly k vectors
ϕ

u ∈ F2 , then clearly ψ (a) is nonzero for at most k vectors a ∈ E ⊥ .
n

Coming back to the case where ϕ is a Boolean function, say ϕ = f , where f has algebraic
degree d, choosing for I a multiindex of size d such that x I is part of the ANF of f , then
the restriction ψ = g has odd weight and its Fourier–Hadamard transform takes therefore
nonzero values only. We deduce (as proved in [68]) that
Nf ≥ 2d .

Notice that Nf equals 2d if and only if at most one element (that is, exactly one) satisfying
f(u) = 0 exists in each coset of E, that is, in each set obtained by keeping constant the
coordinates xi such that i ∈ I . D n
The number N ϕ is also bounded above by i=0 i , where D is the numerical degree
of ϕ. This is a direct consequence of Proposition 14.
The graph viewpoint gives insight on those Boolean functions whose Fourier–Hadamard
spectra have at most three values, as can be seen in [68]. Bent functions (see Chapter 6)
are those Boolean functions whose Cayley graphs are strongly regular of a particular type
[68, 69]: those graphs such that, for all distinct vertices u, v, the number of those vertices
that are adjacent to both u and v are the same.
2.3 The Fourier–Hadamard transform and the Walsh transform 71

A hypergraph (see page 70) can also be related to the ANF of a Boolean function f .
A related (weak) upper bound on the nonlinearity of Boolean functions (see definition in
Section 3.1) has been pointed out in [1179].

2.3.6 The Walsh transform of vectorial functions


Assuming that an inner product in Fn2 and an inner product in Fm 2 have been chosen, both
denoted by “·”, we call the Walsh transform of an (n, m)-function F , and we denote by WF ,
the function that maps any ordered pair (u, v) ∈ Fn2 × Fm 2 to the value at u of the Walsh
transform of the Boolean function v · F :
WF (u, v) = (−1)v·F (x)⊕u·x ; u ∈ Fn2 , v ∈ Fm
2.
x∈Fn2

We call the Walsh spectrum of F the multiset of all the values WF (u, v), where u ∈ Fn2 , v ∈
Fm
2 . We call the extended Walsh spectrum of F the multiset of their absolute values, and
Walsh support of F the set of those (u, v) such that WF (u, v) = 0.

Remark. If we denote by GF the graph {(x, y) ∈ Fn2 × Fm 2 ; y = F (x)} of F , and by 1GF


its indicator (taking value 1 on GF and 0 outside), then we have WF (u, v) = 1, GF (u, v).
The Walsh transform of any vectorial function is the Fourier–Hadamard transform of the
indicator of its graph. 
The autocorrelation function (a, v) → x∈Fn2 (−1)
v·(F (x)+F (x+a)) is directly

connected to the Fourier transform of the function implementing


 the difference table
DF (a, b) = |x; F (x) + F (x + a) = b| since we have x∈F2 n (−1) v·(F (x)+F (x+a)) =
 v·b
b∈Fn2 DF (a, b)(−1) , and DF (a, b) is then recovered from the autocorrelation function
by the inverse Fourier transform formula for Boolean functions.

The inverse Walsh transform formula (2.43) for vectorial functions writes the following:
WF (u, v) (−1)u·x = 2n (−1)v·F (x) . (2.68)
u∈Fn2

There is a simple way of expressing the value of the Walsh transform of the composition of
two vectorial functions by means of those of the functions:

Proposition 16 If we write the values of the function WF in a 2m × 2n matrix


(Mv,u )v∈Fm2 ,u∈Fn2 where Mv,u = WF (u, v), then the matrix similarly corresponding to the
composite function F ◦ H , where H is an (r, n)-function, equals 2−n M × N, where N is
defined similarly with respect to H .

Proof For every w ∈ Fr2 and every v ∈ Fm


2 , we have

WF (u, v)WH (w, u) = (−1)v·F (y)⊕u·(y+H (x))⊕w·x


u∈Fn2 u∈Fn2 ;x∈Fr2 ;y∈Fn2

= 2n (−1)v·F (y)⊕w·x
x∈Fr2 ;y∈Fn2 ; y=H (x)

= 2n WF ◦H (w, v),

since u∈Fn2 (−1)
u·(y+H (x)) equals 2n if y = H (x), and is null otherwise.
72 Generalities on Boolean and vectorial functions

Remark. Because of Proposition 16, it could seem more convenient to exchange the
positions of u and v in WF (u, v). But we shall not do so because the common use is to
respect the order (input, output).

Remark. We have WF (u, v) = b∈Fm ϕb (u)(−1)v·b , where ϕb is the Fourier–Hadamard
2
transform of the indicator function ϕb of the preimage F −1 (b) = {x ∈ Fn2 ; F (x) = b}.

In [201], it is shown that the possibility of building a function CCZ equivalent to a given
(n, m)-function F depends on the structure of the set of zeros of its Walsh transform. Given
an affine permutation A = L+(a, b) of Fn2 ×Fm 2 (where L = (L1 , L2 ) is a linear permutation
and (a, b) a point in Fn2 × Fm 2 ), the image by A of the graph GF of F is the graph of a
function if and only if the image of F2 × {0m } by the adjoint operator L∗ of L is included in
n

the set WF−1 (0) ∪ {(0n , 0m )}. This is immediate: a necessary and sufficient condition is that
L1 (x, F (x)) be a permutation and according  to Proposition 35, page 112, this is equivalent
∗ (u,0 )·(x,F (x))
to ∀u = 0n , x∈Fn (−1) (u,0 m )·L(x,F (x)) = x∈Fn (−1) L m = 0. A transformation
2 2
called twisting allows then to move to another EA equivalence class within the same CCZ
equivalence class: the output of F is viewed in the form (Ty (x), Ux (y)) ∈ Ft2 × Fm−t 2 ,
where t ≤ min(n, m), x ∈ Ft2 , y ∈ F2n−t and where Ty is assumed to be a permutation
for every y. Then the t-twisting of F is the function (Ty−1 (x), UT −1 (x) (y)), whose graph is
y
obtained from that of F by swapping, in each vector of the graph, the subvector of indices
1, . . . , t and the subvector of indices n + 1, . . . , n + t. It is shown in [201] that every CCZ
equivalent function to F can be obtained from F in three steps: applying EA equivalence,
then twisting, then applying EA equivalence again. The number of EA equivalence classes
in the CCZ equivalence class of F is bounded above by the number of n-dimensional vector
spaces in WF−1 (0) ∪ {(0n , 0m )} and below by this same number divided by the order of
the automorphism group of function F (i.e., the group of those affine automorphisms that
preserve the graph of F ).

The case of power functions


When Fn2
is identified with F2n , we have seen at page 24 that power functions are
those functions of the form F (x) = x d . They usually have a lower implementation cost in
hardware. Such F is a permutation of F2n if and only if d is coprime with 2n − 1. An inner
product being for instance (x, y) → trn (x y), for every (u, v) ∈ (F∗2n )2 and every d, we have
(by the change of variable x → xu ) that WF (u, v) = WF (1, uvd ), and if x d is a permutation
   
of F2n , then we have by the change of variable x → x1 that WF (u, v) = WF u1 , 1 .
vd vd
It has been conjectured in 1976 by Helleseth34 in [592] that, for every n ≥ 2 and every
value of d coprime with 2n −1, there exists a ∈ F∗2n such that WF (a, 1) = 0. This conjecture
is still open. It has been checked for n ≤ 25 by Langevin and proved for d = 2n − 2 (inverse
function) in [733] (see more at page 215). We have the following propositions:

34 The conjecture is stated for every characteristic p such that d ≡ 1 [mod p − 1].
2.3 The Fourier–Hadamard transform and the Walsh transform 73

Proposition 17 [39] For every power permutation F (x) = x d , there exists a ∈ F∗2n such
that WF (a, 1) ≡ 0 [mod 3].

Proposition 18 [673] If gcd(d, 2n − 1) = 1 and if the set {Wf (a); a ∈ F∗2n } has three
distinct values exactly, then one of these values is 0.

Proposition 19 [673] If gcd(d, 2n −1) = 1 and n is a power of 2, then the set {Wf (a); a ∈
F∗2n } cannot have three distinct values exactly.

See more in [675]. Some results related to Gauss sums are also given in [38] and we have
the following proposition:

Proposition 20 [174] For every n equal to a power of 2 and every nonlinear power
permutation F (x) = x d over F2n , there exists a ∈ F∗2n such that WF (a, 1) is not divisible by
n
2 2 +1 .

It is observed in [38] that if n is even and F (x) = x d is constant over F∗ n and not over
n
22
F∗2n , then there exists a ∈ F∗2n such that WF (a, 1) = −2 2 . And using McEliece’s theorem,
it is proved that:

Proposition 21 [194, 196] Let l and n be two positive integers. The Walsh values of a
power function F (x) = x d over F2n are all divisible by 2l if and only if, for all u ∈ Z/
(2n − 1)Z, w2 (ud) ≤ w2 (u) + n − l.

See also [746]. These results are complementary of Theorem 2, page 63 (recall that
the algebraic degree of x d equals w2 (d)). The latter will have consequences for the
characterization of almost bent functions; see page 382.

Relation between the Walsh transform and NNF of the graph indicator
We know that WF (u, v) = 1, GF (u, v). Relation
 (2.59), page 66, and Relation (2.60) show
then that if the NNF of 1GF (x, y) equals I J
I ⊆{1,...,n} λI ,J x y , then
J ⊆{1,...,m}

WF (u, v) = (−1)wH (u)+wH (v) 2n+m−|I |−|J | λI ,J ,


I ⊆{1,...,n},J ⊆{1,...,m}
supp(u)⊆I ,supp(v)⊆J

λI ,J = 2−(n+m) (−2)|I |+|J | WF (u, v).


u∈Fn
2 ,v∈F2
m
I ⊆supp(u),J ⊆supp(v)
74 Generalities on Boolean and vectorial functions

2.3.7 The multidimensional Walsh transform


K. Nyberg defines in [911] a polynomial representation, called the multidimensional Walsh
transform; let us define


m
f (x)
W (F )(z1 , . . . , zm ) = zj j ∈ Z[z1 , . . . , zm ]/(z12 − 1, . . . , zm
2
− 1),
x∈Fn2 j =1

where f1 , . . . , fm are the coordinate functions of F .


The multidimensional Walsh transform maps every linear (n, m)-function L to the
polynomial W (F + L)(z1 , . . . , zm ). This is a representation with uniqueness of F , since, for
every L, the knowledge of W (F + L) is equivalent to that of the evaluation of W (F + L)
at (ξ1 , . . . , ξm ) for every choice of ξj , j = 1, . . . , m, in the set {−1, 1} of roots of the
polynomial zj2 − 1. For such a choice, let us define the vector v ∈ Fm 2 by vj = 1 if
ξj = −1 and vj = 0 otherwise. For every j = 1, . . . , m, let us denote by aj the vector
mF2 such that the aj · x. We denote then by u the vector
of n j th coordinate of L(x) equals
j =1 vj aj ∈ F2 . Then this evaluation equals
n v·F (x)⊕u·x . We see that the
x∈Fn2 (−1)
correspondence between the multidimensional Walsh transform and the Walsh transform is
the correspondence between a multivariate polynomial of Z[z1 , . . . , zm ]/(z12 −1, . . . , zm 2 −1)

and its evaluation over {(z1 , . . . , zm ) ∈ Z / z1 − 1 = · · · = zm − 1 = 0} = {−1, 1}m .


m 2 2

Consequently, the multidimensional Walsh transform satisfies a relation equivalent to the


Parseval’s relation (see [911]).

2.4 Fast computation of S-boxes


We shall see in Chapter
 n −111 that substitution boxes are almost always expressed in univariate
polynomial form j2 =0 bj x j (where x, bj ∈ F2n ), because the structure of field is needed
to generate them, although the multiplication plays no role in the criteria they must satisfy
(only the addition playing a role). In such polynomial expression, the additions and scalar
multiplications being linear mappings are fast to compute. Those multiplications whose two
operands include variables (that we shall exceptionally represent explicitly by ×) are more
complex to process fastly. Methods exist for multiplication processing (see more in [320]):
• The most efficient in terms of timing is complete tabulation, by reading the content of a
table in ROM containing all the precomputed results. The size of the table is of n22n bits,
the timing is around five cycles.
• The most efficient in terms of memory is direct processing. The timing complexity is of
order O(nlog3 (2) ) with large constants, thanks to Karatsuba’s method repeated recursively
until getting low-cost multiplications:
-n.
m= ; (ah Xm + al ) × (bh Xm + bl ) = ch X2m + chl Xm + cl ,
2
where ah , al , bh , bl , ch , chl , cl are polynomials of degree ≤ m

ch = ah × bh , cl = al × bl ,
chl = (ah + al ) × (bh + bl ) − ch − cl .
2.4 Fast computation of S-boxes 75

• A compromise is the so-called log-alog method, which assumes that the functions
log : x ∈ F2n → i = logα (x) and alog : i → x = α i
have been tabulated in ROM for some primitive element α of the field. The processing of
a × b then simply consists in processing:
c = alog[(log[a] + log[b]) mod 2n − 1].
Its memory complexity is n2n+1 bits, and its timing complexity is constant.
• Another compromise is obtained with the tower field approach. For n = 2m even, the
elements of F2n are viewed as elements of F2m [X]/(X2 + X + β), where X2 + X + β is
a degree-2 polynomial irreducible over F2m . The field isomorphism mapping an element
a ∈ F2n into the pair (ah , al ) ∈ F22m is denoted by L. The multiplication a × b is then
executed as follows:
(ah , al ) ← L(a); (bh , bl ) ← L(b); cl ← ah × bh × β + al × bl
ch ← ah × (bh + bl ) + al × bh ; c ← L−1 (ch , cl ).
This is recursively applied if n is a power of 2.

Methods also exist for evaluating whole polynomials (e.g., the cyclotomic method and the
Knuth–Eve method) that we shall present in Section 12.1.2 because they play a role with
respect to countermeasures against side-channel attacks.
3

Boolean functions, vectorial functions,


and cryptography

The design of conventional cryptographic systems relies on two fundamental principles


introduced by Shannon [1034]: confusion and diffusion. Confusion aims at concealing any
algebraic structure in the system. It is closely related to the complexity1 of the involved (so-
called nonlinear) functions. Diffusion consists in spreading out the influence of any minor
modification of the input data or of the key over all outputs. These two principles were
stated more than half a century ago. Since then, many attacks have been found against the
diverse known cryptosystems, and the relevance of these two principles has always been
confirmed. In this chapter, we describe the main attacks on symmetric cryptosystems and
the related criteria on Boolean and vectorial functions. Two books exist on Boolean and
vectorial functions for cryptography [401, 1125], which partly cover the state of the art.
Several sections of the Handbook of Finite Fields [890] are also devoted to this same subject
(in a reduced format). In the subsequent chapters, we shall develop as completely as possible
the study of each criterion.

3.1 Cryptographic criteria (and related parameters) for Boolean functions


The known attacks on stream ciphers lead to criteria [842, 844, 970, 1041] that the imple-
mented cryptographic functions must satisfy to resist attacks [388, 391, 826, 843, 1042].
More precisely, the resistance of the cryptosystems to the known attacks can be quantified
through some fundamental characteristics (some, more related to confusion, and some,
more related to diffusion) of the Boolean functions used in them; and the design of these
cryptographic functions needs to consider various characteristics simultaneously. Some of
these characteristics are affine invariants. Of course, all characteristics cannot be optimum
at the same time, and trade-offs must be considered (see below).

3.1.1 Balancedness
Cryptographic Boolean functions must be balanced (their output must be uniformly, i.e.,
equally, distributed over {0, 1}) for avoiding statistical dependence between the plaintext
and the ciphertext. Indeed, we wish that it is not possible to distinguish the pair of a random
plaintext and of the corresponding ciphertext from a random pair. Notice that f is balanced
if and only if Wf (0n ) = F (f ) = 0.

1 That is, the cryptographic complexity, which is different from circuit complexity, for instance.

76
3.1 Cryptographic criteria (and related parameters) for Boolean functions 77

3.1.2 Algebraic degree


Cryptographic functions must have high algebraic degrees (see Definition 6, page 35).
Indeed, all cryptosystems using Boolean functions for confusion (combining or filtering
functions in stream ciphers, functions involved in the S-boxes of block ciphers, etc.) can be
attacked if the functions have low algebraic degrees. For instance, in the case of combining
functions (see Figure 1.3, page 22),if n LFSRs
 having lengths L1 , . . . , Ln are combined
 
by the function f (x) = aI xi , then the sequence produced by f has linear
I ⊆{1,...,n} i∈I
complexity
 

L≤ aI Li
I ⊆{1,...,n} i∈I

(and L equals this number under the sufficient condition that the sequences output by the
LFSRs are m-sequences of pairwise coprime periods), see [1007, 1183]. In the case of the
filter model (see Figure 1.4, page 23), we have a less precise result [1006]: if L is the length
of the LFSR and if the feedback polynomial is primitive, then the linear complexity of the
sequence satisfies:
dalg (f ) 
L
L≤ .
i
i=0

Moreover, if L is a prime, then L ≥ L
, and the fraction of functions f of a given
dalg (f )
dalg (f ) L
algebraic degree that output a sequence of linear complexity equal to i=0 i is at least
e−1/L . In both models, the algebraic degree of f has to be high so that L can have high
value (the number of those nonzero coefficients aI , in the ANF of f , such that I has large
size, can also play a role, but clearly a less important one).
When n tends to infinity, random
n−2 n
Boolean functions have almost surely algebraic degrees
at least n − 1 (the number 2 i=0 i = 22 −n−1 of Boolean functions of algebraic degree at
n
( )
n
most n − 2 is negligible with respect to the number 22 of all Boolean functions). But we
shall see that the functions of algebraic degree n − 1 or n do not allow achieving some other
characteristics such as resiliency.
We have seen in Section 2.2 that the algebraic degree is an affine invariant.

3.1.3 Nonlinearity and higher-order nonlinearity


In order to provide confusion, cryptographic functions must lie at large Hamming distance
from all affine functions. Let us explain why.

Correlations with linear functions and attacks


We shall say that there is a nonzero correlation between a Boolean function f and a linear
function  if dH (f , ) is different from2n−1 (precisely, the correlation between f and
a (x) = a · x, where a ∈ Fn2 , equals x∈Fn (−1)f (x)⊕a (x) , that is Wf (a)). Because of
2
Parseval’s relation (2.48), page 61, and of Relation (2.37), page 57, any Boolean function
78 Boolean functions, vectorial functions, and cryptography

has nonzero correlation with at least one linear function. But all correlations should be
small (in magnitude). Indeed, a large positive correlation between a Boolean function f
involved in a cryptosystem and a linear function  means that dH (f , ) is small, and f is
then efficiently approximated by ; a large negative one means that it is approximated by
 ⊕ 1.
The existence of such affine approximations of f allows in various situations (block
ciphers, stream ciphers) the building of attacks on this system.
In the case of stream ciphers, these attacks are the so-called fast correlation attacks [203,
369, 517, 645, 646, 647, 843]: let  be a linear approximation of f (or of f ⊕ 1, but then
we shall study f ⊕ 1), whose distance to f is smaller than 2n−1 , denoting by Prob [E] the
probability of an event E, we have:
dH (f , ) 1
p = Prob [f (x1 , . . . , xn ) = (x1 , . . . , xn )] = = − ,
2n 2
where > 0. The pseudorandom sequence s corresponds then to the transmission with
errors of the sequence σ that would be produced by the same model with the same LFSRs,
but with  instead of f . Attacking the cipher can be done by correcting the errors as in
the transmission of the sequence σ over a noisy channel. Assume that we have N bits
su , . . . , su+N−1 of the pseudorandom sequence s, then Prob [si = σi ] ≈ p. The set of
possible sequences σu , . . . , σu+N−1 is a vector space, that is, a linear code of length N
and dimension at most L, where L is the size of the linear part of the PRG (the length of
the LFSR in the case of the filter generator). We then use a decoding algorithm to recover
σu , . . . , σu+N−1 from su , . . . , su+N−1 , and since  is linear, the linear complexity of the
sequence σ is small and we obtain, for instance by the Berlekamp–Massey (BM) algorithm,
the initialization of the LFSR. We can then compute the whole sequence s.
There are several ways for performing the decoding. The method exposed in [843]
and improvedby [369] is as follows. We call a parity check polynomial any polynomial
a(x) = 1 + rj =1 aj x j (ar = 0), which is a multiple of the feedback polynomial of an

LFSR generating the sequence σi . Denoting by σ (x) the generating function i≥0 σi x i , the
product a(x) σ (x) is a polynomial of a degree less than r. We use for the decoding a set
of parity check polynomials satisfying three conditions: their degrees are bounded by some
integer m, the number of nonzero coefficients aj in each of them is at most some number
t ≥ 3 (i.e., each polynomial has Hamming weight at most t +1), and for every j = 1, . . . , m,
at most one polynomial has nonzero coefficient  aj . Each parity check polynomial a(x) =
1 + rj =1 aj x j gives a linear relation σi = rj =1 aj σi−j = j =1,...,r ; aj =0 σi−j for every
i ≥ m, and the relations corresponding to different polynomials involve different indices
i − j . If we replace the (unknown) σi s by the si s, then some of these relations become false,
but it is possible by using the method of Gallager [524] to compute a sequence zi such that
Prob (zi = σi ) > 1 − p. Then it can be proved that iterating this process converges to the
sequence σ (with a speed that depends on m, t, and p). The number of bits N needed to be
known in the keystream, the offline time complexity P , and the online time complexity T
are as follows (see [203]):
 2(t−2)  2t (t−2)
L 1 t−1 N t−2 L 1 t−1
N =2 t−1 P = T =2 t−1 ,
2 (t − 2)! 2
3.1 Cryptographic criteria (and related parameters) for Boolean functions 79

where L is the length of the LFSR and is the bias of the nonlinearity with respect to 2n−1 ,
that is, = 2 −nl(f
n−1 )
2n , where nl(f ) is defined below. Note that the number of variables of
the function does not play an explicit role.
In the case of block ciphers, we shall see in Section 3.4 that the Boolean functions
involved in their S-boxes must also lie at large Hamming distances to affine functions, to
allow resistance to the linear attacks [829].

The corresponding parameter and criterion for Boolean functions


Definition 18 The nonlinearity of a Boolean function f is the minimum Hamming distance
between f and affine functions. We shall denote it by nl(f ).

The larger is the nonlinearity, the larger is p in the fast correlation attack and the less
efficient is the attack. Hence, from the designer point of view, the nonlinearity must be large
(in a sense that will be clarified below), and we shall see that this condition happens to
be necessary against other attacks as well. A high nonlinearity is surely one of the most
important cryptographic criteria.
By definition, the nonlinearity of any Boolean function is bounded above by its Hamming
weight. The set of those Boolean functions that achieve this bound with equality (i.e., of
all possible coset leaders of the first-order Reed–Muller code) is unknown. Some functions
belong obviously to it: n-variable Boolean functions of Hamming weight at most 2n−2 (since
nonzero affine functions have at least twice the weight, and according to the triangular
n
inequality); bent functions (see Definition 19 below) of Hamming weight 2n−1 − 2 2 −1 ;
and more generally, plateaued functions (see Definition 63, page 258) of amplitude 2r and
Hamming weight 2n−1 − 2r−1 . But the set is not completely determined. Note that each
coset of the first-order Reed–Muller code contains at least one element of this set. In [1138],
the Boolean functions of nonlinearities 2n−2 and 2n−2 + 1 are studied.
The nonlinearity is an EA invariant, since dH (f ◦ L ⊕  , ) = dH (f , ( ⊕  ) ◦ L−1 ), for
every function f , , and  , and for every affine automorphism L, and since ( ⊕  ) ◦ L−1
ranges over the whole set of affine functions when  does.
The nonlinearity can be computed through the Walsh transform: let a (x) = a1 x1 ⊕ · · · ⊕
an xn = a · x be any linear function; according to Relation (2.37), we have dH (f , a ) =
2n−1 − 12 Wf (a), and we deduce dH (f , a ⊕ 1) = 2n−1 + 12 Wf (a); the nonlinearity of f is
therefore equal to
1
nl(f ) = 2n−1 − max |Wf (a)|. (3.1)
2 a∈Fn2
Hence a function has high nonlinearity if and only if all of its Walsh values have low
magnitudes. The value maxa∈Fn2 |Wf (a)| is called the “linearity of f ” by some authors and
its “spectral amplitude” by some others.

Upper and lower bounds, bent functions



Parseval’s relation a∈Fn Wf2 (a) = 22n implies that the arithmetic mean of Wf2 (a) equals
2
2n . The maximum of Wf2 (a) being larger than or equal to its arithmetic mean, we deduce
n
that maxa∈Fn2 |Wf (a)| ≥ 2 2 . This implies the following:
80 Boolean functions, vectorial functions, and cryptography

Theorem 3 For every n-variable Boolean function f , we have


n
nl(f ) ≤ 2n−1 − 2 2 −1 . (3.2)

This bound, valid for every Boolean function and tight for every even n as we shall see,
is called the covering radius bound2 . It can be improved (i.e., lowered) when we restrict
ourselves to some subclasses of functions: resilient and correlation immune functions
(see Chapter 7); functions tr(ax d ) such that a ∈ F∗2n and gcd(d, 2n − 1) = 1, since
their nonlinearity equals that of the vectorial function x d and is then bounded above by
n−1
2n−1 − 2 2 , according to Theorem 6, page 118. A Boolean function will be considered
as highly nonlinear if its nonlinearity lies near3 the upper bound in its class. Note that,
for general Boolean functions, there is no direct correlation between the nonlinearity and
the algebraic degree: highly nonlinear n-variable functions can have an algebraic degree
as low as 2 (see Section 5.2) and as large as n (but then the nonlinearity cannot be
optimal; see Theorem 13, page 200), and functions with low nonlinearity (e.g., functions
of Hamming weight at most 2n−2 , whose nonlinearity equals the Hamming weight since
the minimum distance of the Reed–Muller code of order 1 equals 2n−1 and because of the
triangular inequality on Hamming distance) can have algebraic degree between 2 and n
as well.
Olejár and Stanek [917] have shown that, when n tends to infinity, random Boolean
√ n−1
functions on Fn2 have almost surely nonlinearity larger than 2n−1 − n 2 2 (this is easy to
prove by counting – or more precisely by bounding from above – the number of functions
whose nonlinearities are lower than or equal to a given number; see, e.g., [224, 229], and
using the so-called Shannon effect; see page 103). Rodier [1000] has shown later a more
precise and strong result: asymptotically, almost
 all Boolean functions have nonlinearity
n
−1 √ √ n
−1 √ √ 
between 2 n−1 − 22 n 2 ln 2 + n
4 ln n
and 2 n−1 − 22 n 2 ln 2 − 5 ln
n
n
and
n √
therefore located in the neighborhood of 2n−1 − 2 2 −1 2n ln 2, where ln denotes the natural
(i.e., Neperian) logarithm.
The probability Prob (maxw |WF (w)| ≥ y) is equal to 1 when y = 2n/2 ; it decreases
slowly when y increases,
√ decreases then suddenly to the neighborhood of 0 when y is
approaching 2 n/2 2n ln 2, then it√decreases slowly to 0 when y increases to 2n . Further
details can be given around 2n/2 2n ln 2. The article [784] provides sharper √ results by
a different √method. For n > 164, we have Prob (maxw |WF (w)| ≥ 2n/2 2n ln 2) ≤
(1 + o(1))/ nπ ln 2.
n
Equality occurs in (3.2) if and only if |Wf (a)| = 2 2 for every vector a, since the
maximum of Wf2 (a) equals the arithmetic mean if and only if Wf2 (a) is constant.

Definition 19 An n variable Boolean function is called bent if its nonlinearity equals


n n
2n−1 − 2 2 −1 , or equivalently, Wf (a) = ±2 2 for every a ∈ Fn2 .

2 The covering radius of the Reed–Muller code of order 1 equals by definition the maximum nonlinearity of
Boolean functions; see Section 4.1.
3 The meaning of “near” depends on the framework; see [650].
3.1 Cryptographic criteria (and related parameters) for Boolean functions 81

Table 3.1 Best-known nonlinearities nl of Boolean functions in small odd dimension [815].

n 5 7 9 11 13 15
n−1
2n−1 − 2 2 12 56 240 992 4,032 16,256
nl 12 56 242 996 4,040 16,276
22n−2 − 2 2 −2 
n
12 58 244 1,000 4,050 16,292

n
Such functions exist only for even values of n, because 2n−1 − 2 2 −1 must be an integer
(in fact, they exist for every n even). Chapter 6 is devoted to them.
For n odd, Inequality (3.2) cannot be tight. The maximum nonlinearity of n-variable
n−1
Boolean functions, that is, the covering radius of RM(1, n), lies then between 2n−1 − 2 2
(which can always be achieved, e.g., by quadratic functions; see Section 5.2) and 22n−2 −
n n−1
2 2 −2  [617]. It has been shown in [597, 894] that it equals 2n−1 −2 2 when n = 1, 3, 5, 7,
and in [936, 937], by Patterson and Wiedemann4 (with rotation symmetric functions, see
n−1
Definitions 59 and 60 at page 248), that it is strictly larger than 2n−1 − 2 2 if n ≥ 15 (a
review on what was known in 1999 on the best nonlinearities of functions on odd numbers
n−1
of variables is given in [515]; see also [133, 747, 815]). This value 2n−1 − 2 2 is called
the quadratic bound because, as we already mentioned, such nonlinearity can be achieved
by quadratic functions. It is also called the bent concatenation bound since it can also be
achieved by the concatenation xn f (x1 , . . . , xn−1 ) ⊕ (xn ⊕ 1)g(x1 , . . . , xn−1 ) of two bent
functions f , g in n − 1 variables. It has been later proved by Kavut et al. in [684, 686] (see
also [816], where balanced functions are obtained), thanks to rotation symmetric functions
as well, that the best nonlinearity of Boolean functions in odd numbers of variables is
strictly larger than the quadratic bound for any n > 7. See Table 3.1 for the best-known
nonlinearities for n odd between 5 and 15 compared to the quadratic (or bent concatenation)
lower bound and to the upper bound.
Bent functions being not balanced (since we have seen that f is bent if and only if |Wf (a)|
n
equals 2 2 for every vector a and then Wf (0n ) = 0), and having too low algebraic degree (as
we shall see with Theorem 13, page 200), they are improper for use in cryptosystems. For
this reason, even when they exist (for n even), it is also necessary to study those functions
n−1 n
that have large nonlinearities, say between 2n−1 − 2 2 and 2n−1 − 2 2 −1 , but are not bent,
among which some balanced functions exist. The maximum nonlinearity of balanced
functions is unknown for any n ≥ 8. See Table 3.2 for the best-known nonlinearities for
n
n between 4 and 15 compared to the upper bound bnd = 22n−2 − 2 2 −2 . Note that the
15-variable function can be made 1-resilient (see Definition 21, page 86).
As first observed in [1169, 1173], relations exist between the nonlinearity and the
derivatives of Boolean functions. We give here simpler proofs of these facts. Applying
Relation (2.56) to E = {0n , e}⊥ , where e = 0n , and to b = 0n and all a ∈ Fn2 , and using that
1 
maxu∈a+E Wf2 (u) ≥ |E| 2
u∈a+E Wf (u), we have the following proposition:

4 It has been later proved (see [466, 696, 820, 1016, 1027]) that balanced functions with nonlinearity strictly
n−1
larger than 2n−1 − 2 2 , and with algebraic degree n − 1, or satisfying P C(1); see Definition 24, page 97,
exist for every odd n ≥ 15.
82 Boolean functions, vectorial functions, and cryptography

Table 3.2 Best-known nonlinearities nl of balanced Boolean functions in small dimension


[685, 815, 816, 1016].

n 4 5 6 7 8 9 10 11 12 13 14 15

nl 4 12 26 56 116 240 492 992 2,010 4,036 8,120 16,272


bnd 6 12 28 58 120 244 496 1,000 2,016 4,050 8,128 16,292

Proposition 22 For every n ≥ 1 and every n-variable Boolean function f , we have


1/ n
nl(f ) ≤ 2n−1 − 2 + max |F (De f )|.
2 e=0n

This directly proves an important property of bent functions that we shall revisit in Chapter
6: f is bent if and only if all its derivatives De f , e = 0n , are balanced. 
The obvious relation wH (f ) ≥ 12 wH (De f ) = 12 2n−1 − 12 F (De f ) , valid for every
e ∈ Fn2 , leads when applied to the functions
 f (x) ⊕ a · x ⊕ ,wherea ∈ Fn2 and ∈ F2 , 
to
the inequality dH (f , a ·x ⊕ ) ≥ 12 2n−1 − 12 (−1)a·e F (De f ) ≥ 12 2n−1 − 12 |F (De f )| .
Hence, taking the maximum of this last expression when e ranges over Fn2 , we deduce the
lower bound:

Proposition 23 For every positive integer n and every n-variable function f , we have
1
nl(f ) ≥ 2n−2 − min |F (De f )|. (3.3)
4 e∈Fn2 ,e=0n

Another lower bound on the nonlinearity is given at the end of the remark located after
Theorem 7, page 152, and a further one is given in [248, subsection 4.2]:

Proposition 24 Let f be any n-variable Boolean function. Let S = supp(f ) = {x ∈ Fn2;


f (x) = 1} be the support of f and let Mf =
0' (0 0' (0
maxz∈Fn2 \{0n } 0 (x, y) ∈ S 2 ; x + y = z 0 + minz∈Fn2 \{0n } 0 (x, y) ∈ S 2 ; x + y = z 0
2
and let Ef =
0' (0 ' (
maxz∈Fn2 \{0n } 0 (x, y) ∈ S 2 ; x + y = z 0 − minz∈Fn2 \{0n } | (x, y) ∈ S 2 ; x + y = z |
.
2
Then:
 1 
nl(f ) ≥ 2n−1 − max |2n−1 − |S||, |S| − Mf + (2n − 1)Ef . (3.4)

Any bent function achieves (3.4) with equality and it would be interesting to determine
all functions f such that (3.4) is an equality.
3.1 Cryptographic criteria (and related parameters) for Boolean functions 83

Nonlinearity and codes


The nonlinearity of a Boolean function f equals the minimum distance of the linear code
RM(1, n) ∪ (f ⊕ RM(1, n)). See more in Chapter 4. More generally, the minimum distance
of an unrestricted code defined as the union of cosets f ⊕RM(1, n) of the Reed–Muller code
of order 1, where f ranges over a set F equals the minimum nonlinearity of the functions
f ⊕ g, where f and g are distinct and range over F , since dH (f ⊕ h, g ⊕ h ) = dH (f ⊕ g,
h⊕h ) and h⊕h ranges over RM(1, n) when h, h do. This observation allows constructing
some optimal nonlinear codes such as Kerdock codes (see Section 6.1.22).

Higher-order nonlinearity
Changing one or a few bits in the output (in the truth table) of a low degree Boolean function
gives a function with high degree and does not fundamentally modify the robustness of the
system using it (explicit attacks using approximations by low-degree functions exist for
block ciphers but not for all stream ciphers, however; see, e.g., [707]). A relevant parameter
is the nonlinearity profile:

Definition 20 Let n and r ≤ n be positive integers. Let f be an n-variable Boolean func-


tion. We call the r-th order nonlinearity (and if r is not specified, the higher-order
nonlinearity) of f and we denote by nlr (f ), its Hamming distance to the Reed–Muller code
of order r. The nonlinearity profile of f is the sequence of its r-th order nonlinearities, for
all values of r < n.

Several papers have shown the role played by this EA-invariant parameter against some
cryptanalyses (but contrary to the first-order nonlinearity, it must have low value for allowing
attacks) and studied it from an algorithmic viewpoint [387, 544, 638, 707, 831, 882]. It
is related to the minimal distance to functions depending on a subset of variables (which
plays a role with respect to the correlation attack, see below in Subsection 3.1.7, and is not
EA invariant) since a function depending on k variables has algebraic degree at most k.
Hence the r-th order nonlinearity is a lower bound for the distance to functions depending
on at most r variables. The former is much more difficult to study than the latter. The best
possible rth-order nonlinearity of Boolean functions equals the covering radius of the r-th
order Reed–Muller code; see Subsection 4.1.6, page 157.

Upper and lower bounds and asymptotic behavior An upper bound on nlr (f ) is given
in [309] for r ≥ 2, which we shall address in Section 4.1 (see page 158). Asymptotically, it
gives

15 √ n
nlr (f ) ≤ 2 n−1
− · (1 + 2)r−2 · 2 2 + O(nr−2 ).
2
An asymptotic lower bound, given in [229], is as follows: let c ∈ R, c > 0; for every r ≥ 0,
the density of the set of functions such that
2
3 r 
3 n
nlr (f ) > 2n−1
−c 4 n−1
2 2
i
i=0
84 Boolean functions, vectorial functions, and cryptography

(i.e., the probability



for a function to satisfy this inequality) is larger than 1 −
(1−c2 log2 e) ri=0 (ni)
2 and, if c2 log2 e > 1, it tends to 1 when n tends to ∞. This is r
easily
n
proved: the number of functions of algebraic degree at most r equals 2 i=0 ( i ) .
For every such function h, the number of Boolean functions  f whose  2n Hamming
distance to h is bounded above by some number D equals 0≤i≤D 2i . Hence, the
3 r 
3 n
number of Boolean functions f such that dH (f , h) ≤ 2n−1 − c4
n−1
2 2
i
n  i=0
2
equals . We know from [14] that, for every N, we have
√  n−1
i
r n
0≤i≤2n−1 −c i=0 ( i ) 2
2
 N N −2N(1/2−λ) 2
0≤i≤λN i < 2 e 1 
. We deduce that the number of Boolean functions f such
2 r n
2 is bounded above by 22 −c
n−1 n
that dH (f , h) ≤ 2n−1 − c r n
2 i=0 ( i ) log2 e . Thus,
i=0 i
the number of those 1Boolean functions that have r-th order nonlinearity smaller than or
 r n
2 2 is smaller than 22 +(1−c log2 e) i=0 ( i ) . The rest of the
r n n−1 n 2
equal to 2n−1 − c i=0 i
proof is straightforward.
A more precise and more recent result is given by K.-U. Schmidt in [1021], which
generalizes the result on r = 1 by Rodier [1000] recalled at page 80, and a result from
2n−1 −nlr (f )
[435], which dealt with r = 2: for every r ≥ 1, the ratio √ tends to 1 almost
2n−1 (nr) ln 2
surely when n tends to infinity (see more details in [1021]).
Unfortunately, this does not help obtaining explicit functions with nonweak r-th order
nonlinearity.

Remark. We shall see in Section 4.1 that the minimum Hamming weight of nonzero
n-variable Boolean functions of an algebraic degree at most r (i.e., the minimum distance
of the Reed–Muller code RM(r, n)) is equal to 2n−r for every r ≤ n. Hence, applying this
property to r + 1 instead of r, we have nlr (f ) ≥ 2n−r−1 for every function f of an algebraic
degree exactly r + 1 ≤ n. Moreover, we shall also see that the minimum weight n-variable
Boolean functions of an algebraic degree r +1 are the characteristic functions of (n−r −1)-
dimensional flats. Such functions have r-th order nonlinearity 2n−r−1 since the null function
is the closest function of an algebraic degree at most r to such a function.

Computing the r-th order nonlinearity of a given function with an algebraic degree strictly
larger than r is a hard task for r > 1 (for the first order, we have seen that much is known
in theory and algorithmically thanks to the Walsh transform, which can be computed by the
algorithm of the fast Fourier–Hadamard transform; but for r > 1, very little is known).
Even the second-order nonlinearity is known only for a few peculiar functions and for
functions in small numbers of variables. Some simple but useful facts are shown in [232].
A nice algorithm due to G. Kabatiansky and C. Tavernier and improved and implemented
by Fourquet and Tavernier [518] works well for r = 2 and n ≤ 11 (in some cases,
n ≤ 13) only. It can be applied for higher orders, but it is then efficient only for very
small numbers of variables. Proving lower bounds on the r-th order nonlinearity of functions
(and therefore proving their good behavior with respect to this criterion) is also a quite
3.1 Cryptographic criteria (and related parameters) for Boolean functions 85

difficult task. Until 2008, there had been only one attempt, by Iwata and Kurosawa [638],
to construct functions with r-th order nonlinearity bounded from below. But the obtained
value, 2n−r−3 (r + 5), of the lower bound was small. Also, lower bounds on the r-th order
nonlinearity by means of the algebraic immunity of Boolean functions have been derived
(see Chapter 9), but they are small too. In [232], a method is introduced for efficiently
bounding from below the nonlinearity profile of a given function when lower bounds exist
for the (r − 1)-th order nonlinearities of the derivatives of f :

Theorem 4 Let f ∈ BF n and let 0 < r < n be an integer. We have:

1
nlr (f ) ≥ max nlr−1 (Da f ), and
2 a∈Fn2
5
1 2n
nlr (f ) ≥ 2n−1 − 2 −2 nlr−1 (Da f ).
2 n
a∈F2

The first bound is easily deduced from the inequality wH (f ) ≥ 12 wH (Da f ) applied
to f ⊕ h, dalg (h) ≤ r, and the second one comes from the equalities nlr (f ) = 2n−1 −
0 0 ⎛ ⎞2
0 0
1 0 0
max 0 (−1)f (x)⊕h(x) 00 and ⎝ (−1)f (x)⊕h(x) ⎠ =
2 h∈BF n ; dalg (h)≤r 00 n 0
x∈F
2 x∈Fn
2

(−1)Da f (x)⊕Da h(x) = 22n − 2 dH (Da f , Da h).


a∈Fn2 x∈Fn2 a∈Fn2

These bounds ease the determination of efficient lower bounds on the second-order
nonlinearities of functions in some infinite classes, by reducing the problem to calculations
and summations of first-order nonlinearities (often tricky, but feasible). This has been done
in a series of papers (see, e.g., in [714] the references and the table comparing the obtained
second-order nonlinearities; see also [538]) that we shall not all cite. Such lower bounds
were given as examples (about power functions, including the Welch function) in [232],
but also bounds for the whole nonlinearity profile of the multiplicative inverse function
trn (x 2 −2 ): the r-th order nonlinearity of this function is approximately bounded below
n

−r
by 2n−1 − 2(1−2 ) n and therefore asymptotically equivalent to 2n−1 , for every fixed r. Note
that the extension of the Weil bound that we shall see in Section 5.6 is efficient for bounding
below the r-th order nonlinearity of the inverse function only for r = 1. Indeed, already
for r = 2, the univariate degree of a quadratic function in trace representation form can be
n
bounded above by 2 2  +1 only, and this gives a bound in 2n on the maximum absolute value
of the Walsh transform and therefore no information on the nonlinearity. In [240], the author
similarly studied the (simplest) Dillon bent function (x, y) → xy 2 −2 , x, y ∈ F2n/2 (with
n/2

an improvement in [1066]) and a univariate function. In [607], the authors asymptotically


studied, for p an odd prime, the Boolean function taking value 0 over the binary expansions
of the quadratic residues modulo p.
86 Boolean functions, vectorial functions, and cryptography

The relative positions of the two bounds of Theorem 4 with respect to each other have
been studied in [872], where it is shown that for r = 2, there exist functions for which the
first bound is stronger, and others where it is weaker.

3.1.4 Correlation immunity and resiliency


We have seen that the Boolean functions used in stream ciphers must be balanced. In both
models of pseudorandom generators, there is a stronger condition related to balancedness
to satisfy.

In the combiner model


Any combining function f (x) must stay balanced when some number of coordinates xi of x
are kept constant.

Definition 21 Let n be a positive integer and t ≤ n a nonnegative integer. A n-variable


Boolean function f is called an t-th order correlation immune function if its output
distribution probability is unaltered when at most t (or, equivalently, exactly t) of its input
bits are kept constant. It is called a t-resilient function5 if it is balanced and t-th order
correlation immune, that is, if any of its restrictions obtained by fixing at most t (or exactly t)
of its input coordinates xi is balanced.

Note that, by definition, 0-th order correlation immunity is an empty condition and
0-resiliency means balancedness.
Nota Bene. When we say that a function f is t-th order correlation immune (t-resilient if
it is balanced), we do not mean that t is the maximum value of k such that f is k-th order
correlation immune. We will call this maximum value the correlation immunity order of f
(resp. its resiliency order if it is balanced).
The notion of correlation immune function has been introduced by Siegenthaler in [1041].
It has been observed later in [181] that the notion existed already in combinatorics and
statistics. Indeed, saying that a function f is t-th order correlation immune is equivalent to
saying that the array (i.e., matrix) whose rows are the vectors of the support of f is a simple
binary orthogonal array6 of strength t.

Definition 22 [988] An array (a matrix) over an alphabet A is an orthogonal array of


strength t if, when we select any t columns in it, each vector of At appears the same number
λ of times as a row in the array restricted to these columns. This orthogonal array is called
simple if no two rows are equal. It is often called a t − (|A|, n, λ) orthogonal array, where
n is the number of columns in the array (in the case of correlation immune functions, the
number of variables, with |A| = 2).

5 The term of resiliency was introduced in [370], in relationship with another cryptographic problem.
6 This also relates then correlation immune functions to mutually orthogonal latin squares and threshold
secret-sharing schemes.
3.1 Cryptographic criteria (and related parameters) for Boolean functions 87

Orthogonal arrays play a role in statistics, for the organization of experiments. Each
row corresponds to the organization of an experiment and the n columns correspond to
parameters. It is necessary to organize the experiments so that any combination of some
number k of parameters will appear in the same number of experiments. This is achieved if
all possible |A|n experiments are made, but this is not a solution since the number of rows
needs to be minimized (exactly as in the case of countermeasures to side-channel attacks; see
Subsection 12.1.1, page 431). There exist bounds: the number of rows in a binary orthogonal
 k2  n  
array of strength k is larger than or equal to i=0 (Rao [988]) and to 2 n 1− n
i 2(k+1)
(Friedman, [520]). There exists a monograph on orthogonal arrays [591].
Correlation immunity is a criterion for the resistance to an attack on the combiner model
due to Siegenthaler, called correlation attack [1042]: if f is not t-th order correlation
immune, then there exists a correlation between the output of the function and (at most)
t coordinates of its input; if t is small, a divide-and-conquer attack uses this weakness for
attacking a system using f as a combining function; in the original attack by Siegenthaler,
all the possible initializations of the t LFSRs corresponding to these coordinates are tested
(in other words, an exhaustive search of the initializations of these specific LFSRs is
done); when we arrive to the correct initialization of these LFSRs, we observe a correlation
(before that, the correlation is negligible, as for random pairs of sequences); now that the
initializations of the t LFSRs are known, those of the remaining LFSRs can be found with
an independent exhaustive search (or by applying again the Siegenthaler attack if possible).

An additional condition It is shown in [187, 203] that, to make the correlation attack
on the combiner model with a t-resilient combining function as inefficient as possible, the
coefficient Wf (u) of the function has to be small for every vector u of Hamming weight
higher than but close to t. This condition is satisfied under the sufficient condition that the
function is highly nonlinear (i.e., has high nonlinearity). Hence we see that nonlinearity
plays a role with respect to this attack as well.

Characterization of correlation immunity and resiliency by the Walsh transform


Resiliency and correlation immunity have been nicely characterized by means of the
Fourier–Hadamard and Walsh transforms of f , first by S. Golomb in [549] (which is not
widely known) and later by Xiao and Massey in [1128]. We propose to call this the Golomb–
Xiao–Massey characterization:

Theorem 5 [549] Any n-variable Boolean function f is t-th order correlation immune if
and only if, for all u ∈ Fn2 such that 1 ≤ wH (u) ≤ t, we have Wf (u) = 0, i.e., f(u) = 0.
And f is t-resilient if and only if Wf (u) = 0 for all u ∈ Fn2 such that wH (u) ≤ t.

Proof Let us prove the first assertion. The second is a direct consequence. By applying
the Poisson summation formula (2.39), page 58, to ϕ = fχ , a = 0n and EI = {x ∈
Fn2 ; xi = 0, ∀i ∈ I }, b ranging over Fn2 , we obtain since EI⊥ = {x ∈ Fn2 ; xi = 0, ∀i ∈ I }
that f is t-th order correlation immune if and only if, for every I of size t, the value of
the sum u∈EI (−1)b·u Wf (u) is independent of b. If, for every nonzero u of weight at
most t, we have Wf (u) = 0 (that is, f(u) = 0 according to Relation (2.32)), then the
88 Boolean functions, vectorial functions, and cryptography

sum u∈EI (−1)b·u Wf (u) is independent of b. Conversely, if this latter property is satisfied

for every I of size t, then since u∈EI (−1)b·u Wf (u) is the Fourier–Hadamard transform
of the function equal to Wf (u) if u ∈ EI and to 0 otherwise, by the inverse Fourier–
Hadamard transform formula (2.42), we have Wf (u) = 0 for every nonzero u of weight at
most t.

Remark. For f balanced, there is another proof: we apply the second-order Poisson
formula (2.57) to E = {x ∈ Fn2 ; xi = 0, ∀i ∈ I }, where I is any set of indices of size t; the
sum of E and E ⊥ = {x∈ Fn2 ; xi = 0, ∀i ∈ I }is direct and equals Fn2 ; hence we can take
E  = E ⊥ and we get u∈E ⊥ Wf2 (u) = |E ⊥ | a∈E ⊥ F 2 (ha ), where ha is the restriction
of f to a + E, that is, the restriction obtained by fixing the coordinates of x whose indices
belong to I to the corresponding coordinates of a. The number F (ha ) is null if and only if
ha is balanced and clearly, all the numbers F (ha ), a ∈ E ⊥ are null if and only if all the
numbers Wf (u), u ∈ E ⊥ are null. Since this is valid for every multiindex I of size t, this
completes the proof.

Another characterization of correlation immune and 2nresilient functions exists, by the


−1 f (k) ξ −kj , where j and k
discrete Fourier transform: j ∈ {0, . . . , 2n − 1} → k=0 (−1) √
2π −1
are identified respectively with their binary expansions and ξ = e 2n ; see [1110].
Theorem 5 directly implies the following corollary:

Corollary 6 Let f be any n-variable Boolean function and t ≤ n. Then f is t-th order
correlation immune if and only if its support, viewed as an unrestricted code, has dual
distance at least t + 1.

Proof Let C denote the support of f. The dual distance of C equals (by Definition 4, page
16) the number min{wH (u); u = 0n , x∈C (−1)u·x = f(u) = 0}.

See more in [422, 423] (see also in [828] a generalization of this result to arrays over
finite fields and other related nice results).
Hence, since the Hamming weight of a t-th order correlation immune function is by
definition divisible by 2t , the size of a code of dual distance d is divisible by 2d−1 , as we
saw at page 17.

Automorphism group Contrary to the algebraic degree, to the nonlinearity and to


balancedness, the correlation immunity and resiliency orders are not affine invariants (they
are permutation invariants), except for the null order (and for the order n, but the set of
n-th order correlation immune functions is the set of constant functions and the set of
n-resilient functions is empty, because of Parseval’s relation (2.47), page 60). They are both
invariant under any translation x → x + b, according to Lemma 4 and Theorem 5. The
automorphism group of the set of t-resilient functions (that is, the group of all permutations
σ of Fn2 that preserve resiliency) and the orbits under its action have been studied
in [622]).
The whole Chapter 7 is devoted to correlation immune and resilient functions.
3.1 Cryptographic criteria (and related parameters) for Boolean functions 89

Remark. An interesting question is, given a Boolean function (resp. a balanced Boolean
function) f , what is the best possible correlation immunity (resp. resiliency) order of the
Boolean functions affine equivalent to f ? Of course, the highest possible power of 2 dividing
wH (f ) plays a role, but the reply is not straightforward.

In the filter model


The divide-and-conquer method valid for the combiner model does not apply to the filter
model, since there is only one LFSR in this model. The condition of high-order resiliency is
then not needed. But a stronger condition than balancedness is also necessary in this model,
in order to avoid so-called distinguishing attacks. These attacks are able to distinguish
the pseudorandom sequence, say (si )i∈N , from a random sequence. A way of doing so is
to observe that the distribution of the sequences (si+γ1 , . . . , si+γn ) is not uniform, where
γ1 , . . . , γn are (for instance) the positions where the input bits to the filtering function are
chosen [20]. Golić [545] has observed that if the feedback polynomial of the LFSR is
primitive and if the filtering function has the form g(x1 , . . . , xn−1 ) ⊕ xn (up to a permutation
of variables), then the property of uniformity is satisfied whatever the tap positions are
(where the input bits to the filter function are taken). Canteaut [189] has proved that this
condition on the function is also necessary for having uniformity. For choosing a filtering
function, we can choose a function g satisfying the cryptographic criteria listed in the present
section, and use f defined by means of g in one of the two ways above. But better can be
done (see Subsection 9.1.6, page 343). More is said in [567] on the requirements for the
filter function.

3.1.5 Algebraic immunity and fast algebraic immunity


A new kind of attack, called algebraic attacks, was introduced in 2003 (see [388, 391, 497])
and has significantly changed the situation with Boolean functions in stream ciphers. These
attacks recover the secret key, or at least the initialization of the system, by solving a system
of multivariate algebraic equations.

Shannon’s criterion
The idea that the key bits in a cryptosystem can be characterized as the solutions of a
system of multivariate equations translating the specifications of the cryptosystem comes
from C. Shannon [1034]. Until the invention of algebraic attacks, this bright observation
led more to a design criterion (i.e., the system should not be solvable in reasonable time
with current means) than to an actual attack. Indeed, in practice, for cryptosystems that are
robust against the usual attacks (e.g., for stream ciphers resisting the Berlekamp–Massey
attack), this system is too complex to be solved (its equations being highly nonlinear and the
number of unknowns being too large for a nonlinear system of equations). However, in the
case of stream ciphers, we can get a very overdefined system (i.e., a system with a number of
linearly independent equations much larger than the number of unknowns). Let us consider
the combiner or the filter model, or any model with a linear part (the n LFSRs in the case of
the combiner model, the single LFSR in the case of the filter model) of size N filtered by an
n-variable Boolean function f . There exists a linear permutation L : FN 2 → F2 updating
N
90 Boolean functions, vectorial functions, and cryptography

the current state of the linear part into its next state7 , and a linear function L : FN
2 → F2
n

mapping the linear part to the n bits selected as input to f . Denoting by (u1 , . . . , uN ) the
initialization of the linear part, the current state of the linear part at ith clock cycle equals
Li (u1 , . . . , uN ). Denoting by (si )i≥0 the pseudorandom sequence output by the generator,
we have for every i ≥ 0,
si = f (L ◦ Li (u1 , . . . , uN )). (3.5)
These equations all have the same degree dalg (f ). The number of those that are exploitable
by the attacker equals the number of bits si known by him/her, and can then be much larger
than the number of unknowns (but, of course, the larger the number of equations, the weaker
the attack). The system of these equations can then be greatly overdefined if necessary8 . This
makes less complex the resolution of the system by using Gröbner bases (see [497]), and
even allows linearizing the system9 (i.e., obtaining a system of linear equations by replacing
every monomial of degree larger than 1 by a new unknown). The linear system obtained
d (f ) N
after linearization has, however, too many unknowns: this number is roughly j alg =0 j .

Courtois’ and Meier’s improvement for stream ciphers


Courtois and Meier have had a simple but efficient idea. Assume that there exist functions
g = 0 and h of low algebraic degrees (say, of algebraic degree at most d) such that f g = h
(where f g denotes the Hadamard product of f and g, whose support is the intersection of
the supports of f and g). For every i ≥ 0, Relation (3.5) implies
si g(L ◦ Li (u1 , . . . , uN )) = h(L ◦ Li (u1 , . . . , uN )). (3.6)
This equation in u1 , . . . , uN has degree at most d, since L and 
dL are
 linear, and the system
of equations obtained after linearization has then at most j =0 Nj unknowns and may
  ω 
d N
be solved by Gaussian elimination (if d is small enough) in O i=0 i operations,
 
where ω ≈ 3 is the exponent of the Gaussian reduction. The attack needs about di=0 Ni
10

bits of the keystream.


Low-degree relations have been shown to exist for several well-known constructions of
stream ciphers, which were immune to all previously known attacks. This was the case,
for instance, with functions whose ANFs had only few nonzero coefficients. Such functions
had been used as combining/filter functions for reasons of efficiency in the design of some
stream ciphers11 (e.g., LILI-128 and Toyocrypt stream ciphers; see the references in [842]).
7 In the filter model, the matrix of L is simply a companion matrix; in the combiner model, it is a slightly more
complex matrix having companion matrices around its diagonal and zeros elsewhere.
8 The probability that N random equations in N variables have rank N equals roughly 1/2 since the
determinant of this system lives in F2 .
9 The known algorithms are, starting from the simplest one, linearization, XL, Buchberger, F4, and F5 (by
Faugère); they have different complexities and do not need the same numbers of linearly independent
equations.
10 It can be taken equal to log2 7 ≈ 2.8 and the coefficient in the O can be taken equal to 7, according to Strassen
[1052]; a still better exponent is due to Coppersmith and Winograd, but the multiplicative constant is then
inefficiently high for our framework.
11 The designers of these stream ciphers had forgotten at their own expenses the basic rule of choosing, for
cryptosystems, primitives behaving as randomly as possible.
3.1 Cryptographic criteria (and related parameters) for Boolean functions 91

Krause and Armknecht [28] extended algebraic attacks to combiners with memory. They
studied the algebraic equations satisfied by such combiners and proved an upper bound on
their possible degree by means of the input and memory sizes. Courtois [389] generalized
their results to multioutput functions.

Algebraic immunity
As observed in [391], if we know the existence of a nonzero low algebraic degree multiple
h of f , then the support of h being included in that of f , we have (f ⊕ 1)h = 0, and taking
g = h, we have the desired relation fg = h. But the existence of such multiple h of f is
only a sufficient condition for having relation fg = h. A necessary and sufficient condition
has been found in [842]:

Proposition 25 Let f be any n-variable Boolean function. The existence of functions g =


0 and h, both of algebraic degree at most d, such that fg = h, is equivalent to the existence
of a function g = 0 of algebraic degree at most d such that fg = 0 or (f ⊕ 1)g = 0.

Proof Equality fg = h implies f 2 g = f h, that is (since f 2 = f ), f (g ⊕ h) = 0, which


gives the desired equality of the form fg = 0 (with g = 0) if g = h by replacing g ⊕ h by
g; and if g = h, then fg = h is equivalent to (f ⊕ 1)g = 0. This proves the implication
from top to bottom. The converse is straightforward.

Note that Proposition 25 implies that the existence of a low algebraic degree nonzero
multiple of f or of f ⊕ 1 is a necessary and sufficient condition for the existence of low
algebraic degree g = 0 and h such that fg = h (since being a multiple of f , resp. of f ⊕ 1,
is equivalent to having null product with f ⊕ 1, resp. with f ).

Definition 23 [842] Let f be any n-variable Boolean function. An n-variable Boolean


function g such that fg = 0 is called an annihilator of f .
The minimum algebraic degree of nonzero annihilators of f or f ⊕ 1, i.e., the minimum
algebraic degree of nonzero multiples of f ⊕ 1 or f , or equivalently, the minimal value d
such that there exist g = 0 and h, both of algebraic degree at most d, such that fg = h, is
called the algebraic immunity of f and is denoted by AI (f ).

This notion has been generalized to functions over general finite fields in [52], with an
upper bound on it.

Remark. The set of all annihilators of function f is equal to the ideal of all the multiples
of f ⊕ 1.

Remark. Algebraic immunity plays also a role in computational complexity; see [770],
where a stronger notion is studied (for symmetric functions).

All of Chapter 9 is devoted to algebraic immunity.


92 Boolean functions, vectorial functions, and cryptography

Let g bea generic n-variable Boolean function of algebraic degree at most d. Let the ANF
of g equal I ⊆{1,...,n};|I |≤d aI x I , where the coefficients aI can be any elements of F2 . Then
g is an annihilator of f if and only if f (x) = 1 implies g(x) = 0, that is, if and only if the
coefficients aI satisfy the system of homogeneous linear equations I ⊆{1,...,n};|I |≤d aI uI ,
 
where u ranges over the support of f . In this system, we have di=0 ni number of variables
(the coefficients of the monomials of degrees at most d) and wH (f ) many equations.12 We
shall denote by Mf ,d the matrix of this system.
Algebraic immunity is an affine invariant but not an EA invariant. More precisely, its
automorphism group (that is, the group of all permutations σ of Fn2 such that AI (f ◦ σ ) =
AI (f ) for every Boolean function f ) equals the general affine group (as for Reed-Muller
codes). Indeed, denoting by An(f ) the F2 -vector space of annihilators of f , we have An(f ◦
σ ) = An(f ) ◦ σ .
A strength of algebraic attack comes from the fact that the algebraic degrees of g and h
can always be made lower than or equal to the Courtois–Meier bound  n2 :

Proposition 26 [391] The algebraic immunity of any n-variable Boolean function is


bounded above13 by  n2  and by dalg (f ).

Proof The number of monomials of algebraic degree at most  n2  is strictly larger than
2n−1 . The disjoint union of the family of these monomials and of the family of the products
of f by these monomials has then size strictly larger than 2n , which is the dimension of
the F2 -vector space BF n . The functions in this disjoint union are then necessarily F2 -
linearly dependent. Given a linear combination equal to function 0 and having not all-zero
coefficients, let us gather separately the part dealing with the first family and the part dealing
with the second. This gives two functions h and g, both of degree at most  n2 , such that
h = f g and (g, h) = (0, 0), i.e., g = 0. This proves the first upper bound. The second
comes from the fact that f and f ⊕ 1 are annihilators of each other.

 n−1 n
Remark. For n odd, according to Proposition 26 and since 2 i=0 2
i = 2n , we
have AI (f ) = n+1 2 if and only if the family {x f , |I | ≤
I
2 } ∪ {x (f ⊕ 1), |J | ≤
n−1 J

2 } is a basis of the F2 -vector space BF n . Note that this leads


n−1
to new codes: for
k n
every k ≤ n−1 2 , the code C f ,k of length 2n and dimension 2
i=0 i generated by
{x f , |I | ≤ k} ∪ {x (f ⊕ 1), |J | ≤ k}. For k = 2 , it equals the whole space BF n .
I J n−1

For k = 1, it equals the direct sum of the first-order Reed–Muller code punctured at
the positions in the support of f and of the first-order Reed–Muller code punctured at
the positions in the cosupport, and has minimum distance 12 nl(f ), since wH (fg) equals
2n−1 if g = 1 (because f is balanced according to Relation (9.5), page 330) and
H (f ⊕g) 2n −wH (f ⊕g)
wH (fg) = wH (f )+wH (g)−w
2 = 2 = wH (f 2⊕g⊕1) if g is affine nonconstant,

12 Those corresponding to u of small weights may be used to simplify those corresponding to u of larger weights
as shown in [27].
13 Consequently, it is bounded above by k/2 if, up to affine equivalence, it depends only on k variables, and by
k/2 + 1 if it has a linear kernel (see below) of dimension n − k, since it is then equivalent, according to
Proposition 28, to a function in k variables plus an affine function.
3.1 Cryptographic criteria (and related parameters) for Boolean functions 93

and we have the same for wH ((f ⊕ 1)h). Note that the known functions f such that
AI (f ) = n+1
2 have diverse nonlinearities.

Algebraic immunity of random functions


Random functions behave well with respect to algebraic immunity:14 it has been proved
in [437] (see a slightly more complete proof in [307] and its extension to vectorial
functions)
1 that, for all a < 1, when n tends to infinity, AI (f ) is almost surely larger than
 n
2 −
n n
2 ln 2a ln 2 .

Consequences of the invention of algebraic attack on the design of stream ciphers


A difference by 1 in the algebraic immunity of a function f , used as combiner or filter in a
stream cipher, makes a big difference in the efficiency of algebraic attack. The designer
needs then to choose f with optimal or near-optimal algebraic immunity. Let then an
n-variable function f , with algebraic immunity  n2 , be used, for instance, as filter on
an LFSR of length N ≥ 2k, where k is the length of the key (otherwise, it is known
that the system is not robust against an attack called time-memory-data trade-off attack).
Then the complexity of an algebraic attack using one annihilator of degree  n2  is roughly
  log2 7   2.8
7 N0 + · · · + Nn  ≈ 7 N0 + · · · + Nn  (see [391]). Let us choose k = 128
2 2
(which is usual) and N = 256; then it is for n ≥ 13 that the complexity of algebraic
attack is at least 280 (which is considered nowadays as just enough); and it is larger than
the complexity of an exhaustive search, that is, 2128 , for n ≥ 15. If the attacker knows
several linearly independent annihilators of degree  n2 , then the number of variables must
be enhanced! In practice, the number of variables will have to be near 20 (but this poses
then a problem of efficiency of the stream cipher). This has quite changed the situation
with Boolean functions at the beginning of this century, since before algebraic attacks, the
Boolean functions used had rarely more than 10 variables.

Fast algebraic attack


A high value of AI (f ) is even not a sufficient property for a resistance to algebraic attacks,
because other algebraic attacks have been later invented. The fast algebraic attack (FAA)
is an improvement to the standard algebraic attack. It can work even if the algebraic
immunity of the function is large,15 provided that there exist n-variable Boolean functions g
nonzero of low algebraic degree, and h of reasonable algebraic degree (i.e., of algebraic
degree possibly larger than n2 but significantly smaller than n) such that fg = h; see
[388]. This attack is based on the observation that it is possible to obtain a low-degree
equation from several ones of the form (3.6) by eliminating the large-degree terms in the
right-hand sides of these equations, and that such elimination may be made offline by
the attacker (that is, before that values of the si s are known by him/her) and therefore
benefit of a much longer time of computation. The efficiency of the precomputation and

14 No result is known on the behavior of random functions against fast algebraic attacks.
15 Fast algebraic attack has worked on the eSTREAM [495] proposal SFINKS [390], while the cipher was
designed to withstand algebraic attack.
94 Boolean functions, vectorial functions, and cryptography

substitution steps has been improved by Hawkes and Rose [590] for the filter model
 dalg (h) N  dalg (h) N  dalg (h) N
(allowing a complexity of O i=0 i log32 i=0 i + i=0
2
i N log2 N
 dalg (g) N
operations, needing 2 bits of stream for the former, and an online complexity
 dalg (g) N 3 i=0  dalgi (g) N  dalg (h) N  dalg (h) N
of O i=0 i +2 i=0 i i=0 i log2 i=0 i operations)
and by Armknecht [25] for the combiner model, also when they are made more complex
by the introduction of memory. Fast algebraic attacks need more data than standard ones
(since several values si need to be known to obtain one equation), but may also be faster.
Armknecht and Ars [26] introduced a variant of the FAA that reduced the data complexity
(but not the time complexity).

On the existence of g and h Given nonnegative integers d and e such that e + d ≥ n, the
number of monomials of degrees at most e and the number of monomials of degrees at most
d have a sum strictly larger than 2n , and there exist16 then g = 0 of algebraic degree at most
e and h of algebraic degree at most d such that fg = h. An n-variable Boolean function
f is then optimal with respect to fast algebraic attacks if there do not exist two functions
g = 0 and h such that fg = h, dalg (g) <  n2  and dalg (g) + dalg (h) < n. Since fg = h
implies f h = f 2 g = fg = h, we see that h is then an annihilator of f ⊕ 1, and if h = 0,
its algebraic degree is then at least equal to the algebraic immunity of f .

Complexity of the attack and related parameters on Boolean functions The complex-
ity of FAA is roughly of the order (see [590])
 6 7
O min N max[dalg (g)+dalg (fg),3dalg (g)] , g = 0 .

It can be seen that FAA with g = 1 is less efficient than the Rønjom–Helleseth attack (see
below) and that FAA with dalg (g) ≥ AI (f ) is in fact the algebraic attack. This has led in
[324] to studying the so-called fast algebraic complexity:
'  (
F AC(f ) = min max dalg (g) + dalg (fg), 3dalg (g) ; 1 ≤ dalg (g) < AI (f ) ,
whose value is invariant by changing f into f ⊕ 1, and is bounded above by n and below by
the so-called fast algebraic immunity:
 ' (
F AI (f ) = min 2AI (f ), min dalg (g) + dalg (fg); 1 ≤ dalg (g) < AI (f ) ,
which had been informally introduced in a preliminary version of the paper [791] and used
in [324, 870, 1106]. Note that FAI is also invariant by changing f into f ⊕ 1, and is easier
to study. If this latter parameter is close to n, then FAC is too, and the function provides then
a good resistance to FAA.

Remark. Since, for the resistance against FAA, there must not exist g = 0 such that
dalg (g) is small and dalg (fg) is reasonably large, then if dalg (f ) is not large, f does not
resist FAA. Because of the Siegenthaler bound (see Proposition 117, page 285) and of the

16 We do not require here that fg = 0; if such a requirement is imposed, the result is no more true, as observed
by Gong [553].
3.1 Cryptographic criteria (and related parameters) for Boolean functions 95

fact that functions in the combiner model must be correlation immune, the combiner model
cannot be used nowadays without extra protections.

Other algebraic attacks


Algebraic attack on the augmented function Considering now f as a function in N
variables, to simplify description, this attack due to [509] works with the vectorial function
F (x), whose output equals the vector (f (x), f (L(x)), . . . , f (Lm−1 (x))), where L is the
(linear) update function of the linear part of the generator. This attack can be more efficient
than the standard algebraic attack. But the efficiency of the attack not only depends on
the function f ; it also depends on the update function (and naturally also on the choice
of m), since for two different update functions L and L , the vectorial functions F (x) and
F  (x) = (f (x), f (L (x)), . . . , f (L m−1 (x)) are not linearly equivalent; they are not even
CCZ equivalent in general. The resistance to this attack is then more a matter with the pair
(f , L) rather than with the single function f .

The Rønjom–Helleseth attack This attack, introduced in [1003] and improved in [556,
600, 1001, 1002, 1004], also adapts the idea of algebraic attacks due to Shannon, but in a
different way. An LFSR with a primitive retroaction polynomial (or equivalently a primitive
characteristic polynomial) generates a sequence of the form ui = trN (λα i ), where α is
a primitive element of F2N . Essentially, an LFSR generates the field F∗2N and a classical
filter generator keystream sequence is formed by applying a Boolean function in n variables
to n of the N bits of the coefficient vector of the element ui . Rønjom and Helleseth then
observe that the coefficients in front of a particular monomial in the sequence of multivariate
equations expressing the keystream bits form a so-called coefficient sequence that inherits
highly structural finite field properties from the LFSR. In particular, from this observation
they gain fine-grained control over the linear dependencies in the multivariate equation
system, which enables very efficient reductions. They take  advantage
 of this by proposing
an attack whose computational complexity is in about di=0 Ni operations, where d is
thealgebraic degree of the filter function and N is the size of the LFSR (rather than
AI (f ) N ω 
O i=0 i in the case of standard algebraic attack, where AI (f ) is the algebraic
immunity of the filter function and ω ≈ 3 is the exponent of the Gaussian reduction). It needs
about di=0 Ni consecutive bits of the keystream output by the pseudorandom generator
 AI (f ) 
rather than i=0 Ni . Since d is supposed to be close to the number n of variables of
  
the filter function, the number di=0 Ni is comparable to Nn . Since AI (f ) is supposed
to be close to  n2 , we can see that denoting by C the complexity of the Courtois–Meier
attack and by C  the amount of data it needs, the complexity of the Rønjom–Helleseth attack
roughly equals C 2/3 and the amount of data it needs is roughly C  2 . From the viewpoint of
complexity, it is more efficient, and from the viewpoint of data it is less efficient.
It was later observed (see [556, 600, 1001]) that the multivariate representation essentially
hides away more of the underlying finite field structure stemming from the LFSR, and that
it follows straightforwardly from a univariate representation that the equation systems are
cyclic Vandermonde type. In particular, in the univariate representation one has even more
complete control over the dependencies of each coefficient and more freedom in comparison
96 Boolean functions, vectorial functions, and cryptography

to the multivariate case. Here the keystream sequence is simply viewed as ai = P (ui ), where
P is a univariate polynomial over F2N . Then [556] introduced a parameter on sequences,
called spectral immunity, an analogue to the algebraic immunity, but related to the approach
of the Rønjom–Helleseth attack and to its improvements (in particular, the so-called selective
discrete Fourier transform (DFT) attack, which multiplies the portion of known keystream
by a sequence of smaller linear complexity, and which possibly results in a more efficient
attack than FAA, or is able to work when the number of known consecutive bits of the
keystream is too small for FAA). The spectral immunity SI (s) of a binary sequence s
is the lowest linear complexity of a nonzero binary annihilator s (i.e., binary sequence a,
satisfying a s = 0). In terms of univariate polynomials, the spectral immunity is equal to the
minimal weight of a multiple of P or P + 1 in F2N , thus linking security directly to
the minimum distance of the associated algebraic codes defined by the univariate filter
polynomials. Moving to a univariate representation over finite fields seems to be a more
natural representation for this type of generator. For instance, it has been an open question
in [188] whether the irregular equation systems resulting from an annihilator attack on the
filter generator have full rank. As observed in [1001], from the univariate representation, this
directly translates to a question about the singularity of generalized Vandermonde matrices
over finite fields, which has already been solved by Shparlinski [1038] (most of such
matrices have full rank). It has been shown that univariate cryptanalysis becomes particularly
effective in practice in comparison to multivariate attacks when the LFSR is defined over
larger fields (i.e., word-based stream ciphers); see, for instance, [1001]. Although filter
generators are usually building blocks in more complex designs, the technique has been
used to practically break several ciphers, including a large part of the Welch–Gong family of
generators and the recent Keccak/Farfalle-based pseudorandom function Kravatte [342]. It
is an open problem how this change of representation can be used to also improve algebraic
attacks on ciphers such as SNOW-3G, which use word-based LFSRs as components in more
complex designs.

3.1.6 Variants to these criteria in relationship with guess and determine attacks
The guess and determine attacks make hypotheses on the values of some bits or some linear
combinations of bits in the data processed by the stream cipher. Given the complexity, say
C, of the attack when the hypothesis is satisfied, the global complexity of the attack is
obtained by dividing C by the probability that the hypothesis is satisfied. There is then
a trade-off to be found between this probability and C. In such framework, the input
to the Boolean function at one moment in the process belongs, in the simplest case, to
an affine subspace of Fn2 (which may be a different one at each moment). For a given
Boolean function f to be used as combiner or filter function, all the criteria introduced
in the previous subsections then also need to be studied for the restrictions of f to such
affine spaces. It is difficult to say in general which affine spaces exactly are concerned
and, as in the case of attacks on the augmented function, such study is hardly viewed as
a study of the single Boolean function, except in particular cases. It depends on the whole
cryptosystem. We shall see in Section 12.2 another case where functions need to be studied
on subsets of Fn2 (which are no more affine spaces but sets of vectors of fixed Hamming
weights).
3.1 Cryptographic criteria (and related parameters) for Boolean functions 97

3.1.7 Avalanche criteria, nonexistence of nonzero linear structure, correlation


with subsets of indices
Strict avalanche criterion, propagation criterion, and global avalanche criteria
The strict avalanche criterion (SAC) has been introduced by Webster and Tavares [1116]
and this concept was generalized into the propagation criterion (P C) by Preneel et al. [970]
(see also [969]). The SAC and its generalizations are based on the properties of the
derivatives of Boolean functions. These properties describe the behavior of a function
whenever some coordinates of the input are complemented. Thus, they are related to the
property of diffusion of the cryptosystems using the function. They concern more the
Boolean functions involved in block ciphers.

Definition 24 Let f be a Boolean function on Fn2 and E a subset of Fn2 . Function f satisfies
the P C with respect to E if, for all a ∈ E, the derivative Da f (x) = f (x) ⊕ f (a + x) is
balanced. It satisfies P C(l) if it satisfies P C with respect to the set of all nonzero vectors of
weight at most l. In other words, f satisfies P C(l) if the autocorrelation coefficient F (Da f )
is null for every a ∈ Fn2 such that 1 ≤ wH (a) ≤ l. Criterion SAC corresponds to P C(1).

Some cryptographic applications require Boolean functions that still satisfy P C(l) when a
certain number k of coordinates of the input x are kept constant (whatever these coordinates
are and whatever are the constant values chosen for them). We say that such functions
satisfy the P C(l) of order k. This notion, introduced in [970], is a generalization of the
strict avalanche criterion of order k, SAC(k) (which is equivalent to P C(1) of order k),
introduced in [516]. Obviously, if a function f satisfies P C(l) of order k ≤ n − l, then it
satisfies P C(l) of order k  for any k  ≤ k.
There exists another notion, which is similar to P C(l) of order k, but stronger [968, 970]
(see also [219]): a Boolean function satisfies the extended propagation criterion EP C(l) of
order k if every derivative Da f , with a = 0n of weight at most l, is k-resilient.
These parameters are not affine invariants.
A weakened version of the PC criterion has been studied in [721].

Global avalanche criteria: sum-of-squares and absolute indicators The second


moment of the autocorrelation coefficients

V (f ) = F 2 (Db f ) (3.7)
b∈Fn2

has been introduced by Zhang and Zheng [1167] for measuring the global avalanche
criterion (GAC), and is also called the sum-of-squares indicator. The absolute indicator
f = maxb∈Fn2 , b=0n | F (Db f ) | is the other global avalanche criterion. Functions with
high absolute indicator are weak against cube attacks [465]. Both indicators are clearly affine
invariants. In order to achieve good diffusion, cryptographic functions should have low
sum-of-squares indicators and absolute indicators. Obviously, we have V (f ) ≥ 22n , since
F 2 (D0 f ) = 22n . Note that every lower bound of the form V (f ) ≥ V straightforwardly
1
implies that the absolute indicator is bounded below by V2−2
2n
n −1 . The functions achieving
98 Boolean functions, vectorial functions, and cryptography

V (f ) = 22n are those functions whose derivatives Db f (x), b = 0n , are all balanced. We
shall see in Chapter 6 that these are the bent functions, which are unbalanced. In [1180] and
references therein are studied the balanced functions with minimal sum-of-square indicator
22n + 2n+3 .
If f has a k-dimensional linear kernel {e ∈ Fn2 ; De f = cst} (see the next subsection),
then
V (f ) ≥ 22n+k (3.8)
(with equality if and only if f is partially-bent; see page 256).
Note that, according to Relation (2.55), page 62, applied to Db f for every b, we have
V (f ) = F (Da Db f ), (3.9)
a,b∈Fn2

where Da Db f (x) = f (x) ⊕ f (x + a) ⊕ f (x + b) ⊕ f (x + a + b) is the second-order


derivative of f .
Note also that, according to Relation (2.45), page 60 (expressing the convolutional
product of Fourier–Hadamard transforms), applied to ϕ(b) = ψ(b) = F (Db f ), and using
that, according to Relation (2.53), the Fourier–Hadamard transform of ϕ equals Wf2 , we have
for any n-variable Boolean function f
∀a ∈ Fn2 , Wf2 (b)Wf2 (a + b) = 2n F 2 (Db f )(−1)b·a ,
b∈Fn2 b∈Fn2

and thus, for a = 0n :


Wf4 (b) = 2n V (f ), (3.10)
b∈Fn2

as observed in [191].   
We have: Wf4 (b) ≤ Wf2 (b) maxn Wf2 (b) = 22n maxn Wf2 (b) (according to
b∈F2 b∈F2
b∈Fn2 b∈Fn2
Parseval’s relation (2.47), page 60), and8we deduce, using Relation (3.10) and inequality
V (f ) ≥ 22n : maxb∈Fn2 Wf2 (b) ≥ V2(fn ) ≥ V (f ); thus, according to Relation (3.1), page 79,
relating the nonlinearity to the Walsh transform, we have (as first shown in [1169, 1173]):

Proposition 27 For every n-variable Boolean function f , we have


n 8 18
nl(f ) ≤ 2n−1 − 2− 2 −1 V (f ) ≤ 2n−1 − 4 V (f ),
2
with equality on the left-hand side if and only if f is plateaued (see Definition 63, page 258),
in which case V (f ) = 2n λ2 , where λ is the amplitude.

Denoting by NWf the cardinality of the support {a ∈ Fn2 ; Wf (a) = 0} of the Walsh
transform of f , Relation (3.10) also implies the following relation, first observed in [1173]:
V (f ) × NWf ≥ 23n . Indeed, using for instance the Cauchy–Schwarz inequality, we see that
 2   
a∈F2n W 2 (a)
f ≤ a∈F2n W 4 (a) ×N
f Wf , and we have a∈Fn2 Wf (a) = 2 , according
2 2n

to Parseval’s relation.
3.1 Cryptographic criteria (and related parameters) for Boolean functions 99

According to the observations 8 made above and below Proposition 27, the functions that
− n2 −1
satisfy nl(f ) = 2n−1 −2 V (f ) (resp. V (f ) × NWf = 2 ) are the functions whose
3n

Walsh transforms take one nonzero


8 absolute value (i.e., are plateaued), and the functions
satisfying nl(f ) = 2n−1 − 12 4 V (f ) are the bent functions.
Constructions of balanced Boolean functions with low absolute indicators and high
nonlinearities have been studied in [813, 1072].

Remark. Zhang and Zheng conjectured that the absolute indicator of any balanced
n+1
Boolean function of algebraic degree at least 3 is lower-bounded by 2 2  , but counterex-
amples were found by many people (Maitra-Sarkar, Burnett et al., Gangopadhyay-Keskar-
Maitra, Kavut).

Remark. A related but different parameter is max f (a) (recall that f (a) =
a∈Fn2 ,a=0n
 Da f (x) is the autocorrelation function), without absolute value. It has appeared
x∈Fn2 (−1)
recently in the framework of side-channel attacks (see Section 12.1).

Nonexistence of nonzero linear structure


The set of linear structures of a Boolean functions plays a role in its study, particularly when
the function is a quadratic function (see Section 5.2).

Definition 25 The linear kernel of an n-variable Boolean function f is the set of those
vectors e such that De f is a constant function. It is denoted by Ef . Any element of the linear
kernel is called a linear structure17 of f .

More generally, a linear structure e of a vectorial function F is such that De F equals


a constant function. Since for every n-variable Boolean function f (more generally, any
vectorial function) and any a, b ∈ Fn2 , we have Da f (x) ⊕ Db f (x) = f (x + a) ⊕ f (x + b) =
Da+b f (x +a), the linear kernel of any Boolean function is an F2 -subspace of Fn2 . Moreover,
the restriction of f to its linear kernel is affine since its derivatives are all constant. More
generally, for every r ≤ n, the set of those e ∈ Fn2 such that De f has algebraic degree at
most r is a vector space, and the restriction of f to this vector space has algebraic degree at
most r + 1.
Nonlinear cryptographic functions used in block ciphers should have no nonzero linear
structure (see [496]). The existence of nonzero linear structures, for the functions imple-
mented in stream ciphers, is a potential risk and is better avoided.

Proposition 28 Any n-variable Boolean function f (x1 , . . . , xn ) has a nonzero linear


structure if and only if it is linearly equivalent to a function of the form
g(x1 , . . . , xn−1 ) ⊕ xn , (3.11)

17 We also call linear structure a pair (a, b) ∈ Fn2 × F2 such that Da f equals constant function b.
100 Boolean functions, vectorial functions, and cryptography

where ∈ F2 . More generally, the linear kernel of f has dimension at least k if and only if
f is linearly equivalent to a function of the form
g(x1 , . . . , xn−k ) ⊕ n−k+1 xn−k+1 ⊕ ··· ⊕ n xn , (3.12)
where n−k+1 , . . . , n ∈ F2 .

Proof The conditions are clearly sufficient. Conversely, let f have a nonzero linear
structure e, then by composing f on the right by a linear automorphism L on Fn2 such that
L(0, . . . , 0, 1) = e, we have D(0,...,0,1) (f ◦ L)(x) = f ◦ L(x) ⊕ f ◦ L(x + (0, . . . , 0, 1)) =
f ◦ L(x) ⊕ f (L(x) + e) = De f (L(x)). And it is easily seen that D(0,...,0,1) (f ◦ L) being
then constant, f ◦ L has the form g(x1 , . . . , xn−1 ) ⊕ xn . The case of dimension k is
similar.

Note that, according to Proposition 28, if f admits a nonzero linear structure, then
since nonlinearity is an EA invariant, nl(f ) equals the nonlinearity of g given by (3.11)
and viewed as an n-variable function, which equals 2nl(g), where g is now viewed as an
(n − 1)-variable. Hence, according to the covering radius bound (3.2), page 80, applied
to this (n − 1)-variable function, nl(f ) is bounded above by the bent concatenation
n−1
bound 2n−1 − 2 2 . This implies that the functions achieving strictly larger nonlinearities
(obtained by Patterson and Wiedemann and by Kavut et al.; see Section 3.1.3) cannot have
any nonzero linear structure.
Similarly, if k is the dimension of the linear kernel of f , we have that nl(f ) ≤ 2n−1 −
n+k−2
2 2 as seen in [190], since nl(f ) = 2k nl(g), where g is the (n − k)-variable function
given in (3.12) and according to the covering radius bound applied on g with n − k in the
place of n.
Another characterization of linear structures is by the Walsh transform [486, 736] (see
also [193]). In the next proposition, we separate the case where the linear structure e is such
that De f is the null function and the case where it is function 1.

Proposition 29 Let f be any n-variable Boolean function. The derivative De f equals the
null function (resp. function 1) if and only if the support supp(Wf ) = {u ∈ Fn2 ; Wf (u) = 0}
of Wf is included in {0n , e}⊥ (resp. in its complement).

Proof Relation (2.56), page 62, with b = 0n and E = {0n , e}⊥ , gives the equality
Wf2 (u) = 2n−1 (2n + (−1)a·e F (De f )). (3.13)
u∈a+E
If De f is null, then let us fix a such that a · e = 1 and if De f = 1, then let us fix it such
that a · e = 0. Then Wf (u) is null for every u ∈ a + E, according to Relation (3.13). This
proves the implication from top to bottom. The converse is straightforward.

Notice that, if De f is the constant function 1 for some e ∈ Fn2 , then f is balanced
(indeed, the relation f (x + e) = f (x) ⊕ 1 implies that f takes the values 0 and 1 equally
often). Thus, a nonbalanced function f has no nonzero linear structure if and only if there
3.1 Cryptographic criteria (and related parameters) for Boolean functions 101

is no nonzero vector e such that De f is null. According to Proposition 29, we deduce the
following corollary:

Corollary 7 Any nonbalanced function f has no nonzero linear structure if and only if the
support of its Walsh transform has rank n.

A similar characterization exists for balanced functions by replacing the function f (x) by
a nonbalanced function f (x) ⊕ b · x. It is deduced in [354] (see more in [1082]) that resilient
functions of high orders must have linear structures.

Distance to linear structures The dimension of the linear kernel is an affine invariant.
Hence, so is the criterion of nonexistence of nonzero linear structure. But, contrary to the
criteria viewed before it, it is an all-or-nothing criterion. Meier and Staffelbach introduced
in [844] a related criterion, leading to a characteristic (that is, a criterion that can be satisfied
at levels quantified by numbers): a Boolean function on Fn2 being given, its distance to
linear structures is its distance to the set of all Boolean functions admitting nonzero linear
structures, among which we have all affine functions (hence, this distance is bounded above
by the nonlinearity) but also other functions, such as all nonbent quadratic functions.

Proposition 30 [844] The distance to linear structures of any n-variable Boolean function
f equals 2n−2 − 14 maxe∈Fn2 \{0n } |F (De f )|.

Proof Given e in Fn2 \ {0n } and in F2 , let Le, be the set of those n-variable Boolean
functions g such that De g = . Then a function g in Le, lies at minimum Hamming distance
from f , among all elements of Le, , if and only if, for every x ∈ Fn2 such that De f (x) = ,
we have g(x) = f (x) (and g(x + e) = f (x + e)), and for every x ∈ Fn2 such that De f (x) =
⊕ 1, we have g(x) = f (x) or g(x + e) = f (x + e) (and only one such equality can
then happen).The Hamming distance between f and g equals then 12 |{x ∈ Fn2 ; De f (x) =
⊕ 1}| = 12 2n−1 − (−1)2 F (De f ) . This completes the proof since the set of functions
9
admitting nonzero linear structures equals e∈Fn \{0n }, ∈F2 Le, .
2

Note that Proposition 30 proves again Relation (3.3), page 82, and also proves, according
to Theorem 12, page 192, that the distance of f to linear structures equals 2n−2 if and only
if f is bent.

The maximum correlation with respect to a subset I of indices


This parameter has been introduced in [1155].

Definition 26 Let f be any n-variable Boolean function and I ⊆ {1, . . . , n}.


F (f ⊕ g)
The maximum correlation with respect to I equals Cf (I ) = max =
g∈BF I ,n 2n
|F (f ⊕ g)|
max , where BF I ,n is the set of n-variable Boolean functions depending
g∈BF I ,n 2n
on {xi , i ∈ I } only.
102 Boolean functions, vectorial functions, and cryptography

According to Relation (2.35), page 57, the Hamming distance from f to BF I ,n is equal
to 2n−1 (1 − Cf (I )). As we saw already, denoting the size of I by r, this distance is bounded
below by the r-th order nonlinearity of f (i.e., the minimum Hamming distance to functions
of algebraic degree at most r). It can be much larger.
The maximum correlation of any combining function with respect to any subset I of small
size should be small (i.e., its distance to BF I ,n should be large). It is straightforward to
prove, by decomposing the sum F (f ⊕ g) and using that an unrestricted Boolean function
 |I | |F (h )|
over FI2 can take any binary value at any input x ∈ FI2 , that Cf (I ) equals 2j =1 2nj ,
where h1 , . . . , h2|I | are the restrictions of f obtained by keeping constant the xi s for i ∈ I ,
and to see that the distance from f to BF I ,n is achieved by the functions g taking value
0 (resp. 1) when the corresponding value of F (hj ) is positive (resp. negative), and that we
have Cf (I ) = 0 if and only if all hj s are balanced (thus, f is m-resilient if and only if
Cf (I ) = 0 for every set I of size at most m).
 |I | 0 02
0 0 ≤ 2|I | 2|I | F 2 (hj ), and the
j =1 F (hj )
2
The Cauchy–Schwarz inequality gives j =1
second-order Poisson formula (2.57), page 62, directly implies then the following inequality
observed in [187]:

⎛ ⎞1
2

⎜ ⎟ −n+ |I2|
Cf (I ) ≤ 2−n ⎜
⎝ Wf2 (u)⎟
⎠ ≤2 max |Wf (u)|
u∈Fn
u∈Fn
2;
2
supp(u)⊆I
|I | 
= 2−n+ 2 2n − 2 nl(f ) (3.14)

or equivalently

⎛ ⎞1
2

 1⎜ ⎟ |I |
dH f , BF I ,n ≥ 2n−1 − ⎜ Wf2 (u)⎟
⎠ ≥2
n−1
− 2 2 −1 maxn |Wf (u)|
2⎝ u∈F2
u∈Fn
2;
supp(u)⊆I
|I | |I |
= 2n−1 − 2n+ 2 −1 + 2 2 nl(f ).

This latter inequality shows that, contrary to the case of approximation by functions of
algebraic degree at most r, for avoiding close approximations of f by functions of BF I ,n
when I has small size, it is sufficient that the first-order nonlinearity of f be large.
Parameter maxI ⊆{1,...,n},|I |≤k Cf (I ) is permutation invariant. A related (but different)
affine invariant parameter, also related to the distance to linear structures, is the minimum
Hamming distance to those Boolean functions g whose linear kernel {e ∈ Fn2 ; De g = 0} has
{1,...,n}\I
dimension at least n − k. Indeed, the linear kernels of functions in BF I ,n contain F2 .
The results on the maximum correlation above generalize to this criterion [187].
Results in the domain of Boolean functions for circuit design and learning express that, if
the total influence 2 −n n
 i=1 wH (Dei f ) of an n-variable Boolean function f is low, then the
2
sum u∈Fn ;wH (u)≥k Wf (u) is small for large k (and the function is “essentially determined
2
by few coordinates”); see [519, 914]. This is related to Relation (2.67), page 68.
3.1 Cryptographic criteria (and related parameters) for Boolean functions 103

3.1.8 Complexity parameters


Among the criteria viewed above, the main cryptographic complexity parameters (related to
Shannon’s notion of confusion) are the algebraic degree, the nonlinearity and higher-order
nonlinearity, the algebraic immunity, and the fast algebraic immunity. Other complexity
parameters exist. Note that, as pointed out by Meier and Staffelbach in [843], they are
supposed to be affine invariants, because the composition by affine automorphisms should
not modify the complexity. And indeed, the attacks on cryptosystems using Boolean
functions (stream or block ciphers) often work with similar complexities when using two
affinely equivalent functions (maybe not exactly the same complexity, because diffusion
plays also a role and may be different with both functions).

Algebraic thickness This parameter has been evoked in [844] and later studied in [222,
224, 229].

Definition 27 Let f be any n-variable Boolean function. The minimum number of terms
in the algebraic normal forms of all functions affinely equivalent to f is called the algebraic
thickness of f . We shall denote it by AT (f ).

As far as we know, this parameter is not directly related to an attack. Note, however, that
if a function has very low algebraic thickness, then it has low algebraic immunity, since, for
every set I of nonempty  multiindices of {1, . . . , n}, an annihilator of the Boolean function
of ANF I ∈I x equals I ∈I (xiI ⊕ 1), where, for every I ∈ I , iI is an index chosen in I
I

(any one). We deduce that AI (f ) ≤ AT (f ) for every Boolean function.


In the case of affine
functions, and more generally of the indicators of flats (in particular,
of function δ0 (x) = ni=1 (xi ⊕ 1) = I ⊆{1,...,n} x I , which has all monomials in its ANF),
AT (f ) equals 1.
In the case of quadratic functions, thanks to the existence of the Dickson form of these
functions that we shall see in Theorem 10, page 172, AT (f ) equals at most  n+1 2 , which is
also rather small.
Boolean functions of algebraic degree not close to n − 1 have also moderate algebraic
dalg (f ) n
thickness, since AT (f ) ≤ i=0 i .
But it has been shown that, asymptotically, almost all Boolean functions f (in the
sense of probability theory) have large algebraic thickness. This property is related to the
n
fact that the number 22 of n-variable Boolean functions is strongly increasing when n
grows, which allows proving in some cases the existence of functions possessing some
complexity features, without being always able to exhibit any such function. This is possible
by bounding above the number of functions that do not possess these features and showing
n
that it is negligible when compared to 22 . This phenomenon on Boolean functions, which
is also valid with codes, is the so-called Shannon effect (this term has been introduced in
[807]): Shannon in [1035] could prove this way the existence of Boolean functions with
high circuit complexity.
Concerning algebraic thickness, it has been first proved in [222] that for every number
λ < 1/2, the density in BF n of the subset {f ∈ BF n | AT (f ) ≥ λ 2n } is larger than
1 − e−2 (1/2−λ) +(n +n) log2 (e) and tends to 1 when n tends to infinity. A more precise
n+1 2 2

bound was proved shortly later:


104 Boolean functions, vectorial functions, and cryptography

Proposition 31 [224] Let c be any strictly positive real number. The density in BF n of
' n−1 (
the subset f ∈ BF n | AT (f ) ≥ 2n−1 − c n 2 2 is larger than 1 − 2n +n e−c n and,
2 2 2

if c2 log2 e > 1, then this density tends to 1 when n tends to infinity. For every n ≥ 3, a
n−1
Boolean function f such that AT (f ) ≥ 2n−1 − n 2 2 exists.

Proof Let k be any positive integer. The number


 2n of n-variable
 2nBoolean functions whose
ANF have at most k monomials equals 1 + 1 + . . . + k . The number of affine
automorphisms on Fn2 equals (2n − 1)(2n − 2)(2n − 4) . . . (2n − 2n−1 ) 2n < 2 n +n . Thus,
2

the number of Boolean


 functions f such that AT (f ) ≤ k is smaller than N(n, k) =
 2n 2n  n2 +n
1 + 1 + ... + k 2 . We have seen already at page 83 that, for every N, we
 N −2N(1/2−λ) 2
have 0≤i≤λN i < 2 e N . Hence, applying this with N = 2n and λ =
1/2 − c n 2−(n+1)/2 , we deduce that the density of the set {f ∈ BF n | AT (f ) ≥ 2n−1 −
c n 2(n−1)/2 } is larger than 1− N(n,2 −c
n−1 n 2(n−1)/2 )
> 1−2n +n e−c n = 1−2n +n−c n log2 e ,
2 2 2 2 2 2
n
22
and tends to 1, if c2 log2 e > 1. The last sentence is easy to check.

Proposition 31 implies that, for every λ < 1/2, there exists m such that, for every n ≥ m,
a Boolean function f such that AT (f ) ≥ λ 2n exists. But, unless λ is small, m is greater
than 3. We can take m = 9 for λ = 14 and m = 12 for λ = 38 .
Hence, almost all n-variable Boolean functions have algebraic thickness larger than half
the whole number 2n of monomials (see more in [956]). It may seem surprising that taking
the minimum number of terms in the ANFs of all functions affinely equivalent to f does not
affect significantly the number of terms in the ANF of a random function. This is due to the
small number of affine automorphisms compared to the number of Boolean functions.
The lower bound of Proposition 31 is accompanied by an upper bound:

Proposition 32 [222] For every f ∈ BF n , we have AT (f ) ≤ 2


3 2n .

Proof The proof is by induction on n. The assertion is clearly valid for n = 1. Let n be
any integer larger than 1 and assume that the assertion is valid for n − 1. Let f be any
Boolean function in BF n and let f0 and f1 be the Boolean functions on F2n−1 such that
f (x1 , . . . , xn ) = f0 (x1 , . . . , xn−1 ) ⊕ xn f1 (x1 , . . . , xn−1 ). We shall denote by |f | the number
of terms in the ANF of f . We have |f | = |f0 | + |f1 |. By hypothesis, there exists an affine
automorphism A of F2n−1 such that |f1 ◦ A| ≤ 2/3 2n−1 . Thus, we can assume without loss
of generality that |f1 | ≤ 2/3 2n−1 . Assume that |f | = |f0 | + |f1 | is larger than 2/3 2n . Let r
be the number of terms that are in both ANFs of f0 and f1 . We have |f0 | + |f1 | − r ≤ 2n−1 ,
since 2n−1 is the total number of monomials in n − 1 variables. Thus r is larger than or equal
to 2/3 2n − 2n−1 = 1/3 2n−1 . Changing xn into xn ⊕ 1 in the ANF of f keeps f1 unchanged
and replaces f0 by f0 ⊕ f1 . We have |f0 ⊕ f1 | + |f1 | = (|f0 | + |f1 | − r) − r + |f1 ] ≤
2n−1 − 1/3 2n−1 + 2/3 2n−1 = 2/3 2n .

Given Propositions 31 and 32, we can consider that a function f has large thickness
if AT (f ) equals λ 2n , where λ is near 1/2. Note that the algebraic degrees of such
functions cannot be substantially smaller than n2 , since we have seen already that
3.1 Cryptographic criteria (and related parameters) for Boolean functions 105
dalg (f ) n
AT (f ) ≤ i=0 i . There exist functions with low algebraic thicknesses and
with highest possible nonlinearity (e.g., quadratic bent functions). There also exist
functions with high algebraic thicknesses and low nonlinearities, since there exist
functions with high algebraic thicknesses and low Hamming weights: take λ <
λ < 1/2; the number of functions of Hamming weights smaller than or equal
2n λ 2n 2n H (λ )
to 2n λ equals i=0 i ≥ √2 2 (cf. [809, page 310]), where H2 (x) =
2n+3 λ (1−λ )
−x log2 (x) − (1 − x) log2 (1 − x) is the entropy function. We have seen above
that the number of functions f such that AT (f ) ≤ 2n λ is smaller than or
equal to
 
2n 2n
2 n +n ≤ 22 H2 (λ)+n +n ;
2 n 2
1+ + ··· +
1 k
thus, the latter is asymptotically smaller than the former and there exist functions of weights
smaller than or equal to 2n λ satisfying AT (f ) > 2n λ.
There also exist functions with algebraic degree at least n − 1, nonlinearity at least
n−1 √
2n−1 − 2 2 n and algebraic thickness at least λ 2n , with λ < 1/2 as close to 1/2 as we
wish, since the probabilities√ that f has algebraic
 degree at most n − 2, resp. nonlinearity
n
−1 √
at most 2 n−1 −2 2 n 2 ln 2 + n , resp. algebraic thickness at most λ 2n , tend all
4 ln n

three to 0 (see Section 3.1).

Nonnormality Hans Dobbertin has introduced in [466] the following notion: for
any n even, an n-variable Boolean function is a normal function (resp. a weakly
normal function) if it is constant (resp. affine) on at least one n2 -dimensional
flat. He used this notion for constructing balanced functions with high nonlin-
earities (see more at page 296). The notion has been generalized and extended
(see, e.g., [222, 224]):

Definition 28 Let n and k ≤ n be positive integers. An n-variable Boolean function f is


called a k-normal function (resp. a k-weakly normal function) if there exists a k-dimensional
flat on which f is constant (resp. affine). For n even, n2 -normal functions are simply called
normal.

The notion of normality has been later related to an attack on stream ciphers [881]. The
related parameter is studied in [276] as well as two other parameters that complete the
information it gives. The notion of k-nonnormal function is a particular case of that of
affine disperser of dimension k and is also related to that of affine extractor, a stronger
notion needed for the extraction of randomness from few independent sources; see more
precise definitions and constructions in [110, 1036]. It is also related to a similar notion
coming from computational number theory: that of kwise independent random variables;
see [12].
The complexity criterion we are interested in is k-nonnormality with small k. Even
if almost all Boolean functions satisfy it, as we shall see, it is not satisfied by simple
ones:
106 Boolean functions, vectorial functions, and cryptography

– Every quadratic Boolean function f on Fn2 is n2 -normal if n is even and n+1 2 -weakly
normal if n is odd, according to the properties of quadratic functions that we shall see in
Section 5.2.
– Every symmetric Boolean function (i.e., every function whose output is invariant under
permutation of its input bits,: n ; and depends< nthen = only on the Hamming weight of the
input;
<n= see Section 10.1) is 2 -normal and 2 -weakly-normal since its restriction to the
2 -dimensional flat
6 n 7
(x1 , . . . xn ) ∈ Fn2 | xi+ n2  = xi ⊕ 1, ∀i ≤
2
is constant if n is even and affine if n is odd. Indeed, if n is even, all the elements of
this flat have same Hamming weight n2 and f (x) takes therefore a constant value; if n is
odd, we have f (x) = f (x1 , . . . , xn−1 , 0) ⊕ xn [f (x1 , . . . , xn−1 , 0) ⊕ f (x1 , . . . , xn−1 , 1)],
where the functions f (x1 , . . . , xn−1 , 0) and f: (x;1 , . . . , xn−1 , 1) are constant on this flat.
– Every Boolean function on Fn2 with n ≤ 7 is n2 -normal, as can be checked by computer
investigation.

There is a mutual upper bound on k and on the nonlinearity of the function:

Proposition 33 Let f be a k-weakly normal Boolean function on Fn2 . Then


nl(f ) ≤ 2n−1 − 2k−1 ,
or equivalently k ≤ log2 [2n−1 − nl(f )] + 1.

Proof Applying the Poisson summation formula (2.39), page 58, to the sign function fχ ,
we see that if f ⊕ a · x is constant on the flat b ⊕ E ⊥ , then the mean of (−1)b·u Wf (u)
when u ranges over a ⊕ E equals ±|E ⊥ |. And the maximum absolute value of a sequence
of numbers is larger than or equal to the absolute value of its arithmetic mean.

Hence, k-normality with large k implies low nonlinearity. Notice that, since every
n
Boolean function has nonlinearity bounded above by 2n−1 − 2 2 −1 , Proposition 33 gives
no information if k ≤ n2 . But the high nonlinearity 2n−1 − 2 2 −1 of bent functions implies
n

that they cannot be ( n2 + 1)-weakly normal.

Remark. A more general result due to Zheng et al., proved in a complex way in [1179],
can be proved similarly: let A be any k-dimensional flat (k ≤ n). Let f be a Boolean function
on Fn2 and f  its restriction to A. Denote by nl(f  ) the nonlinearity of f  (i.e., the minimum
Hamming distance between f  and any affine function on A). Then we have:18
nl(f ) − nl(f  ) ≤ 2n−1 − 2k−1 .
Indeed, according to the Poisson summation formula applied to fχ with A = b ⊕ E ⊥ , we
have maxu∈Fk |Wf  (u)| ≤ maxv∈Fn2 |Wf (v)|, which completes the proof.
2
In fact, a little more can be said, as seen in [191]. Recall that, given two subspaces E of
dimension k and E  of Fn2 such that E ∩ E  = {0n } and whose direct sum equals Fn2 , and
18 Note that in Proposition 33, we have nl(f  ) = 0.
3.1 Cryptographic criteria (and related parameters) for Boolean functions 107

denoting for every a ∈ E  by ha the restriction of f to the coset a + E, the second-order


Poisson formula (2.57) in Proposition 12 (page 62) implies

max Wf2 (u) ≥ F 2 (ha )


u∈Fn2
a∈E 

(indeed, the maximum of Wf2 (u) is larger than or equal to its mean). Hence we have
maxu∈Fn2 Wf2 (u) ≥ F 2 (ha ) for every a. Applying this property to f ⊕, where  is any linear
function, and using Relation (3.1), page 79, between the nonlinearity and the maximum
absolute value of the Walsh transform, we deduce
∀a ∈ E  , nl(f ) ≤ 2n−1 − 2k−1 + nl(ha ). (3.15)
The approaches by the first and the second Poisson formulae lead to two different necessary
conditions for the case of equality in (3.15); see [224], where the case of equality is studied.
The proof above shows that, if equality occurs in the inequality nl(f ) ≤ 2n−1 − 2k−1 for a
given function f that coincides with an affine function  on a k-dimensional flat, then f ⊕ 
is balanced on every other coset of this flat.

As a consequence of Proposition 33, the maximum possible nonlinearity of quadratic


functions (i.e., the covering radius of the Reed-Muller code RM(1, n) in the Reed–Muller
n
code RM(2, n)) is bounded above by 2n−1 − 2 2 −1 if n is even, which tells nothing, and by
n−1
2n−1 − 2 2 if n is odd (these values are in fact the exact ones).
For every α > 1, when n tends to infinity, random Boolean functions are almost surely
[α log2 n]-nonnormal:

Proposition 34 [222] Let c be larger than 1. Let (kn )n∈N∗ be a sequence of positive
integers such that c log2 n ≤ kn ≤ n. The density in BF n of the set of all Boolean functions
on Fn2 that are not kn -weakly normal is larger than 1 − 2n(kn +1)−2 . This density tends to
kn

1 when n tends to infinity. Therefore, there exists


: a; positive integer N such that, for every
n ≥ N , kn -nonnormal functions exist. For kn = n2 we can take N = 12.

Proof Let λn be the number of kn -dimensional flats in Fn2 . Fix such a flat A. Let μn be the
number of Boolean functions whose restrictions to A are affine (clearly, this number does
not depend on the choice of A). The number of kn -weakly normal functions on Fn2 is smaller
than or equal to λn μn .
The number of kn -dimensional vector subspaces of Fn2 equals (cf., e.g., [809]):
 
n (2n − 1)(2n − 2)(2n − 22 ) · · · (2n − 2kn −1 )
= k
kn (2 n − 1)(2kn − 2)(2kn − 22 ) · · · (2kn − 2kn −1 )
 
n
and the number of kn -dimensional flats in Fn2 is: λn = 2n−kn .
kn
We choose now as particular kn -dimensional flat the set Fk2n × {0kn }. The restriction
to Fk2n × {0kn } of a Boolean function on Fn2 is affine if and only if the algebraic normal
108 Boolean functions, vectorial functions, and cryptography

form of the function contains no monomial of degree at least 2 involving the coordinates
x1 , . . . , xkn only. The number of such functions is μn = 2νn , where νn = 2n − 2kn +
1 + kn . The number of kn -weakly normal functions on Fn2 is then smaller than or equal to
 
n n
2n−k n 2νn . The number of Boolean functions on Fn2 being equal to 22 , the density
kn
of the subset An in BF n of all Boolean functions on Fn2 that are not kn -weakly normal is
 
n
2νn −2 .
n
larger than or equal to 1 − 2n−kn
kn
   
n n
< 2nkn −kn +kn , since every factor in the numerator of
2
We have is smaller
kn kn
than 2n and every factor in its denominator is larger than or equal to 2kn −1 . Thus, the density
of An is larger than or equal to

1 − 2n(kn +1)+kn +1−kn −2 > 1 − 2n(kn +1)−2 .


2 kn kn

The exponent n(kn + 1) − 2kn is smaller than or equal to 2kn /c (kn + 1) − 2kn and thus
tends to −∞ when n tends to +∞. The  last sentence of the proposition can be checked by
n
2νn −2 , n even, and n odd are increasing and
n
computation (the sequences 1 − 2 nn−k
kn
positive respectively for n ≥ 12 and n ≥ 13).

Remark. 1. The result of Proposition 34 is easy to prove but pretty astonishing: the size
of a kn -dimensional flat is close to n.
2. Proposition 34 also remains essentially valid (except for the number “12”) if, in the
definition of k-weakly normal functions, we replace “there exists a k-dimensional flat
on which the function is affine” with “there exists a k-dimensional flat such that the
restriction of the function to this flat has degree ≤ l,” where l is some fixed positive
integer: the value of νn has then to be changed into 2n − 2kn + 1 + k1n + · · · + kln .

The deterministic function with asymptotically lowest-known normality, due to Shaltiel


0.9
[1036], has normality 2log n . Other constructions are given in [47].
The behavior of normality for fixed algebraic degree functions is also interesting to
determine. X.-D. Hou has shown in [616] that, for any odd n ≤ 13, the maximum
n−1
nonlinearity of all cubic functions is the same as for quadratic functions: 2n−1 − 2 2 . So
we can wonder whether cubic Boolean functions behave for generic n as quadratic functions
with respect to maximum nonlinearity or to normality. For nonlinearity, this is an open
problem. But for normality, kn -nonnormal Boolean functions of algebraic degree 3 exist,
where kn is negligible with respect to n (this confirms the feeling that cubic functions behave
merely as general functions, considering their Hamming weights; see Section 5.3, page 180).
Indeed, it has been shown in [222] that for every λ > 12 and any sequence (kn )n∈N∗ of
positive integers such that nλ ≤ kn ≤ n, the density of the set of all Boolean functions of
algebraic degree at most 3 on Fn2 that are not kn -weakly normal in the set of all Boolean
kn kn
functions of algebraic degree at most 3 is larger than or equal to 1 − 2n(kn +1)−kn −( 2 )−( 3 ) .
2

This density tends to 1 when n tends to infinity.


3.1 Cryptographic criteria (and related parameters) for Boolean functions 109

As proved later in [377] (and recalled in [111]), for any constant d, a random algebraic
degree d Boolean function has normality (n1/(d−1) ).

Remark.
1. All the results above are essentially valid if we restrict ourselves
 2n to balanced func-
(2n )!
tions. Indeed, the number of balanced functions on F2 2n−1 = ((2n−1 )!)2 ∼
n equals
√ n 2 n −2n 1 n
n
2π 2 (2 ) e 2 −2
n
 =
2
√
n−1 n−1 2 π 2 , according to Stirling’s formula, and all our
2π 2n−1 (2n−1 )2 e−2
n  2n
arguments can be used, replacing the number of functions, 22 , by 2n−1 .
2. We can also deal with the distance to linear structures. Since the existence of a linear
structure for a function f is equivalent to the existence of a Boolean function g on F2n−1
and of a linear function l on F2 such that f (x1 , . . . , xn ) is affinely equivalent to the
function g(x1 , . . . , xn−1 ) ⊕ l(xn ), the number of functions admitting linear structures is
n−1
smaller than or equal to 22 , times the number of affine automorphisms, times 2. Thus, it
is smaller than 22 +n +n+1 . Moreover, let ρ be a positive number smaller than 1/2. The
n−1 2

number of Boolean functions on Fn2 that lie at distance smaller than or equal to ρ 2n from
 ρ 2n  n
this set is smaller than or equal to 22 +n +n+1 i=0 2i ≤ 22 +n +n+1+2 H2 (ρ) .
n−1 2 n−1 2 n

n
Thus, this number is negligible with respect to 22 if H2 (ρ) < 1/2 and, asymptotically,
almost all functions lie then at distance greater than ρ 2n from the set of all Boolean
functions admitting linear structures.

We have seen that a low algebraic degree of Boolean functions does not imply their
normality. Conversely, k-normality does not imply low algebraic degree: take a function
of high algebraic degree on F2n−1 (considered as a subspace of Fn2 ) and complete it by 0 to
obtain a function on Fn2 .
There exist functions f with low algebraic thicknesses (e.g., functions of algebraic
degree 3) which are k-nonnormal with small k; and there exist functions with high algebraic
thicknesses that are k-normal with large k: take a function g on F2n−1 with high AT (g)
and complete it by 0 to obtain a function f on Fn2 ; it is a simple matter to check that
AT (f ) ≥ AT (g). In [111, 377] (and references therein), the authors studied the relationship
between algebraic thickness and nonnormality. The most interesting is that almost all
functions have high algebraic degrees, nonlinearities, and algebraic thicknesses and are
non-k-normal with small k.

Spectral complexity The size of the support of the Walsh transform of an n-variable
function f , that is, 2n minus the number of its zeros, is called the spectral complexity
of f . We shall denote it by SC(f ). This criterion has been studied in [968, 1008].
Since, according to the inverse Walsh transform formula (2.43), page 59, the Walsh
transform values Wf (u) provide the decomposition of the sign function of f over the
basis of the so-called Walsh functions (−1)u·x , and since these functions are realized by
simple circuits, the spectral complexity is related to the circuit complexity of Boolean
functions.
110 Boolean functions, vectorial functions, and cryptography

Note that, for every n-variable Boolean function f , an easy lower bound can be derived
from the Cauchy–Schwarz inequality:

 2
u∈Fn2 Wf2 (u) (22n )2
SC(f ) ≥  = 
u∈Fn2 Wf4 (u) 2n (x,y,z,t)∈(Fn )4 (−1)
2
f (x)⊕f (y)⊕f (z)⊕f (t)
x+y+z+t=0n

23n
= f (x)⊕f (y)⊕f (z)⊕f (x+y+z)
.
(x,y,z)∈(Fn2 )3 (−1)

The  average spectral complexity of n-variable Boolean functions, equal to 2n −


−2 n
f ∈BF n |{u ∈ F2 ; Wf (u) = 0}|, is also easily determined: for every f ∈ BF n
2 n

and u ∈ F2 , we have Wf (u) = 0 if and only if function f (x) ⊕ u · x is balanced. We have


n
 2n
then |{f ∈ BF n ; Wf (u) = 0}| = 2n−1 for every u. Hence, the average number of zeros of
2n 1
2n (2n−1 ) n
the Walsh transform equals 22n ∼ π2 2 2 and the average spectral complexity equals
n2
2n ( )
2n − 222n−1 n .
Ryazanov in [1008] shows the more precise result that the random variable equal
 π 1/2
to 2n+3 times the number of zeros of the Walsh transform tends in distribution to
the constant function 12 over {0, 1}. The proof is too long for being included here. He
also studies the number of zeros of the Walsh transform of functions of even Ham-
ming weights and shows then that the same random variable converges to 1 (in par-
ticular, functions of even Hamming weights have on average twice more zeros than
general Boolean functions; this can be simply proved with the same method as previously
described).
The evaluation can also be done for random (n, m)-functions. When F ranges over the set
of (n, m)-functions and v ranges over Fm 2 \ {0m }, the component function v · F ranges 2 − 1
m

times over the set of n-variable Boolean functions. Since for v = 0m , we have WF (u, v) = 0
for every u = 0n and WF (0n , 0m ) = 2n , we deduce that the average number of zeros of the
2n
2n ( )
Walsh transform of (n, m)-functions equals 2n − 1 + (2m − 1) 222n−1 n .
And when restricting ourselves to (n, n)-permutations, we know that when v = 0n ,
the component function v · F ranges uniformly over the set of balanced functions when
F ranges over the set of permutations. Distinguishing the cases “u = v = 0n ,” “u =
0n , v = 0n ,” “u = 0n , v = 0n ” and “u = 0n , v = 0n ,” we obtain an average of
2
2n−1
(2n −1)2 (2n−2 )  n−1 2
2(2 − 1) +
n
2n , since |{f ∈ BF n , f balanced; Wf (u) = 0}| equals 22n−2
(2n−1 )
for every u = 0n , because u · x and f (x) ⊕ u · x need to be both balanced, that is, we need
wH (f (x)(u·x)) = wH (f (x)(u·x⊕1)) = 2n−2 , where f (x)(u·x) is the product of f (x) and
u · x.
As in the case of Boolean functions above, by the Cauchy–Schwarz inequality, the spectral
complexity of (n, m)-functions SC(F ) = |{(u, v) ∈ Fn2 × Fm 2 ; WF (u, v) = 0}| of F
satisfies
3.1 Cryptographic criteria (and related parameters) for Boolean functions 111
 2
u∈Fn2 ,v∈Fm WF2 (u, v)
SC(F ) ≥  2

u∈Fn2 ,v∈Fm Wf4 (u, v)


2

23n+m
=
|{(x, y, z, t) ∈ (Fn2 )4 ; x + y + z + t = 0n , F (x) + F (y) + F (z) + F (t) = 0m }|
23n+m
= .
|{(x, y, z) ∈ (Fn2 )3 ; F (x) + F (y) + F (z) + F (x + y + z) = 0m }|
In the case of an APN (n, n)-function (see Definition 41, page 137), this gives

24n 22n
SC(F ) ≥ ≈ .
3 · 22n − 2n+1 3
A similar method with v = 0m apart gives SC(F ) ≥ 1 + (2n − 1) 2n−1 ≈ 22n−1 .
Nonhomomorphicity For every even integer k such that 4 ≤ k ≤ 2n , the k-th order
nonhomomorphicity [1171] of a Boolean function equals the number of k-tuples (u1 , . . . , uk )
of vectors of Fn2 such that u1 + · · · + uk = 0n andf (u1 ) ⊕ · · · ⊕ f (uk ) = 0. It is a simple
matter to show that it equals 2(k−1)n−1 + 2−n−1 u∈Fn Wfk (u). This parameter should be
2
small (but no related attack exists on stream ciphers). It is maximum and equals 2(k−1)n if
nk
and only if the function is affine. It is minimum and equals 2(k−1)n−1 + 2 2 −1 if and only if
the function is bent.

Conclusion of this section


As we can see, there are numerous cryptographic criteria for Boolean functions to be used
in stream ciphers. The ones that must be necessarily satisfied are balancedness, a high
algebraic degree, a high nonlinearity, a high algebraic immunity, and a good resistance to fast
algebraic attacks. It is difficult but not impossible to find functions satisfying good trade-offs
between all these criteria (see Chapter 9). Achieving additionally resiliency of a sufficient
order, which is necessary for the combiner model, is impossible because of the Siegenthaler
bound.19 Hence, the filter model is more appropriate.
We saw that, asymptotically, almost all Boolean functions (in the sense of probability
theory) have high algebraic degree, high nonlinearity, and high algebraic immunity. They
have also high algebraic thickness and low normality. The related following randomness
criteria for n-variable Boolean functions seem then appropriate:
• Algebraic degree close to n − 1 (since the number of functions of algebraic degree at
n
most n − 2 is negligible compared to 22 )
• Nonlinearity lying within the interval
  
n
−1 √
√ 4 ln n n
−1 √
√ 5 ln n
2n−1
−2 2 n 2 ln 2 + ;2 n−1
−2 2 n 2 ln 2 −
n n
(according to Rodier’s results; see Subsection 3.1.3)

19 But to render f 1-resilient by composing it with a linear automorphism – which preserves the other features –
we just need that there exist n linearly independent vectors at which the Walsh transform vanishes.
112 Boolean functions, vectorial functions, and cryptography

• Algebraic immunity at distance at most ln n from n


2 (according to Didier’s results; see
Subsection 3.1.5)
• Algebraic thickness equal to λ2n with λ near 12

Of course, these criteria make sense asymptotically.

3.2 Cryptographic criteria for vectorial functions in stream and block ciphers
Vectorial functions can be used (in the place of Boolean functions) as combiners or filters in
stream ciphers (they then allow the PRG to generate several bits at each clock cycle, which
increases the speed of the cipher) or as S-boxes in block ciphers. These two situations are
very different, but some criteria of resistance to attacks are the same. We study them in this
section. We shall study in the two next sections those criteria and parameters that are specific
to each use.

3.2.1 Balancedness of vectorial functions


Recall that an (n, m)-function is called balanced if its output distribution is uniformly
distributed (with m ≤ n), that is, if it takes every value of Fm 2 the same number 2
n−m

of times. By definition, F is then balanced if every Boolean function ϕb = 1{b} ◦ F has


Hamming weight 2n−m . A vectorial function used as a combiner or as a filter needs to
be balanced because any combination of its output bits can be made, and to avoid such a
combination to give statistical information allowing one to distinguish when a pair of texts
is a pair (plaintext, ciphertext), this needs the vectorial function to be balanced.
S-boxes in block ciphers are also better balanced. In every SPN (see Subsection 1.4.2),
the S-boxes need to be permutations (with m = n) and are then balanced. In Feistel ciphers,
we have seen that the S-boxes do not need to be balanced, but that it has been shown for
instance in [957] that, when they are unbalanced, an attack may be possible; to withstand
it, the designer needs to complexify the encryption algorithm, for instance with expansion
boxes. Hence, balanced S-boxes are preferred.

Characterization through the component functions


The balanced S-boxes (and among them, the permutations) can be nicely characterized by
the balancedness of their component functions:

Proposition 35 [775] An (n, m)-function F is balanced if and only if its component


functions v · F , v = 0m , are all balanced, that is, if and only if, for every nonzero v ∈ Fm
2,
we have WF (0n , v) = 0.

Proof The relation



2m if F (x) = b
(−1)v·(F (x)+b) = = 2m ϕb (x), (3.16)
0 otherwise
v∈Fm
2
3.2 Cryptographic criteria for vectorial functions in stream and block ciphers 113

is valid for every (n, m)-function F , every x ∈ Fn2 , and every b ∈ Fm 2 , since the function
v → v · (F (x) + b), being linear, is either balanced or null. Thus, we have

(−1)v·(F (x)+b) = 2m |F −1 (b)| = 2m wH (ϕb ), (3.17)


x∈Fn2 ;v∈Fm
2

where wH denotes  the Hamming weight. Hence, the Fourier–Hadamard transform of the
function v → x∈Fn2 (−1)
v·F (x) equals the function b → 2m |F −1 (b)|. We know that a
pseudo-Boolean function has constant Fourier–Hadamard transform if and only if it is null
at
 every nonzero vector. We deduce that F is balanced if and only if the function v →
n (−1) v·F (x) is null on Fm \ {0 }.
x∈F 2 m
2


Equivalently, F is balanced if and only if a∈Fn F (Da (v · F )) = 0 for every v = 0m
2
(according to Wiener–Khintchine’s
 formula (2.53), page 62). Note that, for m = n, F is
a permutation if and only if v∈Fn F (Da (v · F )) = 0 for every a = 0n (since v∈Fm
2 2
F (v · G) = 2m |G−1 (0m )| for every (n, m)-function G).
If F is balanced, then the fi (1im) being balanced, we have dalg (F )n − 1. Much
more can be said, in particular for permutations: F is a permutation if and only if the product
of strictly less than n coordinate functions of F has even Hamming weight, that is, algebraic
degree strictly less than n, and the product of all n coordinate functions has algebraic
degree n. The condition is clearly necessary, and it is easily seen that it is sufficient (since
“|F −1 (a)| is odd for every a ∈ Fn2 ” implies F bijective). Note that the relation between this
characterization and Proposition 35 is given by Relations (2.25) and (2.26).
There is a nice property of the Walsh transform of permutations:

∀v = w, WF (u, v)WF (u, w) = 0. (3.18)


u∈Fn2

Indeed, we have WF (u, v)WF (u, w) = (−1)u·(x+y)⊕v·F (x)⊕w·F (y) =


u∈Fn2 u,x,y∈Fn2

2n (−1)(v+w)·F (x) . Note that for v = w, the sum in (3.18) equals 22n (this is Parseval’s
x∈Fn2
relation on the Boolean function v · F ). Of course, Relation (3.18) can be also applied to
F −1 and since

WF −1 (u, v) = WF (v, u),

we obtain

∀v = w, WF (v, u)WF (w, u) = 0.


u∈Fn2

Imbalance of an (n,m)-function
A natural way of quantifying the fact that some (n, m)-function F is unbalanced is by the
variance of the random variable b → |F −1 (b)|, where |F −1 (b)| denotes the size of the
114 Boolean functions, vectorial functions, and cryptography

preimage of b by F . In [267], the variance is multiplied by 2m to give the following integer-


valued parameter,20 that we shall call the imbalance of F :
0 0 2 0 0
0 −1 0 0 −1 02
NbF = 0F (b)0 − 2n−m = 0F (b)0 − 22n−m . (3.19)
b∈Fm
2 b∈Fm
2

It has the following properties:


• NbF ≥ 0, for every vectorial function F , and NbF = 0 if and only if F is balanced.
• NbF is invariant under composition of F by permutations (on the right and on the left);
in particular, it is affine invariant.
• NbF = |{(x, y) ∈ (Fn2 )2 ; F (x) = F (y)}| − 22n−m ≤ 22n − 22n−m and NbF = 22n −
22n−m if and only 0 if F is constant.
0
• NbF = a∈Fn 0(Da F )−1 (0m )0 − 22n−m .
2

Parameter NbF can be expressed by means of the Walsh transform. We have


⎛ ⎞

WF2 (0n , v) = ⎝ (−1)v·(F (x)+F (y)) ⎠


v∈Fm
2 x,y∈Fn2 v∈Fm
2

= 2m |{(x, y) ∈ Fn2 | F (x) = F (y)}| = 2m (NbF + 22n−m ).


Hence:
NbF = 2−m WF2 (0n , v). (3.20)
2 ,v=0m
v∈Fm

3.2.2 Algebraic degree of vectorial functions


The algebraic degree of vectorial functions has been defined at page 39. The output of the
function used in a stream cipher being also the output of the PRG, the output bits can be
combined and used in a Berlekamp–Massey attack. The algebraic degree is then an important
parameter.
In block ciphers, the algebraic degree is a security parameter against structural attacks,
such as integral [709], higher-order differential, cube [465], or, recently, attacks based on the
division property21 [1086] (see also the two first sections of [106] and the references therein).
In particular, the higher-order differential attack [706, 735] (see also [204]) exploits the fact
that the algebraic degree of the S-box F is low, or more generally that there  exists a low-
dimensional vector subspace V of F2 such that the function DV F (x) = v∈V F (x + v)
n

(i.e., Da1 · · · Dak F (x) where {a1 , . . . , ak } is a basis of V ) is constant. A probabilistic version
of this attack [638] allows the derivative not to be constant, and the S-box must then have
high higher-order nonlinearity (a notion defined for Boolean functions in Definition 20,

20 The framework of [267] is functions from Abelian groups to Abelian groups; we stick here to Boolean
functions.
21 A very elementary notion, from a viewpoint of Boolean functions, whose properties given in diverse papers
are in fact well-known properties of Reed–Muller codes.
3.2 Cryptographic criteria for vectorial functions in stream and block ciphers 115

page 83; for vectorial functions, see page 349 in Subsection 9.2.4). Stricto sensu, the higher-
order differential attack has been proved efficient for quadratic functions only. But since
cryptographers like to have some security margin, even cubic functions may be viewed as
weak (unless, as usual in cryptography, some precautions are taken with the global cipher).
Quadratic S-boxes, if used, need care. It is observed in [204, 108, 106] (see page 64 and
below) that the algebraic degree of the function resulting from the first rounds of the cipher
may increase less than expected.
The algebraic degree of the computational inverse of a permutation plays also a role
in the algebraic degree of the iterated rounds implementing it. This is shown in [106]
n−1−dalg (G)
by proving that dalg (G ◦ F ) ≤ n − dalg (F −1 )
for every (n, n)-permutation F and
every (n, r)-function G. We do not recall the proof given in [106] for this bound, since
as seen in [254] we-have directly. from Relation (2.12), page 41, the slightly stronger bound
 
dalg (G◦F ) ≤ n− d (F −1 ) , implied by n = dalg (gk ⊕ 1) i∈I c (fi ⊕ 1) ≤ dalg (G)+
n−dalg (G)
alg
(n−|I |) dalg (F −1 ). And dalg (G◦F ) is bounded above by max{t; dG,F −1 (n−t) = n}, where
dG,F −1 (n − t) equals the maximal numerical degree of the linear combinations in BF n of at
most one coordinate function of G and at most n − t coordinate functions of F −1 (or more
precisely of the parts of the NNFs of these functions that are not n−t
 divisibleby 2 ). Indeed,
in the framework of Relation (2.12) again, we have n = dalg (gk ⊕ 1) i∈I c (fi ⊕ 1) ≤

dnum (gk ⊕ 1) i∈I c (fi ⊕ 1) ≤ dG,F −1 (n−|I |), the latter inequality being due to Relation
(2.26), page 50.

Remark. It is an open problem to know whether those high algebraic degree functions that
are CCZ equivalent to low algebraic degree functions could be attacked by a modification
of the higher-order differential attack. Thus, it is not clear whether the designer should also
avoid functions CCZ equivalent to quadratic functions.

3.2.3 Nonlinearity of vectorial functions


In stream ciphers, since the output bits can be combined by the attacker, the nonlinearity of
all component functions must be large, and the minimum of these nonlinearities, called the
nonlinearity of the vectorial function, is then a parameter related to the resistance to the fast
correlation attack [843]. But nonlinear combinations of the output bits can also be used by
the attacker, and this will lead in Subsection 3.3.2 to the introduction of a parameter more
adapted to this framework.
In block ciphers, the linear attack, introduced by Matsui [829], is based on an idea
from [1084]. It may have been unknown by the National Security Agency (NSA) at the
time it was introduced; this could explain why it works better22 than the differential attack
on the DES. It seems that it was known or partly known from the USSR. It is, with the
differential attack that we shall describe at page 134, one of the two most powerful general-
purpose cryptographic attacks known to date. Its most common version is an attack on
the reduced cipher, that is, the cipher obtained from the original one by removing its

22 The differential attack needs 247 pairs (plaintext, ciphertext) while the linear attack needs “only” 243 pairs.
116 Boolean functions, vectorial functions, and cryptography

k1 k2 kr−1 kr

Y (r − 1)
Y (0) Y (r)
F F F F

Bias in the distribution Compute Y (r − 1) = F −1 (Y (r), kr )


of (Y (0), Y (r − 1)) for all values of kr

Comparison

Value of kr

Figure 3.1 Last round attacks.

last round23 (or more generally an attack on a round whose inputs and outputs can be
computed from the plaintext and ciphertext and a number of key bits hopefully “small”).
We describe the principle of the attack in the case it is applied to the reduced cipher. In
Figure 3.1, Y (r − 1) denotes the output of the reduced cipher corresponding to a plaintext
Y (0), and Y (r) denotes the ciphertext. Assume that it is possible to distinguish the outputs
of the reduced cipher from random outputs, by observing some statistical bias in their value
distribution. The existence of such a distinguisher allows recovering the key used in the last
round, either by an exhaustive search, which is efficient if this key is shorter than the master
key, or by using specificities of the cipher allowing replacing the exhaustive search by, for
instance, solving algebraic equations.
We describe now the attack in the case of exhaustive search, which is simpler to describe.
The attacker, who knows a number of pairs (plaintext, ciphertext) of the (complete) cipher,
visits all possible last round keys. For each try, he/she applies to all the ciphertexts in
these pairs the inverse of what is the last round when the key corresponds to the try (this is
possible since all except the key is supposed known to him/her; if not, say, if some parameter
is unknown, the attacker will have to try all possibilities). He/she obtains in the case of the
correct key guess the output of the reduced cipher and has then a number of pairs (plaintext,
ciphertext) of the reduced cipher, on which he/she can observe the statistical bias. In all
the other cases (incorrect guesses), the obtained pairs (plaintext, ciphertext) correspond to a

23 The output of the reduced cipher is unknown if the last round key is unknown, but it is convenient to name this
reduced cipher for describing the attack.
3.2 Cryptographic criteria for vectorial functions in stream and block ciphers 117

cipher equal to the original cipher with an additional round whose round key is random, and
the pairs are then assumed random, with no observable bias. Such assumption is verified in
practice. The number of pairs (m, c) that are known to the attacker needs then to be large
enough to distinguish the bias (the smaller the bias, the larger the number of known pairs
needed).
For distinguishing pairs (plaintext, ciphertext) of the reduced cipher, the linear attack
uses triples (α, β, γ ) of binary strings such that, a block m of plaintext and a key k being
randomly chosen, the bit α · m ⊕ β · c ⊕ γ · k, where “·” denotes the usual inner product
(between two strings of the same length) and c denotes the (reduced) ciphertext related
to m, has a probability different from 1/2 of being null. The more distant from 1/2 the
probability, the more efficient the attack. Note that when searching for triples (α, β, γ ),
both m and k are supposed ranging uniformly over their definition spaces (indeed, the
plaintext can be any binary string of a given length, and the round key can be as well
any string of a given length), while during the attack, m still ranges uniformly, but k
is fixed.
The related criterion on any S-box F used in the cipher for allowing resistance to the
attack is that the component functions v · F , v = 0m , be at Hamming distance to affine
Boolean functions u · x ⊕ as close to 2n−1 as possible. In other words, the nonlinearities of
all these component functions must be as large as possible. The generalization to vectorial
functions of the notion of nonlinearity introduced by Nyberg [907] and studied by Chabaud
and Vaudenay [341], is then as follows:

Definition 29 The nonlinearity of an (n, m)-function is the minimum nonlinearity of its


component functions:

1
nl(F ) = 2n−1 − max |WF (u, v)| ; WF (u, v) = (−1)v·F (x)⊕u·x . (3.21)
2 v∈Fm
2 \{0m }
u∈Fn
x∈Fn2
2

Note that “ max ” can be replaced by “ max ”, since we have


2 \{0m }; u∈F2
v∈Fm (u,v)∈Fn2 ×Fm
2 ;(u,v)=(0n ,0m )
n

n (−1)
x∈F2
u·x = 0 for every nonzero u.
Nonlinearity is an EA invariant (see Definition 5, page 28), that is, it does not change when
we compose the function by affine automorphisms nor when we add an affine function to it
(this implies for instance that if A is a surjective affine function from Fr2 into Fn2 , then nl(F ◦
A) = 2r−n nl(F ), since by affine invariance, we can assume without loss of generality that
A is a projection and the equality is then easily shown). Nonlinearity is more strongly a CCZ
invariant. Indeed, in Relation (3.21), WF (u, v) equals the Fourier–Hadamard transform of
the graph {(x, F (x)), x ∈ Fn2 } of F , and maxv∈Fm ∗ ,u∈Fn |WF (u, v)| is then invariant under
2 2
affine transformation of this graph.
S. Dib has shown in [436] that for 0 < β < 1/4 and m ≤ n, when n tends to
infinity, the nonlinearity of almost all (n, m)-functions (in terms of probability) is bounded
n−1 8
above by 2n−1 − 2 2 (n + m) log 2 (1 − β), and that for β > 0, when n + m tends
to infinity, the nonlinearity of almost all (n, m)-functions is bounded below by 2n−1 −
n−1 8
2 2 (n + m) log 2 (1 + β).
118 Boolean functions, vectorial functions, and cryptography
n
The covering radius bound 2n−1 − 2 2 −1 (see page 80) on the nonlinearity of any
n-variable Boolean function is obviously valid for (n, m)-functions. Naturally, this has led
researchers to extend the notion of bentness to vectorial functions:

Definition 30 Given two integers n and m (with n necessarily even), an (n, m)-function F
is called bent if its nonlinearity nl(F ) achieves the optimum 2n−1 − 2n/2−1 .

We shall see with Proposition 104, page 269, that bent (n, m)-functions do not exist if
m > n2 . This has led to asking whether better upper bounds than the covering radius bound
could be proved in this case. Such bound has been found by Chabaud and Vaudenay in [341].
In fact, a bound on sequences due to Sidelnikov [1040] is equivalent for power functions to
the bound obtained by Chabaud and Vaudenay, and its proof is valid for all functions. This
is why the bound is now called the Sidelnikov–Chabaud–Vaudenay (SCV) bound:

Theorem 6 Let n and m be any positive integers such that m ≥ n − 1. Let F be any
(n, m)-function. Then:
5
1 (2n − 1)(2n−1 − 1)
nl(F ) ≤ 2n−1 − 3 × 2n − 2 − 2 .
2 2m − 1

1
Proof Recall that nl(F ) = 2n−1 − max |WF (u, v)|. We have
2 2 \{0m }; u∈F2
v∈Fm n


v∈Fm
2 \{0m }
WF4 (u, v)
u∈Fn
max WF2 (u, v) ≥  2
. (3.22)
v∈Fm
2 \{0m } v∈Fm
2 \{0m }
WF2 (u, v)
u∈Fn2 u∈Fn2

Parseval’s relation states that, for every v ∈ Fm


2:

WF2 (u, v) = 22n . (3.23)


u∈Fn2

Using that any character sum x∈E (−1)(x) associated to a linear function  over any
F2 -vector space E is nonzero if and only if  is null on E, we have

WF4 (u, v)
v∈Fm n
2 , u∈F2
⎡ ⎤⎡ ⎤

= ⎣ (−1)v·(F (x)+F (y)+F (z)+F (t)) ⎦ ⎣ (−1)u·(x+y+z+t) ⎦


x,y,z,t∈Fn2 v∈Fm u∈Fn2
0
2
 >0
0 x+y+z+t = 0n 00
= 2n+m 00 (x, y, z, t) ∈ F4n ;
2 F (x) + F (y) + F (z) + F (t) = 0m 0
= 2n+m |{(x, y, z) ∈ F3n
2 ; F (x) + F (y) + F (z) + F (x + y + z) = 0m }| (3.24)
≥ 2n+m |{(x, y, z) ∈ F3n
2 ; x = y or x = z or y = z}|. (3.25)
3.2 Cryptographic criteria for vectorial functions in stream and block ciphers 119

Clearly, |{(x, y, z); x = y or x = z or y = z}| equals

3 · |{(x, x, y); x, y ∈ Fn2 }| − 2 · |{(x, x, x); x ∈ Fn2 }| = 3 · 22n − 2 · 2n .

Hence, according to Relation (3.22),

2n+m (3 · 22n − 2 · 2n ) − 24n


max WF2 (u, v) ≥
2 \{0m }; u∈F2
v∈Fm n (2m − 1) 22n
(2n − 1)(2n−1 − 1)
= 3 × 2n − 2 − 2
2m − 1
and this gives the desired bound, according to Relation (3.21), page 117.

The condition m ≥ n − 1 is assumed in Theorem 6 to make nonnegative the expression


located under the square root. Note that for m = n − 1, this bound coincides with the
covering radius bound. For m ≥ n, it strictly improves upon it. For m > n, the square root
in it cannot be an integer (see [341]). Hence, the SCV bound can be tight only if n = m with
n odd, in which case it states
n−1
nl(F ) ≤ 2n−1 − 2 2 . (3.26)

We shall see that, under this condition, it is actually tight.

Definition 31 [341] The (n, n)-functions F that achieve (3.26) with equality are called
almost bent (AB).

Remark. The term of almost bent is a little misleading. It gives the feeling that these
functions are not optimal. But they are, according to Theorem 6. Proposition 104, page 269,
will give the values of n and m such that bent (n, m)-functions exist.

According to Inequality (3.22), page 118, the


 AB functions are those (n, n)-functions such
that, for every u, v ∈ Fn2 , v = 0n , the sum x∈Fn (−1)v·F (x)⊕u·x = WF (u, v) equals 0 or
2
n+1
±2 2 (indeed, the maximum of a sequence of nonnegative and not all null integers equals
the ratio of the sum of their squares over the sum of their values if and only if these integers
take one nonzero value exactly). We shall see at page 262 that this is equivalent to saying
that all component functions are near-bent. Note that this condition does not depend on the
choice of the inner product.
We shall see that AB functions exist for every odd n ≥ 3. Function F (x) = x 3 , x ∈ F2n ,
is the simplest one. Chapter 11 covers their topic.

Bounds on nonlinearity by means of imbalance


We follow [239] in this subsection. A bound is given on the nonlinearity of (n, m)-functions,
by means of their imbalance (see the definition at page 114):
120 Boolean functions, vectorial functions, and cryptography

Proposition 36 Let F be any (n, m)-function. The nonlinearity of F satisfies


/
1 2m
nl(F ) ≤ 2 n−1
− NbF .
2 2m − 1

Proof We have, using Relation (3.20), page 114:


1
max WF2 (u, v) ≥ max WF2 (0n , v) ≥ WF2 (0n , v)
v∈Fm
2 / v=0m 2 / v=0m
v∈Fm 2m − 1
u∈Fn 2 / v=0m
v∈Fm
2

2m
= NbF .
2m−1
Relation (3.21), page 117, completes the proof.

This bound shows that, to have a chance of having a high nonlinearity, a function must
not differ too much from a balanced function.
The bound of Proposition 36 is tight (it is achieved with equality, for instance, by bent
functions, since both inequalities above are equalities in that case). Moreover, it can be
applied to F + L (which has the same nonlinearity as F ) for every linear (n, m)-function L.
Note that we have in general NbF +L = NbF . Proposition 36 implies, denoting by Ln,m the
set of linear (n, m)-functions,
5
1 2m
nl(F ) ≤ 2n−1 − max NbF +L , (3.27)
2 2m − 1 L∈Ln,m

which is obviously tight too.

Remark. We have v · L(x) = L∗ (v) · x where L∗ is the adjoint operator of L.


Hence max
m
WF2 (u, v) = maxm
WF2 +L (0n , v).
v∈F2 / v=0m v∈F2 / v=0m
u∈Fn2
L∈Ln,m

Relation (3.27) raises the question of determining the mean of NbF +L :

Proposition 37 [239] Let F be any (n, m)-function. The mean of the random variable
L ∈ Ln,m → NbF +L equals 2n − 2n−m . We have max NbF +L ≥ 2n − 2n−m , with
L∈Ln,m
equality if and only if F is bent.

Proof For every L ∈ Ln,m , we have

NbF +L = |(Da (F + L))−1 (0m )| − 22n−m


a∈Fn2

= |(Da F )−1 (L(a))| − 22n−m . (3.28)


a∈Fn2
3.2 Cryptographic criteria for vectorial functions in stream and block ciphers 121

The size of Ln,m equals 2mn . Given any nonzero element a of Fn2 and any element b of
Fm2 , the number of linear functions L such that L(a) = b equals 2
m(n−1) . We have then the

following, distinguishing the case a = 0n from the others:

|(Da F )−1 (L(a))| = 2mn 2n + 2m(n−1) |(Da F )−1 (b)|


L∈Ln,m a∈Fn2 a∈Fn2 b∈Fm
2
a=0n

= 2(m+1)n + 2m(n−1) (2n − 1)2n .


1   −1
a∈Fn2 |(Da F ) (L(a))| equals 2 + 2
The mean |Ln,m n n−m (2n − 1) = 22n−m +
| L∈Ln,m
2n − 2n−m . This proves the first assertion. The second is then straightforward, and the case
of equality is when the function (a, b) ∈ (Fn2 \ {0n }) × Fm −1
2 → |(Da F ) (b)| is constant. We
shall see in Section 6.4 that this is characteristic of bent functions.

Remark. The definition of nonlinearity given in Definition 29, page 117, is related to
Matsui’s linear attack [829], but the term of nonlinearity can also evoke the behavior of
the functions F + L, where L is any linear (n, m)-function, which could lead to other
“nonlinearity” notions. We see with Proposition 37 that bent functions, which are related to
the classical notion of nonlinearity, are also related to the imbalance of functions F +L.

Proposition 37 and Relation (3.27) give the covering radius bound, and show that the
constancy of function L ∈ Ln,m → NbF +L is characteristic of bent functions.
The fact that the average value of NbF +L is the same for all (n, m)-functions is not
surprising: Relation (3.20) applied to the function F + L gives
⎛ ⎞2
∗ (v)·x
NbF ⊕L = 2−m ⎝ (−1)v·F (x)⊕L ⎠ ,
2 ,v=0m
v∈Fm x∈Fn2

where L∗ is the adjoint operator of L. Summing up this equality when L ranges over Ln,m
allows, for every v = 0m , the vector L∗ (v) to cover uniformly Fn2 , and Parseval’s relation
leads then to the mean.

Remark. The number maxL∈Ln,m NbF +L is, after nl(F ), a second parameter quantifying
the nonaffineness of F (in a different way from nl(F ) but in a coherent one, according to
Relation (3.27)). We shall see that it is also closely related to a third parameter NBF that
we shall introduce at page 138. Some easily proved properties of maxL∈Ln,m NbF +L include
the following:
– If F is affine, that is, if F + L0 is constant for some linear function L0 , then we know
that maxL∈Ln,m NbF +L = NbF +L0 = 22n − 22n−m is maximal.
– If, on the opposite side, F is bent, then, for every L, we have NbF +L = 2n − 2n−m and
maxL∈Ln,m NbF +L = 2n − 2n−m is minimal (according to Proposition 37); we can say
that, for every L, the function F + L is “almost balanced,” which is the best that can be
achieved for every linear function L.
– F → maxL∈Ln,m NbF +L is EA-invariant since Nb is affine invariant.
122 Boolean functions, vectorial functions, and cryptography

For m = n = 5, max NbF +L = 52 < 2 (2n − 1) = 62 for every AB function.


L∈Ln,m

Other bounds
Bounds have been obtained in relation with codes [267]:
m 2n−1
nl(F ) < 2n−1 − × n−1 ; m < 2n − 2 (3.29)
2 2 −1
and using the sphere packing bound:
nl(F )−1
2 
2n n −n−m−1
≤ 22 (3.30)
i
i=0
and the Griesmer bound:
m+n ? @
nl(F )
≤ 2n . (3.31)
2i
i=0
A construction using concatenated codes (see page 9) is given in [53], which allows
approaching these bounds. Precisely, a (2e − 1, (k − 2)e)-function F is obtained for every
e ≥ 2, k ≥ 3, such that nl(F ) = 2e−2 (2e − k + 1).
A lower bound on the nonlinearity of vectorial functions is given in [234] and upper
bounds in [1133] by means of parameter NbF of page 114, under particular conditions, in
some cases. A table of the best-known nonlinearities is given in [53].
Another notion of nonlinearity of vectorial functions, sometimes denoted by nlv , has been
introduced in [266] and studied further in [788]: their minimum Hamming distance to affine
vectorial functions.

Higher-order nonlinearity
This notion (see Definition 20, page 83) can be extended to vectorial functions by taking the
minimum r-th order nonlinearity of component functions: nlr (F ) = minv=0m nlr (v · F ). We
can more generally consider F composed by functions of higher degrees:

Definition 32 For every (n, m)-function F , for every positive integers s ≤ m and t ≤
n + m, and every nonnegative integer r ≤ n, we define
nls,r (F ) = min{nlr (f ◦ F ); f ∈ BF m , dalg (f ) ≤ s, f = cst},

and NLt (F ) = min{wH (h(x, F (x))); h ∈ BF n+m , dalg (h) ≤ t, h = cst}.

Definition 32 excludes f = cst and h = cst for obvious reasons.


Clearly, for every function F and all integers t ≤ t  , s ≤ s  and r ≤ r  , we have NLt (F ) ≥
NLt  (F ) and nls,r (F ) ≥ nls  ,r  (F ). Note also that we have NL1 (F ) = nl1,1 (F ) = nl(F ).
As recalled in [233, section 3], which is devoted to these notions, T. Shimoyama and T.
Kaneko have exhibited in [1037] several quadratic functions h and pairs (f , g) of quadratic
functions showing that the nonlinearities NL2 and nl2,2 of some sub-S-boxes of the DES are
3.2 Cryptographic criteria for vectorial functions in stream and block ciphers 123

null (and therefore that the global S-box of each round of the DES has the same property).
They deduced a “higher-order nonlinear” attack (an attack using the principle of the linear
attack by Matsui but with non-linear approximations), which needs 26% less data than
Matsui’s attack. This improvement is not very significant, practically, but the notions of
NLt and nls,r may be related to potentially more powerful attacks. Note that we have
NLmax(s,r) (F ) ≤ nls,r (F ) by taking h(x, y) = g(x) ⊕ f (y) (since f = cst implies then
h = cst) and the inequality can be strict if s > 1 or r > 1 since it may happen that a function
h(x, y) of low algebraic degree and such that wH (h(x, F (x))) is small exists while no such
function exists with separated variables x and y. This is the case, for instance, of the S-box
of the AES for s = 1 and r = 2 (see below).

Proposition 38 [233] For all positive integers n, m, r ≤ n, and s ≤ m and every


(n, m)-function F , we have NLs (F ) ≤ 2n−s and nls,r (F ) ≤ 2n−s . These inequalities are
strict if F is not balanced (that is, if its output is not uniformly distributed over Fm
2 ).

Indeed, there necessarily exists an (m − s)-dimensional affine subspace A of F2m (whose


indicator 1A has algebraic degree s) such that |F −1 (A)| ≤ 2n−s , and we can take f (y) =
h(x, y) = 1A (y). See in [233] the rest of the proof.
The bound nls,r (F ) ≤ 2n−s is asymptotically almost tight (in a sense that will be made
precisely in Proposition 40, page 124) for permutations when r ≤ s ≤ .227 n.

Existence of permutations with higher-order nonlinearities


bounded from below
The case of permutations is more interesting and useful than that of general functions when
dealing with higher-order nonlinearity, but it is more delicate.

Proposition 39 Let n and s be positive integers and let r be a nonnegative integer. Let D
be the greatest integer such that
D   2n
2n n−s
≤ s n2 r n .
t=0
t 2 i=0 i + i=0 ( i )
( )

There exist (n, n)-permutations F whose higher-order nonlinearity nls,r (F ) is strictly larger
than D.

Proof We recall the proof from [233]. Given a number D, a permutation F of Fn2 and two
n-variable Boolean functions f and g, let us consider the support E = supp((f ◦ F ) ⊕ g),
that is, E = (F −1 (supp(f )))  supp(g), where  is the symmetric difference operator.
Then F −1 maps supp(f ) onto supp(g)  E (since the equality 1E = f ◦ F ⊕ g implies
f ◦ F = g ⊕ 1E ) and Fn2 \ supp(f ) onto (Fn2 \ supp(g))  E. If we have dH (f ◦ F , g) ≤ D,
then E has size at most D. For all integers i ∈ [0, 2n ] and r, let us denote by Ar,i the number
of codewords of Hamming weight i in the Reed–Muller code of order r. If i is the size of
supp(f ) (with 0 < i < 2n , since f = cst), then for every set E such that |supp(g)  E| =
|supp(f )| = i and |(Fn2 \ supp(g))  E| = |Fn2 \ supp(f )| = 2n − i, the number of
permutations whose restriction to supp(f ) is a one-to-one function onto supp(g)  E and
124 Boolean functions, vectorial functions, and cryptography

whose restriction to Fn2 \ supp(f ) is a one-to-one function onto (Fn2 \ supp(g))  E equals
i! (2n − i)!. We deduce that the number of permutations F such that nls,r (F ) ≤ D is
bounded above by
D  2n −1 2n
2n
As,i Ar,j i! (2n − i)!
t
t=0 i=1 j =0

Since the nonconstant codewords of the Reed–Muller code of order s have Hamming
weights between 2n−s and 2n −2n−s , we deduce that the probability Ps,r,D that a permutation
F chosen at random (with uniform probability) satisfies nls,r (F ) ≤ D is bounded
above by
D  2n D  2n
2n i! (2n − i)! 2n As,i
Ar,j As,i = Ar,j 2n
t 2n ! t
t=0 j =0 2n−s ≤i≤2n −2 n−s t=0 j =0 2n−s ≤i≤2n −2n−s i
 
D 2n
s n r n
2 i=0 ( i )+ i=0 ( i )
t=0 t
<  2n .
2n−s
(3.32)

We deduce that, under the hypothesis of Proposition 39, we have Ps,r,D < 1, and there exist
permutations F from Fn2 to itself, whose higher-order nonlinearity nls,r (F ) is strictly larger
than D. This completes the proof.

This lemma is translated into a table for small values of n in [233]. Let us see now what
happens when n tends to ∞. Let H2 (x) = −x log2 (x) − (1 − x) log2 (1 − x) be the binary
entropy function.

Proposition 40 Let snn tend to a limit ρ such that 1 − H2 (ρ) > ρ (which is approximately
equivalent to ρ ≤ .227) when n tends to ∞. If rn ≤ μ n for every n, where 1 − H2 (μ) > ρ
(e.g., if rn /sn tends to a limit strictly smaller than 1), then for every ρ  > ρ, almost all

permutations F of Fn2 satisfy nlsn ,rn (F ) ≥ 2(1−ρ )n .

Proof We recall the proof from [233]. We know  (seene.g., [809, page 310]) that, for
every integer n and every λ ∈ [0, 1/2], we have i≤λn i ≤ 2 nH2 (λ) . According to the
√ 
Stirling formula, we have also, when i and j tend to ∞: i! ∼ i i e−i 2πi and i+j i ∼
i+j i i+j j 1
( i )( j ) i+j
ij . For i + j = 2 and i = 2
√ n n−sn , this gives

 n−sn /
2n (2sn )2 2sn
∼√
2n−sn 2π (1 − 2−sn )2 −2 n
n n−s
2n − 2n−sn
n−sn /
2sn 2 2sn
=√ −s
.
2π 2(2 −2 n ) ln(1−2 n ) log2 e 2n − 2n−sn
n n−s

We deduce then from Inequality (3.32), page 124


3.2 Cryptographic criteria for vectorial functions in stream and block ciphers 125
 
Dn
log2 Psn ,rn ,Dn = O 2 n
H2 + 2−n(1−H2 (sn /n)) + 2−n(1−H2 (rn /n))
2n

−sn +log2 (sn ) −sn −sn
−2 −2 (1 − 2 ) log2 e

(we omit − 2n+1sn


+ 2n+1
n
log2 (1 − 2−sn ) inside the brackets above; it is negligible).
If lim n = ρ where 1−H2 (ρ) > ρ, then there exists ρ  > ρ such that 1−H2 (ρ  ) > ρ  and
sn

such that asymptotically we have sn ≤ ρ  n; hence 2−n(1−H2 (sn /n)) is negligible with respect
to 2−sn . And if rn ≤ μ n, where 1 − H2 (μ) > ρ, then we have 2−n(1−H2 (rn /n)) = o(2−s  ),
n
 )n  Dn
and for Dn = 2 (1−ρ , where ρ is any number strictly larger than ρ, we have H2 2n =
 
   
H2 2−ρ n = ρ  n 2−ρ n − (1 − 2−ρ n ) log2 (1 − 2−ρ n ) = o(2−ρ n ) = o(2−sn ). We obtain
 )n
that, asymptotically, nlsn ,rn (F ) > 2(1−ρ for every ρ  > ρ.

The inverse S-box


For Finv (x) = x 2n −2 and finv (x) = trn (Finv (x)), we have nlr (Finv ) = nlr (finv ) as for
n
any power permutation. Recall that, for r = 1, this parameter equals 2n−1 − 2 2 when n
is even and is close to this number when n is odd, and that for r > 1, it is approximately
−r
bounded below by 2n−1 − 2(1−2 ) n (see more in [232]). We have NL2 (Finv ) = 0, since we
have wH (h(x, Finv (x))) = 0 for the bilinear function h(x, y) = trn (axy), where a is any
nonzero element of null trace and xy denotes the product of x and y in F2n . Indeed, we have
x Finv (x) = 1 for every nonzero x. As observed in [392], we have also wH (h(x, Finv (x))) =
0 for the bilinear functions h(x, y) = trn (a(x +x 2 y)) and h(x, y) = trn (a(y + y 2 x)), where
a is now any nonzero element, and for the quadratic functions h(x, y) = trn (a(x 3 + x 4 y))
and h(x, y) = trn (a(y 3 +y 4 x)). These properties are the core properties used in the tentative
algebraic attack on the AES by Courtois and Pieprzyk.
It is proved in [233] that, for every ordered pair (s, r) of strictly positive integers, we have
• nls,r (Finv ) = 0 if r + s ≥ n;
• nls,r (Finv ) > 0 if r + s < n;

and that, in particular, for every ordered pair (s, r) of positive integers such that r +s = n−1,
we have nls,r (Finv ) = 2. The other values are unknown when r + s < n, except for small
values of n.

3.2.4 Algebraic immunities of vectorial functions


Algebraic attacks can be performed on stream ciphers and on block ciphers; this is why
we address the algebraic immunities of vectorial functions in the present section. But there
are several definitions, and the relevant ones are not the same in both frameworks. Algebraic
attacks can be applied to those stream ciphers that, for increasing the speed, use as combiners
or filters vectorial (n, m)-functions F instead of single-output Boolean functions. Figures 3.2
and 3.3 below display how vectorial functions can be used in the pseudorandom generators
of stream ciphers to speed up the ciphers.
126 Boolean functions, vectorial functions, and cryptography

x1
LFSR 1

si,1
x2
LFSR 2 Output ...
F si,m
..
.
xn
LFSR n

Figure 3.2 combiner model.

  

si+L−1 ... si+1 si


x1 xi xn

F (x1 , x2 , . . . , xn )

si,1
Output ..
.
si,m

Figure 3.3 filter model.

The output bits of F can be combined in any way, that is, by applying any m-variable
Boolean function h, and the algebraic attack can be performed on the combiner or filter
model using the resulting Boolean function h ◦ F . The minimum algebraic immunity of
all these functions clearly equals the minimum algebraic immunity of the indicators of the
preimages F −1 (z) for z ∈ Fm 2 . This will lead to Definition 34.
Algebraic attacks also exist on block ciphers (see [392]), exploiting the existence of
multivariate equations involving the input x to the S-box and its output y. In the case of
the AES, whose S-box is the power function x ∈ F28 → x 2 −2 ∈ F28 , an example
8

of such an equation is x 2 y = x, where x, y ∈ F28 . The main parameter playing a role


in the complexity of algebraic attacks, to be studied for a given S-box F in a cipher, is
the lowest algebraic degree d of Boolean relations between inputs and ouputs to F . If
these are viewed in Fn2 and Fm
 2 , the simplest relations to be considered are of the form
I (F (x))J = 0; a
a
I ⊆{1,...,n},J ⊆{1,...,m} I ,J x I ,J ∈ F2 . Another parameter is the number
of linearly independent relations of degree d. Since,  for an (n, m)-function, the number of
unknowns aI ,J in the equations above equals di=0 n+m and the number of equations is
i  
2 , the number of linearly independent relations of degree d is at least di=0 n+m
n
i − 2n .
3.2 Cryptographic criteria for vectorial functions in stream and block ciphers 127

But the actual efficiency of algebraic attacks on block ciphers is difficult to study. The
global number of variables in the large system of equations expressing the whole cipher, that
is, the number of data bits and key bits in all the rounds of the cipher, is much larger than for
stream ciphers, and the resulting systems of equations are not as overdefined as for stream
ciphers; nobody is able to predict correctly the complexity of solving such polynomial
systems. The AES allows bilinear relations between the input and the output bits of the
S-boxes, and this may represent a threat, if an idea is found that would reduce the number
of unknowns without increasing too much the degrees of the equations. In [392], the authors
wrote that “it is not completely unreasonable to believe, that the structure of Rijndael and
Serpent could allow attacks with complexity growing slowly with the number of rounds,”
and the authors added, “In this paper, it seems that we have found such an attack,” but it is
widely believed today that such an attack is not efficient on these two cryptosystems.
Several notions of algebraic immunity of vectorial functions have been studied in [29, 32].
We first need to recall the definition of annihilator and give the definition of the algebraic
immunity of a set:

Definition 33 We call annihilator of a subset E of Fn2 any n-variable Boolean function


vanishing on E. We call algebraic immunity of E, and we denote by AI (E) the minimum
algebraic degree of all the nonzero annihilators of E.

Note that the algebraic immunity of a Boolean function f equals by definition


min(AI (f −1 (0)), AI (f −1 (1))).
The first generalization of algebraic immunity to S-boxes is its direct extension:

Definition 34 The basic algebraic immunity of an (n, m)-function F is defined as follows:


AI (F ) = min{AI (F −1 (z)); z ∈ Fm
2 }.

Note that AI (F ) also equals the minimum algebraic immunity of all the indicators ϕz of the
preimages F −1 (z) since, the algebraic immunity being a non-decreasing function over sets,
we have AI (Fn2 \ F −1 (z)) ≥ AI (F −1 (z )) for every distinct z, z ∈ Fm
2.
This version of algebraic immunity is relevant to stream ciphers. A second notion of
algebraic immunity of S-boxes, more relevant to S-boxes in block ciphers, has been called
the graph algebraic immunity and is defined as follows:

Definition 35 The graph algebraic immunity of an (n, m)-function F is the algebraic


immunity of the graph {(x, F (x)); x ∈ Fn2 } of the S-box, and is denoted by AIgr (F ).

Two other notions studied in [32] are essentially different expressions for the same AI (F )
and AIgr (F ).
A third notion seems also natural:

Definition 36 The component algebraic immunity of an (n, m)-function F is defined as


follows:
AIcomp (F ) = min{AI (v · F ); v ∈ Fm
2 \ {0m }}.
128 Boolean functions, vectorial functions, and cryptography

Properties and relative bounds It has been observed in [29] that, for any (n, m)-
function F , we have AI (F ) ≤ AIgr (F ) ≤ AI (F ) + m. The left-hand side inequality
is straightforward (by restricting an annihilator of the graph to a value of y such that the
annihilator does not vanish) and is shown tight in [235], and the right-hand side inequality
comes from the fact that, since there existsz and a nonzero annihilator g(x) of F −1 (z) of
algebraic degree AI (F ), the function g(x) m i=1 (yj ⊕ zj ⊕ 1) is an annihilator of algebraic
degree AI (F ) + m of the graph of F .
It has been
 also observed in [29] that, denoting by d the smallest integer such
that di=0 ni > 2n−m , we have AI (F ) ≤ d (indeed, there is at least one z such that
|F −1
(z)| ≤ 2n−m ; the annihilators of F −1 (z) are the solutions of |F −1 (z)| linear equations
d n
in i=0 i unknowns – which are the coefficients in the ANF of an unknown annihilator
of algebraic degree at most d – and the number of equations being strictly smaller than the
number of unknowns, the system must have nontrivial solutions). It has been proved in [500]
that this bound is tight. Note that it shows that for having a chance that AI (F ) be large,
  nH2 (d/n)
we need m small enough: we know (see [809, page 310]) that di=0 ni ≥ √28d(1−d/n) ,
where H2 (x) = −x log2 (x) − (1 − x) log2 (1 − x); for AI (F ) being possibly larger than
  nH2 (k/n)
a number k, we must have ki=0 ni ≤ 2n−m , and therefore √28k(1−k/n) ≤ 2n−m , that is,
m ≤ n (1 − H2 (k/n)) + 12 (3 + log2 (k(1 − k/n))). It also implies that AI (F ) ≤ n − m; see
more in [235].
D Finally,
n+mit has also been proved in [29] that, denoting by D the smallest integer such that
i=0 i > 2 , we have AIgr (F ) ≤ D (the proof is similar, considering annihilators
n

in n + m variables of the graph), but it is not known whether this bound is tight (it is
shown in [29] that it is tight for n ≤ 14). This implies that AIgr (F ) ≤ n; see more
in [235].
Since the algebraic immunity of any Boolean function is bounded above by its algebraic
degree, the component algebraic immunity of any vectorial function is bounded above by its
minimum degree and therefore by its algebraic degree:

AIcomp (F ) ≤ dalg (F ).

We have also

AIcomp (F ) ≥ AI (F )

since AIcomp (F ) equals AI (F −1 (H )) for some affine hyperplane H of Fm 2 because


AIcomp(F) equals the algebraic immunity of the Boolean function v . F for some v = 0m ,
and since AI is a nondecreasing function over sets. We have

AIcomp (F ) ≥ AIgr (F ) − 1

since:
– If g is a nonzero annihilator of v · F , v = 0m , then the product h(x, y) = g(x) (v · y) is
a nonzero annihilator of the graph of F .
3.3 Cryptographic criteria and parameters for vectorial functions in stream ciphers 129

– If g is a nonzero annihilator of v · F ⊕ 1, then h(x, y) = g(x) (v · y) ⊕ g(x) is a


nonzero annihilator of the graph of F . More bounds on these three parameters are given
in [235].

Remark. As in the case of Boolean functions (see Subsection 3.1.6, page 96), the variants
of these parameters (and of the ones to come in the next sections) in relationship with guess
and determine attacks should be studied as well.

3.3 Cryptographic criteria and parameters for vectorial functions


in stream ciphers
3.3.1 Correlation immunity and resiliency of vectorial functions
The notion of resilient Boolean function, when extended to vectorial functions, is relevant in
cryptology to quantum cryptographic key distribution (see [58]) and to stream ciphers with
multioutput combiners or filters.
Recall that an (n, m)-function is called balanced if the distribution of F (x) when x ranges
over Fn2 is uniform over Fm2.

Definition 37 Let n and m be two positive integers. Let t be an integer such that 0 ≤ t ≤ n.
An (n, m)-function F (x) is called t-th order correlation immune if its output distribution
does not change when at most t coordinates xi of x are kept constant. It is called t-resilient
if it is balanced and t-th order correlation immune, that is if it stays balanced when at most
t coordinates xi of x are kept constant.

This notion has a relationship with another notion that also plays a role in cryptography:
an (n, m)-function F is called a multipermutation (see [1095]) if any two ordered pairs
(x, F (x)) and (x  , F (x  )) such that x, x  ∈ Fn2 are distinct, differ in at least m + 1 distinct
positions (that is, collide in at most n − 1 positions); such a (n, m)-function ensures then a
perfect diffusion; an (n, m)-function is a multipermutation if and only if the indicator of its
graph {(x, F (x)); x ∈ Fn2 } is an n-th order correlation immune Boolean function (see [179]).
Since S-boxes must be balanced, we shall focus on resilient functions, but most of the
results below can also be stated for correlation immune functions.
We call an (n, m) function that is t-resilient an (n, m, t)-function. Clearly, if such a
function exists, then m ≤ n − t, since balanced (n, m)-functions can exist only if m ≤ n.
This bound is weak (it is tight if and only if m = 1 or t = 1). It is shown in [370] (see
t/2 n 
also [79]) that, if an (n, m, t)-function exists, then m ≤ n − log2 i=0 i if t is even and
   
n−1 (t−1)/2 n
m ≤ n − log2 (t−1)/2 + i=0 i if t is odd. This can be deduced from the bound
on orthogonal arrays due to Rao [988]; see page 87. But, as shown in [79] (see also [760]),
potentially better bounds can be deduced from the linear programming bound due to Delsarte
m−1 m−2
[421]: if an (n, m, t)-function exists, then t ≤ 22m −1n − 1 and t ≤ 2 2 2m(n+1) −1 − 1.
Note that composing a t-resilient (n, m)-function by a permutation on F2 does not change
m

its resiliency order (this obvious result was first observed in [1168]). Also, the t-resiliency of
130 Boolean functions, vectorial functions, and cryptography

S-boxes can be expressed by means of the t-resiliency and t-th order correlation immunity
of Boolean functions:

Proposition 41 Let n and m be two positive integers and 0 ≤ t ≤ n. Let F be an


(n, m) function. Then F is t-resilient if and only if one of the following conditions is
satisfied:

1. For every nonzero vector v ∈ Fm 2 , the Boolean function v · F (x) is t-resilient, that is,
WF (u, v) = 0, for every u ∈ Fn2 such that wH (u) ≤ t.
2. For every balanced m-variable
 Boolean function g, the n-variable Boolean function g◦F
is t-resilient, that is, x∈Fn (−1)g(F (x))⊕u·x = 0, for every u ∈ Fn2 such that wH (u) ≤ t.
2
3. For every vector b ∈ Fm 2 , the Boolean function ϕb = δ{b} ◦ F is t-th order correlation
immune and has Hamming weight 2n−m .

Proof We prove that the t-resiliency of F implies Condition 2, which implies Condition 1,
which implies Condition 3, which implies that F is t-resilient.
– If F is t-resilient, then, for every balanced m-variable Boolean function g, the function
g ◦ F is t-resilient, by definition; hence Condition 2 is satisfied.
– Condition 2 clearly implies Condition 1, since the function g(x) = v · x is balanced for
every nonzero vector v.
– If Condition 1 is satisfied, then Relation (3.16), page 112, implies that, for every nonzero
vector
 u ∈ Fn2 such that wH (u) ≤ t and for every b ∈ Fm 2 , we have ϕ b (u) =
2 −m (−1) v·(F (x)+b)⊕u·x = 0, and ϕb is t-th order correlation immune for
x∈Fn2 ,v∈Fm
2
every b. Also, according to Proposition 35, page 112, Condition 1 implies that F is
balanced, i.e. ϕb has Hamming weight 2n−m , for every b. These two conditions obviously
imply, by definition, that F is t-resilient.

Consequently, the t-resiliency of vectorial functions is invariant under the same transforma-
tions as for Boolean functions.

3.3.2 Unrestricted nonlinearity of vectorial functions


The classical notions of nonlinearity of vectorial functions (Definition 29, page 117) and
higher-order nonlinearity (Definition 32, page 122), have been introduced in the framework
of block ciphers: due to the iterative structure of these ciphers, the knowledge of a function
f such that nl(f ◦ F ) or nlr (f ◦ F ) is low does not necessarily lead to an attack, unless the
algebraic degree of f is low, and r is low too in the latter case. This is why, in Definition 32,
the algebraic degree of f is also specified.
On the contrary, the structure of pseudorandom generators in stream ciphers is not
iterative, and all of the m output bits of the (n, m)-function used as combiner or filter can
be combined by a linear or nonlinear (but nonconstant) m-variable Boolean function f to
perform (fast) correlation attacks. Consequently, a second generalization to (n, m)-functions
3.3 Cryptographic criteria and parameters for vectorial functions in stream ciphers 131

of the notion of nonlinearity has been introduced (in [318], directly related to the Zhang–
Chan attack [1156]).

Definition 38 Let F be an (n, m)-function. The unrestricted nonlinearity of F , denoted by


unl(F ), is the minimum Hamming distance between all nonconstant affine functions and all
Boolean functions g ◦ F , where g is a nonconstant Boolean function in m variables.

If unl(F ) is small, then one of the linear or nonlinear (nonconstant) combinations of the
output bits of F has high correlation to a nonconstant affine function of the input, and a
(fast) correlation attack is feasible.

Remark.
1. In Definition 38, the considered affine functions are nonconstant, because the minimum
distance between all Boolean functions g ◦ F (g nonconstant) and all constant functions
equals minb∈Fm2 |F −1 (b)| (each number |F −1 (b)| is indeed equal to the distance between
the null function and g ◦ F , where g equals the indicator of the singleton {b}); it is
therefore an indicator of the balancedness of F . It is bounded above by 2n−m (and it
equals 2n−m if and only if F is balanced).
2. We can replace “nonconstant affine functions” with “nonzero linear functions” in the
statement of Definition 38 (replacing g with g ⊕ 1, if necessary).
3. Thanks to the fact that the affine functions considered in Definition 38 are nonconstant,
we can relax the condition that g is nonconstant: the distance between a constant function
and a nonconstant affine function equals 2n−1 , and unl(F ) is clearly always smaller than
2n−1 .

The unrestricted nonlinearity of any (n, m)-function F is obviously unchanged when F


is right-composed with an affine invertible mapping. Moreover, if A is a surjective linear (or
p
affine) function from F2 (where p is some positive integer) into Fn2 , then it is easily shown
that unl(F ◦ A) = 2 p−n unl(F ). Also, for every (m, p)-function φ, we have unl(φ ◦ F ) ≥
unl(F ) (indeed, the set {g ◦ φ, g ∈ BF p }, where BF p is the set of p-variable Boolean
functions, is included in BF m ), and if φ is a permutation on Fm
2 , then we have unl(φ ◦ F ) =
unl(F ) (by applying the inequality above to φ −1 ◦ F ).
A further generalization of the Zhang-Chan attack, called the generalized correlation
attack, has been introduced in [299]: considering implicit equations that are linear in the
input variable x and of any degree in the output variable z = F (x), the following probability
is considered, for any nonconstant function g and all functions wi : Fm 2 → F2 :

Prob [g(z) + w1 (z) x1 + w2 (z) x2 + · · · + wn (z) xn = 0], (3.33)

where z = F (x), and where x uniformly ranges over Fn2 .


The knowledge of such approximation g with a probability significantly higher than 1/2
leads to an attack, because z = F (x) corresponding to the output keystream is known, and
therefore g(z) and wi (z) are known for all i = 1, . . . , n.
132 Boolean functions, vectorial functions, and cryptography

This led to a new notion of generalized nonlinearity:


m
Definition 39 Let F : Fn2 → Fm 2 n+1 → R
2 . The generalized Hadamard transform F̂ : (F2 )
is defined as follows:
F̂ (g(·), w1 (·), . . . , wn (·)) = (−1)g(F (x))+w1 (F (x)) x1 +···+wn (F (x)) xn ,
x∈Fn2

where the input is in BF m


n+1
.
Let W be the set of all n-tuple functions w(·) = (w1 (·), . . . , wn (·)) ∈ BF nm , where
w(z) = 0n for all z ∈ Fm
2.
The generalized nonlinearity is defined as follows:
6   7
gnl(F ) = min min m wH (u · F ), 2n − wH (u · F ) , nlgen F ,
0=u∈F2

where
1
nlgen F = 2n−1 − max F̂ (g(·), w1 (·), . . . , wn (·)). (3.34)
2 g∈BF m ,w∈W

The generalized nonlinearity can be much smaller than the other nonlinearity measures and
provides linear approximations with better bias for (fast) correlation attacks.

Relations to the Walsh transforms and lower bounds


The unrestricted nonlinearity of F can be related to the values of the Fourier–Hadamard
transforms of the functions ϕb = 1{b} ◦ F (see page 112), and a lower bound (observed in
[1156]) depending on nl(F ) can be directly deduced:

Proposition 42 For every (n, m)-function, we have


1  
unl(F ) = 2n−1 − max |ϕb (u)| ≥ 2n−1 − 2m/2 2n−1 − nl(F ) . (3.35)
2 u∈F2 \{0n }
n
b∈Fm
2

This bound does not give an idea of the best possible unrestricted nonlinearities: even if
n
nl(F ) is close to the nonlinearity of bent functions 2n−1 − 2 2 −1 , it implies that unl(F ) is
n+m
approximately larger than 2n−1 − 2 2 −1 , whereas there exist balanced (n, n2 )-functions F
n
such that unl(F ) = 2n−1 − 2 2 (see below).

Proposition 43 [299] Let F : Fn2 → Fm 2 and let w(·) denote the n-tuple of m-bit Boolean
functions (w1 (·), . . . , wn (·)). Then
nlgen F = 2n−1 − 1/2 max |ϕb (w(z))|
w(z)∈Fn2 \{0n }
z∈Fm
2
0 0
0 0
1 0 0
= 2n−1 − max 00 (−1) WF (w(z), v)00 ,
v·z
2m+1 0=w(z)∈
0v∈Fm 0
z∈Fm
2 Fn
2 2
3.3 Cryptographic criteria and parameters for vectorial functions in stream ciphers 133

where WF denotes the Walsh transform. Hence


 
gnl(F ) ≥ 2n−1 − (2m − 1) 2n−1 − nl(F ) .

Upper bounds
If F is balanced, the minimum distance between the component functions v ·F and the affine
functions cannot be achieved by constant affine functions, because v · F , which is balanced,
has distance 2n−1 to constant functions. Hence:

Proposition 44 (covering radius bound) For every balanced S-box F :


n
unl(F ) ≤ nl(F ) ≤ 2n−1 − 2 2 −1 . (3.36)

Another upper bound:


⎛ 5 ⎞
2
1⎝ 22m − 2m 22n − 22n−m 22m − 2m
unl(F ) ≤ 2n−1 − + + −1 − 1⎠
2 2n − 1 2n − 1 2n − 1

has been obtained in [318]. It improves upon (i.e., is lower than) the covering radius bound
only for m ≥ n2 + 1, and the question of knowing whether it is possible to improve upon
the covering radius bound for m ≤ n2 is open. In any case, this improvement will not be
dramatic, at least for m = n2 , since it is shown (by using Relation (3.35)) in this same paper
 x
if y = 0 n
that the balanced function F (x, y) = y satisfies unl(F ) = 2n−1 − 2 2 (see
x if y = 0
other examples of S-boxes in [698], whose unrestricted nonlinearities seem low, however). It
is pretty astonishing that an S-box with such high unrestricted nonlinearity exists; but it can
be shown that this balanced function does not contribute to a good resistance to algebraic
attacks and has null generalized nonlinearity (see below).

Proposition 45 Let F : Fn2 → Fm 2 . Then the following inequality holds:


5
1 2n+2 |F −1 (z)| − 4|F −1 (z)|2
nlgen F ≤ 2 n−1
− .
4 m 2n − 1
z∈F2

Furthermore if F (x) is balanced, then we have


/
2m − 1
gnl(F ) ≤ 2 n−1
−2 n−1
.
2n − 1

if y = 0 x
It is proved in [300] that the balanced function F (x, y) = y
has null
x if y = 0
generalized nonlinearity. Hence, a vectorial function may have very high unrestricted
nonlinearity and have zero generalized nonlinearity. Some functions with good generalized
nonlinearity are given in [300]:
134 Boolean functions, vectorial functions, and cryptography

1. F (x) = trm
n (x k ), where k = 2i + 1, gcd(i, n) = 1, is a Gold exponent.

2. F (x) = trm
n (x k ), where k = 22i − 2i + 1 is a Kasami exponent, 3i ≡ 1 [mod] n,

where m divides n and n is odd, and where trm n is the trace function from F n to F m , have
2 2
generalized nonlinearity satisfying gnl(F ) ≥ 2n−1 − 2(n−1)/2+m−1 .
Power functions and sums of power functions represent for the designer of the cryptosys-
tem using them the interest of being more easily computable than general functions (which
makes it possible to use them with more variables while maintaining good efficiency). Power
functions have the peculiarity that, denoting the set {x d ; x ∈ F∗2n } by U , two functions
trn (ax d ) and trn (bx d ) such that a/b ∈ U are linearly equivalent. It is not clear whether this
is more an advantage for the designer or for the attacker of a system using such function.

3.4 Cryptographic criteria and parameters for vectorial functions


in block ciphers
We have seen in Subsection 3.2.3 a first example of the role played by S-boxes in the
robustness of the block ciphers in which they are involved, and of how the main attacks
on block ciphers result in design criteria for the S-boxes they implement. We shall see now
a second example, whose importance is comparable.

3.4.1 Differential uniformity


The differential attack, introduced by Biham and Shamir [82] (but which was already known
by the NSA and kept secret), is anterior to the linear attack. It assumes the existence of
ordered pairs (α, β), α = 0, of binary strings of the same length as the blocks (which
are binary strings too), such that, a block m of plaintext being randomly chosen and
c and c being the ciphertexts related to m and m + α, the bitwise difference c + c
(recall that + denotes the bitwise addition/difference in Fn2 ) has a larger probability to
equal β than if c and c were randomly chosen binary strings. Such an ordered pair
(α, β) corresponding to a bias in the output distribution is called a differential and can
be exploited in differential attacks; the larger the probability of the differential, the more
efficient the attack. As for the linear attack, there are several ways to mount such differential
cryptanalysis. The most common (and most efficient) is to use differentials for the reduced
cipher (see Figure 3.1, page 116). The existence of a differential allows one to distinguish,
in a last round attack, the reduced cipher output from a random permutation. The existence
of such distinguisher allows recovering the key used in the last round, by an exhaustive
search, which is efficient if this key is shorter than the master key, or by using specificities
of the cipher allowing replacing the exhaustive search by, for instance, solving algebraic
equations.
Here also, we describe the attack in the case of exhaustive search, which is simpler to
describe. Similar to what we have seen at page 116, the attacker, who knows a number
of pairs (plaintext, ciphertext) corresponding to the original cipher and of the form (m, c)
and (m + α, c ), where (α, β) is a differential for the reduced cipher, visits all possible last
round keys. For each try of such a candidate as last round key, he/she inverts the last round
and obtains in the case of the correct key guess the output of the reduced cipher; the attacker
3.4 Cryptographic criteria and parameters for vectorial functions in block ciphers 135

observes then the statistical bias of the differential. In all the other cases (incorrect guesses),
the obtained binary string is considered as random, with no observable bias. The number of
pairs (m, c) and (m + α, c ), which are known to him/her, needs then to be large enough to
distinguish the bias. This number depends on how the probability of the differential is larger
than for a random pair. In the case of DES, the number was 247 (which is huge and made the
attack impractical).
The existence of differential attacks leads to a criterion on (n, m)-functions F , when
used as S-boxes in the round functions of the cipher, which corresponds to minimizing
the possibilities for the attacker to find differentials whose probability is large. Since
the differentials cannot be determined by direct computer investigation and must then be
approximately evaluated by “chaining” differentials inside each round, the criterion is that
the output of the derivatives Da F (x) = F (x)+F (x +a); x, a ∈ Fn2 , a = 0n , be as uniformly
distributed as possible. This leads to the following parameter.

Definition 40 [906, 907, 912] Let n, m, δ be positive integers. An (n, m)-function F is


called differentially δ-uniform if, for every nonzero a ∈ Fn2 and every b ∈ Fm
2 , the equation
F (x) + F (x + a) = b has at most δ solutions. The minimum of those values δ having such
property, that is, the maximum number of solutions of such equations, is denoted by δF and
called the differential uniformity of F .

The differential uniformity δF is necessarily even since the solutions of equation


Da F (x) = b go by pairs: if x is a solution of F (x) + F (x + a) = b, then x + a is also
a solution. The lower is δF , the better is the contribution of the S-box to the resistance to
the differential attack, as shown in [908, 912]. The differential uniformity δF of any (n, m)-
function F is bounded below by 2n−m (as observed by Nyberg) since Da F being an (n, m)-
function, at least one element of Fm 2 has at least 2
n−m preimages by D F . The differential
a
uniformity equals 2 n−m if and only if every derivative Da F , a = 0n , is balanced. We say
then that F is perfect nonlinear, and we shall see in Chapter 6 that this is equivalent to saying
that F is bent. According to a result from Nyberg that we shall see in Proposition 104, page
269, (n, m)-functions have differential uniformity strictly larger than 2n−m when n is odd or
m > n/2.
The differential uniformity of an S-box being determined, its differential spectrum also
affects the security of the corresponding cipher. The differential spectrum is the multiset of
the values:

δF (a, b) = |{x ∈ Fn2 ; Da F (x) = F (x) + F (x + a) = b}| = (1GF ⊗ 1GF )(a, b), (3.37)

(where 1GF is the graph indicator of F ; see page 35) and the difference distribution table
(DDT) is the table that displays them (note that, given a permutation F , all these data are
the same for F and F −1 , up to exchanging a and b, since 1GF (x, y) = 1GF −1 (y, x)).

For every u ∈ Fn2 and v ∈ Fm 2 , we have a∈Fn2 ,b∈Fm δF (a, b)(−1)u·a⊕v·b =
 u·a⊕v·Da F (x) =
 2

a∈Fn2 v·F (a)(−1) , by the change of variable b =


u·a
a,x∈Fn2 (−1)
Da F (x), and since v · Da F = Da (v · F ), and the Wiener–Khintchine formula (2.53),
136 Boolean functions, vectorial functions, and cryptography

page 62 (or Property (2.44), page 60, applied to expression (3.37)), shows that the Fourier
transform of function δF equals WF2 .
Differential uniformity is in fact a notion on the graph GF = {(x, y) ∈ Fn2 × Fm 2;
y = F (x)} of the function: it is the maximum number of solutions (X, Y ) ∈ GF2 of the
equation X + Y = (a, b) when (a, b) ∈ (Fn2 \ {0n }) × Fm 2 . For this reason, differential
uniformity is a CCZ invariant (see Definition 5, page 28). The necessary and sufficient
condition, recalled from [201] and that we reported at page 72, ensuring that the image
by an affine permutation A = L + (a, b) of the graph GF of F is the graph of a
function, is equivalent to the fact that the image of {0n } × Fm −1 is included in the
2 by L
set δF−1 (0) ∪ {(0n , 0m )}.
It is observed in [910, 1060] that, because of the truncated differential attack [706], the
differential uniformity of the (so-called chopped) functions obtained by withdrawing a few
coordinate functions should also be considered and can be low for some vectorial functions
having good differential uniformity.
Note that if a function has good nonlinearity, then it does not have necessarily a good
differential uniformity too: take an (n, m)-function F and consider the (n + 1, m)-function
F  such that F  (x, xn+1 ) = F (x) for every x ∈ Fn2 , xn+1 ∈ F2 ; the nonlinearity of F 
is twice that of F and can then be rather good, while the differential uniformity of F 
equals 2n and is then bad. The converse is not true either: take any (n, m)-function F
and consider the (n, m + 1)-function F  obtained by adding a null coordinate function;
the nonlinearity of F  is null while the differential uniformity equals that of F and can then
be good.
The asymptotic behavior of δF for general (n, n)-functions F has been studied in [1098],
after being studied in [643] for power functions over F2n :

Proposition 46 [1098] For any d > 4 with d ≡ 0, 3 [mod 4], the limit when n to infinity
of the ratio
0 >0
0 0
0 F ∈ F2n [x]; deg(F ) = d and δF = d − 1 for d odd 0
0 d − 2 for d even 0
,
|{F ∈ F2n [x]; deg(F ) = d}|
where deg(F ) denotes the polynomial degree, equals 1.

For more general (n, m)-functions, see [405, 589, 913]; the average differential uniformity
of (n, m)-functions is much larger than 2n−m .

Almost perfect nonlinear functions The smaller the differential uniformity, the better the
contribution to the resistance against differential cryptanalysis. When m ≥ n, the smallest
possible value of δF (which is always even) is 2, and differentially 2-uniform functions can
exist only when m ≥ n (indeed, we need m ≥ n − 1, and m = n − 1 is impossible except
if n ≤ 2 since differentially 2-uniform (n, n − 1)-functions are perfect nonlinear, and we
would then need to have n − 1 ≤ n/2, as we shall see in Proposition 104, page 269). We
use the term of APN function only when m = n. Note that the notion of APN function and
3.4 Cryptographic criteria and parameters for vectorial functions in block ciphers 137

the differential property of the multiplicative inverse function had been investigated starting
from 1968 by V. Bashev and B. Egorov in the USSR.

Definition 41 [71, 908, 912] An (n, n)-function F is called almost perfect nonlinear (APN)
if it is differentially 2-uniform, that is, if for every a ∈ Fn2 \ {0n } and every b ∈ Fn2 , the
equation F (x) + F (x + a) = b has 0 or 2 solutions (i.e., |{Da F (x), x ∈ Fn2 }| = 2n−1 ).
Equivalently, for distinct elements x, y, z, t of Fn2 , the equality x + y + z + t = 0n implies
F (x) + F (y) + F (z) + F (t) = 0n , that is, the restriction of F to any 2-dimensional flat (i.e.,
affine plane) of Fn2 is nonaffine.

We have already encountered APN functions when proving the SCV bound, and the
equivalence between these three properties is easily seen: Inequality (3.25), page 118, is
an equality if and only if F (x) + F (y) + F (z) + F (x + y + z) = 0n can be achieved only
when x = y or x = z or y = z, and this is equivalent to any of the following properties:
– The restriction of F to any two-dimensional flat (i.e., affine plane) of Fn2 is non-affine,
that is, does not sum up to 0n , (indeed, the set {x, y, z, x + y + z} is a flat and it is two-
dimensional if and only if x = y and x = z and y = z; and F (x) + F (y) + F (z) +
F (x + y + z) = 0n is equivalent to saying that the restriction of F to this flat is affine,
since we know that a function F is affine on a flat A if and only if, for every x, y, z in A
we have F (x + y + z) = F (x) + F (y) + F (z)).
– For every distinct nonzero (that is, F2 -linearly independent) vectors a and a  , the second-
order derivative Da Da  F (x) = F (x) + F (x + a) + F (x + a  ) + F (x + a + a  ) takes
only nonzero values.
– The equality F (x) + F (x + a) = F (y) + F (y + a) (obtained from F (x) + F (y) +
F (z) + F (x + y + z) = 0n by denoting x + z by a) can be achieved only for a = 0n or
x = y or x = y + a.
– For every a ∈ Fn2 \ {0n } and every b ∈ Fn2 , the equation Da F (x) = F (x) + F (x + a) = b
has at most two solutions (that is, zero or two solutions, since if it has one solution x,
then it has x + a for a second solution).

Remark. As in the case of AB functions, the term of almost perfect nonlinear gives the
feeling that these functions are almost optimal while they are optimal.

Chapter 11 covers the whole topic of APN functions.

Related nonlinearity parameters


– We have seen at page 121 the nonlinearity parameter alternative to the classical
nonlinearity, equal to the maximum imbalance of the sums of F and linear functions:
maxL∈Ln,m NbF +L .
If m 0= n and F is APN,
0 then according to the0 properties seen0at page 114, NbF +L =
0(Da F )−1 (L(a))0 − 2n =  n 0(Da F )−1 (L(a))0 is bounded above by
a∈F n
2 a∈F \{0n }
2
138 Boolean functions, vectorial functions, and cryptography

2 (2n − 1), for every L, which implies that maxL∈Ln,m NbF +L lies in the interval ]2n −
1; 2 (2n − 1)] (since we know from Proposition 37, page 120, that it is larger than or equal to
2n − 2n−m , and we know that it cannot equal 2n − 2n−m since F would be bent). Moreover,
when n is even, the maximum 2 (2n − 1) is achieved by all APN power functions; indeed,
Dobbertin proved (and we shall see in Proposition 165, page 385) that for any APN power
function F , there are 2 3−1 elements of F∗2n having three preimages each by F , and all the
n

other elements of F∗2n have no preimage (see, e.g., [237]), which implies, using (3.19), that
NbF = 1 + 9 · 2 3−1 − 2n = 2 (2n − 1).
n

APN functions in five variables have been classified under EA equivalence and CCZ
equivalence in [134]. When F is the inverse function, maxL∈Ln,m NbF +L equals 56. There
is no other APN and non-AB function for n = 5.
For m = n = 6 and m = n = 8, the functions CCZ equivalent to x 3 , found in [162, 163],
match the maximum 2 (2n − 1), as do the APN power functions. We do not know if some
APN functions can have a smaller value of maxL∈Ln,m NbF +L for n even. And it is not
clear to us whether maxL∈Ln,m NbF +L can take diverse values when F is AB for n odd and
whether it is CCZ invariant.
– The bentness/perfect nonlinearity of a function being characterized by the balancedness of
its derivatives, the following nonlinearity indicator has been introduced in [267]:

NBF = NbDa F = |(Da F )−1 (b)|2 − (2n − 1)22n−m . (3.38)


a∈Fn2 \{0n } a∈Fn2 \{0n } b∈Fm
2

This indicator is directly related to Nyberg’s and Chabaud–Vaudenay’s results and proofs; it
allows clarifying some properties found by them (see, e.g., Relation (3.40) below) and saying
a bit more. We shall call it the derivative imbalance of F . It has the following properties, as
mentioned in [239, 267]:
• NBF ≥ 0, for every function F , and NBF = 0 if and only if F is bent/perfect nonlinear.

• NB is CCZ invariant since NBF equals a∈Fn \{0n } |{(x, y) ∈ (Fn2 )2 / Da F (x) =
2
Da F (y)}| − (2n − 1)22n−m and equals therefore
0 >0
0 x + x  = y + y  = 0n 0
0 (x, x  , y, y  ) ∈ (Fn )4 / 0 − (2n − 1)22n−m
0 2 F (x) + F (x ) = F (y) + F (y ) 0
 
06 70
0 0
= 0 (X, X  , Y , Y  ) ∈ GF4 / X + X  = Y + Y  = 0n+m 0 − (2n − 1)22n−m ,

where GF = {(x, F (x)) ∈ Fn2 × Fm2 } is the graph of F .


• NBF ≥ (2 − 1)(2
n n+1 −2 2n−m ) (this inequality comes from the Cauchy–Schwarz
 0 02
 0 0 m 0F −1 (b)0
inequality b∈Fm 0F −1 (b)0 ≥
2 b∈F 22n
2
|I m(F )| = |I m(F )| applied to Da F and from
2
|I m(Da F )| ≤ 2n−1 ; see an improvement in [522, proposition 3]); note that this proves
again that (n, n)-functions cannot be perfect nonlinear; there is equality if and only if, for
every a = 0n , |I m(Da F )| equals 2n−1 and |(Da F )−1 (b)| is constant for b ∈ I m(Da F ).
For n = m, there is then equality if and only if F is APN;
3.4 Cryptographic criteria and parameters for vectorial functions in block ciphers 139
0 0
0 0
• NBF = 0(Da Da  F )−1 (0m )0 − (2n − 1)(22n−m − 2n+1 ).
a,a  ∈Fn
2
linearly indept
• NBF ≤ (2n − 1)(22n − 22n−m ), for every function F : Fn2 → Fm 2 (see a refinement in
[522, proposition 4]) and NBF = (2 − 1)(2 − 2
n 2n 2n−m ) if and only if F is affine.

Remark. A parameter24 has been introduced afterward in [922, 925] and studied further25
in [921, 923, 924], without comparing it to NBF . We give here its definition for (n, m)-
functions but, as NBF , it can be defined for any function from an Abelian group to an
Abelian group: the ambiguity A(F ) equals

i −1
|{(a, b) ∈ (Fn2 \ {0n }) × Fm
2 ; |(Da F ) (b)| = i}|.
2
i≥0

A(F ) is the same as NBF , up to a constant and to the multiplication by 12 :


1 1
A(F ) = |(Da F )−1 (b)|2 − |(Da F )−1 (b)|
2 2
(a,b)∈(Fn2 \{0n })×Fm
2 (a,b)∈(Fn2 \{0n })×Fm
2

1 (2n − 1)2n
=(NBF + (2n − 1)22n−m ) −
2 2
1
= NBF + (2n − 1)(22n−m−1 − 2n−1 ).
2
In [522], the necessary work of unification of the results on NBF and on ambiguity is made.
The known bounds on NBF and those on ambiguity are compared, and all the results are
translated from one definition to the other. More results are also given.

Parameter NBF can be expressed by means of the Walsh transform. Thanks to Relation
(3.20), page 114, we have
NBF = 2−m WD2 a F (0n , v). (3.39)
a∈Fn2 ,a=0n 2 ,v=0m
v∈Fm

Chabaud–Vaudenay’scalculations recalled in the proof of Theorem 6, more precisely at


page 118, show that v∈Fm2 ,v=0m WF4 (u, v) =
u∈Fn
2

2n+m |{(x, y, a) ∈ F3n


2 / F (x) + F (x + a) = F (y) + F (y + a)}| − 2
4n

= 23n+m + 2n+m |{(x, y) ∈ F3n


2 / Da F (x) = Da F (y)}| − 2
4n

a∈Fn2 ,a=0n

24 A second parameter called deficiency is also introduced and studied in the same papers:
D (F ) = |{(a, b) ∈ (Fn2 \ {0n }) × Fm −1
2 ; |(Da F ) (b)| = 0}|. It plays a less important role.
25 In particular for functions from Z/nZ (resp. from the additive/multiplicative group of a finite field) to itself,
and for some specific functions over finite fields, including all permutation polynomials over finite fields up to
degree 6 and reversed Dickson polynomials (which we shall see more in detail at page 389).
140 Boolean functions, vectorial functions, and cryptography

= 23n+m − 24n + 2n+m |Da F −1 (b)|2


a∈Fn2 ,a=0n b∈Fm
2

= 23n (2m − 1) + 2n+m NBF . (3.40)

The Sidelnikov–Chabaud–Vaudenay bound can then be specified as follows:


5
1 n 2m−n
nl(F ) ≤ 2n−1 − 2 + m NBF . (3.41)
2 2 −1

(This obviously implies the covering radius bound since NBF ≥ 0, and the SCV bound,
because of the inequality NBF ≥ (2n − 1)(2n+1 − 22n−m ) recalled above).
We can immediately see that the bound in (3.41) is tight for m ≤ n/2, n even (since the
covering radius bound is then tight) and for m = n, n odd (since the Sidelnikov–Chabaud–
Vaudenay bound is then tight). In fact, it is tight for all values of n and m: the proof in
[267] shows that it is an equality for a given F if and only if F is plateaued with single
amplitude (see Definition 67, page 274). It would be interesting to determine for which
triples (n, m, NBF ), or equivalently for which triples (n, m, nl(F )), the bound is tight (which
would be determined if we know all possible amplitudes for plateaued (n, m)-functions with
single amplitude).
Two other bounds on the nonlinearity involving the imbalance are given in [1133].
We have seen with Proposition 37, page 120, that the mean of the random variable L →
NbF +L is the same for every function. We shall see now that its variance equals NBF , up
to a multiplicative factor.

Proposition 47 [239, 267] Let F be any (n, m)-function. The variance of the random
variable L ∈ Ln,m → NbF +L equals 2−m NBF .

Proof Let us denote by VF the variance of the


random variable −1 L ∈ Ln,m → NbF +L ,
equal to that of the random variable L → a∈F \{0n }
n |(Da F ) (L(a))| according to
2
Relation (3.28), page 120, whose mean equals 22n−m − 2n−m , according to Proposition
37. Hence VF equals

1  2
|(Da F )−1 (L(a))| |(Da  F )−1 (L(a  ))| − 22n−m − 2n−m .
|Ln,m | L∈Ln,m
a,a  ∈Fn
2 \{0n }

Let us distinguish the case where a = a  = 0n and the case where a and a  are linearly
independent. We have seen that, when a is a fixed nonzero vector, the number of linear
functions L such that L(a) = b equals 2m(n−1) = 2−m |Ln,m |, for every vector b; similarly,
when a, a  are fixed linearly independent vectors, the number of linear functions L such that
L(a) = b and L(a  ) = b equals 2m(n−2) = 2−2m |Ln,m |, for all vectors b, b . We obtain
3.4 Cryptographic criteria and parameters for vectorial functions in block ciphers 141
 2
VF = 2−m |(Da F )−1 (b)|2 + 2−2m μF − 22n−m − 2n−m , where
a∈Fn2 \{0n } b∈Fm
2

μF = |(Da F )−1 (b)| |(Da  F )−1 (b )|


a,a  ∈Fn2 \{0n } b,b ∈Fm
2
a=a 
⎛ ⎞⎛ ⎞

= ⎝ |(Da F )−1 (b)|⎠ ⎝ |(Da  F )−1 (b)|⎠


a,a  ∈Fn2 \{0n } b∈Fm b∈Fm
2 2
a=a 

= (2n − 1)(2n − 2) 22n .

Then by the definition of NBF , VF = 2−m (NBF + (2n − 1)22n−m ) + 24n−2m − 3 · 23n−2m +
2 · 22n−2m − (24n−2m − 2 · 23n−2m + 22n−2m ) = 2−m NBF .

Remark. In [239], it is shown that the mean of NbF +L when L ranges over the subset of
balanced linear (n, m)-functions is the highest when F is balanced, but that its value is then
not much larger than the mean in Proposition 37.

A recent stronger criterion for permutations


Boomerang attacks [1101] (and their variants, called sandwich attacks) are a possible
alternative to differential attacks when differentials (see Subsection 3.4.1, page 134)
having sufficiently large probability are not known. The parameter that quantifies the
contribution of an (n, n)-permutation F to the resistance to these attacks (the smaller the
parameter, the better the resistance) (see [371]) is the so-called boomerang uniformity
(see [107]):

max |{x ∈ Fn2 ; F (F −1 (x) + a) + F (F −1 (x + b) + a) = b}| = (3.42)


(a,b)∈(Fn2 \{0n })2

max |{y ∈ Fn2 ; F −1 (F (y) + b) + F −1 (F (y + a) + b) = a}| =


(a,b)∈(Fn2 \{0n })2

max |{(x, y) ∈ Fn2 2 ; F (x + a) + F (y + a) = b and F (x) + F (y) = b}|


(a,b)∈(Fn2 \{0n })2

(the first equality being shown by using that F (F −1 (x) + a) + F (F −1 (x + b) + a) = b is


equivalent to F −1 (x + b) + a = F −1 (F (F −1 (x) + a) + b) and setting y = F −1 (x) + a).
It is easily shown that the boomerang uniformity is affine invariant, and as we can see, it is
also invariant when changing F into F −1 , but it is not EA invariant (see [107]) and therefore
not CCZ invariant. We have that, denoting y = F −1 (x) + a and z = F −1 (x + b) + a, the
boomerang uniformity equals max |{(x, y) ∈ (Fn2 )2 ; F (y) + F (z) = F (y + a) +
(a,b)∈(Fn2 \{0n })2
F (z + a) = b}|. The necessary condition F (y) + F (z) = F (y + a) + F (z + a) being
equivalent to Da F (y) = Da F (z), if F is APN then, since this latter equality implies y = z
or y = z+a and since y = z is impossible because b = 0n , the boomerang uniformity equals
142 Boolean functions, vectorial functions, and cryptography

2. But APN permutations are known only for n odd and n = 6. For general permutations,
we can see by considering the particular case z = y + a that the boomerang uniformity is
larger than or equal to the differential uniformity δF (see Definition 40, page 135). In [107],
is shown that the boomerang uniformity of the multiplicative inverse (n, n)-function for n
even equals 6 if 4 divides n and 4 otherwise. Its value is characterized when n = 4 for all
differentially 4-uniform permutations (showing that it is at least 6). It is shown that if F is
differentially 4-uniform and quadratic, then its boomerang uniformity is at most 12.
Quadratic permutations whose Boomerang Connectivity Table (BCT) is optimal (in the
sense that the maximal value in the BCT equals the lowest known differential uniformity)
have been derived in [875]. Moreover, boomerang uniformities of some specific permuta-
tions (mainly the ones with low differential uniformity) as well as a characterization by
means of the Walsh transform of those functions F from F2n to itself with boomerang
uniformity δF have been considered in [762].

3.4.2 Other features also related to attacks


Univariate degree
The interpolation attack [639] is efficient when the degree of the univariate polynomial
representation of the S-box over F2n is low or when the distance of the S-box to the set
of low univariate degree functions is small. A vectorial function should then not have low-
degree univariate representation nor be approximated by such a function.

Attacks without related criteria on Boolean functions


The slide attack [89], when it can be mounted, has a complexity independent of the number
of rounds in the block cipher, contrary to the attacks previously described. It analyzes the
weaknesses of the key schedule (the most common case of weakness being when round
keys repeat in a cyclic way) to break the cipher. The slide attack is efficient when the cipher
can be decomposed into multiple rounds of an identical F function vulnerable to a known
plaintext attack.

3.5 Search for functions achieving the desired features


3.5.1 The difficulty of designing good S-boxes
Substitution boxes in block ciphers need to satisfy many criteria:
• The S-boxes for SPN networks must be bijective. The S-boxes for Feistel cryptosystems
are better surjective and in fact balanced; see [957, 995].
• The S-boxes are better APN or differentially 4-uniform, or at least differentially 6-
uniform.
• They have better high nonlinearity, say near 2n−1 − 2n/2 .
• They have better not too low algebraic degree; degree 2 is often too small because of the
higher-order differential attack [706, 735].
3.5 Search for functions achieving the desired features 143

• For reason of efficiency (see page 401), in software, n is better even, n/2 too . . . that
is, n is better a power of 2. In hardware, n can be any number. But general-purpose
cryptosystems must be implementable in both hardware and software. Then n = 4, 8 are
preferred (n = 4 for lightweight ciphers).
• The S-box should be easy to protect against physical (side-channel and fault injection)
attacks; see Section 12.1.1, page 431. Hence, the number of nonlinear multiplications in
F2n to compute the output (when the S-box is expressed over this field) should be small.

Examples of S-boxes used in practice:


• (4, 4)-S-boxes: Serpent, PRESENT, CLEFIA, NOEKEON, LED, RECTANGLE
• (6, 4)-S-boxes: DES
• (8, 8)-S-boxes (inverse function): AES, CLEFIA, CAMELLIA
• (9, 9) and (7, 7)-S-boxes, combined (AB functions Gold x 5 and Kasami x 13 ∼ x 81 ):
MISTY, KASUMI
• (8, 32)-S-boxes: CAST

Other examples:
• Key-dependent S-boxes: CAST, Twofish
• Pseudorandomly generated (4, 4)-S-boxes: KHAZAD
• Round function based on x 3 in F237 or F233 according to the versions: KN
• Mixing operations from different groups: IDEA, CAST, RC6

3.5.2 Constructions versus computer investigations of Boolean


and vectorial functions
We shall give in Chapters 5 through 11 constructions of Boolean and vectorial functions
satisfying the criteria we have seen in the present chapter. We shall study how these
constructions can allow obtaining functions providing good trade-offs between several
criteria. Such constructions provide in general infinite classes of functions (in any numbers
of variables ranging over some infinite sets). These functions are rather well structured,
compared with random functions satisfying the same criteria. This is a quality (it simplifies
the study of criteria) but also a drawback (the structure may be usable by attackers).
It is then also useful to search by computer investigation for functions, in numbers of
variables small enough for search to be feasible, meeting one or several criteria, and if
possible to classify as in [124] these functions under proper notions of equivalence (which
needs mathematical tools as well). Of course, such searches are also useful to guess infinite
classes and constructions. They often show that the functions built by algebraic constructions
have peculiarities.

General classification
The classification of Boolean functions dates back to the 1950s [549] and 1960s [588].
It has been realized in [66] under affine equivalence for all Boolean functions up to five
144 Boolean functions, vectorial functions, and cryptography

variables (with 48 equivalence classes), for all six-variable Boolean functions in [812] (see
also Fuller’s thesis, “Analysis of affine equivalent Boolean functions for cryptography,”
Queensland University of Technology, 2003), for those seven-variable Boolean functions of
algebraic degree at most 3 modulo those of degree 1 in [613, 102], for those eight-variable
Boolean functions of algebraic degree 4 modulo those of degree 3 in [739]; and under CCZ,
EA, affine and permutation equivalences for (4, 4)-functions in [756, 183, 1009, 1158]. Note
that Burnside’s lemma [171] states that, if G is a groupof permutations acting on a set X,
σ ∈G |{x ∈ X ; σ (x) = x}|.
1
then the number of orbits induced on X is given by |G|

More targeted computer investigations


Many papers report computer searches of specific functions (made after mathematical work).
A few first examples are [1005] for six-variable bent Boolean functions, [741] for eight-
variable bent Boolean functions, [134] for APN (5, 5)-functions (with a classification), and
[135] for APN (n, n)-functions with n = 6, 7, 8.
The survey of the recent literature shows that many results using heuristics
(providing specific instances of Boolean functions, not general ones that algebraic con-
structions can give, but allowing the creation of many different solutions satisfying certain
properties) are now obtained with evolutionary algorithms and to a lesser extent with other
methods used for diverse kinds of searches. For instance, [694] implements hill-climbing
algorithms, which are a different type of heuristic than evolutionary algorithms (even if
evolutionary algorithms can work as hill-climbing algorithms). Other examples are [1051],
which uses satisfiability (SAT) solvers, and [74], which uses similarly a satisfiability modulo
theory tool.
Usually, there is no guarantee that the solutions are not equivalent to each others, and a
hard part of the work (when it is done) is to check inequivalence.
A list of recent papers making computer investigations can be found in [952].
As far as we know, the first application of genetic algorithms (GA) to the evolution of
cryptographically suitable Boolean functions has been done in [884], where the aim was
to reach high nonlinearity. The authors worked up to 16 variables and concluded that GA
combined with hill climbing is much faster than random search.
In [945, 946], several types of evolutionary algorithms to find correlation immune
Boolean functions with minimal Hamming weight are used, and [951] is the first attempt
(as far as we are aware) to mathematically show why finding balanced Boolean functions
with high nonlinearity is hard for evolutionary algorithms.
In [947], the authors evolved secondary constructions of bent Boolean functions (i.e., of
bent functions from bent functions), the goal being to reach many dimensions; there is no
further analysis of whether such obtained constructions are valid for an infinite number of
dimensions or whether they are new, up to equivalence.
These results show that techniques with heuristics can compete with algebraic con-
structions of Boolean functions when the numbers of inputs are not too big (for larger n,
it becomes a computationally intensive process to examine a large number of functions
generated by heuristics).
3.6 Boolean and vectorial functions for diffusion, secret sharing, and authentication 145

For vectorial Boolean functions, the situation is less positive. In [883], genetic algorithms
to evolve S-boxes with high nonlinearity and low autocorrelation value are used. The
selection of the appropriate genetic algorithm parameters is discussed. In [372], the authors
use simulated annealing and hill-climbing algorithms to evolve bijective S-boxes of sizes
up to 8 × 8 with high nonlinearity values. In [170], a heuristic method to generate MARS-
like S-boxes is used, generating a number of S-boxes of appropriate size that satisfy all
the requirements placed on the MARS S-box and even managing to find S-boxes with
improved nonlinearity values. Bent (n, m)-functions are obtained in [948] with evolutionary
computation. Picek et al. use several types of evolutionary algorithms to find differentially-6
uniform (n, n − 2) functions but are not able to report success for any previously unknown
size [949]. In [682], functions with particular symmetries are searched.
The results for vectorial Boolean functions obtained with heuristics cannot really compete
with algebraic constructions even when considering the nonlinearity property. While
algebraic constructions reach nonlinearity of 112 for 8 × 8 S-box size, the best result
for heuristics is currently 104. Optimal values of nonlinearity and differential uniformity
have been obtained with heuristics only recently for sizes larger than 4 × 4 (see [950],
where proper cellular automata rules are found and used to construct S-boxes). The biggest
advantage of using heuristics in the design of S-boxes lies in the fact that such techniques
can account for properties, such as resistance against side-channel attacks, that algebraic
constructions cannot (see, e.g., [297]). Finally, if we consider not only cryptographic
properties of S-boxes but also their implementation cost (such as area and power), then
the heuristics could have an advantage over algebraic constructions. As an example of such
a direction, Picek et al. use evolutionary algorithms to construct S-boxes that are either area
or power efficient [953].

3.6 Boolean and vectorial functions for diffusion, secret sharing,


and authentication
Designing diffusion layers for block ciphers is related to codes and to Boolean vectorial
functions. It is addressed in Subsection 4.2.3, page 161. The motivation for secret sharing is
cryptographic and that is why we cover it in this chapter, but it could have also been covered
in the next one.

3.6.1 Secret sharing, access structures, and minimal codes


In [1030], Shamir introduced a simple and elegant way to (probabilistically) split a secret
a ∈ Fq into a number n of shares so that no set of shares with cardinality (strictly) less than m
gives any information on a, where m is some positive integer smaller than or equal to n, and
at least m shares allow reconstructing (deterministically) the secret. Such scheme is called
an (n, m) threshold secret sharing scheme. Blakeley in [90] presented independently an idea
for realizing the same; we shall not describe his slightly less efficient scheme. Shamir’s
scheme associates the secret a with a polynomial Pa (X) over Fq defined as Pa (X) =

a + m−1 i=1 ui X , where the ui denote random coefficients. Then, n ≥ m distinct nonzero
i

elements α0 , . . . , αn−1 are publicly chosen in F∗q and the polynomial Pa (X) is evaluated in
the αi to construct a so-called n-sharing (a0 , a1 , . . . , an−1 ) of a such that ai = Pa (αi ) for
146 Boolean functions, vectorial functions, and cryptography

every i ∈ [0, . . . , n − 1]. To reconstruct a from at least m shares (α


i , ai ); i ∈
I , Lagrange’s
polynomial interpolation is first applied to reconstruct Pa (X) = i∈I ai k∈I ,k=i αX−α k
i −αk
.
Then the polynomial is evaluated in 0. This allows a dealer to distribute the shares to n
players so that at least m of them are able to reconstruct the secret, while less have no
information on it. We have
a= ai · βi , (3.43)
i∈I
 αk
where the constants βi are defined as follows: βi = . The constants can be
αk − α i
k∈I ,k=i
precomputed once for all and be public.
Shamir’s scheme is related to a problem in distributed storage systems, the exact repair
problem, described in [583]: a file (cut into blocks) to be stored is interpreted as a degree d
polynomial F over a field F, each block being a coefficient of the polynomial; to distribute
the file over n nodes, n elements α1 , . . . , αn of F are chosen, and F (αi ) is sent to node
i; if a node fails, we may recover it by polynomial interpolation from the information on
any m other nodes. It is possible to organize the distribution by breaking symbols into
subsymbols belonging to subfields, so that to repair a failed node, one needs only a part of the
information from other nodes (more than m nodes, but with a smaller amount of information
needed globally). A lower bound is given in [583] on the amount of information needed
(the repair bandwidth), and empirical constructions are proposed that make it possible to
approach it.
Secret sharing schemes play also a central role in multiparty computation protocols, first
introduced in [1136], in which n participants (also called players) are supposed to compute
the image of a given function by making computations on the shares of the input provided
by a secret sharing scheme, each player having one share. Such protocol is supposed to
enable the coalition of players to securely evaluate the function, while some of the players
are corrupted by an adversary. The protocol is called t-private if any t players cannot get
from the protocol execution more information than their own shares; this is possible for
any function when the number of players is at least 2t + 1. This happens to be closely
related to the problematics of masking functions (S-boxes) and of probing security that
we shall see in Chapter 12, and is also connected with threshold implementation (see the
same chapter).
Shamir’s scheme is a linear secret sharing scheme in the sense that the set
{(a, a0 , a1 , . . . , an−1 ) ∈ Fqn+1 } of those vectors of all possible a ∈ Fq concatenated with
all possible sharings of a is a vector subspace of Fqn+1 (a linear code) and a is a linear
function of the vector of its shares. As observed by Massey26 in [827], given any linear
[n + 1, k, d]q -code with (in the framework of the present book) q = 2n (and assuming in
practice that d ≥ 2 and that the corresponding dual code has a minimum distance d ⊥ ≥ 2,
even if this is not specified by Massey), one can define a (linear) n-sharing over Fq . Indeed,
let G denote a generator matrix of the code; we assume that its first column is nonzero (this

26 We shall borrow much from his paper in the present subsection.


3.6 Boolean and vectorial functions for diffusion, secret sharing, and authentication 147

can be ensured by permuting the codeword coordinates if necessary, thanks to the fact that
d ⊥ ≥ 2). Then the sharing (a0 , a1 , . . . , an−1 ) of a is built from a k-tuple (r1 , . . . , rk ) such
that a equals the (usual) inner product between (r1 , . . . , rk ) and the first column of G, and
chosen with a uniform probability under this constraint, and the sharing (a0 , a1 , . . . , an−1 )
is defined by (a, a0 , a1 , . . . , an−1 ) = (r1 , . . . , rk ) × G. For simplicity, up to a permutation
of codeword coordinates again, and to a change of generator matrix, we can assume that
the first column of G equals the first vector (1, 0, . . . , 0) of the canonical basis of Fkq (we
can even assume that G is in systematic form G = [Ik | M], where Ik is the k-dimensional
identity matrix over Fq ), and we have then r1 = a (and the other ri are random). The
reconstruction of a from its sharing (a0 , . . . , an−1 ) is obtained by choosing a row of a parity
check matrix whose first coordinate is nonzero (which exists because we have d ≥ 2;
otherwise the vector (1, 0, . . . , 0) would belong to the code; note that it is not the only
nonzero one since d ⊥ ≥ 2) and writing that the (usual) inner product between this row and
(a, a0 , a1 , . . . , an−1 ) equals 0. The next proposition is from [991].

Proposition 48 Given a linear [n + 1, k, d]q -code used for secret sharing as described
above, with d, d ⊥ ≥ 2, the knowledge of any d ⊥ − 2 shares gives no information on the
secret, and n − d + 2 shares allow reconstructing the secret.

Indeed, those vectors of length d ⊥ − 1 whose first term equals a generic secret and the
other ones are the shares of this secret at d ⊥ − 2 positions cover uniformly the whole vector

space Fdq −1 , by definition of the dual distance (see the observations and footnote after
Definition 4, page 16), and the code of length n − d + 1 built the same way with n − d + 2
shares instead of d ⊥ − 2 has straightforwardly minimum distance at least 2 and has dual
distance at least 2 as well (since the dual distance of this punctured code is larger than or
equal to the dual distance of the original code; see Lemma 2, page 17).
Some of such known secret sharing schemes use Boolean or vectorial functions, as
initiated in [269] and developed in other papers; see [456, 867, 874] and the references
therein. And as already observed in 1996 in [461, page 148 and figure 7.1], correlation
immune and resilient Boolean functions (see Definition 21, page 86) being related to the
dual distance of codes by Corollary 6, page 88, they can, in accordance with Proposition 48,
be employed for secret sharing.
But the determination of the so-called qualified coalitions of players, which are able
to (uniquely) reconstruct the secret, is more difficult to do in general than for Shamir’s
construction (which is equivalent to using as code a possibly punctured Reed–Solomon code,
and is simple because such code being MDS, any set of k positions is an information set [see
pages 9 and 161], and any k positions of a codeword determine then the full codeword
uniquely). As also observed in [461] (see theorem 2.5), MDS codes lead then to so-called
threshold secret sharing schemes (in which qualified sets are exactly those of sufficient
sizes), and conversely. For general codes, the set of all qualified coalitions satisfies the
monotone property. An important notion is then the access structure of a secret sharing
scheme, that is, the class of minimal qualified coalitions (for which, if any share is removed,
the remaining shares give no information about the secret).
148 Boolean functions, vectorial functions, and cryptography

Let us recall how the access structure of a code can be determined.27 Recall that we say
that a vector u over a finite field Fq covers a vector v, and we write v u if supp(v) ⊆
supp(u). A nonzero codeword u of a code C is called a minimal codeword of C if it covers no
codeword of C different from au, with a ∈ Fq (i.e., no Fq -linearly independent codeword)
[103, 104]. Minimum weight codewords are minimal, but the converse is in general not
true, except for MDS codes, in which the minimal codewords are the codewords of weight
n − k + 1.
As observed by Massey, no two Fq -linearly independent minimal codewords of a linear
code can have the same support, since otherwise any linear combination would be a
codeword that both of the former codewords would cover. This means that each support of a
minimal codeword corresponds uniquely to this minimal codeword, up to linear dependency.
In fact, as shown in [33], a set I of indices is the support of a minimal codeword if and only
if a parity check matrix H restricted to the columns indexed by I has rank |I | − 1. The fact
that it has rank less than |I | is equivalent to the existence of a codeword whose support is
included in I , and the fact that it has rank |I |−1, say that the columns indexed in I  ⊂ I with
|I  | = |I | − 1 are linearly independent, is then equivalent to the fact that this codeword has
support I and is minimal, since otherwise we could find by linear combination a codeword
whose support would be I  , a contradiction. This is a condition on I that does not require to
know the minimal codeword of support I . This also proves that any minimal codeword has
Hamming weight at most n − k + 1.
Every codeword is a linear combination of minimal codewords, since if a codeword u is
not minimal, it covers a minimal codeword v, and there exists a linear combination u + cv,
c ∈ Fq , which has Hamming weight strictly smaller than wH (u); the process can continue
(with u + cv) a finite number of steps and, when it ends, it provides a linear decomposition
of u over minimal codewords it covers. Hence, for every nonzero position in a codeword,
this codeword covers a minimal codeword that is nonzero at this position.
As shown by Massey [827] (and recalled in [33]), the access structure of the secret sharing
scheme corresponding to a linear code is specified by those minimal codewords in the dual
code, whose first component is nonzero (the set of shares corresponding then to the locations
where this minimal codeword is nonzero, except the first). Indeed, as we saw, the secret is
a linear combination of the shares and the vector of the coefficients of the resulting null
linear combination belongs to the dual. Note that this property also proves that codewords
of Hamming weight at most 2d − 1 in a binary [n, k, d] code are minimal. More generally,
it is shown in [33] for every [n, k, d]q code that the codewords of Hamming weight at most
qd
q−1 − 1 are all minimal, since given such codeword u, and supposing 
the existence of
a nonzero codeword v u linearly independent of u, we have c∈F∗q wH (u − cv) =
(q − 1)wH (u) − wH (v) ≤ (q − 1)wH (u) − d and the average of wH (u − cv) when c ∈ F∗q
qd
is then at most wH (u) − q−1 d
≤ q−1 − 1 − q−1
d
= d − 1; one of these codewords has then
Hamming weight at most this value, a contradiction since none can be the zero codeword by
hypothesis.
The minimal codewords have been determined in [33] for the (not necessarily binary)
Hamming codes and for the binary Reed–Muller codes of order at most 2 (all nonzero

27 In practice, this is often a very hard task.


3.6 Boolean and vectorial functions for diffusion, secret sharing, and authentication 149

codewords in RM(1, n) are minimal except the all-1 codeword, and all codewords in
RM(2, n) are minimal except those of Hamming weight 2n−1 + 2n−1−h for h = 0, 1, 2
and for some of those of Hamming weight 2n−1 ; the proof, too technical to be included
here, is based on the facts that 2d = 2n−1 and that any nonminimal codeword in a binary
code equals the sum of two codewords with disjoint supports).
The codes whose nonzero codewords are all minimal are particularly interesting (this
makes the code easily decodable and simplifies the access structures of the secret sharing
scheme; it plays also a role in multiparty computation; see e.g., [339]). A code having such
property is called a minimal code. We have the following proposition:

Proposition 49 [33] Let C be a linear code over Fq . If w


wmin
max
> q−1
q , where wmin and wmax
denote respectively the minimum and maximum nonzero weights in C ; then C is minimal.

We do not give the proof. The hypothesis of Proposition 49 seems very strong as soon
as q is large, but many examples of codes satisfying it exist and no example of a nonbinary
minimal code not satisfying it is known in characteristic 2. Infinite families of minimal
binary linear codes (related to Boolean functions) not meeting this condition have been
recently found in [345, 456].
A recent necessary and sufficient condition for linear codes to be minimal is:

Proposition 50 [456, 602] A linear code C over Fq is minimal if and only if, for each pair
of Fq -linearly independent codewords u and v in C , we have

wH (u + cv) = (q − 1)wH (u) − wH (v). (3.44)


c∈F∗q

Hence, the minimality of C is completely determined by the weights of its codewords, and
it is more easily handled if the numbers of these weights is small. Minimal codes derived
from finite geometry (hyperovals; see page 219) are given in [862] in relation with bent
vectorial functions.
A code is called a two-weight code if its nonzero elements have two possible weights
only. Examples related to Boolean functions will be seen with Proposition 68, page 195. It
is shown in [602] that if C is a two-weight linear code with length N and weights w1 and w2 ,
such that 0 < w1 < w2 < N and j w1 = (j − 1)w2 for every integer j such that 2 ≤ j ≤ q,
then C is minimal.
Binary three weight minimal codes are also investigated in [456].

3.6.2 Authentication schemes


The framework is as follows: Alice wishes to transmit to Bob a message m (a vector over
the field F2n ) in the form (m, t), where t is a tag corresponding to m and depending on a
secret key k shared between Alice and Bob, in order that Bob can verify the validity of the
signature and nobody other than Alice and Bob can forge a valid message.
A systematic authentication scheme is a tuple (M, T , K, {Ek : k ∈ K}), where Ek :
M → T is the encoding rule related to k. To transmit an information m ∈ M to Bob,
150 Boolean functions, vectorial functions, and cryptography

Alice calculates t = Ek (m) and sends the tuple (m, t) to Bob over the public channel. Bob
verifies that the relation t = Ek (m) is satisfied. There exist two kinds of attacks: the attacker
can try to forge (m, t) from scratch, hoping that it is accepted by Bob – this is called the
“impersonation attack” – or he can observe a valid tuple (m, t) and try to modify it – this
is called the “substitution attack.” The maximal success probabilities of these attacks are
denoted by PI and PS , respectively.
|{k ∈ K; Ek (m) = t}|
PI = max ;
m∈M,t∈T |K|
|{k ∈ K; Ek (m) = t and Ek (m ) = t  }|
PS = max .
m=m ∈M,t,t  ∈T ||{k ∈ K; Ek (m) = t}|

An example
Let F2h be a subfield of F2n and F a vectorial function from F2n to F2n . We define the
following scheme from [268]: M = F2n × F2n , T = F2h , K = F2n × F2h , E = {Ek : k ∈
K}, where for every k = (k1 , k2 ) ∈ K and m = (a, b) ∈ M, we have Ek (m) = trhn (aF (k1 )+
 n −1 hj
bk1 ) + k2 , where trhn is the trace function from F2n into F2h : trhn (a) = jh=0 a 2 .
Function k → Ek is a bijection from K to E and
 
1 1 1 nl(F )
PI = h , PS ≤ h + 1 − h 1 − n−1 ,
2 2 2 2
where nl(F ) denotes the nonlinearity of F and |M| = 22n , |T | = 2h , |K| = |E| = 2n+h .
Other examples can be found, for instance, in [268, 303, 460].
4

Boolean functions, vectorial functions,


and error-correcting codes

Nota bene: Symbol n being traditionally used to denote the number of variables of Boolean
functions, what was denoted by m in Section 1.2 is changed in this chapter into n. The codes
have then length 2n , and not n as it is usual in coding.

4.1 Reed–Muller codes


Reed–Muller codes have been introduced by David Muller in [891], and their decoding
algorithm has been given by Irving Reed in [989]. They have played an important role
in the history of error-correcting codes. For instance, they were used in the 1960s and
early 1970s for the transmission of the first photographs of Mars by the Mariner series of
spacecrafts. A Reed–Muller code of length 32, dimension 6, and minimum distance 16 was
used (precisely, the first-order Reed–Muller code RM(1, 5)). Each codeword corresponded
to a level of darkness, this made 64 different levels and up to 16−12 = 7 errors could be
corrected in the transmission of each codeword. Reed–Muller codes were also used in the
third generation (3G) of mobile phones (developed in the late 1990s for release in the early
2000s), in the so-called Transport Format Combination Indicator (TFCI, part of the initial
“handshake” between the mobile device and the base station, designed to inform the receiver
of what type of communication will come next), for which it is extremely important to get
information correct. The same code as for Mariner spacecrafts was first used and it was later
replaced by a punctured subcode of the second-order Reed–Muller code RM(2, 5), which
had dimension 10 and minimum distance 12.
Reed–Muller codes still play an important role thanks to their specific properties (see,
e.g., [1, 457]) and their roles with respect to new problematics (such as Locally Correctable
Codes [778]), despite the fact that their parameters are not good,1 except for the lowest
and largest orders. They also constitute a useful framework for the study of Boolean
functions.

Definition 42 For every nonnegative integer r and every positive integer n ≥ r, the Reed–
Muller code RM(r, n) of order r and length 2n is the binary linear code of all words of
length 2n corresponding to the evaluations over Fn2 (on which some order has been chosen)
of all n-variable Boolean functions of algebraic degree at most r.

1 In the late 1970s, for transmitting color photographs of Mars, the Voyager spacecrafts used the extended
binary Golay code and later Reed–Solomon codes.

151
152 Boolean functions, vectorial functions, and error-correcting codes

In other words, codewords are the last columns in the truth tables of these functions.
By abuse of language, we shall say that RM(r, n) is the F2 -vector space of all n-variable
Boolean functions of algebraic degree at most r.
For r = 0, RM(0, n) equals the pair of constant functions.
For r = 1, RM(1, n) equals the vector space of all affine functions. Note that we have
seen in Section 1.2 that the codewords of the simplex code are the lists of values taken on
Fn2 \ {0n } by all linear functions. Hence RM(1, n) is the F2 -vector space generated by the
extended simplex code and the constant function 1.
For r = n, RM(n, n) equals the whole space of n-variable Boolean functions, since every
n-variable Boolean functions has an ANF and thenn algebraicdegree at most n.
n
The dimension of RM(r, n) equals 1 + n + 2 + · · · + r since this is the number of
monomials of degrees at most r, which constitute a basis of RM(r, n).

 G = F2 [F2 ] be the so-called group algebra of F2 over F2 , consisting of the


n n
Remark. Let
formal sums g∈Fn ag g, where ag ∈ F2 . The algebra G has only one maximal ideal, called
2
its radical:
6 7
R= xg Xg ; xg = 0 ,
g∈Fn2 g∈Fn2

whose elements correspond to the words of even Hamming weight. The ideals Rj , j ≥ 1,
generated by the products of j elements of R, provide the decreasing sequence
G ⊃ R ⊃ · · · ⊃ Rn = {02n , 12n } ,
with Ri Rj = Ri+j . Berman [67] observed that, for any r, RM(r, n) = Rn−r .

RM(r, n), being a linear code, can be described by a generator matrix G. For instance, a
generator matrix of the Reed–Muller code RM(1, 4) can be as follows:
⎡ ⎤
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
⎢ 0 1 0 0 0 1 1 1 0 0 0 1 1 1 0 1 ⎥
⎢ ⎥
G=⎢ ⎥
⎢ 0 0 1 0 0 1 0 0 1 1 0 1 1 0 1 1 ⎥.
⎣ 0 0 0 1 0 0 1 0 1 0 1 1 0 1 1 1 ⎦
0 0 0 0 1 0 0 1 0 1 1 0 1 1 1 1
The first row corresponds to the monomial of degree 0 (the constant function 1) and the
other rows correspond to the monomials of degree 1 (the coordinate functions x1 , . . . , x4 ),
when ordering the words of length 4 by increasing Hamming weights (we could choose
other orderings; we have seen that this would lead to so-called equivalent codes, as shown
at page 8).

4.1.1 Minimum distance and minimum weight codewords


Theorem 7 The minimum distance of RM(r, n) equals 2n−r .

This was historically proved by double induction over r and n (see [809, page 375]), but
there exists a simpler proof.
4.1 Reed–Muller codes 153

Proof Code RM(r, n) being linear, its minimum distance d equals the minimum nonzero
Hamming weight of codewords. Let us first prove that d ≥ 2n−r . Since 2n−r is a decreasing
function
 of r, it is sufficient to show the bound for functions of algebraic degree r.
Let i∈I xi be a monomial of degree r in the ANF of a Boolean function f of algebraic
degree r; consider the 2n−r restrictions of f obtained by keeping fixed the n − r coordinates
Fr2 , has
of x, whose indices lie outside I . Each of these restrictions, viewed as a function on
an ANF of degree r because, when fixing these n −  r coordinates, the monomial i∈I xi
is unchanged and all the monomials different from i∈I xi in the ANF of f give either
0 or monomials of degrees strictly less than r. Thus, any such restriction has an odd (and
hence a nonzero) Hamming weight (see Section 2.2). The Hamming weight of f being
equal to the sum of the Hamming weights of its restrictions, f has Hamming weight
at least 2n−r .
To complete the proof, we just need to exhibit a codeword of Hamming weight 2n−r . The
simplest example is the Boolean function f (x) = ri=1 xi , that is, the indicator of the affine
space{(1, . . . , 1)} × F2n−r .

Remark.

1. The proof of Theorem 7 shows in fact that, if a monomial  i∈I xi has coefficient
1 in the ANF of f , and if every other monomial i∈J xi such that I ⊂ J has
coefficient 0 (i.e., if I is maximal), then the function has Hamming weight at least
2n−|I | . Applying this observation to the Möbius transform f ◦ of f – whose definition
has been given after Relation (2.3), page 32 – shows that, if there exists a vector x ∈ Fn2
such that f (x) = 1 and f (y) = 0 for every vector y = x whose support contains
supp(x) (i.e., if x is maximal in the support of f ), then the ANF of f has at least
2n−wH (x) terms; this has been first observed in [1179]. Indeed, the Möbius transform of
f ◦ is f .
2. The d-dimensional subspace E = {x ∈ Fn2 ; xi = 0, ∀i ∈ I }, in the proof of
Theorem 7, is a maximal odd weighting subspace: the restriction of f to E has odd
Hamming weight (i.e., has algebraic degree equal to the dimension d when viewed as a
d-variable function), and the restriction of f to any of its proper superspaces has even
Hamming weight (i.e., the restriction of f to any coset of E has odd Hamming weight).
Similarly as above, it can be proved, as in [1179], that any Boolean function admitting a
d-dimensional maximal odd weighting subspace E has Hamming weight at least 2n−d ,
and if d ≥ 2, applying this observation to f ⊕ , where  is affine, we have that f has
nonlinearity at least 2n−d . Indeed, the restriction of f to a d-dimensional affine space has
algebraic degree d if and only if the restriction of f ⊕  does. See more in [191], where
the proofs are given in terms of group rings/algebras (see page 152; see [283, 323] for
other examples where these are used).

Notice that all nonconstant affine functions have Hamming weight 2n−1 , their supports
being affine hyperplanes. Thus, nonconstant affine functions are the codewords of minimum
Hamming weight in RM(1, n). More generally, the codewords of minimum Hamming
weight in RM(r, n) have been characterized (see, e.g., [809]). We give below another proof,
more Boolean function oriented, of this characterization.
154 Boolean functions, vectorial functions, and error-correcting codes

Theorem 8 The Boolean functions of algebraic degree r and of Hamming weight 2n−r
are the indicators of (n − r)-dimensional flats (i.e., the functions whose supports are (n − r)-
dimensional affine subspaces of Fn2 ).

Proof The indicators of (n − r)-dimensional flats have clearly Hamming weight 2 n−r and

they have algebraic degree r, since they are affinely equivalent to the function ri=1 xi ,
because two affine subspaces of Fn2 of the same dimension are affinely equivalent (and recall
from page 36 that the algebraic degree is an affine invariant).  Conversely, let f be a function
of algebraic degree r and of Hamming weight 2n−r . Let i∈I xi be a monomial of degree
c
r in the ANF of f and let I c = {1, . . . , n} \ I . For every vector α ∈ FI2 , let us denote
by fα the restriction of f to the flat {x ∈ Fn2 ; ∀j ∈ I c , xj = αj }, viewed as a function
over FI2 . According to the proof of Theorem 7, and since f has Hamming weight 2n−r ,
each function fα is the indicator δaα of a singleton {aα } of FI2 . Moreover, the mapping
c c
α ∈ FI2 → aα ∈ FI2 is affine, i.e., for every α, β, γ ∈ FI2 , we have aα+β+γ = aα + aβ +
aγ . Indeed, the r-variable function fα ⊕ fβ ⊕ fγ ⊕ fα+β+γ being the restriction to FI2 of
Dα+β Dα+γ f (x + α), where Dα+β Dα+γ f is a second-order derivative of f , it has algebraic
degree at most r − 2, according to Corollary 1, page 39. And it is easily seen by using that
δa (x) = ni=1 (xi ⊕ai ⊕1) or by using Relation  (2.3), page 32, that, for every i ∈ {1, . . . , n},
the coefficient of the degree r −1 monomial j =i xj in (fα ⊕fβ ⊕fγ ⊕fα+β+γ )(x) (which
is null) equals the ith coordinate of aα + aβ + aγ + aα+β+γ . This completes the proof since,
denoting by xI (resp. xI c ) the restriction of x to I (resp. to I c ), the support of f equals the set
{x ∈ Fn2 ; xI = axI c } and that the equality xI = axI c is equivalent to r linearly independent
linear equations.

See more in [34], from a design viewpoint. The minimum weight codewords of RM(r, n)
generate the code over F2 ; see [420].

4.1.2 Dual
The dual of a Reed–Muller code is a Reed–Muller code:

Theorem 9 For every positive n and every nonnegative r < n, the dual

RM(r, n)⊥ = {f ∈ BF n ; ∀g ∈ RM(r, n), f · g = f (x) g(x) = 0}
x∈Fn2

equals RM(n − r − 1, n).

Proof We have seen in Section 2.2 that the n-variable Boolean functions of even Hamming
weights are the elements of RM(n − 1, n) (which equals then the parity code of length 2n ).
Thus, RM(r, n)⊥ is the set of those functions f such that, for every function g of algebraic
degree at most r, the product function fg (whose value at any x ∈ Fn2 equals f (x)g(x)) has
algebraic degree at most n − 1. This is clearly equivalent to the fact that f has algebraic
degree at most n − r − 1.
4.1 Reed–Muller codes 155

Note that, since RM(1, n) is the F2 -vector space generated by the extended simplex code
and the constant function 1, its dual RM(n − 2, n) is the intersection of the dual of the
extended simplex code and the parity code. It also equals the extended Hamming code,
according to Lemma 1, page 9, applied to RM(1, n).

Characterization in the field F2n


If the vector-space Fn2is identified with the field F2n , then the family of those functions
trn (ax j ) such that a ∈ F2n \ {0} and w2 (j ) ≤ n − r − 1 generates RM(n − r − 1, n)
(according to what we have seen on the trace representation of Boolean functions). We have
then that a Boolean function f belongs
 to RM(r, n) if and only if, 
for every nonzero j such
that w2 (j ) ≤ n − r − 1, we have x∈F2n f (x) trn (ax ) = trn (a x∈F2n f (x)x j ) = 0 for
j

every a ∈ F2n , as expressed in the following corollary:

Corollary 8 For every positive n and every nonnegative r < n, a Boolean function f over
F2n belongs
 to RM(r, n) if and only if, for every nonzero j such that w2 (j ) ≤ n − r − 1, we
have x∈F2n f (x) x j = 0.

4.1.3 Automorphism group


The Reed–Muller codes are invariant under the action of the general affine group (i.e., the
group of affine permutations over Fn2 ). More precisely, it is a simple matter to show the
following:

Proposition 51 For any 1 ≤ r ≤ n − 1, the automorphism group of RM(r, n) (that is, the
group of all permutations σ of Fn2 such that f ◦ σ ∈ RM(r, n) for every f ∈ RM(r, n))
equals the general affine group.

The sets RM(r, n) or RM(r, n)/RM(r  , n) have been classified under this action for some
values of r, of r  < r and of n; see [130, 611, 613, 615, 812, 1053, 1057].

4.1.4 Cyclicity of the punctured code R ∗ (r, n)


Let us identify Fn2 with the finite field F2n . The punctured code R ∗ (r, n) obtained from
RM(r, n) by erasing in each codeword f the coordinate at zero input, and ordering
the resulting vector as (f (1), f (α), f (α 2 ), . . . , f (α 2 −2 )), where α is a primitive element
n

of F2n , is a cyclic code. Indeed, the cyclic shift (f (1), f (α), f (α 2 ), . . . , f (α 2 −2 )) →


n

(f (α 2 −2 ), f (1), f (α), . . . , f (α 2 −3 )) is equivalent to changing function f (x) into f ( αx ),


n n

and such transformation on Boolean functions does not change the algebraic degree since
it is linear bijective. For any r < n, the Reed–Muller code RM(r, n) is then an extended
cyclic code [809, page 383].

Proposition 52 For every r < n, the zeros of the punctured Reed–Muller code R ∗ (r, n) of
order r and length 2n − 1 are the elements α i such that 1 ≤ i ≤ 2n − 2 and such that the
2-weight of i is at most n − r − 1.
156 Boolean functions, vectorial functions, and error-correcting codes

Proof We have seen that any Boolean function f ofalgebraic degree at most r has
a univariate polynomial representation of the form j
0≤j ≤2n −2 fj x . The codeword
w2 (j )≤r
f (α 2 ), . . . , f (α 2 −2 )) of the cyclic code R ∗ (r, n) is represented by the
n
(f (1), f (α),
polynomial 0≤l≤2n −2 f (α l )Xl (see Section 1.2), whose value at α i equals
⎛ ⎞

f (α l )α li = fj ⎝ α l(i+j ) ⎠ .
0≤l≤2n −2 0≤j ≤2n −2 0≤l≤2n −2
w2 (j )≤r


The sum 0≤l≤2n −2 α l(i+j ) equals 0 when w2 (i) ≤ n − r − 1 and w2 (j ) ≤ r since i + j ≥
i ≥ 1 cannot equal 2n − 1 since w2 (i + j ) ≤ w2 (i) + w2 (j ), and then, α i+j cannot equal
 (2n −1)(i+j )
1, and then 0≤l≤2n −2 α l(i+j ) = 1+α1+α i+j = 0. Hence, the α i such that 1 ≤ i ≤ 2n − 2
and w2 (i) ≤ n − r − 1 are zeros of the code. Since their number equals the codimension of
the code, they are the only zeros of the code.

4.1.5 The problem of determining the weight distributions


of Reed–Muller codes
What are in RM(r, n) the possible Hamming distances between codewords, or equivalently
the possible Hamming weights (or better, the weight distribution)? The answer, useful for
improving the efficiency of the decoding algorithms, for evaluating their complexities, and
for many other issues, is known for every n if r ≤ 2: see Section 5.2. For r ≥ n − 3, it
can also be deduced from the MacWilliams’ identity (1.1), which theoretically allows us to
deduce the weight distribution of RM(n−r −1, n) from the weight distribution of RM(r, n).
Practically, it is necessary to be able to explicitly expand the factors (X + Y )2 −i (X − Y )i
n

and to simplify the obtained expression for WC (X + Y , X − Y ); this is possible up to some


value of n (around 35) by running a computer.
The cases 3 ≤ r ≤ n − 4 remain unsolved (except for small values of n, see [66], and
for n = 2r, because the code is then self-dual, see [809, 959]). Asymptotically tight bounds
exist [679].
McEliece’s theorem [833] or Ax’s theorem [41] (see also the Stickelberger theorem, e.g.,
in [740, 746]) shows that the Hamming weights (and thus the distances) in RM(r, n) are
n−1
all divisible by 2 r −1 = 2 r , where u denotes the ceiling (the smallest integer larger
n

than or equal to u) and u denotes the integer part. For instance, it is shown
? in@[677] (see
 n−dalg (g) 
d (f )
also [623]) that if dalg (g) ≤ dalg (f ), then dH (f , g) ≡ wH (f ) mod 2 alg and this
proves the McEliece’s divisibility property by taking g = f .
McEliece’s divisibility bound is tight and can also be shown by using the properties of
the NNF; it is deduced in [292] from the fact that, if s is the number of monomials of degree
dalg
C
(f ) >
D
0 in the ANF of f , thenC
the coefficient
D
λu of x u in its NNF is a multiple of
wH (u)−1 wH (u)−s−1
dalg (f ) dalg (f )−1
2 if wH (u) > 0 and of 2 if wH (u) > s and dalg (f ) > 1. Moreover, it
4.1 Reed–Muller codes 157
n
is also shown in [292] that if s < dalg (f ) , then the Hamming weight of f is a multiple of
? @
n−s
dalg (f )−1 −1
2 (larger than what gives McEliece’s theorem).
Further properties of Hamming weights are given in [185] within the coset f ⊕RM(1, n).
Kasami and Tokura [670] have shown that, for r ≥ 2, the only Hamming weights in
RM(r, n) occurring in the range [2n−r ; 2n−r+1 [ are of the form 2n−r+1 − 2i for some i; and
they have completely characterized the codewords: the corresponding functions are affinely
equivalent either to x1 · · · xr−2 (xr−1 xr ⊕ xr+1 xr+2 ⊕ · · · ⊕ xr+2l−3 xr+2l−2 ), 2 ≤ 2l ≤
n − r + 2, or to x1 · · · xr−l (xr−l+1 · · · xr ⊕ xr+1 · · · xr+l ), 3 ≤ l ≤ min(r, n − r). The
functions whose Hamming weights are strictly less than 2.5 times the minimum distance
2n−r have later been studied in [671].
It is shown in [210] (and reported in Section 5.3 below, page 180) that for every Boolean
function f on Fn2 , there exists an integer m and a Boolean function g of algebraic degree at
most 3 on F2n+2m whose Walsh transform satisfies: Wg (0n+2m ) = 2m Wf (0n ). Hence, the
Hamming weight of f is related in a simple way to the Hamming weight of a cubic function
(in a number of variables that can be exponentially larger). This shows that the distances in
RM(3, n) can be very diverse, contrary to those in RM(2, n). See also [65].

4.1.6 Covering radius


The covering radius of RM(r, n), which we shall denote by ρ(r, n), equals by definition (see
Section 1.2) the maximum, when f ranges over BF n , of the minimum Hamming distance
between f and all n-variable Boolean functions of algebraic degree at most r (i.e., of the
distance between f and RM(r, n); this distance is called the r-th order nonlinearity of f ,
and more simply its nonlinearity when r = 1; see Section 3.1).
n
• We have ρ(1, n) = 2n−1 − 2 2 −1 when n ≥ 2 is even (see Chapter 6). When n is odd,
as we already saw in Section 3.1, ρ(1, n) is unknown, except for n ≤ 7, in which case it
n−1 n−1
equals 2n−1 − 2 2 [894]. For n ≥ 9 odd, ρ(1, n) lies strictly between 2n−1 − 2 2 and
n
22n−2 − 2 2 −2  [617,
 684, 686, 936,  937].
• We have limn→∞ 2n/2 − ρ(1,n) = 1 (this fact, conjectured by Patterson and
2n/2−1
Wiedemann in 1983 [936], has been proved by Schmidt [1022] in 2019, who also proved
the same limit when restricting to balanced functions).
• ρ(2, n) is known for n ≤ 7 (see [1102]). In [715], the authors calculated the second-
order nonlinearity of all Boolean functions in the infinite class 2
sof those cubic functions
whose degree 3 part, up to affine equivalence, has the form i=1 xi qi (x), s ≤ n, where
s is minimal and the qi are quadratic on separate sets of variables, and where each qi
does not depend on x1 , . . . , xi . This is done by translating in a systematic way what is
known on the best affine approximations of quadratic functions, and deducing formulae
allowing a direct computation of the second-order nonlinearity of the cubic functions
above, without needing the Walsh transform. This provides a lower bound on ρ(2, n)

2 These functions are closely related to Maiorana–McFarland’s (MM) functions; see page 165. In the case of the
so-called separable functions, they are MM (up to quadratic functions).
158 Boolean functions, vectorial functions, and error-correcting codes

Table 4.1 Lower and upper bounds on the covering radii of Reed–Muller codes for small n.

r\n 1 2 3 4 5 6 7 8 9

1 0 1 2 6 12 28 56 120 242–244[686]
2 0 1 2 6 18[1020] 40[1102] 84–100 171–220
[610]
3 0 1 2 8 20–23 43–67 111–167
4 0 1 2 8 22–31 58–98
5 0 1 2 10 23–41
6 0 1 2 10
7 0 1 2
8 0 1
9 0

(more precisely, on the covering radius of RM(2, n) in RM(3, n)). This lower bound is
compared with the upper bound from [309] that we shall recall as Relation (4.1) below;
for n ≤ 20, the lower and upper bounds are not that far from each other, and the lower
bound performs also well asymptotically. These results are extended to more general
Maiorana–McFarland functions in [714], with a focus on functions f (x, y) = x · φ(y),
where φ is perfect nonlinear, showing that some of these functions have best quadratic
approximation achieved by affine functions and that the lower bound of [715] on ρ(2, n)
can be improved. See also [1109].
• ρ(n, n), ρ(n − 1, n) and ρ(n − 2, n) equal respectively 0, 1, and 2.
• ρ(n − 3, n), n ≥ 3, has been determined in [837]: it equals n + 1 if n is odd and n + 2 if
n is even.
• More results can be found in [610, 612, 614, 616].

We summarize what is known for small numbers of variables in Table 4.1.


General lower and upper bounds and more results are given in [375, 378, 379]. A
n n
first lower bound is simply the translation of the sphere covering bound: 21+n+(2)+···+(r )
ρ(r,n) n n
≥ 22 , and two other lower bounds are due to [378]: ρ(r, n) ≥
 i=0 i
2n−r−3 (r + 4), r even
for r ≤ n − 3 and ρ(r, n) ≥ 2n−r for 2 ≤ r ≤ n − 3 and
2n−r−3 (r + 5), r odd
n ≥ 6. The best-known upper bound, from [309], is as follows:
– A bound is first obtained for r = 2:
E √ F
15 n 122929 155582504573
ρ(2, n) ≤ 2 n−1
− · 22 · 1 − − . (4.1)
2 21 · 2n 4410 · 22n
– This bound is generalized to every r by using the inequality ρ(r, n) ≤ ρ(r − 1, n − 1) +
ρ(r, n − 1), which is easily proved.
– This implies that, asymptotically, ρ(r, n) is bounded above by

15 √ n
2 n−1
− · (1 + 2)r−2 · 2 2 + O(nr−2 ).
2
4.2 Other codes related to Boolean functions 159

The principle of


the proof of (4.1) is to use that, for any two n-variable Boolean functions f
and g, we have x∈Fn (−1)f (x)⊕g(x) = 2n − 2 dH (f , g), which shows
2
0 0
0 0
1 0 0
ρ(2, n) = 2 n−1
− min max 00 (−1) f (x)⊕g(x) 0
0
2 f ∈BF n g∈RM(2,n) 0 n 0
x∈F 2

and to use that:


2
0 0 3   2k+2
0 0 3 3 g∈RM(2,n) n (−1)f (x)⊕g(x)
0 0
(−1)f (x)⊕g(x) 00 ≥ 3
x∈F
max 0
2
4   2k .
g∈RM(2,n) 00 0 f (x)⊕g(x)
x∈F2 n n (−1)
g∈RM(2,n) x∈F2

We have
⎛ ⎞2k ⎛ ⎞
2k 2k
⎝ (−1)f (x)⊕g(x) ⎠ = (−1) i=1 f (xi ) ⎝ (−1) i=1 g(xi ) ⎠,
g∈RM(2,n) x∈Fn2 x1 ,...,x2k ∈Fn2 g∈RM(2,n)
2k
and the mapping g ∈ RM(2, n) → i=1 g(xi ) being an F2 -linear form over RM(2, n),
  2k 
the sum g∈RM(2,n) (−1) i=1 g(x i ) equals the size of RM(2, n) when 2k i=1 g(xi ) is the null
function, and otherwise, this sum equals 0. We refer to [309] for the rest of the proof, which
is more technical.
We have seen at page 84 that the suitably normalised r-th order nonlinearity of a random
Boolean function converges strongly for all r ≥ 1 as shown in [1021], but no limit on ρ(r, n)
similar to the one recalled above for ρ(1, n) is known yet.

Remark. The so-called Gowers norm (whose definition involves k-th order derivatives
of Boolean functions) is related to the covering radius of Reed–Muller codes. We devote
Section 12.4 to it.

A notion on cosets of the first-order Reed–Muller code called orphan or urcoset is related
to the notion of plateauedness of Boolean functions; see page 262.

4.2 Other codes related to Boolean functions


4.2.1 Linear codes
There exist mainly two principles of constructions of linear codes (which are binary3 ) from
Boolean functions and vectorial functions (surveys can be found in [454] and [455]):
• Codes from Boolean functions: Let f be an n-variable Boolean function. Recall that we
denote its support by supp(f ). We choose an order on it and assume that it has rank
n. We define the linear code Csupp(f ) , whose codewords are the lists of values of the
restrictions to supp(f ) of the linear functions v · x, where v ∈ Fn2 and “·” is an inner

3 There also exist constructions of nonbinary codes from so-called p-ary functions, that is, Boolean-like
functions in characteristic p.
160 Boolean functions, vectorial functions, and error-correcting codes

product in Fn2 . In other words, Csupp(f ) equals the code of all linear functions punctured
at all the positions that are not in supp(f ). Any linear code whose generator matrix G
has its columns all different4 can be obtained by this construction, introduced in the early
1970’s and called nowadays the defining-set construction. Indeed, the codewords of such
code are obtained as (v × G)v∈Fk . The support of f having rank n, the parameters of this
2
code are [wH (f ), n, d], where d needs to be determined for each function f . A generator
matrix is made of the elements of supp(f ) put in columns. When f is a bent function in
n ≥ 4 variables (n even), the code has two weights (this property is characteristic) and
d is their minimum (see [1120] and other papers by Wolfmann written in French, whose
results have been rediscovered in [453] among many other results); we recall why in
Chapter 6 at page 195 and give more characterizations. More generally, we can consider
the code obtained from any Reed–Muller code by puncturing it at all positions outside
supp(f ). Note that Fn2 can be identified with F2n , and the inner product can then be
v · x = trn (vx).
Cyclic codes are also related to algebraic immunity; see page 326.
• Codes from vectorial functions:
– Given inner products in Fn2 and Fm 2 (which we shall both denote by “·” for simplicity)
the subcodes CF and CF of RM(r, n), where r ≥ 2 is the algebraic degree of F ,
whose codewords are the Boolean functions v · F (x) ⊕ u · x, respectively v · F (x) ⊕
u · x ⊕ , where u ranges over Fn2 , v over Fm 2 , and over F2 can be associated
to each vectorial function F : Fn2 → Fm 2 having no affine component (i.e., having
strictly positive nonlinearity). More precisely, the codewords are the lists of values of
these functions, some order being chosen on Fn2 . The Hamming weight of codeword
v · F (x) ⊕ u · x (resp. v · F (x) ⊕ u · x ⊕ ) equals 2n−1 − 12 WF (u, v) (resp. 2n−1 −
(−1) 
2 WF (u, v)). Code CF equals the union of the cosets v · F + RM(1, n), where
v ranges over F2 . The parameters of CF are [2n , n + m + 1, d], where d is the
m

nonlinearity of F ; see more in [257, 1099, 269, 1147, 53]. A ⎡ generator matrix ⎤ of
  ... 1 ...
... x ...
CF is , and a generator matrix of CF is ⎣ . . . x . . . ⎦,
. . . F (x) . . .
. . . F (x) . . .
where x and F (x) are column vectors and x ranges over Fn2 . Conversely, let C be
a linear [2n , k, d] binary code such that k > n + 1 and including the Reed–Muller
code RM(1, n) as a subcode. Let (b1 , . . . , bk ) be a basis of C completing a basis
(b1 , . . . , bn+1 ) of RM(1, n). The n-variable Boolean functions corresponding to the
vectors bn+2 , . . . , bk are the coordinate functions of an (n, k − n − 1)-function whose
nonlinearity is d.
The CCZ equivalence between (n, m)-functions can be expressed in terms of these
codes (see the remark at page 379).
Often, we have m = n and Fn2 is identified with F2n , the inner product being then
u · x = trn (ux).
– When m = n, the dual of the code CF ∗ , equal to CF punctured at the zero position,
plays an important role with respect to APN functions F (defined in Definition 41,

4 Such codes are sometimes called projective.


4.2 Other codes related to Boolean functions 161

page 137) such that F (0n ) = 0m ; see Proposition 160, page 378. The dual of CF
plays a similar role with respect to general APN functions (see the remark at page
379). When F is a power function, CF ∗ is cyclic; we find among such codes related
to APN functions in particular the dual of the historical 2-error-correcting BCH code
of length 2n − 1.
– Codes (which are constant-weight) are deduced in [862] from o-polynomials in
relation with vectorial bent functions (see Definition 30, page 118).
– The other notion of nonlinearity of vectorial functions nlv introduced at page 122 has
been studied in [788] in relation with codes.

A hybrid construction is proposed in [1071], and other constructions of cyclic codes from
vectorial (possibly APN) functions are given in [452, 464].
We have seen (in the remark at page 92) connections between algebraic immunity and
linear codes. Connections exist with cyclic codes; see page 326.

4.2.2 Unrestricted codes


Boolean functions play an important role with nonlinear codes, as we shall see in Section
6.1.22 about Kerdock codes.
Vectorial functions also play a role. Given any (n, m)-function F , we can consider the
code GF = {(x, F (x)); x ∈ Fn2 } (the graph of F viewed as a code). When F is linear, GF is
a linear code, but it happens that nonlinear functions F provide better parameters, as in the
case of Kerdock and Preparata codes.
Codes of the form GF are systematic: the set of n first indices has the property that every
possible n-tuple occurs in exactly one codeword within the coordinates of indices 1, . . . , n.
We call {1, . . . , n} an information set of C. Conversely, if a subset I of {1, . . . , N}, where
N is the length C, is an information set, then, up to permutation of the coordinates, it has
the form GF . It is easily shown that all linear codes have such property: the generator matrix
having rank k, it has k linearly independent columns, and placing them in the k first positions,
we can multiply the resulting permuted generator matrix on the left by the inverse of the
invertible square matrix made of its first k columns; this provides a systematic permuted
generator matrix.
Such codes play a role in relation with countermeasures to side channel attacks, see
Section 12.1.1, page 431. They need then to be complementary information set codes (CIS)
(see the same page) in the sense that they admit two complementary information sets. This
is a necessary and sufficient condition so that F can be a permutation.

4.2.3 Codes and diffusion layers in block ciphers


The diffusion (see the definition at page 76) ensured by a mapping F can be studied by
analyzing the pairs (x − y, F (x) − F (y)). In practice, q will be a power of 2 and − will be
the same as +.
These pairs play also a role with respect to the differential attack.
162 Boolean functions, vectorial functions, and error-correcting codes

Definition 43 Let q be a power of a prime. The differential branch number of a function F :


Fnq → Fmq is defined as β(F ) = minx,y∈Fnq ,x=y {dH (x, y) + dH (F (x), F (y))}, the minimum
distance of code GF . The differential branch number of a linear function F : Fnq → Fm q (or
of its matrix) is then defined as β(F ) = minx∈Fnq ,x=0n {wH (x) + wH (F (x))}.

β(F ) quantifies the level of diffusion induced by F when it is used as a diffusion layer in
a block cipher. When q = 2n and F diffuses the outputs of (n, n)-S-boxes, β(F ) indicates
the minimum number of active S-boxes.
Also, the larger β(F ), the more difficult the research of characteristics needed for
mounting differential attacks (see page 134). An r-round characteristic constitutes an (r +1)-
tuple of difference patterns: (X0 , X1 , . . . , Xr ). The probability of this characteristic
is the probability that an initial difference pattern X0 propagates to difference patterns
X1 , . . . , Xr after 1, 2, . . . , r rounds.
If F is linear5 , then GF is linear and the diffusion is studied by analyzing the pairs
(a, F (a)).
It is easily shown that the differential branch number of a linear permutation equals that
of its inverse and that, for every F : Fnq → Fmq , we have β(F ) ≤ m + 1, with equality if and
only if the code GF = {(x, F (x)); x ∈ Fnq } is MDS.
If GF is an MDS [N, k, d]-code such that N > k then any punctured code obtained, for
instance, by erasing the last coordinate of each codeword is an MDS [N − 1, k, d − 1] code.
For every prime power q, every N < q and every k ≤ N, we know that there exist MDS
codes over Fq of parameters [N , k, N − k + 1] (Reed–Solomon codes, for instance). This
allows us to build optimal diffusion layers.

Definition 44 The linear branch number of a linear function F : Fnq → Fm


q is defined as
follows:
β  (F ) = min {wH (a) + wH (b)}.
a,b∈Fn
q , (a,b)=(0n ,0n )
a·x⊕b·F (x)unbalanced

The linear branch number of a function F : Fnq → Fm q is the dual distance of the code
{(x, F (x)), x ∈ Fnq }. Then if F is linear and F  is the linear mapping whose matrix is the
transpose of that of F , we have β  (F ) = β(F  ).
In [795], the authors propose a nonlinear diffusion layer based on Kerdock codes.

4.2.4 Codes and association schemes


Association schemes, originated in statistics, have been used in coding theory and combi-
natorics in the 1970s by Delsarte, McEliece, and others to obtain strong upper bounds on
the size of codes and other combinatorial objects, and to characterize those objects (such as
perfect codes) that meet these bounds. They have also been studied in relation to Boolean
functions. They are related to graphs (which we encountered at page 70). For more details,
the reader is referred to [411, 424].
5 Contrary to a substitution layer, a diffusion layer does not need to be nonlinear; for reasons of speed, it is then
better to choose it to be linear.
4.2 Other codes related to Boolean functions 163

Definition 45 Let V be a finite set of vertices and {G0 , G1 , . . . , Gd } be binary relations on


V with G0 = {(x, x) : x ∈ V }. Then the decomposition (V ; G0 , G1 , . . . , Gd ) (represented
in short as (V , {Gi }0≤i≤d )) is called an association scheme of class d on V provided that
the following properties hold:
– V × V = G0 ∪ G1 ∪ · · · ∪ Gd and Gi ∩ Gj = ∅ for i = j .
– t Gi = Gi  for some i  ∈ {0, 1, . . . , d}, where t Gi = {(x, y) : (y, x) ∈ Gi }. (If i  = i, then
we call Gi symmetric)
– For i, j , k ∈ {0, 1, . . . , d} and x, y ∈ V with (x, y) ∈ Gk , the number pij k := #{z ∈ V :

(x, z) ∈ Gi , (z, y) ∈ Gj } is a constant.

An association scheme is said to be symmetric if each Gi is symmetric.

One of the well-known construction methods of association schemes is to use Schur rings.
Constuctions of association schemes from bent functions (in odd characteristic) have been
considered in the literature (see, e.g., [964]). In [506], the authors studied Boolean functions
arising in some popular association schemes.

4.2.5 Codes and secret sharing


We have seen in Subsection 3.6.1, page 145, how codes play a role with respect to secret
sharing and that Boolean functions can play a role in this domain.
5

Functions with weights, Walsh spectra, and


nonlinearities easier to study

In this chapter, we visit diverse types of Boolean and vectorial functions, whose study is
simpler than for general functions. We will encounter them again in almost all subsequent
chapters.

5.1 Affine functions and their combinations


Affine functions are weak cryptographically (see Sections 3.1 and 3.4), and many criteria
seen in Chapter 3 quantify the difference between cryptographic functions and affine
functions. However, good functions can be obtained by combining affine functions in
different ways. Before presenting them, we briefly address affine functions themselves.

Affine Boolean functions


The Hamming weights and the Walsh spectra of affine Boolean functions (i.e., of the
codewords of RM(1, n)) are peculiar.
The Hamming weight of any non-constant affine function is 2n−1 since this is the size of
any affine hyperplane. The Hamming weights of the two constant functions are of course 0
and 2n .
Recall from page 38 that, given any inner product “·”, any affine Boolean function can be
written in the form (x) = a · x ⊕ , where a ∈ Fn2 and ∈ F2 . The Walsh transform of such
a function takes null value at every vector u = a and takes value 2n (−1) at a. The Walsh
support is then a singleton.
Conversely, every Boolean function whose Walsh support is a singleton is an affine
function, according to the inverse Walsh transform formula (2.43), page 59, and to Parseval’s
relation (2.47), page 60.
Of course, the nonlinearity of any affine Boolean function is null, and this is characteristic
of affine functions.

Affine vectorial functions


The component functions of affine (n, m)-functions are affine Boolean functions (this
property is characteristic of affine vectorial functions). If F (x) = L(x) + a where L is
a linear (n, m)-function and a ∈ Fm 2 , then, for every (u, v) ∈ Fn2 × Fm2 , WF (u, v) =
  ∗
x∈Fn2 (−1)
v·L(x)⊕u·x⊕v·a = x∈Fn (−1) (L (v)+u)·x⊕v·a equals 2 (−1) if u = L∗ (v) and
n v·a
2
is null otherwise, where L∗ is the adjoint operator of L, that is, where v · L(x) = L∗ (v) · x

164
5.1 Affine functions and their combinations 165

for every v and x (in the case where “·” is the usual inner product, the matrix of L∗ is simply
the transpose of that of L). Of course, the nonlinearity of any affine vectorial function is null,
but this is not characteristic of affine vectorial functions (it is characteristic of the fact that
at least one component function of F is affine).

5.1.1 Maiorana–McFarland functions


Since the Walsh transform of affine functions behaves so simply, it is natural to try building
more robust functions by using them as building blocks in constructions. A first way is
based on the additive structure of Fn2 as an F2 -vector space. This leads to considering those
functions whose restrictions to each coset a + E of some F2 -vector subspace E of Fn2 are
affine. Up to affine equivalence, we can take E = Fr2 × {0n−r }. Then the corresponding
functions are called Maiorana–McFarland (MM) functions, since originally, the idea of such
functions comes from Maiorana and McFarland [834], as reported in [441]. The general
class, obtained by considering all affinely equivalent functions to Maiorana–McFarland
functions, is called the completed Maiorana–McFarland class.

Maiorana–McFarland Boolean functions


They have been first investigated for building bent functions (see Section 6.1.15, page
209), and later been considered in [181] for constructing correlation immune and resilient
functions (see Subsection 7.1.8, page 291). Recall that every affine Boolean function has
the form a · x ⊕ . The idea of Maiorana–McFarland’s construction corresponds to making
a and vary. For convenience, instead of denoting the input to the global function by x =
(x1 , . . . , xn ), we denote it then by (x, y), where x = (x1 , . . . , xr ) and y = (xr+1 , . . . , xn ).

Definition 46 Let n and r be any positive integers such that r ≤ n. We call Maiorana–
McFarland’s function any n-variable Boolean function of the form:

f (x, y) = x · φ(y) ⊕ g(y); x ∈ Fr2 , y ∈ F2n−r , (5.1)

where φ is a function from F2n−r to Fr2 and g is an (n − r)-variable Boolean function. We


denote by MMr the corresponding class.

n−r
The size of this class roughly equals 2(r+1)2 .
An example already seen of a Maiorana–McFarland Boolean function is the address
function (see page 68), with r ≈ n − log2 (n). Note that, for every r < n, we have MMr+1 ⊆
MMr (this can be seen directly with Relation (5.1) or by the fact that the restriction of an
affine function to an affine subspace is affine) and that MM1 = BF n (since every function
in one variable is necessarily affine). 
The algebraic degree of f in (5.1) is at most n−r +1 (and at most n−r if y∈Fn−r φ(y) =
 2
0r ) since the algebraic degree of φ is at most n − r (and at most n − r − 1 if y∈Fn−r φ(y) =
2
0r ). We shall see in Section 5.2 that all quadratic functions belong to the completed MM n2 
class.
166 Functions with weights, Walsh spectra, and nonlinearities easier to study

Remark. Maiorana–McFarland functions can be viewed as the concatenations of affine


functions. Indeed, let us order all the binary words of length n in lexicographic order, with
the bit of higher weight on the right-hand side. Then, the truth table of f is the concatenation
of the truth tables of its restrictions obtained by fixing the values of the n − r last bits of the
input and letting the r first input bits freely range over F2 . And f is an MMr function if and
only if all these restrictions are affine.

The calculation of Hamming weight, Walsh spectrum, and nonlinearity are easier for
functions in MMr , r ≥ 2, than for general Boolean functions, and in some cases can be
completely determined. Note that since the input to f is written in the form (x, y), where
x ∈ Fr2 , y ∈ F2n−r , the input to Wf is better written (u, v), where u ∈ Fr2 , v ∈ F2n−r .

Proposition 53 Let f be the function given by Relation (5.1). Then, assuming that the
inner product in Fr2 × F2n−r writes (u, v) · (x, y) = u · x ⊕ v · y (where we use the same
notation “ ·” for denoting inner products in Fr2 and F2n−r ), we have

Wf (u, v) = 2r (−1)g(y)⊕v·y ; u ∈ Fr2 , v ∈ F2n−r ,


y∈φ −1 (u)

where φ −1 (u) denotes the preimage of u by φ. Hence


wH (f ) = 2n−1 − 2r−1 (−1)g(y)
y∈φ −1 (0r )

and
0 0
0 0
0 0
nl(f ) = 2 n−1
−2 r−1
max 0 g(y)⊕v·y 0
n−r 0
(−1) 0.
u∈F2 , v∈F2 0
r
0
y∈φ −1 (u)

Proof We have
Wf (u, v) = (−1)f (x,y)⊕u·x⊕v·y
x∈Fr2 ,y∈Fn−r
2
⎛ ⎞

= ⎝(−1)g(y)⊕v·y (−1)(φ(y)+u)·x ⎠
y∈Fn−r x∈Fr2
2

= 2r (−1)g(y)⊕v·y ,
y∈φ −1 (u)

since x∈Fr2 (−1)
(φ(y)+u)·x is null when φ(y) = u.

Proposition 53 shows that the Walsh support of f is included in I m(φ) × F2n−r . Note that
this Walsh support can be made very small (minimizing the size of I m(φ) and the value of
n − r), even while ensuring some properties such as the nonexistence of linear structure.
We shall see that MMr class provides easy constructions of bent (or highly nonlinear)
functions, correlation immune functions, and resilient functions. It will then be important
5.1 Affine functions and their combinations 167

to be able to say if a given Boolean function is in the completed MMr class or not. The
following proposition is an easy extension of an observation from [441]:

Proposition 54 An n-variable Boolean function f belongs to the completed MMr class


if and only if there exists an r-dimensional vector space E such that Da Db f is the null
function for every a, b ∈ E.

Proof The condition is clearly necessary. It is also sufficient since it means that each
restriction of f to a coset of E is affine.

Note that for such function, E is in general not the linear kernel of f (see Definition 25,
page 99); it can be a superset of the linear kernel.

Maiorana–McFarland vectorial functions


It is easily seen that an r-variable vectorial function is linear if and only if all its component
functions are linear. Let n, r and m be positive integers such that r ≤ n. Let F be any
function of the form

F : (x, y) ∈ Fr2 × F2n−r → ψ(x, y) + G(y) ∈ Fm


2, (5.2)

where G is any function from F2n−r to Fm 2 and ψ : F2 × F2


r n−r
→ Fm2 is such that, for every
y ∈ F2 , the function x → ψ(x, y) is linear. Then, for every y ∈ F2n−r and w ∈ Fm
n−r
2 , there
exists φ(y, w) ∈ Fr2 such that w · ψ(x, y) = φ(y, w) · x, and this property is characteristic
of the functions of the form (5.2). For every (u, v, w) ∈ Fr2 × F2n−r × Fm2 , we have

WF ((u, v), w) = (−1)(φ(y,w)+u)·x⊕w·G(y)⊕v·y


(x,y)∈Fr2 ×Fn−r
2

= 2r (−1)w·G(y)⊕v·y .
y∈Fn−r
2 ; φ(y,w)=u

Remark. If r divides n, then we can endow Fn2 with the structure of the field F2n and Fr2
with the structure of subfield F2r of F2n . In particular, if r = n2 (which will be well suited for
designing bent functions), we can identify Fn2 with F n2 × F n2 and we consider the functions
2 2
of the form

F (x, y) = L(x φ(y)) + G(y), (5.3)

where the product x φ(y) is calculated in F n2 and L is any linear or affine function from
2
F n2 to Fm
2 , φ is any function from F 2 to itself, and G is any ( 2 , m)-function.
n
n
2 2

5.1.2 Niho and PS ap -like functions


When Fn2 is identified with F2n , we can also use the multiplicative structure of F∗2n to
build Boolean functions from affine functions. Similarly to the case of Maiorana–McFarland
functions, in which we considered additive subgroups of Fn2 and their cosets, we can consider
168 Functions with weights, Walsh spectra, and nonlinearities easier to study

multiplicative subgroups of F∗2n and their cosets. A natural choice1 as a subgroup is the
multiplicative group of a subfield F2m of F2n (where m is a divisor of n). We can view F∗2n
as the union of the cosets μ F∗2m of F∗2m , where μ ranges over a subset U of F∗2n containing
one representative of each coset of F∗2m and one only (U has then 22m−1
n
−1 elements). Under
some condition, it is possible to take U equal to the multiplicative subgroup of F∗2n of order
2n −1 2n −1
2m −1 . This is possible when 2 − 1 and 2m −1 are coprime (which is always the case if n is
m

even and m = n2 , in which case the representation of the elements of F∗2n in the form μx,
x ∈ F∗2m , is often called polar representation2 ), since there exist then relative integers i, j
such that i(2m − 1) + j 22m−1
n
= 1, and given a primitive element α of F2n , we have then
n
j −1
2 −1 2n −1
α = (α 2 −1 )i α 2m −1 and α 2 −1 ∈ U , α 2m −1 ∈ F∗2m . It is observed in [804] that, if3
m m

n = 2m, any (n, n)-function F (and therefore any n-variable Boolean


2m −2  function) can then be
2m
uniquely represented by a polynomial in the form F (μx) = s=0 t s
t=0 as,t μ x , where

as,t ∈ F2n , μ ∈U , x ∈ F2m (with additionally the indication of the value of F (0)), and
that if F (0) = μ∈U ,x∈F∗2m F (μx), its algebraic degree equals the maximal 2-weight of
s(2 + 1)u + t (2 − 1)v [mod 2n − 1] such that as,t = 0, where (2m + 1)u + (2m − 1)v = 1.
m m

We consider then those n-variable Boolean functions whose restrictions to the cosets
μ F∗2m , where m divides n, coincide with affine functions:
f (μ x) = trm (x φ(μ)) + g(μ); μ ∈ U , x ∈ F∗2m , (5.4)
where φ is a function from U to F2m and g is a Boolean function over U . And a value must
still be chosen for f (0). Note that if each restriction to μF2m has algebraic  degree less than
2n −2
m (in particular if dalg (f ) < m), then the univariate representation f (z) = i=0 ai zi of f
satisfies “(i = 0 and ai = 0) ⇒ (i [mod 2 − 1] ∈ I )”, where I = {2 ; j = 0, . . . , m − 1}:
m j

this sufficient condition is indeed necessary since, assuming without 2loss of generality that
n −2
a0 = 0, for every ω ∈ F∗2n , the function x ∈ F2m → f (ωx) = i=0 ai ωi x i [mod 2 −1]
m

being linear, we have that k ∈ I ⇒ ai ωi = 0, and by uniqueness of the univariate


0≤i≤2n −2
i≡k [mod 2m −1]
representation of the functions of ω ∈ F2n , this completes the proof [311].
n (see page 43) satisfy that, for every a ∈ F n , we
Recall that functions trn , trm , and trm 2
have trn (a) = trm (trm (a)), and that, for every u ∈ F2n and x ∈ F2m , we have trm
n n (ux) =
n (u). Therefore, for μ ∈ U , x ∈ F m , we have tr (uμx) = tr (xtr n (uμ)). We have
x trm 2 n m m
then, for every u ∈ F2n :
n
Wf (u) = (−1)f (0) + (−1)trm (x [φ(μ)+trm (uμ)])⊕g(μ)
μ∈U ,x∈F∗2m

= (−1)f (0) − (−1)g(μ) + 2m (−1)g(μ) , (5.5)


μ∈U n (uμ)=0
μ∈U ;φ(μ)+trm

But not the only one; investigations could be made on other subgroups, such as those of order 22m−1
n
1
−1 , where m
divides n (for which affinity would no more be the property on which the functions would be built).
2 A slightly different representation is the trace 0/trace 1 representation; see [547].
3 More general cases are studied there.
5.1 Affine functions and their combinations 169
⎛ ⎞
1
wH (f ) = 2n−1 − ⎝(−1)f (0) − (−1)g(μ) + 2m (−1)g(μ) ⎠
2
μ∈U μ∈φ −1 (0)

and
0 0
0 0
1 0 0
nl(f ) = 2n−1 − max 00(−1)f (0) − (−1)g(μ) + 2m (−1)g(μ) 00 .
2 u∈F2n 0 0
μ∈U μ∈U ;φ(μ)+tr n (uμ)=0 m

A subcase is when function g is null in (5.4) (i.e., when the restrictions to the cosets μ F∗2m
coincide with linear functions), which leads (when n = 2m) to the so-called Niho Boolean
functions (the name comes from a theorem by Niho [902] dealing with power functions; see
a survey on their applications in [769]):

f (μ x) = trm (x φ(μ)); μ ∈ U , x ∈ F∗2m , (5.6)

among which are bent functions; see Subsection 6.1.15. Another subcase is (also when
n = 2m) when function φ is null in (5.4) (i.e., when the restrictions to the cosets μ F∗2m
coincide with constant functions), which leads to the so-called PS ap -like class of Boolean
functions:

f (μ x) = g(μ); μ ∈ U , x ∈ F∗2m , (5.7)

among which are also bent functions; see Subsection 6.1.15 as well. Niho power functions
with few Walsh values are studied in [769].

Niho and PS ap -like classes in bivariate form


The last sum in (5.5) is not always easily simplified further since it deals with U , which has
no additive structure in general. This can be circumvented when n = 2m (this case can be
generalized) by representing the elements of F2n by ordered pairs of elements of F2m (which
is possible since F2n is a plane over F2m ). It is then easily seen that the subset U introduced
at the beginning of the present subsection can be taken equal to {(0, 1)} ∪ {(1, λ), λ ∈ F2m }
and one of the cosets of F∗2m becomes then {(0, y), y ∈ F∗2m } and the others become the sets
{x, λx), x ∈ F∗2m } where λ ∈ F2m . We have then

⎧  y y
⎨ trm x φ x + g x ; x ∈ F∗2m , y ∈ F2m
f (x, y) = trm (a y) + ; x = 0, y ∈ F∗2m (5.8)

f (0, 0); x = y = 0,

where a ∈ F2m , ∈ F2 , φ is a function from F2m to F2m , g is a Boolean function over F2m ,
and where the products x φ( yx ) and a y are calculated in F2m .
170 Functions with weights, Walsh spectra, and nonlinearities easier to study

We have then, for every u, v ∈ F2m , that Wf (u, v) equals


y y y
(−1)f (0) + (−1)trm (y (a+v))+ + (−1)trm (x [φ( x )+u+v x ])+g( x )
y∈F∗2m x∈F∗2m ,y∈F2m

= (−1)f (0) − (−1) + (−1)trm (y (a+v))+ + (−1)trm (x [φ(z)+u+vz])+g(z)


y∈F2m x∈F∗m
2
z∈F2m

= (−1)f (0) + (2m δa (v) − 1)(−1) + 2m (−1)g(z) − (−1)g(z) . (5.9)


z∈F2m ; z∈F2m
φ(z)+u+vz=0

We have then wH (f ) =
⎛ ⎞
1
2n−1 − ⎝(−1)f (0) + (2m δ0 (a) − 1)(−1) + 2m (−1)g(z) − (−1)g(z) ⎠
2
z∈φ −1 (0) z∈F2m

and
1
nl(f ) = 2n−1 − A,
2
where A equals
0 0
0 0
0 0
0 0
max 00(−1)f (0) + (2m δa (v) − 1)(−1) + 2m (−1) g(z)
− g(z) 0
(−1) 0 .
u,v∈F2m 0 0
0 z∈F2m ;
φ(z)+u+vz=0
z∈F2m 0

5.2 Quadratic functions and their combinations


The next functions to be naturally considered after affine ones are quadratic ones. We shall
see that they offer a compromise between robustness and simplicity. They will play roles in
almost all domains addressed in the subsequent chapters.

5.2.1 Quadratic Boolean functions


The behavior of quadratic Boolean functions (i.e., of the codewords of RM(2, n)) is rather
simple (less, though, than that of affine functions). There are many results on their Walsh
transform, that we shall try to present completely, but without being able to give all proofs,
since this would take too much space.

Absolute value of the Walsh transform


Recall thatRelation (2.55), page 62, states that, for every Boolean function f , we have
F (f ) = b∈Fn F (Db f ), where F (f ) = x∈Fn (−1)f (x) . If f is quadratic, then Db f is
2
2 2
affine for every b ∈ Fn2 , and is therefore either balanced or constant. Since F (g) = 0 for
every balanced function g, we deduce
F 2 (f ) = 2n (−1)Db f (0n ) , (5.10)
b∈Ef
5.2 Quadratic functions and their combinations 171

where Ef is the linear kernel (i.e., the set of all b ∈ Fn2 such that Db f is constant; see Section
3.1). Since f is quadratic, Ef is also the kernel {x ∈ Fn2 ; ∀y ∈ Fn2 , βf (x, y) = 0} of the
symplectic4 form associated to f :
βf (x, y) = f (0n ) ⊕ f (x) ⊕ f (y) ⊕ f (x + y).
In other words, Ef is the radical of the quadratic form.
The restriction of the function b → Db f (0n ) = f (b) ⊕ f (0n ) to Ef being linear, since
we have already seen after Definition 25, page 99, that the restriction of f to Ef is affine,
we deduce from (5.10) that F 2 (f ) equals 2n |Ef | if f (b) ⊕ f (0n ) is null on Ef (i.e., if f
is constant on Ef ) and is null otherwise. Note that in the former case, f is constant on any
coset a + Ef of Ef , since f and Da f are constant on Ef . According to Relation (2.35), page
57 (and since the linear kernel of f (x) ⊕ a · x equals that of f ), this proves the following
proposition, which shows in particular that the absolute value of the Walsh transform of
every quadratic Boolean function takes only two values, one of which is 0 (such functions
will be called plateaued in Section 6.2).

Proposition 55 [209] Let n be any positive integer. Any n-variable quadratic function f is
unbalanced if and only if its restriction to its linear kernel Ef (i.e., the kernel of its associated
symplectic form) is constant, or equivalently, if every constant derivative of f is null. Then,
n+k
f is constant on any coset of Ef and the Hamming weight of f equals 2n−1 ± 2 2 −1 , where
k is the dimension of Ef .
For every a ∈ Fn2 and every n-variable quadratic function f , Wf (a) is nonzero if and
n+k
only if the restriction of f (x) ⊕ a · x to Ef is constant. Then, Wf (a) equals ±2 2 .

Note that Proposition 55 implies that f is balanced if and only if there exists b ∈ Fn2
such that the derivative Db f (x) = f (x) ⊕ f (x + b) equals the constant function 1. For
nonquadratic Boolean functions, this condition for f to be balanced is sufficient but not
necessary.
Note that, according to Parseval’s relation, there exists a such that Wf (a) = 0.
Proposition 55 implies that n+k
2 is an integer (because the Hamming weight is an integer),
and then that the codimension of Ef must be even. This co-dimension is the rank of βf , also
called by abuse of language the rank of f . Note that, given two quadratic functions f and g,
we have |rank(f ⊕ g) − rank(f )| ≤ rank(g) because the rank of matrices is subadditive:
rank(A + B) ≤ rank(A) + rank(B).
We also deduce

Corollary 9 Let n be any positive integer and f any n-variable quadratic function. The
n+k rk (f )
nonlinearity of f equals 2n−1 − 2 2 −1 = 2n−1 − 2n− 2 −1 , where k is the dimension of
the linear kernel of f and rk (f ) is the rank of βf .

The Hamming weight of <an =n-variable quadratic Boolean function belongs then to the set
{2n−1 } ∪ {2n−1 ± 2i ; i = n2 − 1, . . . , n − 1} and can be any element of this set, since

4 Bilinear, symmetric, and null for x = y; the associated matrix is called a symplectic matrix.
172 Functions with weights, Walsh spectra, and nonlinearities easier to study

it is easily seen that the dimension of the linear kernel in the case of function x1 x2 ⊕ x3 x4
⊕· · ·⊕x2r−1 x2r equals n−2r. The nonlinearity< of=an n-variable quadratic Boolean function
can be any element of the set {2n−1 − 2i ; i = n2 − 1, . . . , n − 1}, and if f has Hamming
weight 2n−1 ± 2i , then for every affine function l, the Hamming weight of the function f ⊕ l
belongs to the set {2n−1 − 2i , 2n−1 , 2n−1 + 2i }.
The method seen above is particularly simple,5 but it does not allow determining whether
the Hamming weight is 2n−1 − 2i or 2n−1 + 2i when the function is not balanced, nor
determining the sign of the Walsh transform. It may be much more difficult to calculate
this sign than the absolute value. Such calculation is sometimes necessary. This is the case
for instance when trying to determine the absolute value of the Walsh transform of a cubic
function by applying Relation (2.55), page 62, or when we calculate the size of the preimage
of an element u ∈ Fm 2 by
 a quadratic function F : Fn2 → Fm
2 , thanks to the formula |{x ∈
F2 ; F (x) = u}| = 2
n −m n m (−1) v·(F (x)+u) .
x∈F , v∈F 2 2

Dickson form of a quadratic function


A first important step, anterior to the method above, has been made by Dickson for
calculating explicitly the Hamming weight of quadratic functions, by showing as described
in [809, page 438] that any nonaffine quadratic Boolean function f over Fn2 is affinely
equivalent to
x1 x2 ⊕ · · · ⊕ x2r−1 x2r ⊕ , (5.11)
where 2r is the rank of the quadratic function and  is an affine function (which can be
taken equal, up to affine equivalence, to 0, 1 or x2r+1 ). This is easily shown: by hypothesis,
f has a monomial of degree 2 in its ANF, and we can assume without loss of generality
that this monomial is x1 x2 . The function has then the form x1 x2 ⊕ x1 f1 (x3 , . . . , xn ) ⊕
x2 f2 (x3 , . . . , xn ) ⊕ f3 (x3 , . . . , xn ), where f1 , f2 are affine functions and f3 is quadratic.
Then, f (x) = (x1 ⊕ f2 (x3 , . . . , xn ))(x2 ⊕ f1 (x3 , . . . , xn )) ⊕ f1 (x3 , . . . , xn )f2 (x3 , . . . , xn ) ⊕
f3 (x3 , . . . , xn ) is affinely equivalent to the function x1 x2 ⊕ f1 (x3 , . . . , xn )f2 (x3 , . . . , xn ) ⊕
f3 (x3 , . . . , xn ). Applying this method recursively shows

Theorem 10 Every quadratic nonaffine function is affinely equivalent to


x1 x2 ⊕ · · · ⊕ x2r−1 x2r ⊕ x2r+1 (5.12)

(where r ≤ n−1
2 ) if it is balanced, to
x1 x2 ⊕ · · · ⊕ x2r−1 x2r (5.13)
(where r ≤ n2 ) if it has Hamming weight smaller than 2n−1 and to
x1 x2 ⊕ · · · ⊕ x2r−1 x2r ⊕ 1 (5.14)
(where r ≤ n2 ) if it has Hamming weight larger than 2n−1 .

5 Theoretically; in practice, calculating the dimension of the linear kernel is not always an easy task.
5.2 Quadratic functions and their combinations 173

The unique expressions (5.12), (5.13), and (5.14) are called the Dickson form of the
quadratic function. They allow describing precisely the weight distribution of RM(2, n)
[809, page 441].

Walsh transform when the function is given by its ANF


We have seen how a quadratic Boolean function can be put in the form g(L(x)+b), where L
is a linear automorphism and g is in Dickson form. Thanks to Relation (2.58), page 63, and to
Lemma 4, page 58, it is then enough to be able to calculate x∈Fn (−1)x1 x2 ⊕···⊕x2r−1 x2r ⊕u·x ,
2
for every u ∈ Fn2 and every r ∈ {1, . . . ,  n2 }. This sum equals
r
(−1) i=1 [(x2i−1 ⊕u2i )(x2i ⊕u2i−1 )⊕u2i−1 u2i ]⊕u2r+1 x2r+1 ⊕···⊕un xn

x∈Fn2
r
and equals then 2n−r (−1) i=1 u2i−1 u2i if u2r+1 = · · · = un = 0 and 0 otherwise. Since
the dimension k of the kernel Ef of the symplectic form βf (x, y) equals n − 2r, this shows
again that the Walsh transform Wf (u) lies in {0, ±2(n+k)/2 } = {0, ±2n−r } for every u. But
we have also the sign of Wf (u).

Remark. Any quadratic function belongs to the completed Maiorana–McFarland class;


this can be easily seen from its Dickson form. Note, however, that given a quadratic function
in Maiorana–McFarland form f (x, y) = x · (L(y) + b) ⊕ g(y), where x ∈ Fk2 , y ∈ F2n−k
and L is linear, the linear kernel of f is not E = Fk2 × {0k }, in general, despite the fact
that Da Da  f is null for a, a  ∈ E. Indeed, writing a = (a1 , a2 ), we have Da f (x, y) =
x · L(a2 ) ⊕ a1 · L(y + a2 ) ⊕ a1 · b ⊕ Da2 g(y), and we do not have necessarily that Da f is
contant for a ∈ E.

Remark. According to Theorem 2, page 63, the functions whose Walsh transform values
are all divisible by 2n−1 are quadratic. According to Theorem 10, they are the sums of an
affine function and of the product of two affine functions. This proves one of the points that
we asserted at page 65.

More general approach on the Walsh transform


Calculating the Dickson form of a quadratic Boolean function in generic number n of
variables is most often impossible when the function is given by its trace representation.
As originally shown by Dillon and Dobbertin in [448, appendix A] for the case of functions
trn (x 2 +1 ) and generalized by Hou to all quadratic functions, there is a possibility of relating
i

all the values of the Walsh transform to one of them (which needs of course to be nonzero;
we know by the Parseval relation that such nonzero value necessarily exists). If the sign of
one of these nonzero Walsh values is known, then all will be deduced.
X. Hou in [625] calculates the product of two nonzero Walsh values, instead of calculating
the square of one value as we did in Proposition 55. This has the interest of providing the
sign of every value Wf (u), knowing one of them. Hou works with quadratic functions
in trace form. This does not reduce theoretically the generality of his results since any
function admits a trace form. However, in practice, if a Boolean function is given by its
174 Functions with weights, Walsh spectra, and nonlinearities easier to study

ANF, it is nonnegligible work to first determine its trace representation; and if, instead
of working with a particular function in a particular number of variables, we work with
all functions with an ANF of some form in arbitrary number of variables, it is most often
impossible. We shall then revisit Hou’s result in a way that will not depend on a particular
representation of the functions. Subsequently, we shall see what this result gives in trace
representation.
Hou needs to assume that Wf (0n ) = 0 (i.e., that f is unbalanced). This does not reduce
the generality since, if f is balanced, we can apply the result to one of the unbalanced
functions f (x) ⊕ b · x. We assume then that Wf (0n ) = 0. This means according to
Proposition 55 that any constant derivative of f is null on Fn2 , i.e., for every x ∈ Ef , we
have f (x) = f (0n ).
We have Wf (0n )Wf (a) = (−1)f (x)⊕f (y)⊕a·y = (−1)f (x+y)⊕f (y)⊕a·y . For
x,y∈Fn2 x,y∈Fn2
every x ∈ Fn2 , function y
→ f (x+y)⊕f (y)⊕a·y is affine. We are then in the same situation
as in Proposition 55, but with the advantage that we shall know  the product of the signs of
Wf (0n ) and Wf (a) when Wf (a) will be nonzero. The sum y∈Fn (−1)f (x+y)⊕f (y)⊕a·y is
2
nonzero if and only if function y → f (x + y) ⊕ f (y) ⊕ a · y is constant over Fn2 . The set
of those x ∈ Fn2 having such property either is empty or is a coset of the linear kernel Ef ,
since f (x + y) ⊕ f (y) ⊕ a · y ⊕ f (x  + y) ⊕ f (y) ⊕ a · y = Dx+x  f (x + y). Moreover, the
constant values of f (x + y) ⊕ f (y) ⊕ a · y are the same for all those x that belong to this
coset, since Wf (0n ) being nonzero, Dx f is the zero function for every x ∈ Ef (according
to Proposition 55). The next proposition is a version made as general as possible of the main
result from [625].

Proposition 56 Let n be any positive integer and f any unbalanced quadratic n-variable
function. Let Wf be the Walsh transform associated to some inner product “·”. Then, for
every a ∈ Fn2 , the value of Wf (a) is nonzero if and only if there exists x in Fn2 such that the
function y → f (x + y) ⊕ f (y) ⊕ a · y is constant on Fn2 . The set of such x is then a coset
of Ef and we have Wf (0n )Wf (a) = 2n+dim Ef (−1)f (x)⊕f (0n ) .

The determination, for given a, of the set of those x such that a · y coincides with function
Dx f (y) or with function Dx f (y)⊕1 leads in Hou’s method to the resolution of an equation,
which is over F2n in his paper since f is taken in trace representation, and that we shall see
below. This determination is necessary for calculating Wf (a) explicitly and is the difficult
part of this method in practice.
We now introduce a slightly different viewpoint (which has never been addressed as is,
as far as we know). We start with the vector space {a ∈ Fn2 ; ∃ x ∈ Fn2 ; ∀y ∈ Fn2 , a · y =
βf (x, y)}, which we denote by Ef , for reasons that will appear below. After identification
between Fn2 and the vector space of its linear forms6 through the correspondence a ←→
(y → a · y), we can view Ef as the image of Fn2 by the linear function x → (y → βf (x, y)).
This linear function having kernel Ef , the dimension of Ef equals n − dim Ef . Moreover,

6 This vector space is called in mathematics the dual space (here of Fn2 ), but we shall avoid using this
denomination, for obvious reasons.
5.2 Quadratic functions and their combinations 175

if a ∈ Ef then a · y is null over Ef , and since the dimension of the vector space Ef⊥ = {a ∈
Fn2 ; ∀y ∈ Ef , a · y = 0} is equal to n − dim Ef as well, we have

Ef = Ef⊥ .

According to Proposition 55, for every b ∈ Fn2 , Wf (b) is nonzero if and only if the function
x → f (x) ⊕ f (0n ) ⊕ b · x is null on Ef , and the Walsh support of f equals then b + Ef⊥ =
b + Ef . For every a ∈ Ef , choosing x ∈ Fn2 such that a · y = βf (x, y), we have

Wf (a + b) = (−1)f (y)⊕(a+b)·y = (−1)f (y)⊕βf (x,y)⊕b·y


y∈Fn2 y∈Fn2

= (−1)f (x+y)⊕f (x)⊕f (0n )⊕b·y = (−1)f (y)⊕f (x)⊕f (0n )⊕b·(x+y)
y∈Fn2 y∈Fn2

= (−1)f (x)⊕f (0n )⊕b·x (−1)f (y)⊕b·y = (−1)f (x)⊕f (0n )⊕b·x Wf (b).
y∈Fn2

Proposition 57 Let f be any quadratic n-variable Boolean function and let Wf be its
Walsh transform associated to some inner product “·”. Let βf (x, y) = f (x + y) ⊕ f (x) ⊕
f (y) ⊕ f (0n ) be the symplectic form associated to f . Let b be any element of Fn2 such that
Wf (b) = 0. Then, for every a ∈ Fn2 , we have Wf (a + b) = 0 if and only if a ∈ Ef⊥ = {u ∈
Fn2 ; u · y = 0, ∀y ∈ Ef }, which is equivalent to saying that there exists x ∈ Fn2 such that the
functions y → a · y and y → βf (x, y) coincide over Fn2 , and we have then

Wf (a + b) = (−1)f (x)⊕f (0n )⊕b·x Wf (b).

Quadratic functions in trace form


We know (see Subsection 2.2.2, page 41) that any quadratic function f (x) over F2n can be
written in a unique way under the form
⎛ ⎞
(n−1)/2
f (x) = trn ⎝ ⎠ ⊕ q(x);
k +1
ak x 2 ak ∈ F2n , (5.15)
k=1

where
  
if n is even, q(x) = trn/2 an/2 x 2 +1 + (x); an/2 ∈ F2n/2 ,  affine,
n/2

(5.16)
if n is odd, q(x) = (x);  affine.
j
We have then, using that, for every u ∈ Fn2 and j ∈ N, we can replace trn (u) by trn (u2 )
n
and u2 by u:
⎛ ⎞
 (n−1)/2  
βf (x, y) = trn ⎝y ⎠ + βq (x, y),
k n−k n−k
ak x 2 + ak2 x 2
k=1
176 Functions with weights, Walsh spectra, and nonlinearities easier to study
n/2 n/2 n/2 n/2
where βq (x, y) = trn/2 (an/2 (x 2 y + xy 2 )) = trn/2 (an/2 trn/2
n (x 2 y)) = tr (a
n n/2 yx
2 )
for n even and βq (x, y) = 0 for n odd. We have then
⎧ 6  n2 −1  2k  7

⎨ x ∈ F2n ; k=1
n−k
ak x + ak2 x 2
n−k n/2
+ an/2 x 2 = 0 , for n even,
Ef =   n−1   >

⎩ x ∈ F2n ; k=1 ak x + ak x
2 2 k 2 n−k 2n−k
= 0 , for n odd.

Hou has observed a useful property for evaluating the size of Ef , and therefore the
nonlinearity, of any quadratic Boolean function in trace form:

Proposition 58 [625] Let f be any quadratic n-variable function in the form (5.15), with
q = 0. Denoting by K the maximal value of k such that ak = 0, |Ef | equals the degree of
the following polynomial:
⎛⎛ ⎞2K ⎞
(n−1)/2  
⎜ ⎠ , x 2n + x ⎟
gcd ⎝⎝
k n−k n−k
ak x 2 + ak2 x 2 ⎠.
k=1

n
Indeed, x 2 + x splits completely over F2n and |Ef | equals the number of solutions in F2n
(n−1)/2  2k n−k n−k
of the equation k=1 ak x + ak2 x 2 = 0, which has no repeated root, because
its derivative (as a polynomial) has no common zero with the equation.
Let us see now how the method we introduced works in univariate representation.
According to Proposition 57, and taking for inner product a · x = trn (ax), we have the
following:

Proposition 59 Let f (x) be any quadratic function. Let


⎛ ⎞
(n−1)/2
trn ⎝ ⎠ + q(x)
k +1
ak x 2
k=1

be its trace form, where q(x) is defined in Relation (5.16). Let


(n−1)/2  
k n−k n−k
Pf (x) = ak x 2 + ak2 x2 if n is odd, (5.17)
k=1
and
(n−1)/2  
k n−k n−k n/2
Pf (x) = ak x 2 + ak2 x2 + an/2 x 2 if n is even. (5.18)
k=1
Let b be any element of F2n such that Wf (b) is nonzero. For every a ∈ F2n , Wf (a + b) is
nonzero if and only if there exists x ∈ F2n such that a = Pf (x) and we have then
Wf (a + b) = (−1)f (x)⊕f (0)⊕b·x Wf (b).

Remark. The observation that Ef⊥ is at the same time equal to {a ∈ F2n ; ∃x ∈ F2n ; ∀y ∈
F2n , a · y = βf (x, y)} and to {a ∈ F2n ; ∀y ∈ Ef , a · y = 0} gives, when applied to function
5.2 Quadratic functions and their combinations 177

f in (5.15), that Pf (a) = 0 if and only if there exists x ∈ F2n such that a = Pf (x),
where Pf is defined by (5.17) (resp. (5.18)). This gives a parameterized form of the set of
solutions.

Particular classes of quadratic functions Particular quadratic Boolean functions have


been successfully investigated in the 1970s. For some of them, the explicit Walsh transform
could be given in a rather simple statement. This begun with Kerdock [689] when he
constructed the so-called Kerdock codes (but the question of the sign was not posed, because
his code is a union of cosets of the first-order Reed–Muller code, and two complementary
functions f and f ⊕ 1 have opposite Walsh transforms). Then Carlitz showed in [332] the

following equalities on so-called cubic sums x∈F2n (−1)trn (wx +ux) , w = 0 (this name
3

being a reference to the polynomial degree of the functions, not to their algebraic degree,
which is 2):
• Let n be an odd integer and u ∈ F2n . For trn (u) = 1, we denote by γ ∈ F2n any element
in F2n such that u = γ 4 + γ + 1. We have

3 +γ ) n+1
(x 3 +ux) (−1)trn (γ ( n2 ) 2 2 when trn (u) = 1
(−1)trn =
0 when trn (u) = 0,
x∈F2n

n2 −1
where ( n2 ) denotes the Jacobi symbol that equals (−1) 8 when n is odd. If we know the
sign of the Walsh transform at 1, this can be deduced from Propositions 55 and 59, after
n−1
observing that the linear kernel of function trn (x 3 ) equals {x ∈ F2n ; x 2 + x 2 = 0} =
{0} ∪ {x ∈ F2n ; x 3 = 1}, which equals F2 since n is odd. The additional information we
have thanks to Carlitz is the sign of the Walsh transform at 1.  
 tr (wx 3 +ux)  u
trn x 3 + 1/3 x
The value of x∈F2n (−1)
n equals x∈F2n (−1)
w (by the
change of variable x → 1 ; note that since n is odd, function x is a permutation
x 3
w3
1
of F2n ; we denote the inverse function by x 3 ; the value of 13 can be found in [731, 907]).
• Let n be an even integer. Then we have two cases according to whether w is a cube or
not:
– If w = 0 is a cube, say w = v 3 , then for tr2n (uv −1 ) = 0, we denote by γ0 any
element in F2n such that γ04 + γ0 = u2 v −2 . We have
 n
(−1) 2 +1+trn (γ0 ) 2 2 +1 when tr2n (uv −1 ) = 0
3 n
(wx 3 +ux)
(−1)trn =
0 when tr2n (uv −1 ) = 0.
x∈F2n

If we know the sign of the Walsh transform at 0, this is deduced from Propositions
55 and 59 after observing that the linear kernel of function trn (wx 3 ) equals {x ∈
n−1
F2n ; wx 2 + (wx)2 = 0} = {0} ∪ {x ∈ F2n ; wx 3 = 1} = vF4 .
– If w is not a cube, then let γ1 be the unique element in F2n such that w2 γ14 + wγ1 =
u2 . Such γ1 exists and is unique because the linear function γ → w2 γ 4 + wγ has a
178 Functions with weights, Walsh spectra, and nonlinearities easier to study

trivial kernel (the linear kernel of function trn (wx 3 ) equals {0} since w is not a cube)
and is then bijective. Then we have
3 +ux) n n
= (−1) 2 +trn (wγ1 ) 2 2 .
3
(−1)trn (wx
x∈F2n

Coulter [384, 385] and Dillon–Dobbertin [448] generalized Carlitz’s results to exponents
of the form 2k + 1 instead of 3. Their results can be deduced from Proposition 59 as well. To
illustrate how, let us assume that n is odd and gcd(k, n) = 1. Then x 2 +1 is a permutation.
k

 k
2 +1 +ux)
We can then reduce ourselves to the sums x∈F2n (−1)trn (x . The linear kernel of
−k
quadratic function trn (x 2 +1 ) has equation x 2 + x 2 = 0, that is, x 2 + x = 0, which has
k k 2k

for solutions in F2n the elements of F2gcd(2k,n) = F2 , and trn (x 2 +1 + ux) is then balanced
k

if and only if trn (u) = 0. We assume then trn (u) = 1, and there exists a such that
−k −k
u = 1 + a 2 + a 2 . Then since trn ((x + a)2 +1 ) = trn (x 2 +1 + (a 2 + a 2 ) x + a 2 +1 ),
k k k k k

 2k +1 +ux) 2k +1 )  k
trn (x 2 +1 +x) .
we have x∈F2n (−1)trn (x = (−1)trn (a x∈F2n (−1)
As we wrote, much work has been done on the Walsh transform of quadratic functions in
univariate form. We shall give the next ones without giving clues on their proofs.
n−1
For n odd, the quadratic functions of nonlinearity 2n−1 − 2 2 (called semi-bent functions
n+1
or near-bent functions; their extended Walsh spectra only contain values 0 and 2 2 ; see
(n−1)/2
ci x 2 +1 ) have been studied by Khoo et al. [697, 699].
i
Section 6.2) of the form trn ( i=1
The study of such functions is simplified when all coefficients ci belong to F2 since the
(n−1)/2 i n−i (n−1)/2 i n−i
linearized polynomial i=1 (ci x 2 + (ci x)2 ) = i=1 ci (x 2 + x 2 ) is then a
2-polynomial over F2 (see page 490), and its study can be done through its 2-associate
(n−1)/2
polynomial c(x) = i=1 ci (x i + x n−i ), more precisely, its gcd with x n + 1 (e.g., near-
bentness is equivalent to gcd(c(x), x n + 1) = x + 1), and the factorization of x n + 1 (see
[697, 699], and see more in [19, 510, 672, 840, 841]). If n and 2n + 1 are primes, the
function is near-bent for all non-all-zero ci . This study has been generalized to n even by
Charpin et al. in [355] (gcd(c(x), x n + 1) = x + 1 is then replaced by gcd(c(x), x n + 1) =
x 2 + 1) and nonquadratic bent functions have been deduced by concatenation of such near-
bent functions. Further functions of this kind have been given and studied in [629, 672, 699,
840, 841].
The sign of the values of the Walsh transform of AB Gold and Kasami functions (see
pages 206 and 230) is studied in [734]. The former are quadratic (the latter are not, but they
are related to quadratic functions). In [548], the result of [734] is generalized: for every AB
power function x d over F2n whose restriction to any subfield of F2n is also AB, the value
 trn (x d +x) equals 2 n+1 n+1
2 if n ≡ ±1 [mod 8] and −2 2 if n ≡ ±3 [mod 8]. In
x∈F2n (−1)
[383], the authors studied the Walsh transform values of the functions trn (x 2 +1 + x 2 +1 ),
a b

gcd(b − a, n) = gcd(b + a, n) = 1.
X. Hou in [625] has been able to address whole subclasses of quadratic functions (and
even more since he could view such functions over every field extension of F2n ). With the
method of calculating Wf (0)Wf (a), he determined the Walsh transform of any quadratic
function whose trace form involves exponents of the form 2k + 1, where k has fixed
2-valuation.
5.2 Quadratic functions and their combinations 179
n−1 k
X. Zhang et al. in [1165] use that, given a linear function L(x) = k=0 ak x 2 ∈ F2n [x],
n−1 2n−k 2n−k

and denoting L(x) = k=0 ak x 
, we have trn (xL(y))) = trn (y L(x)) for every x, y ∈

n  
F2 . For every linear permutation L and linear function L , we have trn x (L ◦ L ◦ L(x)) =
trn L(x) (L ◦ L(x)) , and then
   
(−1)trn (x(L◦L ◦L(x))) = (−1)trn (L(x) (L ◦L(x))) = (−1)trn (xL (x)) .
x∈F2n x∈F2n x∈F2n
 
The functions f (x) = trn x (L  ◦ L ◦ L(x)) and g(x) = trn x L (x) satisfy then
F (f ) = F (g). This provides in fact an equivalence relation between quadratic functions,
which preserves the mapping f → F (f ).

5.2.2 Concatenations of quadratic functions


Concatenated quadratic functions (instead of affine functions) generalize the Maiorana–
McFarland construction. These functions are a little harder to study than Maiorana–
McFarland’s functions, but they are more numerous and they avoid the property of null
second-order derivatives seen in Proposition 54, page 167, which may be a cryptographic
weakness. There are at least two classes:
• A first class [221] is built on the Dickson form of quadratic functions:


t
fψ,φ,g (x, y) = x2i−1 x2i ψi (y) ⊕ x · φ(y) ⊕ g(y), (5.19)
i=1
: ;
with x ∈ Fr2 , y ∈ Fs2 , where n = r + s, t = 2r , and where ψ : Fs2 → Ft2 , φ : Fs2 → Fr2
and g : Fs2 → F2 can be chosen arbitrarily.
s
The size of this class roughly equals 2(t+r+1)2 . The Walsh transform is easily
deduced from
t the observationthat, for every quadratic Boolean function of the form
2t
f (x) = i=1 u i x2i−1 x2i ⊕ j =1 j xj ⊕ c, where ui , vj , c ∈ F2 , x ∈ F2 , and for
v 2t

every element a of F2t 2 , if there exists i = 1, . . . , t such that ui = 0 and v2i−1 =


a2i−1 or v2i = a2i , then we have Wf (a) = 0, and otherwise, Wf (a) is equal to
t
22t−wH (u) (−1) i=1 (v2i−1 ⊕a2i−1 )(v2i ⊕a2i )⊕c , where u = (u1 , . . . , ut ). This implies that, for
every function f of the form (5.19), for every a ∈ Fr2 and every b ∈ Fs2 , we have
t
Wfψ,φ,g (a, b) = 2r−wH (ψ(y)) (−1) i=1 (φ2i−1 (y)⊕a2i−1 )(φ2i (y)⊕a2i )⊕g(y)⊕y·b ,
y∈Ea

where Ea is the superset of φ −1 (a) equal if r is even to


' (
y ∈ Fs2 / ∀i ≤ t, ψi (y) = 0 ⇒ (φ2i−1 (y) = a2i−1 and φ2i (y) = a2i ) ,

and if r is odd to
  >
∀i ≤ t, ψi (y) = 0 ⇒ (φ2i−1 (y) = a2i−1 and φ2i (y) = a2i )
y ∈ F2 /
s
.
φr (y) = ar
180 Functions with weights, Walsh spectra, and nonlinearities easier to study

• A second class [317] has for elements the concatenations of quadratic functions of rank
at most 2, of the form
fφ1 ,φ2 ,φ3 ,g (x, y) = (x · φ1 (y)) (x · φ2 (y)) ⊕ x · φ3 (y) ⊕ g(y), (5.20)
with x ∈ Fr2 , y ∈ Fs2 , where φ1 , φ2 and φ3 are three functions from Fs2 into Fr2 and g
s
is any Boolean function on Fs2 . The size of this class roughly equals 2(3r+1)2 (the exact
number, which is unknown, is smaller since a function can be represented in this form in
several ways) and is larger than for the first class.
The Walsh transform is deduced from the fact that, for every positive integer r and
every Boolean function f on Fr2 of the form (u · x)(v · x) ⊕ w · x; u, v, w ∈ Fr2 :
– If u and v are F2 -linearly independent (i.e., u = 0r , v = 0r and u = v), then f
is balanced if and only if w is outside the vectorspace < u,  v >= {0r , u, v, u + v}
spanned by u and v, and otherwise, if w ∈ {0r , u, v}, then x∈Fr (−1)f (x) equals
2
2r−1 , and if w = u + v, then it equals −2r−1 .
– If u and v are F2 -linearly dependent, then if we have w = 0r and u = 0r or v = 0r ,
 
or if we have u = v = w, then x∈Fr (−1)f (x) equals 2r ; otherwise, x∈Fr (−1)f (x)
2 2
is null.

We deduce that for any function fφ1 ,φ2 ,φ3 ,g of the form (5.20) with φ2 (y) = 0r for
every y ∈ Fs2 , denoting by E the set of all y ∈ Fs2 such that the vectors φ1 (y) and
φ2 (y) are F2 -linearly independent, for every a ∈ Fr2 and every b ∈ Fs2 , Wfφ1 ,φ2 ,φ3 ,g (a, b)
equals
2r−1 (−1)g(y)⊕b·y − 2r−1 (−1)g(y)⊕b·y
y∈E; y∈E;
φ3 (y)+a∈{0r ,φ1 (y),φ2 (y)} φ3 (y)+a=φ1 (y)+φ2 (y)

+ 2r (−1)g(y)⊕b·y .
y∈Fs2 \E;
φ3 (y)+a=φ1 (y)

5.3 Cubic functions


The Hamming weights and the Walsh spectra of nonquadratic cubic Boolean functions (i.e.,
of the codewords in RM(3, n)\RM(2, n)) behave in a much less peculiar way than quadratic
functions.7 This has been shown in [210] as follows. Let f1 , f2 and f3 be any Boolean
functions on Fn2 . Define the function on F2n+2 : f (x, y1 , y2 ) = y1 y2 ⊕ y1 f1 (x) ⊕ y2 f2 (x) ⊕
f3 (x). Then we have
F (f ) = (−1)(y1 ⊕f2 (x))(y2 ⊕f1 (x))⊕f1 (x)f2 (x)⊕f3 (x)
x∈Fn2 ; y1 ,y2 ∈F2

= (−1)y1 y2 ⊕f1 (x)f2 (x)⊕f3 (x) = 2 (−1)f1 (x)f2 (x)⊕f3 (x) .


x∈Fn2 ; y1 ,y2 ∈F2 x∈Fn2

Except that, according to McEliece’s theorem, the Hamming weights are divisible by 2 3 −1 and the Walsh
n
7

transform values are divisible by 2 3  ).


n
5.4 Indicators of flats 181

So, starting with a function g = f1 f2 ⊕ f3 , we can relate F (g) to F (f ), in two more


variables, in which the term f1 f2 has been replaced by y1 y2 ⊕ y1 f1 (x) ⊕ y2 f2 (x). Applying
this repeatedly (“breaking” this way all the monomials of degrees at least 4), this shows that,
for every Boolean function g on Fn2 , there exists an integer m and a Boolean function f of
algebraic degree at most 3 on Fn+2m
2 whose Walsh transform takes the value Wf (0n+2m ) =
2m Wg (0n ) at zero. This proves that the functions of algebraic degree 3 can have Hamming
weights much more diverse than functions of degrees at most 2, since function g from which
we started can have for Hamming weight any integer between 0 and 2n , and then Wg (0n )
can take any even value between −2n and 2n .
Note, however, that the weights of some cubic functions (and even some quartic ones) are
easily determined. The weight of the product fg of two quadratic functions and of its sum
with any affine function can be deduced from fg = f +g−(f 2
⊕g)
. And the Fourier–Hadamard
+ −  ⊕g
transform being R-linear, we have fg+= f g f
. This works, for instance, for σ3 = σ1 σ2 ,
2
where σi is the ith elementary symmetric Boolean function. See also [745].

5.4 Indicators of flats


As we have already seen, a Boolean function  f is the indicator of a flat A of codimension r
if and only if it has the form f (x) = ri=1 (ai · x ⊕ i ), where a1 , . . . , ar ∈ Fn2 are F2 -
linearly independent and 1 , . . . , r ∈ F2 . Then f has Hamming weight 2n−r . Moreover, for
any a ∈ Fn2 , if a is F2 -linearly independent of a1 , . . . , ar , then the function f (x) ⊕ a · x
is balanced (and hence Wf (a) = 0), since it is linearly equivalent to a function r of the
form g(x1 , . . . , xr ) ⊕ xr+1 . If a is F2 -linearly
r dependent of a ,
1 . . . , a r , say a = i=1 ηi ai
r
with ηi ∈ F2 , then a · x takes constant value i=1 η (a · x) = i=1 ηi ( i ⊕ 1) on the flat;
r i i
 r
hence, f(a) = x∈A (−1)a·x equals 2n−r r
(−1) i=1 ηi ( i ⊕1) . Thus, if a =
i=1 ηi ai = 0n ,
then we have Wf (a) = −2 n−r+1 (−1) i=1 ηi ( i ⊕1) ; and we have Wf (0n ) = 2n − 2|A| =
2 −2
n n−r+1 .
Note that the nonlinearity of f equals 2n−r and is bad as soon as r ≥ 2. But indicators
of flats can be used to design Boolean functions with good nonlinearities, by concatenating
sums of indicators of flats and of affine functions; see below.

Remark. As recalled in Section 4.1, the functions of RM(r, n) whose weights occur in
the range [2n−r ; 2n−r+1 [ have been characterized by Kasami and Tokura [670]; any such
function is the product of the indicator of a flat and of a quadratic function or is the sum
(modulo 2) of two indicators of flats. The Walsh spectra of such functions can also be
precisely computed.

5.4.1 Concatenations of sums of indicators of flats and affine functions


Concatenating sums of indicators of flats and of affine functions gives another superclass,
studied in [226], of Maiorana–McFarland’s class. The functions of this generalized class are
of the form
t
(y)
f (x, y) = (x · φi (y) ⊕ gi (y) ⊕ 1) ⊕ x · φ(y) ⊕ g(y); (x, y) ∈ Fr2 × Fs2 , (5.21)
i=1
182 Functions with weights, Walsh spectra, and nonlinearities easier to study

where t is a function from Fs2 into {0, 1, . . . , r}, and where φ1 , . . . , φr , φ are functions
from Fs2 into Fr2 such that, for every y ∈ Fs2 , the vectors φ1 (y), . . . , φt (y) (y) are linearly
independent; g1 , . . . , gr and g are Boolean functions on Fs2 .
Let f be defined by (5.21). For every a ∈ F2r and every b ∈ F2s , we have
t (y)
Wf (a, b) = 2r (−1)g(y)⊕b·y − 2r−t (y)+1 (−1)g(y)⊕b·y⊕ i=1 ηi (a,y) gi (y)
,
y∈φ −1 (a) y∈Fa

where Fa is the set of all the vectors y of the space Fs2 such that a belongs to the flat φ(y)+
< φ1 (y), . . . , φt (y) (y) > (by convention equal to {φ(y)} if t (y) = 0), and where ηi (a, y) is
t (y)
defined (with uniqueness) for every i ≤ t (y) by the relation a+φ(y) = i=1 ηi (a, y) φi (y).
The cryptographic parameters of such functions are studied in [226, section 5].

5.5 Functions admitting (partial) covering sequences


5.5.1 The case of Boolean functions
The notion of covering sequence of a Boolean function has been introduced in [326].

Definition 47 Let f be an n-variable Boolean function. An integer-valued 8 sequence



(λa )a∈Fn2 is called a covering sequence of f if the integer-valued function a∈Fn λa Da f (x)
2
takes a constant value. This constant value is called the level of the covering sequence. If
the level is nonzero, we say that the covering sequence is a nontrivial covering sequence.

For instance, any balanced quadratic function admits a nontrivial atomic covering
sequence (see page 171). Note that the sum a∈Fn2 λa Da f (x) involves both kinds of

additions: the addition in Z and the addition ⊕ in F2 (which is concealed inside Da f ).
It has been shown in [326] that any function admitting a nontrivial covering sequence is
balanced (see Proposition 61 below for a proof) and that any balanced function admits the
constant sequence 1 as covering sequence (the level of this sequence is 2n−1 ).
A characterization of covering sequences by means of the Walsh transform was also given
in [326]: denote again by supp(Wf ) the support {u ∈ Fn2 | Wf (u) = 0} of Wf ; then:

Proposition 60 Let f be any n-variable Boolean function and λ = (λa )a∈Fn2 an integer-
valued sequence. Then f admits λ as covering sequence if and only if the Fourier–Hadamard
transform 
λ of the function a → λa takes a constant value on supp(Wf ). This constant value
 
is a∈Fn λa − 2ρ , where ρ is the level of the covering sequence.
2

 Da f (x) by 2 − 2 (−1)
Proof Replacing 1 1 Da f (x) = 1 − 1 (−1)f (x) (−1)f (x+a) in
2 2
the equality a∈Fn2 λa Da f (x) = ρ, we see that f admits the covering sequence

λ with level ρ if and only if, for every x ∈ Fn2 , we have a∈Fn2 λa (−1)
f (x+a) =
 
a∈Fn λa − 2ρ (−1)
f (x) . These two integer-valued functions are equal if and only
2

8 or real-valued, or complex-valued; but taking real or complex sequences instead of integer-valued ones has no
practical sense.
5.5 Functions admitting (partial) covering sequences 183

if their Fourier–Hadamard
 transforms are equal to each other, that is, if for every
b ∈ F2 , the sum a,x∈F λa (−1)f (x+a)⊕x·b , which by changing x into x + a equals
n n
  2  
n λ (−1) a·b W (b) =  λ(b) W (b), equals n λ − 2ρ Wf (b). The charac-
a∈F2 a f f a∈F2 a
terization follows.

Any Boolean function f on Fn2 is balanced (i.e., satisfies 0n ∈ supp(Wf )) if and only if it
admits at least one nontrivial covering
 sequence: the condition is clearly sufficient according
to Proposition 60 (since  λ(0n ) = a∈Fn λa and ρ = 0), and it is also necessary since the
2
constant sequence 1 is a covering sequence for all balanced functions. See more in [308].
We shall see in Chapter 7 that covering sequences play a role with respect to correlation
immunity and resiliency. But knowing a covering sequence for f gives no information on
the nonlinearity of f , since it gives only information on the support of the Walsh transform,
not on the nonzero values it takes. In [231], the author weakens the definition of covering
sequence, so that it can help computing the (nonzero) values of the Walsh transform.

Definition 48 Let f be a Boolean


 function on Fn2 . A partial covering sequence for f is a
sequence (λa )a∈Fn2 such that a∈Fn λa Da f (x) takes two values ρ and ρ  (distinct or not)
2
called the levels of the sequence. The partial covering sequence is called nontrivial if one of
the constants is nonzero.

The interest of nontrivial partial covering sequences is that they give information on the
Hamming weight and the Walsh transform.

Proposition 61 Let (λa )a∈Fn2 be a partial covering sequence of a Boolean function f , of


levels ρ and ρ  . 
Let A = {x ∈ Fn2 ; a∈Fn λa Da f (x) = ρ  } (assuming that ρ  = ρ; otherwise, when λ is
2
in fact a covering sequence of level ρ, we set A = ∅).
Then, for every vector b ∈ Fn2 , we have


λ(b) − 
λ(0n ) + 2 ρ Wf (b) = 2 (ρ − ρ  ) (−1)f (x)⊕b·x .
x∈A

Hence, if ρ = 0, we have

ρ
2 − 2wH (f ) = Wf (0n ) = 1 −
n
(−1)f (x) .
ρ
x∈A

Proof By definition, we have, for every x ∈ Fn2 :

λa Da f (x) = ρ  1A (x) + ρ 1Ac (x)


a∈Fn2

and therefore
λa (−1)Da f (x) = λa (1 − 2 Da f (x)) = λa − 2 ρ  1A (x) − 2 ρ 1Ac (x).
a∈Fn2 a∈Fn2 a∈Fn2
184 Functions with weights, Walsh spectra, and nonlinearities easier to study

We deduce:
⎛ ⎞

λa (−1)f (x+a) = (−1)f (x) ⎝ λa − 2 ρ  1A (x) − 2 ρ 1Ac (x)⎠ . (5.22)


a∈Fn2 a∈Fn2

We have already seen that the Fourier–Hadamard transform of the function (−1)f (x+a) maps
every vector b ∈ Fn2 to the value (−1)a·b Wf (b). Hence, taking the Fourier–Hadamard
transform of both terms of equality (5.22), we get
⎛ ⎞
⎝ λa (−1)a·b ⎠ Wf (b)
a∈Fn2
⎛ ⎞

=⎝ λa ⎠ Wf (b) − 2 ρ  (−1)f (x)⊕b·x − 2 ρ (−1)f (x)⊕b·x ,


a∈Fn2 x∈A x∈Ac

that is,

λ(b) Wf (b) = 
λ(0n ) Wf (b) − 2 ρ Wf (b) + 2 (ρ − ρ  ) (−1)f (x)⊕b·x .
x∈A
Hence

λ(b) − 
λ(0n ) + 2 ρ Wf (b) = 2 (ρ − ρ  ) (−1)f (x)⊕b·x .
x∈A

A simple example of nontrivial partial covering sequence is as follows: let E be any


set of derivatives of f . Assume that E contains a nonzero
 function and is stable under
addition (i.e., is a nontrivial F2 -vector space). Then g∈E g takes on values 0 and |E2 | .
Thus, if E = {Da f ; a ∈ E} (where we choose E minimum, so that any two different
vectors of the set E give different functions of E ), then 1E is a nontrivial partial covering
sequence.

Corollary 10 Let E be any set of derivatives of an n-variable Boolean function f . Assume


that E contains a nonzero function and is stable under addition (i.e., is a nontrivial F2 -vector
space). Then
2n − 2wH (f ) = Wf (0n ) = (−1)f (x) .
x∈A

See more in [231], with the notion of linear set of derivatives (which are sets of derivatives
stable under addition and provide partial covering sequences), combined with Proposition
61, and applied to the computation of the Hamming weights and Walsh spectra of quadratic
and Maiorana–McFarland functions and of other examples of functions.

5.5.2 The case of vectorial functions


The generalization of the notion of covering sequence to vectorial functions has been studied
in [319]. A covering sequence for a Boolean function can be seen as a function ϕ from Fn2
5.5 Functions admitting (partial) covering sequences 185

into R such that a∈Fn2 ; Da F (x)=1 ϕ (a) = ρ, for every x ∈ Fn2 . This generalizes to vectorial
functions:

Definition 49 We call covering sequence of an (n, m)-function F , a pair of functions


(ϕ, ψ) from, respectively, Fn2 and Fm
2 into R, such that

∀x ∈ Fn2 , ∀b ∈ Fm
2, ϕ (a) = ψ (b) . (5.23)
a∈Fn2 ; Da F (x)=b


Note that this equality between functions b → a∈Fn ; Da F (x)=b ϕ(a) and b → ψ(b) is
2
equivalent to the equality between their Fourier transforms, that is,

2 , ∀x ∈ F2 ,
∀v ∈ Fm n
ϕ(a)(−1)v·Da F (x) = ψ(b)(−1)v·b ,
a∈Fn2 b∈Fm
2

which is equivalent to
(v),
ϕ(a)(−1)v·F (x+a) = (−1)v·F (x) ψ
a∈Fn2

that is,
(v) χF (·, v),
ϕ ⊗ χF (·, v) = ψ (5.24)

where χF (·, v) denotes function x → χF (x, v) = (−1)v·F (x) and ⊗ is the convolutional
product.

Proposition 62 An (n, m)-function F is balanced if and only if it admits at least one


(v) = 
covering sequence (ϕ, ψ) satisfying ψ ϕ (0n ) for every nonzero vector v of Fm
2 . Any
balanced (n, m)-function F admits the pair of constant functions (1, 2n−m ) for covering
sequence.

Proof Assume that (ϕ, ψ) is a covering sequence of F , then Equation (5.24) is satisfied
and by applying the Fourier transform at 0n to both sides of this functional equality, we
obtain

2, 
∀v ∈ Fm (v)WF (0n , v) ,
ϕ (0n )WF (0n , v) = ψ
(v))WF (0n , v) = 0 for every v ∈ Fm . This gives 
ϕ (0n ) − ψ
that is, ( (0m ). If
ϕ (0n ) = ψ
2
 
ϕ (0n ) − ψ (v) is nonzero for every nonzero v ∈ F2 , then the function v → WF (0n , v) is
m

null on Fm2 \ {0m }, which implies that F is balanced, according to Proposition 35.
Conversely, if F is balanced, then, for every pair (b, x) ∈ Fm
2 × F2 , the cardinality of the
n

set {a ∈ Fn2 ; Da F (x) = b} is constant equaling 2n−m since the equation Da F (x) = b is
equivalent to F (x + a) = b + F (x). Let ϕ : Fn2 → R and ψ : Fm 2 → R be respectively
the constant function x → 1 and the constant function y → 2n−m , then the pair (ϕ, ψ) is
a covering sequence of F satisfying the relation ψ (v) = 0 = ψ (0m ) = ϕ (0n ) = 2n−m for
every element v of Fm 2 \ {0m }.
186 Functions with weights, Walsh spectra, and nonlinearities easier to study

Remark. Finding a second covering sequence is often a difficult problem. It is shown in


[319] that the Maiorana–McFarland functions that satisfy the hypothesis of Proposition 128,
page 315, admit several covering sequences.

Definition 50 A covering sequence (ϕ, ψ) of an (n, m)-function F is said to be nontrivial


(v) never equals 
if ψ (0m )) when v ranges over Fm \ {0m }.
ϕ (0n ) (that is, ψ 2

Thus, according to Proposition 62, an (n, m)-function F is balanced if and only if it admits
a nontrivial covering sequence. This definition and this observation generalize what was
known for Boolean functions.

Remark. If ψ is a function from Fm  


2 into R+ , then we have ψ (v) = ψ (0m ) for every
element v of F2 \ {0m } if and only if the support of ψ has rank m (i.e., spans the whole
m

vector space Fm
2 ). Indeed, we have

∀v ∈ Fm  
2 \ {0m }, ψ (v) = ψ (0m ) ⇐⇒ ∀v ∈ F2 \ {0m },
m
ψ(b) = 0
c
b∈Fm
2 ,b∈ v⊥
( )
and, since ψ(b) ≥ 0, ∀b ∈ Fm 2 , this relation is equivalent to saying that the support of ψ is
not included in a linear hyperplane of Fm 2.

Let us now generalize to vectorial functions the characterization of covering sequences of


Boolean functions by means of their Fourier transforms and of the Walsh support of F .

Proposition 63 Let F be an (n, m)-function, and let (ϕ, ψ) be any pair of real-valued
functions respectively defined on Fn2 and on Fm2 . Then F admits (ϕ, ψ) for covering sequence
if and only if, for every pair (u, v) belonging to Supp WF , we have ϕ (u) = ψ(v).

Proof Thanks to the bijectivity of the Fourier transform, for every nonzero vector v ∈ Fm 2,
(v) χF (·, v) and ϕ ⊗ χF (·, v) of Relation (5.24) are equal if and only if their
the functions ψ
Fourier transforms on Fn2 are equal, that is:
∀v ∈ Fm
2 , ∀u ∈ F2 , 
n (v) WF (u, v),
ϕ (u)WF (u, v) = ψ
that is, if and only if

((u, v) ∈ Supp WF ) '⇒  (v) .
ϕ (u) = ψ

Corollary
 11 Let F be an (n, m)-function admitting (ϕ, ψ) for covering sequence. If the
sets  (Fm \ {0m }) are disjoint, then F is t-resilient.
ϕ {u ∈ Fn2 / wH (u) ≤ t} and ψ 2

It is deduced in [319] from Proposition 63 that if an (n, m)-function F admits a covering


sequence (ϕ, ψ) such that the functions ϕ and ψ are, respectively, different from the zero
function on Fn2 and different from the zero function on Fm 2 \ {0m }, then, for every vector
u ∈ Fn2 , there exists v ∈ Fm2 \ {0 m } such that W F (u, v) = 0, and for every vector v ∈ Fm2,
there exists a vector u ∈ F2 such that WF (u, v) = 0.
n
5.6 Functions with low univariate degree and related functions 187

We show now that the notion of covering sequence behaves well with respect to
composition.

Proposition 64 [319] Let F : Fn2 → Fm 2 and G : F2 → F2 be two functions admitting


m k

respectively (ϕ, ψ) and (ψ, θ) for covering sequences. Then, (ϕ, θ) is a covering sequence
of G ◦ F .

Proof For every pair (x, a) ∈ Fn2 × Fn2 , we have, denoting Da F (x) by b:

Da [G ◦ F ](x) = G(F (x)) + G(F (x + a)) = G(F (x)) + G(F (x) + b) = (Db G)(F (x)).

Thus, for every pair (x, c) ∈ Fn2 × Fk2 , we have


⎛ ⎞

ϕ (a) = ⎝ ϕ(a)⎠ .
a∈Fn2 ,Da [G◦F ](x)=c b∈Fm
2 ,(Db G)(F (x))=c a∈Fn2 ,Da F (x)=b

For every pair (x, b) ∈ Fn2 × Fm
2 , we have a∈Fn2 ,Da F (x)=b ϕ (a) = ψ (b) and thus

ϕ (a) = ψ (b) .
a∈Fn2 ,Da [G◦F ](x)=c b∈Fm
2 ,(Db G)(F (x))=c

Let y denote F (x), then

ϕ (a) = ψ(b). (5.25)


a∈Fn2 ,Da [G◦F ](x)=c b∈Fm
2 ,Db G(y)=c

Since (ψ, θ) is a covering sequence of G, the sum b∈Fm ψ (b) takes constant
2 ,Db G(y)=c
value θ (c) for every pair (y, c) ∈ Fm
2 × Fk2 and we deduce

∀x ∈ Fn2 , ∀c ∈ Fk2 , ϕ (a) = θ (c) .


a∈Fn2 ,Da G◦F (x)=c

In [319], the authors give a similar (more technical) result on the concatenation of
functions, with consequences on Maiorana–McFarland functions. An attack on ciphers using
functions admitting covering sequences is also presented.

5.6 Functions with low univariate degree and related functions


The following Weil’s theorem is very well known in finite field theory (cf. [775, theorem
5.38]):

Theorem 11 Let q be a prime power and F (x) ∈ Fq [x] a univariate polynomial of degree
d ≥ 1 with gcd(d, q) = 1. Let χ be a nontrivial character of Fq . Then
0 0
0 0
0 0
0 χ(F (x)) 0 ≤ (d − 1) q 1/2 .
0 0
0x∈Fq 0
188 Functions with weights, Walsh spectra, and nonlinearities easier to study

0 q = 2 , this Weil’s bound means that, for every nonzero a ∈ F2n , we have
For n
0
0 0 n
0 x∈F2n (−1) tr n (aF (x))
0 ≤ (d − 1) 2 2 . And since adding a linear function trn (bx) to
the function trn (aF (x)) corresponds to adding (b/a) x to F (x) and does not change its
univariate degree, we deduce that, if d > 1 is odd and a = 0, then
n
nl(trn (aF )) ≥ 2n−1 − (d − 1) 2 2 −1 .
An extension of the Weil bound to the character sums of functions of the form F (x)+G(1/x)
1/x = x 2 −2 takes value 0 at 0), among which are the so-called Kloosterman sums
n
(where
 trn (1/x+ax) , has been first obtained by Carlitz and Uchiyama [333] and extended
x∈F2n (−1)
by Shanbhag et al. [1032]: if F and G have odd univariate degrees, then
−1 )+G(x)) n
(−1)trn (F (x ≤ (dalg (F ) + dalg (G)) 2 2 .
x∈F2n

More can also be found in [678] for the case where a function with sparse univariate
representation is added to F .
6

Bent functions and plateaued functions

Bent functions are fascinating extremal mathematical objects. Bent Boolean functions play
a role in coding theory, with Kerdock codes (see Subsection 6.1.22, page 254), and in
other domains of communications (for instance, they are used to build the so-called bent
function sequences for telecommunications [919] and are related to Golay Complementary
Sequences [416]). Bent vectorial functions allow constructing good codes [453, 865, 866]
and pose interesting problems related to coding theory [278, 854].
The role of bent Boolean functions in cryptography is less obvious nowadays since,
because of fast algebraic attacks and Theorem 22, page 332 (which shows that Boolean
functions obtained from bent functions by modifying a few values cannot allow resisting
them), we do not know an efficient construction using bent functions that would provide
Boolean functions having all the necessary features for being used in stream ciphers.
Concerning block ciphers, since bent vectorial functions are not balanced and do not exist
when m > n2 , they are rarely used as substitution boxes in block ciphers.1 Bijectivity is
mandatory in the kind of ciphers called substitution permutation networks, and unbal-
ancedness can represent a weakness in the other kind of ciphers called Feistel (see, e.g.,
[957]). But vectorial bent functions can, however, be used in block ciphers at the cost of
additional diffusion/compression/expansion layers, or as building blocks for constructions of
substitution boxes. Moreover, constructions of bent Boolean functions are often transposable
into constructions of Boolean functions for stream ciphers, and bent vectorial functions
are used to construct algebraic manipulation detection codes (see Section 12.1.6), which
play an important role in cryptography. Hence, even from a cryptographic viewpoint, it
seems important to devote a chapter to them. This is all the more true that bent (Boolean
or vectorial) functions possess properties that are cryptographically very interesting: they
have optimal nonlinearity, by definition, and their derivatives are balanced (in other words,
changing the input to a bent function by the addition of a nonzero vector induces a uniform
change among the 2n outputs; this has of course relationship with the differential attack on
block ciphers). And it often happens that the cryptographic interest of notions on Boolean
functions be renewed with the apparition of new ways of using them (see Section 12.1).
The notion of bent function has been generalized to functions over Z4 and to the wider
domain of generalized bent functions. The page limit of this book does not allow us to
address them.

1 But the S-boxes in the block ciphers CAST-128 and CAST-256 are modified from bent functions, as
well as the round functions in the cryptographic hash algorithms MD4, MD5, and HAVAL, and the
nonlinear-feedback shift registers (NLFSR) in the stream cipher Grain.

189
190 Bent functions and plateaued functions

Plateaued functions are a generalization of bent functions that free themselves from some
cryptographic weaknesses inherent to bent functions (in particular, their unbalancedness,
the fact that their numbers of variables are necessarily even, and for vectorial functions
the nonexistence of bent (n, m)-functions when m > n/2) but not all of them (for instance,
they also have limited algebraic degree, which represents a weakness with respect to fast
algebraic attacks).
The history of bent functions begins in the 1960s.2 The first paper in English on bent
Boolean functions was written by O. Rothaus in 1966 and published ten years later [1005].
It seems that, already in 1962, bent functions had been studied in the Soviet Union under
the name of minimal functions, as mentioned by Tokareva in [1089]. V. A. Eliseev and O. P.
Stepchenkov had proved that their algebraic degree is bounded above by half the number
of variables (except in the case of two variables); they had also proposed an analogue of
the Maiorana–McFarland construction. Their technical reports have never been declassified.
The extension of the notion to vectorial (n, m)-functions is due to Kaisa Nyberg [906]. A
book by S. Mesnager [865] that we recommend and a slightly more recent survey [313] exist
on bent functions.
The introduction of plateaued Boolean functions is due to Zheng and Zhang [1173] as a
generalization of partially-bent functions [211]. Recently it has been shown in [247] that
plateaued vectorial functions share with quadratic vectorial functions most of their nice
properties, which considerably simplify in particular the study of their APNness; see Chapter
11 (but the property of plateauedness is not easy to prove in general).

6.1 Bent Boolean functions


We first recall for the convenience of the reader what we have seen on bent functions in
Section 3.1, and we add some observations:
• A Boolean function f on Fn2 (n even) is called bent if its Hamming distance to the code
n
RM(1, n) of n-variable affine functions (the nonlinearity of f ) equals 2n−1 − 2 2 −1 (i.e.,
is optimal).
• f is bent if and only if its Walsh transform Wf (with respect to some inner product)
n
takes values ±2 2 only.3 This characterization is independent of the choice of the
inner product on Fn2 , since any other inner product has the form (x, s) = x · L(s),
where L is an autoadjoint linear automorphism, i.e., when “·” is the usual inner
product, an automorphism whose associated matrix is symmetric. The condition in this
characterization can be slightly weakened, without losing the property of being necessary
and sufficient:

2 In fact, bent functions had been studied before the adjective “bent” was invented, since the supports of bent
Boolean functions are difference sets [651] in elementary Abelian 2-groups. Nevertheless, mathematicians
were not much interested in such groups at that time.
3 In [1093], the authors show that Boolean functions with two Walsh values are affine functions and bent
functions, possibly modified at 0n .
6.1 Bent Boolean functions 191

Lemma 5 Let n ≥ 2 be even. Any n-variable Boolean function f is bent if and


only if, for every a ∈ Fn2 , we have Wf (a) ≡ 2 2 [ mod 2 2 +1 ], or equivalently f(a) ≡
n n

n n
2 2 −1 [ mod 2 2 ].

Proof This necessary condition is also sufficient, since, if it is satisfied, then writing
n
Wf (a) = 2 2 λa , where λa is odd for every a, Parseval’s relation (2.47) implies

a∈Fn λa = 2 , and hence λa = 1 for every a.
2 n 2
2

A slightly different viewpoint on bent functions is that of bent sequences: for each
n
vector X in {−1, 1}2 , define: X̂ = √1 n Hn X, where Hn is the Walsh–Hadamard matrix,
2
recursively defined by

 
Hn−1 Hn−1
Hn = , H0 = [1].
Hn−1 −Hn−1

n
The vectors X such that X̂ belongs to {−1, 1}2 are called bent sequences. They are
the images by character χ = (−1)· of the bent functions on Fn2 . In [993], the authors
consider some generalized bent notions (among which the nega-bent notion) from the
domain of quantum error-correcting codes, corresponding to flat spectra with respect to
some unitary transforms (whose matrices U are such that U U † equals the identity matrix,
where “†” means transpose-conjugate, and generalize Walsh–Hadamard matrices).
• An n-variable Boolean function f is bent if and only if its Hamming distance to any
n
affine function equals 2n−1 ± 2 2 −1 ; then half of the elements of the Reed–Muller code
n n
of order 1 lie at distance 2n−1 + 2 2 −1 from f and half lie at distance 2n−1 − 2 2 −1 (since
n n
if  lies at distance 2n−1 + 2 2 −1 from f , then  ⊕ 1 lies at distance 2n−1 − 2 2 −1 and vice
versa). Conversely, a Boolean function is affine if and only if it lies at maximal distance
from the set of bent functions (this is shown in [1088] but was probably known earlier
by Dillon, Dobbertin, and others, although maybe not explicitly written). In other words,
the set of affine functions and the set of bent functions are metric complements of each
other and constitute a so-called pair of metrically regular sets in the Boolean hypercube.
It is shown by Tokareva after observing that for any nonaffine Boolean function f in
even dimension, there exists a bent function g such that f ⊕ g is not bent.
• Bent Boolean functions are not balanced. As soon as n is large enough (say n ≥ 20),
n
the difference 2 2 −1 between their Hamming weights and the weight 2n−1 of balanced
functions is very small with respect to this weight. However, according to [42, theorem
6], 2n bits of the pseudorandom sequence output by f in a combiner or a filter model are
enough to distinguish it from a random sequence. Nevertheless, we shall see that highly
nonlinear balanced functions can be built from bent functions.

Remark. Given a bent Boolean function f , the functions f ⊕  where  is affine are not
balanced, but their weights are globally as close to 2n−1 as possible: according to Parseval’s
relation, there do not exist functions f such that the functions f ⊕  have all weights closer
to 2n−1 .
192 Bent functions and plateaued functions

6.1.1 Extended affine invariance of bentness and automorphism


group of a function
The nonlinearity being an EA invariant, so is the notion of bent function. A class of bent
functions shall be called a complete class if it is preserved by EA equivalence.
The automorphism group of the set of bent functions is the general affine group. It indeed
contains the general affine group, and the reverse inclusion is a direct consequence of the
property that, given a Boolean function g, if for every bent function f , function f ⊕ g is
also bent, then g is affine (which shows that the automorphism group of the set of all bent
functions is included in that of all affine functions; Proposition 51, page 155, completes then
the proof).
Other notions of equivalence between bent functions come from design theory; see
Subsection 6.1.9.
Given a (Boolean or vectorial) function f , recall that the group (already seen at page
72) of those affine automorphisms A that preserve f (alternatively,4 those that preserve its
graph) is called the automorphism group of function f and is denoted by Aut (f ). The
determination of Aut (f ) for f bent is often a difficult problem; see [56, 426, 659]. In
[1162], the authors only studied the so-called symmetric group (the subgroup of those input
coordinate permutations that preserve the function).

6.1.2 Characterization of bentness by the derivatives


Characterization by first-order derivatives
Thanks to Relation (2.53), page 62, and to the fact that the Fourier–Hadamard transform
of a pseudo-Boolean function is constant if and only if the function equals δ0 times some
constant, we have

Theorem 12 Any n-variable Boolean function (n even5 ) is bent if and only if, for any
nonzero vector a, the Boolean function Da f (x) = f (x) ⊕ f (x + a) is balanced, that is, if
and only if f satisfies P C(n).

In [190, 191] (see also [353]), the authors observed that, for every linear hyperplane H of
Fn2 , the condition of Theorem 12 can be weakened into “for any nonzero a in H , function

Da f is balanced.” Indeed, for H = {0n , α} , we have Wf (α) = a∈Fn (−1)a·α F (Da f ),
2
 2
Wf2 (0n ) + Wf2 (α) = 2 a∈H F (Da f ), and this necessary condition is also sufficient since
n being even, the sum of these two squares equals 2n+1 if and only if each square equals 2n
(see, e.g., [191]). The functions whose derivatives Da f , a ∈ H , a = 0n are all balanced
for n odd are also characterized in [190, 191] as well as, for every n, the functions whose
derivatives Da f , a ∈ E, a = 0n are all balanced, where E is a vector subspace of Fn2 of
dimension n − 2.

4 In the cases of Boolean functions and of bent functions, it makes less difference [149, 150].
5 In fact, according to the observations above, “n even” is implied by “f satisfies P C(n)”; functions satisfying
P C(n) do not exist for odd n.
6.1 Bent Boolean functions 193

Because of Theorem 12, bent (Boolean) functions are also called perfect nonlinear
functions.6 Equivalently, as noted by Rothaus and Welch, f is bent if and only if the 2n × 2n
matrix H = [(−1)f (x+y) ]x,y∈Fn2 is a Hadamard matrix (i.e., satisfies H × H t = 2n I , where
I is the identity matrix). This implies that the Cayley graph Gf (see Subsection 2.3.5, page
70) is strongly regular (see [68] for more precision and for a characterization).

Characterization by second-order derivatives and second-order covering sequences

Proposition 65 [317] An n-variable Boolean function f is bent if and only if

∀x ∈ Fn2 , (−1)Da Db f (x) = 2n . (6.1)


a,b∈Fn2

Proof If we multiply both terms of Relation (6.1) by fχ (x) = (−1)f (x) ,


we obtain the (equivalent) relation: ∀x ∈ Fn2 , fχ ⊗ fχ ⊗ fχ (x)  = 2n fχ (x);
 
indeed, we have fχ ⊗ fχ ⊗ fχ (x) = b∈Fn2 a∈Fn2 (−1)
f (a)⊕f (a+b) (−1)f (b+x) =
 f (a+x)⊕f (a+b+x)⊕f (b+x) . According to the bijectivity of the Fourier–
a,b∈Fn2 (−1)
Hadamard transform and to Relation (2.44), page 60, this is equivalent to
∀u ∈ Fn2 , Wf3 (u) = 2n Wf (u) .

Thus, we have a,b∈Fn (−1)Da Db f (x) = 2n if and only if, for every u ∈ Fn2 , Wf (u) equals
√ 2
± 2n or 0. According to Parseval’s relation, the value 0 cannot be achieved by Wf and this
is therefore equivalent to the bentness of f .

Relation (6.1) is equivalent to the relation a,b∈Fn2 (1 − 2Da Db f (x)) = 2n , that is,

a,b∈Fn2 Da Db f (x) = 22n−1 − 2n−1 , and hence to the fact that f admits the second-order
covering sequence with all-1 coefficients and with level 22n−1 − 2n−1 .

6.1.3 Characterization of bentness by power moments of the Walsh transform


For every even integer w ≥ 4, bent functions are characterized by the property that the sum
 w
a∈Fn Wf (a) is minimum:
2

Proposition 66 Let n be any positive integer and f be any n-variable Boolean function.
Then, for every even integer w ≥ 4, we have
w
Wfw (u) ≥ 2( 2 +1)n ,
u∈Fn2

with equality if and only if f is bent.


6 The characterization of Theorem 12 leads to a generalization of the notion of bent function to nonbinary
functions. In fact, several generalizations exist [16, 718, 802] (see [266] for a survey); the equivalence between
being bent and being perfect nonlinear is no more valid if we consider functions defined over residue class
rings (see, e.g., [271]).
194 Bent functions and plateaued functions

This is straightforward for w = 4 by using for instance the Cauchy–Schwarz inequality


and its case of equality, and for w ≥ 6, it is a direct consequence of the well-known
  1
inequality on the Lw norm: if w ≥ w and λi ≥ 0, ∀i ∈ I , then ( i∈I λw i )
w ≥
1 1 
− 1
|I | w w ( i∈I λw i ) .
w

Such sums (for even or odd w) play a role with respect to fast correlation attacks [189,
203] (when these sums have small magnitude for low values of w, this contributes to a good
resistance to fast correlation attacks).
Note that for w = 4, we have, according to (3.9) and (3.10), page 98:

Wf4 (u) = 2n V (f ) = 2n (−1)Da Db f (x) .


u∈Fn2 x,a,b∈Fn2

Hence:

Corollary 12 Let n be any positive integer and f any n-variable Boolean function. Then

(−1)Da Db f (x) ≥ 22n ,


x,a,b∈Fn2

with equality if and only if f is bent.

Remark. There is no such characterization for w odd, except in particular cases, such as
Niho functions; see Proposition 82, at page 222.

Corollary 13 Let n be any positive integer, w any even integer larger than or equal to 4,
and E an F2 -vector space of n-variable Boolean functions. There exists an (n, m)-function
F such that E \ {0} is the set of component functions of F . All functions except the null one
in E are bent if and only if F is bent, and this happens if and only if
w w
|{(x1 , . . . , xw ) ∈ (Fn2 )w ; F (xi ) = 0m and xi = 0n }| =
i=1 i=1
wn
2(w−1)n−m + (2m − 1) · 2 2 −m .

Proof The first assertion is by definition, and, according to Proposition


 66, the component
functions v · F , v ∈ Fm2 \ {0m }, are all bent if and only if we have u∈F2 ,v∈Fm
n WFw (u, v) =
2
w
2wn + (2m − 1) · 2( 2 +1)n (distinguishing the case v = 0m from the cases v = 0m ), that is,
if and only if we have
w w
(−1)v· i=1 F (xi )⊕u· i=1 xi =
u,x1 ,...,xw ∈Fn2 ,v∈Fm
2
w w
2 n+m
|{(x1 , . . . , xw ) ∈ (Fn2 )w ; F (xi ) = 0m and xi = 0n }| =
i=1 i=1
w
2wn + (2m − 1) · 2( 2 +1)n .
6.1 Bent Boolean functions 195

6.1.4 Characterization of bentness by the NNF


The ANF does not allow directly characterizing bent functions, but the NNF does, and this
provides then a possible characterization through the ANF by using Relation (2.24), page 50
(however, this characterization is complex, and we then do not state it explicitly).
The direct relationship between the Walsh transform values and the coefficients of the
NNF gives

Proposition 67 [292] Let f (x) = I ⊆{1,...,n} λI x I be the NNF of a Boolean function f
on Fn2 . Then f is bent if and only if
n
1. For every I such that n2 < |I | < n, the coefficient λI is divisible by 2|I |− 2 ;
2. λ{1,...,n} ≡ 2 2 −1 [mod 2 2 ].
n n

Proof According
 to Lemma
 5, page 190, f is bent if and only if, for every a ∈ Fn2 ,
f(a) ≡ 2 2 −1 mod 2 2 . We deduce that, according to Relation (2.59), page 66, applied
n n

with ϕ = f , Conditions 1. and 2. are sufficient for f to be bent.


Conversely, Condition 1. is necessary, according to Relation (2.60). Condition 2. is also
necessary since f(1n ) = (−1)n λ{1,...,n} , from Relation (2.59) or (2.60).

The related characterization of bent functions by the ANF mentioned above implies
conditions on the coefficients of the ANFs of bent functions, which have been observed
and used in [301] (see more at page 243) and also partially observed by Hou and Langevin
in [627].
Point 1 in Proposition 67 can be expressed by a single equation; see [293]. It is proved
in this same reference that bentness can also be characterized by the generalized degree
introduced at page 48.

6.1.5 Characterization of bentness by codes


A way of looking at bent functions deals with linear codes (as we mentioned in Section
4.1, at page 159): let f be any n-variable Boolean function (n even); we write its support
supp(f ) = {x ∈ Fn2 ; f (x) = 1} as {u1 , . . . , uwH (f ) }; we consider the matrix G
whose columns are all the vectors of supp(f ), without repetition, and call C the linear
code generated by the rows of this matrix. Then C is the set of all the vectors Uv =
(v · u1 , . . . , v · uwH (f ) ), where v ranges over Fn2 , and:

Proposition 68 [1120] Let n ≥ 4 be an even integer. Any n-variable Boolean function f is


bent if and only if the linear code C whose generator matrix is the matrix whose columns are
all the vectors of supp(f ) has dimension n, and has exactly two nonzero Hamming weights:
2n−2 and wH (f ) − 2n−2 .

Indeed, for every nonzero v in Fn2 , the Hamming weight of codeword Uv equals
  1−(−1)v·x  f(v)
x∈Fn2 f (x) × v · x = x∈Fn2 f (x) 2 = f (0n )−
2 . Hence, according to Rela-
tion (2.32), page 55, relating Fourier–Hadamard and Walsh transforms, wH (Uv ) equals
196 Bent functions and plateaued functions

Table 6.1 Weight distribution of Cf for f bent.

Hamming weight Multiplicity

0 1
2n−1 2n − 1
n n
2n−1 − 2 2 −1 2n−1 + (−1)f (0n ) 2 2 −1
n n
2n−1 + 2 2 −1 2n−1 − (−1)f (0n ) 2 2 −1

W (v)−W (0 )
2n−2 + f 4 f n . Thus, if f is bent, this weight is never null and C has then dimension n;
moreover, either Wf (v) = Wf (0n ) and wH (Uv ) = 2n−2 , or Wf (v) = −Wf (0n ) and
wH (Uv ) = 2n−2 − f 2 n = 2n−2 − 2 −2 w
W (0 ) n
H (f )
2 = wH (f ) − 2n−2 . Conversely, if C has
dimension n and has exactly the two nonzero Hamming weights 2n−2 and wH (f ) − 2n−2 ,
W (v)−W (0 )
then according to the relation wH (Uv ) = 2n−2 + f 4 f n , for every v we have either
Wf (v) = Wf (0n ) or Wf (v) = Wf (0n ) + 4wH (f ) − 2n+1 = −Wf (0n ) and, according to
n
Parseval’s relation (2.48), page 61, Wf (v) equals then ±2 2 for every v, i.e., f is bent.
C being linear, the minimum distance of C equals the minimum of these two nonzero
weights: 2n−2 if wH (f ) = 2n−1 + 2 2 −1 and 2n−2 − 2 2 −1 if wH (f ) = 2n−1 − 2 2 −1 .
n n n

There exist two other characterizations by Wolfmann [1120] dealing with C:


1. C has dimension n and C has exactly two weights, whose sum equals wH (f ).
2. The length wH (f ) of C is even, C has exactly two weights, and one of these weights is
2n−2 .
Of course, any bent Boolean function f can also be viewed as a (vectorial) (n, 1)-function
and be related to the code Cf seen at page 160, which has then weight distribution given by
Table 6.1 (deduced from the Parseval and inverse Walsh transform formulae).
In [633, 634], the authors introduce the so-called near weight enumerator of a bent
n
function f , equal to WCf (X, Y ) + 2 2 −1 Xn , where WCf is the weight enumerator (see page
14) of the code Cf = supp(f ). A related Mac-Williams-like identity is shown between dual
bent functions (see Definition 51, page 197), leading to a notion of formally self-dual bent
function and a Gleason-type theorem (see Gleason’s theorem at page 16). As an application
is proved in [634], the non-existence of bent functions in 2n variables with lowest degree of
nonconstant terms in their ANF equal to n − k, for any nonnegative integer k and n ≥ N,
where N is the smallest integer satisfying N+k+1k+1 < 2N−1 − 1.

6.1.6 Characterization of bentness by difference sets, relative difference sets,


and structures of finite geometries
A subset D of a finite additive group G is called a (|G|, |D|, λ)-difference set in G if every
nonzero element in G can be written in exactly λ ways as the difference between two
elements of D (which implies λ(|G| − 1) = |D|(|D| − 1)). Equivalently, the incidence
matrix [D] defined by [D]u,v = 1 if u + v ∈ D and [D]u,v = 0 otherwise satisfies
[D]2 = (|D| − λ)I + λ J , where I is the identity matrix and J the all-1 matrix [440].
Then G \ D is also a difference set. Moreover, for any g ∈ G, g + D is a difference set,
6.1 Bent Boolean functions 197

called translate of D (we shall see in the next subsection that the set of all translates forms a
symmetric block design).
It is observed in [441, 1005] that a Boolean function f : Fn2 → F2 is bent if and only
if its support supp(f ) is a nontrivial difference set in the elementary Abelian 2-group Fn2 .
It is known from Mann [824] that the parameters of such a difference set must then be
n n
(|G|, |D|, λ) = (2n , 2n−1 ± 2 2 −1 , 2n−2 ± 2 2 −1 ). Such a difference set is called a Hadamard
difference set.
Note that the EA equivalence of two bent functions does not necessarily imply the
equivalence of the related difference sets (see, e.g., [695, page 265]).
|G|
A subset R of a finite additive group G is called a ( |N| , |N|, |R|, λ) relative difference
set in G relative to a subgroup N of G if every element in G \ N can be written in exactly
λ ways as the difference between two elements of R and no nonzero element of N can be
written this way. An n-variable Boolean function is bent if and only if its graph is a relative
difference set relative to {0n } × F2 . This property extends to vectorial functions. See more
in [965] on the connections between Boolean or vectorial functions and such structures.
In [428], the author also characterized some bent functions by means of the notion of
dimensional doubly dual hyperoval, in finite geometry.

6.1.7 The dual of a bent Boolean function


As linear codes, bent functions go by pairs:

Definition 51 For every n even and every bent n-variable Boolean function f , the dual
function f of f , is defined by
n 
∀u ∈ Fn2 , Wf (u) = 2 2 (−1)f (u) .

Proposition 69 [441, 1005] The dual of any bent function is also bent, and its own dual is
f itself.

Indeed, the inverse Walsh transform property (2.43), page 59, gives, for every a ∈ Fn2 :
 f(u)⊕a·u = 2 n2 (−1)f (a) .
u∈Fn2 (−1)
Let f and g be two bent functions, then Relation (2.46), page 60, applied with ϕ(x) =
fχ (x) = (−1)f (x) and ψ = gχ , shows that

F (f⊕ 
g ) = F (f ⊕ g). (6.2)

Thus, f ⊕ g and f⊕ 


g have the same Hamming weight and:

Proposition 70 [209, 212] The mapping f → f is an isometry of the class of bent n-


variable Boolean functions.

Remark. This isometry clearly cannot be extended into an isometry of the whole space
BF n . Indeed, there would exist then a permutation π of Fn2 and an n-variable Boolean
function g such that f = f ◦ π ⊕ g for every bent function f , and the examples of duals of
198 Bent functions and plateaued functions

bent functions we know (with Maiorana–McFarland functions, for instance) show that such
π, g do not exist.

The mapping f → f also preserves EA equivalence, as originally observed in [441] in


different terms. Indeed, for every linear automorphism L, we have according to Relation

(2.58), page 63, that f ◦ L = f◦ L , where L is the adjoint operator of L−1 , and, for every
a, b ∈ Fn2 , we have according to Lemma 4, page 58, that f ◦ tb ⊕ a = f◦ ta ⊕ b ⊕ a · b,
where ta is the translation by a.
Denoting b · x by b (x), Relation (6.2), applied with g(x) = f (x + b) ⊕ a · x, gives
F (Da f⊕ b ) = (−1)a·b F (Db f ⊕ a ), (6.3)
and applied with g(x) = f (x) ⊕ a (x) and with f (x + b) in the place of f (x), it gives the
following property, first observed in [219] (and rediscovered in [193]):
F (Da f⊕ b ) = F (Db f ⊕ a ). (6.4)
This implies in particular the following relation that we shall need in the sequel:
F (Da f⊕ b ) = F (Db f ⊕ a ). (6.5)
a,b∈Fn2 a,b∈Fn2

Moreover, from Relations (6.3) and (6.4), we deduce

Proposition 71 [236] Let f be any n-variable bent function. For every a, b ∈ Fn2 , let us
denote b (x) = b ·x and a (x) = a ·x. Then Da fand Db f satisfy Relation (6.4). Moreover,
if a · b = 1, then F (Da f⊕ b ) = F (Db f ⊕ a ) = 0.

In fact, Relation (6.4) is in a way characteristic of bent functions:

Proposition 72 [236] If a pair of n-variable Boolean functions f and f  satisfies the


relation F (Da f  ⊕ b ) = F (Db f ⊕ a ) for every a, b ∈ Fn2 , then these functions are bent
and are the dual of each other, up to the addition of a constant function.

Proof Taking a = 0n in the equality F (Da f  ⊕ b ) = F (Db f ⊕ a ) shows that Db f is


balanced for every b = 0n and taking b = 0n shows that Da f  is balanced for every a = 0n .
This proves the first assertion. Let us sum up the relations  F (Da f  ⊕ b ) = F (Db f ⊕
a ) for b ranging over Fn2 . We obtain the equalities f  (x)⊕f  (x+a)⊕b·x =
x,b∈Fn2 (−1)
 f (x)⊕f (x+b)⊕a·x =
 f (x)⊕f (y)⊕a·x = W (0 ) × W (a), and this
x,b∈Fn2 (−1) x,y∈Fn2 (−1) f n f
  (0n )⊕f(a)
n
gives 2 (−1) f (0 n )⊕f (a) = 2 (−1)
n f , that is, f (0n ) ⊕ f (a) = f(0n ) ⊕ f(a), for
 

every a.

Notice that, for every a and b, we have Db f = a ⊕ if and only if Da f = b ⊕ .


Rothaus already observed that “many” bent functions are equal to their duals, i.e., are
self-dual bent functions. The characterization of self-dual bent functions is an open problem,
partially addressed in [265, 502, 626] (the latter reference classifies self-dual bent quadratic
functions under the action of the orthogonal group, i.e., the group of n × n matrices M
6.1 Bent Boolean functions 199

such that MM t = I ). See also [806]. It is observed in [265] that a Boolean n-variable
function is self-dual bent or anti-self-dual
 bent (i.e., bent such thatf = f ⊕ 1) if and
only if its so-called Rayleigh quotient x,y∈Fn (−1) f (x)⊕f (y)⊕x·y = f (x) W (x)
2 x∈Fn (−1) f
2
3n
has maximal modulus (that is, has modulus 2 2 ), which is easier to handle in the case of
quadratic functions: [626] uses that the associate symplectic matrix (see the footnote at page
171) is then involutive.

Remark. Since Boolean functions can be expressed in different forms, the question of
moving from one form to another is important. For general functions, we have addressed
this question at page 47. Regarding the duals, we have the easily proved following lemma
n (u) = tr n (v) = 1
(see, e.g., [311]), in which an autodual basis is a pair (u, v) such that trn/2 n/2
n (uv) = 0.
and trn/2

Lemma 6 Let n be even and m = n2 . Let (u, v) be an autodual basis of F2n over F2m . Let
f be bent over F2n and g(x, y) = f (ux + vy), x, y ∈ F2m .
Then

Wf (au + bv) = Wg (a, b),

where Wf is calculated with respect to the inner product X · Y = trn (XY ) and Wg is
calculated with respect to the inner product (x, y) · (x  , y  ) = trm (xx  + yy  ). Hence, if f is
bent, then f(au + bv) = g (a, b).

Numerical normal form of the dual


The numerical normal form of f can be deduced from that of f . Indeed, using equality
f
f = 1−(−1) , we have f = 12 − 2− 2 −1 Wf = 12 − 2 2 −1 δ0 + 2− 2 f. Applying now to
n n n
2
ϕ = f Relation (2.59), page 66, expressing the value of theFourier–Hadamard transform
by means of the coefficients of the NNF, we deduce that if I ⊆{1,...,n} λI x I is the NNF of
f , then
1
f(x) = − 2 2 −1 δ0 (x) + (−1)wH (x)
n n
2 2 −|I | λI .
2
I ⊆{1,...,n}; supp(x)⊆I

Changing I into {1, . . . , n} \ I in this relation, and observing that supp(x) is included in
{1, . . . , n} \ I if and only if xi = 0, ∀i ∈ I , we obtain the NNF of f by expanding the
following relation: f(x) =

1  n 
− 2 2 −1 (1 − xi ) + (−1)wH (x)
n
2|I |− 2 λ{1,...,n}\I
n
(1 − xi ). (6.6)
2
i=1 I ⊆{1,...,n} i∈I

We deduce again that, for every I = {1, . . . , n} such that |I |> n2 , the coefficient of x I in the
NNF of f (resp. of f ) is divisible by 2|I |− 2 .
n

Reducing Relation (6.6) modulo 2 proves Rothaus’ bound (see Theorem 13 below) and
the following fact:
200 Bent functions and plateaued functions

Proposition 73 [1005] Let n ≥ 4 be even and f be any n-variable bent Boolean function.
For every I ⊂ {1, . . . , n} such that |I | = n2 , the coefficient of x I in the ANF of f˜ equals the
coefficient of x {1,...,n}\I in the ANF of f .

Using Relation (2.24), page 50, expressing the NNF by means of the ANF, Equality (6.6)
can be related to the main result of [619] (but this result by Hou is stated in a complex way).
The Poisson summation formula (2.39), page 58, applied to ϕ = fχ = (−1)f gives
(see [212]) that for every vector subspace E of Fn2 , and for every elements a and b of Fn2 , we
have
 n
(−1)f (x)⊕b·x = 2− 2 |E| (−1)a·b (−1)f (x)⊕a·x . (6.7)
x∈a+E x∈b+E ⊥
  n
In particular, f (x)⊕a ·x is constant on b +E ⊥ if and only if x∈a+E (−1)f (x)⊕b·x = ±2 2 ,
and if E has dimension n2 , this is equivalent to the fact that f(x) ⊕ b · x is constant (with
the same value on a + E as f (x) ⊕ a · x on b + E ⊥ if a · b = 0). Note that if f (0n ) = 0
and b = 0n , this means that the constant value of f(x) on a + E is zero. This is particularly
interesting when f is self-dual.

6.1.8 Bound on algebraic degree and related properties


The algebraic degree of any Boolean function f being equal to the maximum size of the
multi-index I such that x I has an odd coefficient in the NNF of f , Proposition 67, page 195,
gives:

Theorem 13 [441, 1005] Let n ≥ 4 be an even integer. The algebraic degree of any bent
function on Fn2 is at most n2 .

In the case that n = 2, the bent functions have degree 2, since they have odd Hamming
weight (in fact, they are the functions of odd weights).
n
The minimal possible Hamming distance between two bent n-variable functions is 2 2 ,
since this is the minimum distance of RM( n2 , n) (see Theorem 7), and since such distance is
achieved by bent functions.
The bound of Theorem 13 is called Rothaus’ bound. It shows, as observed by Dillon
and Rothaus, that n-variable bent functions of algebraic degree n/2 can not be the direct
sums (see page 232) of (necessarily bent) functions in less variables. Theorem 13 can also
be proved with the same method as for proving Theorem 2, page 63, which also allows
obtaining a bound, shown in [620], relating the gaps between n2 and the algebraic degrees
of f and f:

Proposition 74 The algebraic degrees of any n-variable bent function and of its dual
satisfy
n n
− dalg (f)
− dalg (f ) ≥ 2 . (6.8)
2 dalg (f) − 1

Proof of Proposition 74 and alternative proof of Theorem 13: Let us denote by d (resp.
 the algebraic degree of f (resp. of f) and consider a term x I of degree d in the ANF
by d)
6.1 Bent Boolean functions 201

of f . The Poisson summation formula (2.40), page 59, applied to ϕ = fχ and to the vector
  n 
space E = {u ∈ Fn2 ; ∀i ∈ I , ui = 0} gives u∈E (−1)f (u) = 2 2 −d x∈E ⊥ fχ (x). The
orthogonal E ⊥ of E equals {u ∈ Fn2 ; ∀i ∈ I , ui = 0}. According  to Relation (2.4), page
33, the restriction of f to E ⊥ has odd Hamming weight w, thus x∈E ⊥ fχ (x) = 2d − 2w
  n
is not divisible by 4. Hence, u∈E (−1)f (u) is not divisible by 2 2 −d+2 .
 
We deduce first Theorem 13: suppose that d > n2 , then u∈E (−1)f (u) is not even, a
contradiction since E has an even size (indeed, we have I = {1, . . . , n}, because f has
algebraic degree smaller than n, since it has even Hamming weight).
We prove now Proposition 74: according
? @ to McEliece’s theorem (or Ax’s theorem), page
 n−d - .

156, u∈E (−1)f (u) is divisible by 2 d . We deduce the inequality n2 − d + 2 > n−d 
,
d
that is, n
2 −d +1≥ n−d
, which is equivalent to (6.8).
d

Using Relation (2.22), page 49, instead of Relation (2.4) gives a more precise result than
Theorem 13, shown in [292], which will be given in Section 6.1.18.
Proposition 74 can also be deduced from Proposition 67 and from some divisibility
properties, shown in [292], of the coefficients of the NNFs of Boolean functions of algebraic
degree d.
More on the algebraic degree of bent functions can be said for homogeneous functions
(see page 248).

Remark. The numerical degree of a bent function equals n since the Walsh transform does
not vanish.

6.1.9 Bent Boolean functions and designs


A balanced incomplete block design (BIBD), or 2-design, is a collection of subsets (called
blocks) of the same size in some finite set, such that each pair of distinct elements is included
in the same number λ of blocks (then any element is contained in the same number of blocks
as well). A BIBD is symmetric if the number of block equals the number of elements.7 As
recalled in [313], at least two designs are associated with any bent function f (cf. [441, 450,
656]):
1. The difference set design D(f ), in which the blocks are the translates c + D, c ∈ Fn2 ,
of the support D = supp(f ) = f −1 (1) (or of the co-support@cosupport f −1 (0n )).
Suppose for instance that f has Hamming weight 2n−1 + 2n/2−1 and that D = f −1 (0n );
given a pair {x, y} of distinct elements, the number of c such that {x, y} ⊂ c + D equals
w (D f ⊕1)
|{c ∈ Fn2 ; f (x +c) = f (y +c) = 0}|, that is, wH (f ⊕1)− H x+y 2 = 2n−2 −2n/2−1
(since we have |(x + D) ∩ (y + D)| = |D| − (x+D)(y+D) 2 ).
2. The code design C(f ), in which the blocks are the supports Dc of the functions
f (x) ⊕ c · x ⊕ , where is chosen such that wH (f (x) ⊕ c · x ⊕ ) = 2n−1 − 2n/2−1 .
That is, = f(c); hence Dc = {x; f (x) ⊕ c · x ⊕ f(c) = 1}; this design has the
same parameters as the difference set design (designs with such parameters are called
Menon designs): denoting lx (c) = c · x, the number of those c such that a pair {x, y}

7 When λ = 1, we have a projective plane; the blocks are the lines.


202 Bent functions and plateaued functions

of distinct elements is included in Dc equals wH ((f ⊕ lx ⊕ f (x))(f ⊕ ly ⊕ f (y)) =


wH (f⊕lx ⊕f (x))+wH (f⊕ly ⊕f (y))−wH (lx+y ⊕f (x)⊕f (y) 2n−1 −2n/2−1 +2n−1 −2n/2−1 −2n−1
2 = 2 =
2n−2 − 2n/2−1 .
D(f ) admits all translations as automorphisms, but C(f ) has no obvious automor-
phism.

Related notions of equivalence can then be studied: two bent functions f and g could
be called “difference set design equivalent” if D(f ) and D(g) are isomorphic designs, and
“code design equivalent” if C(f ) and C(g) are isomorphic designs.
Note that the designs D(f ) and C(f ) are equal if and only if f is quadratic. Indeed, the
quadratic bent functions have the property that for every linear function l(x), the function
f (x)⊕l(x) equals f (x +a)⊕ , for some a ∈ Fn2 and some ∈ F2 . The set {Da f , a ∈ Fn2 }+
F2 equals then the Reed–Muller code of order 1; this allows proving that D(f ) = C(f ).
Conversely, D(f ) = C(f ) for a bent Boolean function implies that all derivatives have
algebraic degree at most 1, which is equivalent to “f is quadratic.”

6.1.10 Bent Boolean functions and affine subspaces


The Poisson summation formula (2.39), page 58, applied on f or on fχ with a = 0n ,
shows that the intersection between the support D of an n-variable bent function and
a k-dimensional affine subspace b + E of Fn2 , where k ≥ n/2, equals 2k−n (2n−1 −
 
2n/2−1 u∈E ⊥ (−1)f (u)⊕b·u ) and lies then between 2k−1 − 2n/2−1 and 2k−1 + 2n/2−1 , as
observed by Dillon. This implies that D can contain b + E or be disjoint from b + E only
if k = n/2, and that if D contains b + E (resp. is disjoint from b + E), then D has balanced
intersection with any proper coset, and D \ (b + E) (resp. D ∪ (b + E)) is also a difference
set. Studying the intersection of the supports of bent functions and affine spaces results in
studying the sums of bent functions and indicators of flats:

Theorem 14 [212] Let b + E be any flat in Fn2 (E being a linear subspace of Fn2 ). Let f
be any bent function on Fn2 (n even). The function f ∗ = f ⊕ 1b+E is bent if and only if one
of the following equivalent conditions is satisfied:

1. For any a in Fn2 \ E, the function Da f is balanced on b + E.


2. The restriction of the function f(x) ⊕ b · x to any coset of E ⊥ is either constant or
balanced.

If f and f ∗ are bent, then E has dimension larger than or equal to n/2 and the algebraic
degree of the restriction of f to b + E is at most dim(E) − n2 + 1.
If f is bent, E has dimension n2 , and the restriction of f to b + E has algebraic degree
at most dim(E) − n2 + 1 = 1, i.e., is affine, then conversely f ∗ is bent too.

Proof The equivalence between Condition 1 and the bentness of f ∗ is directly


deduced
 from the fact that F (Da f ∗ ) equals F (Da f ) if a ∈ E, and equals F (Da f ) −
4 x∈b+E (−1)Da f (x) otherwise (since when a ∈ E, the cosets b + E and b + a + E are
disjoint, and Da f takes the same values on both of them).
6.1 Bent Boolean functions 203

 Condition 2 is also necessary and sufficient, since we have Wf (a) − Wf ∗ (a) =


2 x∈b+E (−1)f (x)⊕a·x , and using Relation (6.7), page 200, applied with E ⊥ in the place
of E, we have then, for every a ∈ Fn2 :
 ⊥ )− n −1 
(−1)f (u)⊕b·u = 2dim(E 2 (−1)a·b Wf (a) − Wf ∗ (a) .
u∈a+E ⊥
n
Then Wf (a) − Wf∗ (a) takes value 0 or ±2 2 +1 for every a (which is necessary, and is
sufficient according to Lemma 5, page 190) if and only if Condition 2 is satisfied.
Let us now assume that f and f ∗ are bent. Then 1b+E = f ∗ ⊕ f has algebraic degree at
most n2 , according to Rothaus’ bound, and thus dim(E) ≥ n2 .
 values of the Walsh transform of nthe restriction of f to b + E being equal to those
The
of 2 Wf − Wf ∗ , they are divisible by 2 2 and thus the restriction of f to b+E has algebraic
1

degree at most dim(E) − n2 + 1, according to Theorem 2.


If f is bent, E has dimension n2 , and the restriction of f to b + E is affine, then the

relation Wf (a) − Wf∗ (a) = 2 x∈b+E (−1)f (x)⊕a·x shows that f ∗ is bent too, according to
Lemma 5.

Remark. Relation (6.7) applied to E ⊥ in the place of E, where E is some n2 -dimensional


subspace, shows that, if f is a bent function on Fn2 , then f (x) ⊕ a · x is constant on b + E if
and only if f(x) ⊕ b · x is constant on a + E ⊥ . The same relation shows that f (x) ⊕ a · x
is then balanced on every other coset of E and f(x) ⊕ b · x is balanced on every other
coset of E ⊥ . Notice that Relation (6.7) shows also that f (x) ⊕ a · x cannot be constant
on a flat of dimension strictly larger than n2 (i.e., that f cannot be k-weakly normal with
k > n2 ).

Remark. Let f be bent on Fn2 . Let a and a  be two linearly independent elements of Fn2 . Let
us denote by E the orthogonal of the subspace spanned by a and a  . According to Condition
2 in Theorem 14, the function f ⊕ 1E is bent if and only if Da Da  f is null (indeed, a 2-
variable function is constant or balanced if and only if it has even Hamming weight, and f
has even weight on any coset of the vector subspace spanned by a and a  if and only if, for
every vector x, we have f (x) ⊕ f (x + a) ⊕ f (x + a  ) ⊕ f (x + a + a  ) = 0). This result,
stated in [193] and used in [198, Corollary 15] to design a new class of bent functions, is
then a direct consequence of Theorem 14.

6.1.11 Affine spaces of bent Boolean functions


It is observed in [210] that k-dimensional affine spaces of bent Boolean n-variable functions
with k even correspond to bent functions in n + k variables of a particular form. We shall
denote these affine spaces in the form f + < f1 , . . . , fk >, where < f1 , . . . , fk > denotes
the vector space over F2 spanned by F2 -linearly independent functions f1 , . . . , fk .

Proposition 75 [210] For every positive even integers n, k, a k-dimensional affine space
of Boolean n-variable functions f + < f1 , . . . , fk > contains only bent functions if and only
if the Boolean function
204 Bent functions and plateaued functions
k

2
h : (x, y) ∈ Fn2 × Fk2 → (y2i−1 ⊕ f2i−1 (x))(y2i ⊕ f2i (x)) ⊕ f (x)
i=1

is bent.

The proof is a generalization of the calculations made in Section 5.3, page 180.

Proof For every (a, b) ∈ Fn2 × Fk2 , we have Wh (a, b) =


 k2
(−1) i=1 (y2i−1 ⊕f2i−1 (x)⊕b2i )(y2i ⊕f2i (x)⊕b2i−1 )⊕f (x)⊕a·x

(x,y)∈Fn2 ×Fk2
 k2
· (−1) i=1 [b2i b2i−1 ⊕b2i f2i (x)⊕b2i−1 f2i−1 (x)] =
 k2  k2
(−1) i=1 y2i−1 y2i ⊕f (x)⊕a·x⊕ i=1 [b2i b2i−1 ⊕b2i f2i (x)⊕b2i−1 f2i−1 (x)] =
(x,y)∈Fn2 ×Fk2

k  k2
22 (−1) i=1 [b2i b2i−1 ⊕b2i f2i (x)⊕b2i−1 f2i−1 (x)]⊕f (x)⊕a·x =
x∈Fn2
k
±2 W 2 k (a)
j =1 bj fj (x)⊕f (x)

(by making the changes of variables y2i−1 → y2i−1 ⊕ f2i−1 (x) ⊕ b2i and y2i → y2i ⊕
  k2 k
f2i (x) ⊕ b2i−1 and using that y∈Fk (−1) i=1 y2i−1 y2i = 2 2 ). Hence h is bent if and only if
 2
each function kj =1 bj fj (x) ⊕ f (x) is bent.

Remark. The situation with k-dimensional affine spaces of bent functions is quite different
from what we have with k-dimensional vector spaces of Boolean functions whose nonzero
elements are all bent: these latter vector spaces are in correspondence with bent (n, k)-
functions: their nonzero elements are the component functions of these bent vectorial
functions (see Section 6.4, page 268) and can then exist only if k ≤ n2 (see Proposition 104,
page 269).

An example of application of Proposition 75 is given in [210], providing a large number


of (m − 2)-variable bent functions of algebraic degree 4 from any m-variable cubic bent
function: let h be any such function, and we have that each derivative Du h(x) is quadratic
and balanced for every u = 0m , since h is bent. According to Proposition 55, page 171
(see also the few lines following the proposition), for each u = 0m , there exists v such
that Dv Du h equals the constant function 1, that is, Du h(x + v) = Du h(x) ⊕ 1, that is,
h(x + u + v) = h(x) ⊕ h(x + u) ⊕ h(x + v)] ⊕ 1, and hence:
∀x ∈ Fm
2 , ∀y1 , y2 ∈ F2 , h(y1 u + y2 v + x) = h(x) ⊕ y1 Du h(x) ⊕ y2 Dv h(x) ⊕ y1 y2 .

(This can be checked for each value of (y1 , y2 ).) We can then see that Proposition 75 can be
applied with n = m − 2, k = 2, by taking for f the restriction of h(x) + Du h(x)Dv h(x) to
6.1 Bent Boolean functions 205

an (m − 2)-dimensional vector space E not containing u, v nor u + v (identifying then this


vector space with Fn2 ), for f1 the restriction of Du h to E and for f2 the restriction of Dv h to
E. We deduce that the two-dimensional affine space (h⊕Du hDv h)|E + < Du h|E , Dv h|E >
contains only bent functions. These bent functions have algebraic degree 4 in general.

6.1.12 A graph related to bent functions


In [716], the author studies the graph G whose vertices are the bent functions and whose
n
edges connect vertices at Hamming distance 2 2 of each other. It is shown that the degree of
n n/2
any vertex is not more than 2 2 i=1 (2i + 1), and that this bound is achieved with equality
by quadratic bent functions, and only by them.
The minimal codewords of Reed–Muller codes being indicators of affine spaces (see
n
Theorem 8, page 154), if two bent functions lie at distance 2 2 from each other, then
according to Rothaus’ bound and to Theorem 14, page 202, they are weakly normal and
they differ by the indicator of the n2 -dimensional space on which they are affine. Hence, if a
n
bent function is not weakly normal, there is no bent function at Hamming distance 2 2 from
it. According to the existence of bent functions for n ≥ 14, which are not weakly normal, G
is disconnected if n ≥ 14 (it is connected if n ≤ 6; the question whether it is disconnected
for 8 ≤ n ≤ 12 seems open). Does it remain disconnected when we take off all vertices
corresponding to functions being not weakly normal ? See more in [716].

6.1.13 Bent Boolean functions of low algebraic degrees


Quadratic bent functions
All the quadratic bent functions are known. According to the properties recalled in
Section 5.2, any quadratic function

f (x) = ai,j xi xj ⊕ h(x) (h affine, ai,j ∈ F2 )
1≤i<j ≤n

is bent if and only if one of the following equivalent properties is satisfied:


n
1. Its Hamming weight is equal to 2n−1 ± 2 2 −1 .
2. Its associated symplectic form βf : (x, y) → f (0n ) ⊕ f (x) ⊕ f (y) ⊕ f (x + y) is
nondegenerate (i.e., has kernel {0n }).
3. The matrix of this symplectic form, that is, the skew-symmetric matrix M =
(mi,j )i,j ∈{1,...,n} over F2 , defined by mi,j = ai,j if i < j , mi,j = 0 if i = j , and
mi,j = aj ,i if i > j , is nonsingular (i.e., has determinant 1).
4. f (x) is equivalent, up to an affine nonsingular transformation, to the function. x1 x2 ⊕
x3 x4 ⊕ · · · ⊕ xn−1 xn ⊕ ( ∈ F2 ).
Hence, there is a unique EA equivalence class of bent functions, as Rothaus and Dillon
already observed in different terms.

Remark. According to these characterizations, there exist (quadratic) bent functions for
every even positive n (we can take the simplest one x1 x2 ⊕ · · · ⊕ xn−1 xn ). Thus, the covering
radius of the Reed–Muller code of order 1 equals 2n−1 − 2 2 −1 when n is even.
n
206 Bent functions and plateaued functions

Note that when f is bent in Proposition 57, page 175, that is, when Ef is the trivial vector
space, Ef⊥ equals the whole space Fn2 and the linear functions y → βf (x, y) cover then all
linear forms on Fn2 (once each) when x ranges over Fn2 . Examples of quadratic bent functions
over Fn2 are
• The so-called Maiorana–McFarland (see below) quadratic bent functions f (x, y) =
n/2 n/2
x · π(y) ⊕ h(y), where x, y ∈ F2 and π is an affine permutation of F2 .
wH (x)
• The elementary quadratic symmetric Boolean function σ2 (x) = [mod 2] =
 2
x x
1≤i<j ≤n i j (which is, up to the addition of an affine symmetric function, the only
symmetric bent function; see Section 10.1).
This function is bent because the kernel of it
associated symplectic form ϕ(x, y) = xi yj , equal to {(x1 , . . . , xn ) ∈ Fn2 ; ∀i =
 1≤i=j ≤n
1, . . . , n, j =i xj = 0} is reduced to {0n }, since n is even.

Quadratic bent functions in trace representation We have seen at page 176 how the
Hamming weight and Walsh transform values of quadratic Boolean functions in trace form
can be calculated.  n −1   
A generic quadratic function trn 2
a x 2k +1 + tr a x 2n/2 +1 + (x), where
k=1 k n/2 n/2
a1 , . . . , a n2 −1 ∈ F2n , an/2 ∈ F2n/2 and  is affine, is bent if and only if the equation
 n2 −1  2k 2n−k 2n−k  n/2
k=1 ak x +ak x +an/2 x 2 = 0 has 0 for the only solution, that is, the linearized
polynomial on the left-hand side is a permutation polynomial.
In the case of Gold Boolean functions f (x) = trn (ax 2 +1 ); a ∈ F∗2n , Carlitz’ result
i

shows that, for i = 1, f is bent if and only if a is not a cube (see page 177). For general i,
i n−i i 2i
raising the equation ax 2 + (ax)2 = 0 to the 2i th power gives a 2 x 2 + ax = 0. Hence,
x = 0 is a solution if and only if (ax 2 +1 )2 −1 = 1, that is, ax 2 +1 ∈ F2n ∩ F2i = F2gcd(i,n)
i i i

and since 2i + 1 and 2i − 1 are coprime and x → x 2 +1 is then a permutation in F2gcd(i,n) , the
i

existence of such x is equivalent to that of x such that x 2 +1 = a1 ; function trn (ax 2 +1 ) is


i i

then bent if and only if a ∈ {x 2 +1 , x ∈ F2n }. Such a exists if and only if function x 2 +1 is not
i i

a permutation on F2n , that is, gcd(2i +1, 2n −1) = 1, and since 22i −1 = (2i −1)(2i +1) and
2i −1,2n −1) gcd(2i,n) −1
2i − 1 and 2i + 1 are coprime, we have gcd(2i + 1, 2n − 1) = gcd(2 gcd(2i −1,2n −1)
= 22gcd(i,n) −1 ;
n
the condition is then that gcd(i,n) is even. Being quadratic, these functions belong to the
completed Maiorana–McFarland class.
Another classical example of quadratic bent function is
⎛n ⎞
2 −1  n 
f (x) = trn ⎝ x 2 +1 ⎠ + tr n2 x 2 2 +1 .
i

i=1

 n2 −1 2i n−i n/2
The equation i=1 (x + x 2 ) + x 2 = 0, that is, x + trn (x) = 0 has indeed clearly 0
as the only solution, since trn (x) ∈ F2 and trn (1) = 0.
Quadratic bent functions are studied in [355, 629, 632, 699, 1144] (with the viewpoint
of linearized permutation polynomials in the latter reference) and deduced in [768] from
generalized bent functions.
6.1 Bent Boolean functions 207

An example of bivariate bent function over F2n for every n is from [236]. Function

f (x, y) = trn (x 2 +1 + y 2 +1 + xy);


i i
x, y ∈ F2n , gcd(n, 3) = gcd(n, i) = 1,
is bent. Its associated symplectic form equals βf : ((x, y), (x  , y  )) → f (0, 0) ⊕ f (x, y) ⊕
i i
f (x  , y
 )⊕f (x +x  , y +y  ) = tr (x 2 x  +xx  2 +y 2 y  +yy  2 +xy  +x  y). The kernel of β
i i
 i n n−i  f
x +x
2 2 +y =0
equals (x, y) ∈ F22n ; i n−i , equal to {(0, 0)} since denoting z = x + y
y2 + y2 + x = 0
i n−i 2i i 3i
we have z2 + z2 + z = 0, which implies z2 = z2 + z and therefore z2 = z, that is, z ∈
i n−i
F23i , and therefore z ∈ F2 and z = 1 being not a solution of the equation z2 + z2 + z = 0,
i n−i
we have x = y and x 2 + x 2 + x = 0, that is, x = 0.

Remark. Another representation of Boolean functions, in which, instead of identifying


x = (x1 , . . . , xn ) with a field element (by the use of a basis of the vector space F2n over
F2 ), we identify (x1 , . . . , xn−1 ) with a field element in F2n−1 , and keep xn in F2 , leads to the
Kerdock code; see Section 6.1.22, where the bent functions leading to this code are given.
The so-called cyclic bent functions, such that, for any a = b ∈ F2n−1 and any ∈ F2 ,
f (ax1 , x2 ) + f (bx1 , x2 + ) is bent (as well as f itself), are proposed in [458], with applica-
tions in codes, codebooks, designs, mutually unbiased bases (MUBs), and sequences.

The unique EA equivalence class of quadratic bent functions has simplest representative
trn (x 2 +1 ) in univariate representation and trm (xy) in bivariate representation.
m

Cubic bent functions


Any Boolean function f being bent if and only if every derivative of f in a nonzero direction
is balanced, and every quadratic Boolean function being balanced if and only if one of its
derivatives is the constant function 1, we have:

Proposition 76 Let n be any positive integer and f any cubic n-variable Boolean function.
Then f is bent if and only if, for every nonzero a ∈ Fn2 , there exists b ∈ Fn2 such that the
second-order derivative Da Db f equals constant function 1.

Up to an affine transformation, we may assume in Proposition 76 that a = (1, 0, . . . , 0)


and b = (0, 1, 0, . . . , 0) and any cubic bent function is then affinely equivalent to a function
of the form x1 x2 ⊕x1 f1 (x3 , . . . , xn )⊕x2 f2 (x3 , . . . , xn )⊕f3 (x3 , . . . , xn ), but it seems difficult
to go further in the determination of cubic bent functions.
The characterization given by Corollary 12, page 194, simplifies itself in the case
(2)
of a cubic function: denoting the set {(a, b) ∈ Fn2 ; Da Db f = cst} by Ef , we
have (−1)Da Db f (x) = 2n (−1)Da Db f (0n ) , since the second-order deriva-
x,a,b∈Fn2 (2)
(a,b)∈Ef
tives of f , which are affine, are constant or balanced. Then f is bent if and only if
 D D f (0n ) = 2n . Note that, for every a ∈ Fn , the section {b ∈ Fn ; (a, b) ∈
(2) (−1) a b
a,b∈E f
2 2

Ef(2) } (2)
of Ef at a equals EDa f and is then a linear subspace of Fn2 . Hence, Ef is a bilinear
(2)

space. Moreover, according to the property observed for quadratic functions at page 171,
208 Bent functions and plateaued functions

and since Da f is quadratic, function b → Da Db f (0n ) is linear over EDa f for every a, that
is, function (a, b) → Da Db f (0n ) is bilinear over Ef(2) .
We deduce that (−1)Da Db f (0n ) = |EDa f | and that f is bent if
(2)
a,b∈Ef a∈Fn2
∀b∈EDa f ,Da Db f (0n )=0

and only if |EDa f | = 2n , and since for a = 0n we have EDa f = Fn2 and
a∈Fn2
∀b∈EDa f ,Da Db f (0n )=0

∀b ∈ EDa f , Da Db f (0n ) = 0, this proves again Proposition 76.


But it is still an open problem to characterize the bent functions of algebraic degree 3 (that
is, classify them under the action of the general affine group). This has been done for n ≤ 6
in [1005] (see also [968], where the number of bent functions is computed for these values
of n). For n = 8, it has been done in [618] (and completed in [9]); all of these functions have
at least one affine derivative Da f , a = 0n (it has been proved in [193] that this happens for
n ≤ 8 only).

6.1.14 Bent Boolean functions in few variables


Bent functions in two variables are the Boolean functions of odd Hamming weight, i.e., of
algebraic degree 2.
Bent functions in four variables are quadratic and therefore known.

Bent Boolean functions in six variables


The determination of all bent 6-variable functions has been done in [1005], where a search
for all cubic bent functions in six variables was made, i.e., of all 6-variable bent functions
of maximal algebraic degree. Rothaus determined, up to affine equivalence, four possible
degree 3 parts of cubic Boolean functions in six variables. Determining all 6-variable bent
6
functions was then possible by visiting all 2(2) = 215 quadratic parts for each of these four
cases. Bart Preneel in his thesis [968] made this work again and found a fifth class of degree
3 parts, but this fifth class did not give any bent function. It was also proved by R. E. Kibler
(as mentioned by Dillon in [440]) and rediscovered 30 years later in [124] (while classifying
EA equivalence classes of 6-variable Boolean functions according to some cryptographic
properties) that every bent function in six variables is affinely equivalent to a function of
the Maiorana–McFarland class (see below). It was later observed in [212] that any bent
function of algebraic degree 3 in six variables is affine equivalent to a function of the form
x1 x2 x3 ⊕ x1 h1 (x4 , x5 , x6 ) ⊕ x2 h2 (x4 , x5 , x6 ) ⊕ x3 h3 (x4 , x5 , x6 ) ⊕ g(x4 , x5 , x6 ), where the
mapping (x1 , x2 , x3 ) → (h1 (x4 , x5 , x6 ), h2 (x4 , x5 , x6 ), h3 (x4 , x5 , x6 )) is a permutation and
where h1 ⊕h2 ⊕h3 is affine (for any function of this form, this double condition is necessary
and sufficient). This implies in particular that any bent function in at most six variables is
affinely equivalent to its dual.

Bent Boolean functions in eight variables


The (impressive) determination of all bent 8-variable functions has been completed in [743],
after that Langevin and Leander enumerated them in [741] (Hou had previously classified
cubic bent functions in [618].)
6.1 Bent Boolean functions 209

In [347], the authors constructed bent homogeneous functions (i.e., bent functions whose
ANFs are the sums of monomials of the same degree) on 12 (and less) variables by using
the invariant theory (which makes feasible the computer searchs).

6.1.15 Primary constructions of bent Boolean functions


Except for small values of n, there does not exist a classification of bent functions under
the action of the general affine group, and the structure of the set of bent functions is
not clear. In order to understand better this structure, and also to have bent functions for
applications, researchers have designed constructions. We describe them below. It is not
clear whether these constructions give some insight on general bent functions or if on the
contrary they draw our attention to peculiar bent functions. Nevertheless, they represent
some important knowledge and have practical interest. Some of the known constructions
are ex nihilo (from scratch). We call them primary constructions and address them in the
present subsection. The others, which use as building blocks previously constructed bent
functions (often called initial functions), and sometimes lead to recursive constructions, are
called secondary constructions. We shall address them in the next subsection.

1. The Maiorana–McFarland original class M (see [441, 834]) is the set of all the Boolean
n/2
functions on Fn2 = {(x, y); x, y ∈ F2 }, of the form
f (x, y) = x · π(y) ⊕ g(y), (6.9)
n/2 n/2
where “·” is an inner product in F2 , π is any permutation on F2 , and g any Boolean
n/2
function on F2 .
In bivariate representation, this gives (x, y) = trn/2 (x π(y)) + g(y), where π is any
permutation polynomial over F2n/2 and g any Boolean function on F2n/2 .

Proposition 77 Any function of the form (6.9) is bent if and only if π is a permutation. The
dual of this bent function equals then f(a, b) = b · π −1 (a) ⊕ g(π −1 (a)), where π −1 is the
inverse permutation of π .

This is a direct consequence8 of Proposition 53, page 166, which writes here
n
Wf (a, b) = 2 2 (−1)g(y)⊕b·y , (6.10)
y∈π −1 (a)

where π −1 (a) denotes the preimage of a by π −1 . We see that the dual function of f also
belongs to Maiorana–McFarland class but with its two inputs swapped.
As we saw already in Section 5.1, the fundamental idea of Maiorana–McFarland’s
construction consists in concatenating affine functions. If we order all the binary words
of length n in lexicographic order, with the bit of higher weight on the right, then the truth
table of f is the concatenation of the restrictions of f obtained by setting the value of y and
n/2
letting x freely range over F2 . These restrictions are affine.

8 The input is here cut in two pieces x and y of the same length; cutting it in pieces of different lengths is
addressed in Proposition 79 below; bentness is then obviously not characterized by the bijectivity of π .
210 Bent functions and plateaued functions

Of course, M is a particular case of the general Maiorana–McFarland construction of


Boolean functions seen in Subsection 5.1.1, which has been a generalization of M first
investigated in [181].
Note that function f above is such that, for every function h(y), function f (x, y) ⊕ h(y)
is bent. This property is characteristic of the functions of the form (6.9):

n/2
Proposition 78 A Boolean function f (x, y), x, y ∈ F2 , belongs to class M if and only
if, for every function h(y), the function f (x, y) ⊕ h(y) is bent.

Proof The condition is necessary, according to Proposition 77. For proving that it
is also sufficient, let us take h = δa (the indicator of {a}). For every a, u, v ∈
n  
F22 , we have that n (−1)f (x,y)⊕u·x⊕v·y⊕δa (y) =
2
n (−1)f (x,y)⊕u·x⊕v·y −
x,y∈F2 x,y∈F22

2 n (−1)f (x,a)⊕u·x⊕v·a , and if f (x, y) and f (x, y) ⊕ δa (y) are both bent, then
2
x∈F2
n n  n
we have ±2 2 = ±2 2 ± 2 n (−1)f (x,a)⊕u·x . Hence for every a, u
2
∈ F22 ,
x∈F2
 n
we have n (−1) f (x,a)⊕u·x ∈ {0, ±2 2 }. Clearly, having, for some a, that
x∈F22

n (−1)f (x,a)⊕u·x = 0 for every u is impossible because of Parseval’s relation. Then,
2
x∈F2
n n  n
for every a ∈ F22 , there exists u ∈ F22 such that n (−1)f (x,a)⊕u·x = ±2 2 that is
x∈F22
f (x, a) = u · x or f (x, a) = u · x ⊕ 1.

When a new method of construction of bent functions is found, it is necessary (for


showing that it does not only provide functions that could be obtained with already known
methods) to prove that some constructed functions are affinely inequivalent to Maiorana–
McFarland functions.9 We know thanks to Proposition 54, page 167, that an n-variable
Boolean function with n even belongs to the completed Maiorana–McFarland class (the
smallest possible complete class including M) if and only if there exists an n2 -dimensional
linear subspace E of Fn2 such that function Da Db f is identically null for every a, b ∈ E.
According to Proposition 78, this is also equivalent to the fact that there exists an n2 -
dimensional affine subspace A and an n2 -dimensional linear subspace E of Fn2 such that every
element of Fn2 can be expressed in a unique way in the form x + y, where x ∈ E, y ∈ A, and
such that f ⊕ h is bent for every function h(x + y) depending only on y.
The completed class of M contains all bent functions in at most six variables [440] and
all quadratic bent functions (according to point 4 in the characterization of quadratic bent
functions of page 205, taking π = id and g constant in (6.9)).

Derived classes C and D Two classes of bent functions have been obtained in [212] by
adding to functions of Maiorana–McFarland’s class the indicators of vector subspaces:

9 This should ideally be checked for all known classes of bent functions; this represents much (hard) work;
checking this with class M is usually considered as mandatory because M is simpler and provides the widest
class of bent functions.
6.1 Bent Boolean functions 211

– The class, denoted by D, whose elements are the functions of the form f (x, y) =
n/2
x · π(y) ⊕ 1E1 (x)1E2 (y), where π is any permutation on F2 and E1 , E2 are two linear
n
subspaces of F22 such that π(E2 ) = E1 ⊥ (1E1 and 1E2 denote their indicators). The dual
of f belongs to the completed class of D.
A subclass D0 of D has for elements the functions of the form f (x, y) =
x · π(y) ⊕ δ0 (x) (recall that δ0 is the Dirac symbol). The dual of such f is the function
y · π −1 (x) ⊕ δ0 (y). It is proved in [212] that D0 is not included10 in the completed
versions of classes M and PS and that every bent function in six variables is affinely
equivalent to a function of this class, up to the addition of an affine function.
– The class C of all the functions of the form f (x, y) = x · π(y) ⊕ 1L (x), where L is
n/2 n/2
any linear subspace of F2 and π any permutation on F2 such that, for any element a
of F2 , the set π −1 (a + L⊥ ) is a flat. It is a simple matter to see, as shown in [198], that,
n/2

under the same hypothesis on π , if g is a Boolean function whose restriction to every flat
π −1 (a + L⊥ ) is affine, then the function x · π(y) ⊕ 1L (x) ⊕ g(y) is also bent.

The fact that any function in class D or class C is bent comes from Theorem 14, page
202. In [822], existence and nonexistence results of such π and L are given for many of
the known classes of permutations, inducing generic methods of constructions. In [1154],
sufficient conditions on π and L so that f is provably outside the completed Maiorana–
McFarland class are found. In particular, it is shown that the C functions described in [822]
do not belong to the completed Maiorana–McFarland class. The more difficult question
whether these functions are also outside the completed PS class remains open.

Maiorana–McFarland construction as a secondary construction The original


Maiorana–McFarland’s construction is a particular case of a more general construction
of bent functions, which is a secondary construction for r < n2 and a primary one for r = n2 :

Proposition 79 [223] Let n = r + s (r ≤ s) be even. Let φ be any mapping from Fs2 to


Fr2 such that, for every a ∈ Fr2 , the set φ −1 (a) is an (n − 2r)-dimensional affine subspace
of Fs2 . Let g be any Boolean function on Fs2 whose restriction to φ −1 (a) (viewed as a Boolean
function on Fn−2r
2 via an affine isomorphism between φ −1 (a) and this vector space) is bent
for every a ∈ F2 , if n > 2r (no condition on g being imposed if n = 2r). Then the function
r

fφ,g (x, y) = x · φ(y) ⊕ g(y) is bent on Fn2 .

Proof This is a direct consequence of Proposition 53, page 166, which writes
Wfφ,g (a, b) = 2r (−1)g(y)⊕b·y . (6.11)
y∈φ −1 (a)

According to Relation (6.11), the function fφ,g is bent if and only if r ≤ n2 and
 g(y)⊕b·y = ±2 n2 −r for every a ∈ Fr and every b ∈ Fs . The hypothesis
y∈φ −1 (a) (−1) 2 2
in Proposition 79 is a sufficient condition for that (but it is not a necessary one).

10 We have seen in Proposition 54 that there is a rather simple way to show that a function f does not belong to
the completed class of M; it is more difficult to show that it does not belong to the completed class of PS .
212 Bent functions and plateaued functions

This construction is pretty general: the choice of any partition of Fs2 in 2r flats of
dimension s − r = n − 2r and of an (n − 2r)-variable bent function on each of these
flats leads to an n-variable bent function. Note that φ, defined so that the elements of this
partition are the preimages of the elements of Fr2 by φ, is balanced (i.e., has output uniformly
distributed over Fr2 ). In fact, it is observed in [802] that, if a bent function has the form fφ,g ,
then φ is balanced. This is a direct consequence of the characterization of balanced vectorial
functions by Proposition 35, page 112, and of the fact that, for every nonzero a ∈ Fr2 , the
Boolean function a · φ is balanced, since it equals the derivative D(a,0s ) fφ,g .
Obviously, every Boolean function can be represented (in several ways) in the form fφ,g
for some values of r ≥ 1 and s and for some mapping φ : Fs2 → Fr2 and Boolean function g
on Fs2 .

Remark. There exist n2 -dimensional vector spaces of n-variable Boolean functions whose
nonzero elements are all bent. This is equivalent to the existence of bent (n, n2 )-functions.
Maiorana–McFarland’s construction allows constructing such functions, as we shall see at
page 270. Dimension n2 is maximal, since a result by K. Nyberg shows that bent (n, m)-
functions cannot exist for m > n2 .

2. The partial spread class PS , introduced in [441] by J. Dillon, is the set of all the sums
(modulo 2) of the indicators of 2 2 −1 or 2 2 −1 + 1 pairwise supplementary subspaces of
n n

dimension n2 of Fn2 (i.e., such that the intersection of any two of them equals {0n }, and
given their dimension, whose sum is direct and equals Fn2 ). A set of pairwise supplementary
subspaces is called a partial spread, and a (full) spread if it covers Fn2 . Some PS functions
are built with partial spreads that are parts of full spreads, and some are built with partial
spreads that cannot be extended into full spreads (we shall see a quadratic example below).

n n
Proposition 80 Any sum (modulo 2) of the indicators of 2 2 −1 or 2 2 −1 + 1 pairwise
supplementary subspaces of dimension n2 of Fn2 is a bent function. The dual of such function
has the same form, all the n2 -dimensional spaces involved in the definition being replaced by
their orthogonals.

Definition 52 Class PS is the set of bent functions defined in Proposition 80. The sums
of 2 2 −1 such indicators constitute subclass PS − (whose elements have Hamming weight
n

2 2 −1 (2 2 −1 − 1) = 2n−1 − 2 2 −1 ) and the sums of 2 2 −1 + 1 of them constitute subclass PS +


n n n n

n n n n
(whose elements have Hamming weight (2 2 −1 + 1) 2 2 − 2 2 −1 = 2n−1 + 2 2 −1 ).

We shall see that Proposition 80 is a particular case of a theorem (Theorem 17) that we
shall state at page 241. The bentness of the functions in PS can also be alternatively shown:
for each pair of supplementary subspaces Ei and Ej and every a ∈ Fn2 , the set Ei ∩ (a + Ej )
is a singleton; this allows proving that, for every nonzero a, the product of any function
f (x) in PS − (resp. in PS + ) with its shifted function fa (x) = f (x + a) has Hamming
weight 2 2 −1 (2 2 −1 − 1) = 2n−2 − 2 2 −1 if f (a) = 0 and (2 2 −1 − 1)(2 2 −1 − 2) + 2 2 −
n n n n n n

2 = 2n−2 − 2 2 −1 if f (a) = 1 (resp. (2 2 −1 + 1)2 2 −1 = 2n−2 + 2 2 −1 if f (a) = 0 and


n n n n

2 2 −1 (2 2 −1 − 1) + 2 2 = 2n−2 + 2 2 −1 if f (a) = 1), which implies in all cases that the


n n n n
6.1 Bent Boolean functions 213

derivative Da f (whose Hamming weight equals 2(wH (f ) − wH (ffa ))) is balanced, and
thus that f is bent, according to Theorem 12, page 192.
The PS − functions built with a full spread are the complements of the elements of PS +
built with the same full spread and vice versa, but the complement of a general PS bent
function is not necessarily in PS .

Remark. The Boolean functions equal to the sums of some number of indicators of
pairwise supplementary n2 -dimensional subspaces of Fn2 share with quadratic functions the
nice and convenient property of being bent if and only if they have the Hamming weight of
n
a bent function (which is 2n−1 ± 2 2 −1 ).

All the elements of PS − have algebraic degree n2 exactly (indeed, by applying a linear
n
isomorphism of Fn2 , we may assume that F2 × {0n/2 } is among the 2 2 −1 pairwise
n/2

supplementary spaces defining the function, and since the function vanishes at 0n , Relation
(2.4), page 33, shows that the monomial x1 . . . x n2 appears in its ANF).
On the contrary, the elements of PS + do not all have algebraic degree n2 : Dillon
observed in [441] that, when n2 is even, all quadratic bent functions are PS + functions or
their complements. Indeed, by affine equivalence, we can restrict ourselves to the function
(x, , y, η) ∈ F2n/2−1 × F2 × F2n/2−1 × F2 → trn/2−1 (xy) ⊕ η ⊕ 1, where trn/2−1 is
the trace function from F2n/2−1 to F2 ; the support of this function equals the union of the
2n/2−1 + 1 vector spaces of dimension n/2 (and very much related to the Kerdock code)
S∅ = {0}×{0}×F2n/2−1 ×F2 and Sa = {(x, , a 2 x+atrn/2−1 (ax)+a , trn/2−1 (ax)); (x, ) ∈
F2n/2−1 ×F2 } for a ∈ F2n/2−1 . Indeed, we have trn/2−1 (xy)⊕ η = 0 if and only if x = = 0
or there exists a such that y = a 2 x +atrn/2−1 (ax)+a and η = trn/2−1 (ax). Note that since
f has algebraic degree strictly less than n2 for n ≥ 8, this partial spread is not extendable to
a full spread.
It is an open problem to characterize the algebraic normal forms of the elements of class
PS or their trace representations. It is then necessary to identify within the PS construction,
classes of explicit bent functions.11

Class PS ap in bivariate representation J. Dillon exhibits in [441] a subclass of PS − ,


denoted by PS ap (where ap stands for “affine plane”), whose elements (that we shall call
Dillon’s functions) are defined in an explicit form, that we already addressed in Subsection
5.1.2 (more precisely at page 169).
The vector space Fn2 is identified with the affine plane F2n/2 ×F2n/2 (an inner product being
(x, y) · (x  , y  ) = tr n2 (xx  + yy  ); we know that the notion of bent function is independent
of the choice of the inner product). The affine plane F2n/2 × F2n/2 is equal to the union of
its 2n/2 + 1 lines through the origin E∅ = {0} × F2n/2 and Ea = {(x, ax) ; x ∈ F2n/2 },
a ∈ F2n/2 ; these lines are n/2-dimensional F2 -subspaces of Fn2 and constitute the so-called
Desarguesian spread. Choosing any 2n/2−1 of the lines, and taking them different from E0

11 The situation with PS is then similar to the situation with general bent functions: we have a nice and simple
definition, but no systematic way of determining all the elements that satisfy it.
214 Bent functions and plateaued functions

and E∅ (of equations x = 0 and y = 0), leads,


n/2 −2
  by definition, to an element of PS ap ,
of the form f (x, y) = g x y 2 , i.e., g xy with xy = 0 if y = 0, where g is a
n/2
balanced Boolean function on F2 which vanishes at 0. In the sequel, we shall always take
this convention that 10 = 0 and write xy instead of x y 2 −2 . The bentness of the resulting
n/2

 of Relation (5.9), page 170, with φ = f (0) = = a = 0.


function is a consequence
The complements g xy ⊕1 of these functions are the functions h( xy ) where h is balanced
and does not vanish at 0; they belong to class PS + .
For every balanced function g, the dual of the bent function g( xy ) is g( yx ) (this will be a
direct consequence of Theorem 17, page 241).

Class PS ap in univariate representation We have already seen at pages 167 and 169,
the notion of PS ap Boolean function in univariate representation (but without studying the
condition under which such a function is bent). A univariate representation of the elements
of Desarguesian spread is {u F2n/2 , u ∈ U }, where U = {u ∈ Fn2 ; u2 +1 = 1} is the cyclic
n/2

group of (2n/2 + 1)th roots of unity in Fn2 (i.e., the multiplicative subgroup of F∗2n of order
2n/2 + 1). Each line through the origin of the plane F2n over F2n/2 , instead of being identified
by the constant value x/y of its nonzero elements (x, y) ∈ F22n/2 (which makes with the
convention 1/0 = 0 that the two lines of equations x = 0 and y = 0 provide necessarily
the same output by the PS ap function) is identified by the unique element of U its contains.
Then g is viewed as a Boolean function over U such that g(α1 ) = g(α2 ) = 0 = f (0),
where (α1 , α2 ) is the basis chosen for the plane F2n over F2n/2 , assuming without loss of
 α1 , α2 both belong to U . Relation (5.5), page 168, with m = n/2, and
generality that
(−1)f (0) − μ∈U (−1)g(μ) = 0 (since g is taken balanced on U \ {αi }, i = 1, 2), and
φ = 0 gives an alternative proof of the bentness of the PS ap functions defined by Dillon,
since trmn (z) = 0 if and only if z ∈ F m . Moreover, for every x ∈ F∗ and every
2 2n/2
u ∈ U , we have (ux)2 −1 = u−2 and u → u−2 isa permutation
n/2
 of U ; this leads to
n/2 −1
an expression of PS ap bent functions of the form h z 2 , z ∈ F2n , where h is a
Boolean function over F2 such that h(0) = 0 and whose restriction to U has Hamming
n

weight 2 2 −1 .
n
 n/2 
Dillon shows in [442] that all bent functions of the form trn az2 −1 , z ∈ F2n , are
affinely inequivalent to the Maiorana–McFarland functions.
It is possible to deduce the univariate representation of PS ap functions from their
bivariate representation. We have seen at page 47 that any bivariate function f (x, y) over
F2n/2 can be represented as a function of z ∈ F2n , which we shall also denote by f (z)
n (az) = az + (az)2n/2 and y = tr n (bz) =
(by abuse of notation), by posing x = trn/2 n/2
n/2 n/2
bz + (bz)2 for some elements a, b ∈ F2n that need to be F2 -linearly independent (for
instance, we can choose ω ∈ F2n \F2n/2 , and the pair (1, ω) is then a basis of the F2n/2 -vector
 
space F2n ; we then take for (a, b) a basis orthonormal with (1, ω)). For f (x, y) = g x
y ,
we have then the following expression valid for z = 0:

 n/2 −1
 2n/2 −2 
b + b2 z2 −1
n/2 n/2 n/2
f (z) = g a + a2 z2 .
6.1 Bent Boolean functions 215

Given a primitive element α of F2n , we have for i = 0, . . . , 2n/2 and j = 0, . . . , 2n/2 − 2:


   
f α i+j (2 +1) = g (a + a 2 β i ) (b + b2 β i )2 −2 ,
n/2 n/2 n/2 n/2

where β = α 2 −1 .
n/2

Dillon [442], observing that function trn (az2 −1 ), z ∈ F2n , a ∈ F∗2n , is bent if
n/2

and only if (see above) the restriction of trn (az) to U has Hamming weight 2n/2−1 ,
conjectures that such a exists for every even n. He gives the translation of this con-
jecture in terms of cyclic codes: let θ be a primitive element of U (i.e., a primitive
(2n/2 + 1)th root of unity in F2n ), then the condition is that the cyclic code C =
n/2
{(trn (a), trn (aθ), trn (aθ 2 ), trn (aθ 3 ), . . . , trn (aθ 2 ); a ∈ F2n } contains codewords of
Hamming weight 2n/2−1 . Since multiplying a by an element of U corresponds to a cyclic
shift, he can restrict himself to a ∈ F2n/2 . Then trn (a θ j ) = trn/2 (a trn/2 n (θ j )) =

trn/2 (a (θ j +θ −j )). We know (see Appendix, page 491 and foll.) that when j = 1, . . . , 2n/2 ,
θ j + θ −j (i.e., when z ∈ U \ {1}, z + z−1 ) takes twice each value in {x ∈ F∗2n/2 ; trn/2 (x −1 ) =
1}. The condition on a is then that (−1)trn/2 (ax) = 2n/2−1 − 2 · 2n/2−2 =
x∈F∗n/2 ; trn/2 (x −1 )=1
 2
−1 )

0, which is equivalent to 1 − (−1)trn/2 (x (−1)trn/2 (ax) = 0. The conjecture is then
x∈F∗n/2
2
−1 +ax)
that a exists in F∗2n/2 such that (−1)trn/2 (x = −1. We have already seen at
x∈F∗n/2
2
page 188 that such sum added with 1 is called a Kloosterman sum. Lachaud and Wolfmann
proved this conjecture in [733]; they proved that the values of such Kloosterman sums are
all the numbers divisible by 4 in the range [−2n/4+1 + 1; 2n/4+1 + 1], by relating such sums
to elliptic curves (and this relation was exploited later in [781] for deriving an algorithm
checking bentness more efficiently in such a context).
It has been later observed that all these results remain valid with exponents of the form
n n
j · (2 2 − 1), where gcd(j , 2 2 + 1) = 1, with the same arguments (the mapping x → x j by
n
which function x → x 2 2 −1 is composed being a permutation of U ). These exponents are
now widely called Dillon exponents. Leander [750] has found another proof that gives more
insight; a small error in his proof has been corrected in [350].
Dillon checked that one of the functions in PS ap does not belong to the completed
M (Maiorana–McFarland) class: function tr8 (x 15 ) over F28 , is affinely inequivalent to M
functions because (we omit the proof) there cannot exist an n/2-dimensional subspace W of
Fn2 such that Da Db f is null for every a and b both in W .
It may be more difficult to prove that a given function is not affinely equivalent to PS
functions than to M functions; see an example in [212].

Extended PS ap class Class PS ap is slightly extended into the subclass of PS − denoted


by PS #ap , of those Boolean functions over F2n that can be obtained from those of PS ap by
composition by the transformations x ∈ F2n → δx, δ = 0, and by addition of a constant.12
The elements of PS #ap are the Boolean functions f of Hamming weight 2n−1 ± 2n/2−1

12 The functions of PS ap are among them those satisfying f (0) = f (1) = 0.


216 Bent functions and plateaued functions

on F2n such that, denoting by α a primitive element of this field, f (α 2 +1 x) = f (x) for
n/2

every x ∈ F2n . We shall see in Subsection 6.1.20 that the functions in PS #ap have a stronger
property than bentness, called hyper-bentness. It is proved in [278] (by extension of the
results of [441])
r that they are the functions of Hamming weight 2n−1 ± 2n/2−1 , which can
be written as i=1 trn (ai x ji ) for ai ∈ F2n and ji a multiple of 2n/2 − 1 with ji ≤ 2n − 1.

Other classes of PS functions in explicit form The functions in PS ap are not the
only PS bent functions that can be given with explicit trace representation (useful for
applications, e.g., in telecommunications).
For instance, the PS bent functions related to André’s spreads13 have been studied in
[246]. These spreads introduced by J. André in the 1950s and independently by Bruck later
are defined as follows: let k and m be positive integers such that k divides m, say m = kl.
Let Nkm be the norm map from F2m to F2k :
2m −1
Nkm (x) = x 2k −1 .
Let φ be any function from F2k to Z/ lZ. Then, denoting φ ◦ Nkm by ϕ (it can be any function
m −1
from F2m to Z/ lZ that is constant on any coset of the subgroup U of order 22k −1 of F∗2m ),
the F2 -vector subspaces:
kϕ(z)
{(0, y), y ∈ F2m } and {(x, x 2 z), x ∈ F2m },
where z ∈ F2m form together a spread of F22m . Indeed, these subspaces have trivial pairwise
kϕ(y) kϕ(z)
intersection: suppose that x 2 y = x 2 z for some nonzero elements x, y, z of F2m (the
kϕ(y) kϕ(z)
other cases of trivial intersection are obvious), then we have Nkm (x 2 y) = Nkm (x 2 z),
kϕ(y) kϕ(z) kϕ(z)
that is, Nkm (x 2 )Nkm (y) = Nkm (x 2 )Nkm (z). Equivalently, since x → x 2 is in the
Galois group of F22m over F2k , Nkm (x)Nkm (y) = Nkm (x)Nkm (z), hence Nkm (y) = Nkm (z) and
kϕ(y) kϕ(z)
ϕ(y) = ϕ(z), which together with x 2 y = x 2 z implies then y = z.
Those spreads provide asymptotically the largest part of the known examples, due to the
large number of choices for the map φ.
The trace representation of the PS bent functions associated to André’s spreads is easily
obtained. A pair (x, y) ∈ F∗2m × F2m belongs to {(x, x 2 z), x ∈ F2m } if and only if
kϕ(z)

 
Nkm (y)

kϕ(z) kφ(Nkm (z)) Nkm (x) kϕ(y/x)
y = x2 z=x 2
z = x2 z = x2 z. (6.12)
Then a Boolean function in this class has the form

y
f (x, y) = g kϕ(y/x)
(6.13)
x2
(with the usual convention y0 = 0) where g is balanced on F2m and vanishes at 0. Such a
bent function is in PS and is potentially inequivalent to PS ap functions (this needs to be
further studied, though).
Let us study now the dual of f . If S is the support of g, then since 0 ∈ S, the support
9 kϕ(z)
of f is equal to the union z∈S {(x, x 2 z), x ∈ F2m }, less {0}. The support of the dual
13 We thank W. Kantor for mentioning these spreads, which lead to numerous bent functions.
6.1 Bent Boolean functions 217

of f is the union of the orthogonals of these subspaces, less {0} as well (see Proposition
80). A pair (x  , y  ) ∈ F22m belongs to the orthogonal of {(x, x 2 z), x ∈ F2m } if and only
kϕ(z)

if trm (xx  + x 2 zy  ) = trm ((x  + (zy  )2


kϕ(z) m−kϕ(z)
)x) equals 0 for all x ∈ F2m , that is, if
kϕ(z)
2
x  + (zy  )2 = 0, that is, if x  = y  = 0 or z = x y  . Hence we have
m−kϕ(z)

 kϕ(x/y) 
x2
f(x, y) = g . (6.14)
y
Of course, if g does not vanish at 0, the function defined by (6.13) is bent as well. We can
see this by changing g into its complement g ⊕ 1 (which changes f and its dual into their
complements as well).
Note that class PS ap corresponds to the case where ϕ is the null function. It also
m
corresponds to the case k = m, since we have then f (x, y) = g yx , because x 2 = x.
Note finally that if k = 1, then Nkm (x) = 1 for every x = 0, and the groups of the spread
are {(0, y), y∈ F2m }, {(x, 0), x ∈ F2m }, and {(x, x 2 z), x ∈ F2m }, z ∈ F∗2m for some j and
j

f (x, y) = g y2j ; the functions are in the PS ap class up to linear equivalence.


x
Finite prequasifield spreads from finite geometry (see [963]) have also been investigated
by Wu [1123] to give explicit forms of the related functions in P S and of their duals,
thanks to the determination of the compositional inverses of certain parametric permutation
polynomials. In particular, Wu has considered the Dempwolff–Muller prequasifields and the
Knuth presemifields to obtain the expressions of the corresponding P S bent functions. The
constructed functions and their dual functions are in a similar shape as the P S ap functions,
but are more complex. See more in [663].
Explicit constructions of bent functions derived from symplectic presemifields associated

to pseudoplanar functions (see page 269) i<j ai,j x 2 +2 (whose multiplicative operation
i j

m−j m+i−j m−j m−i j −i m−i


is x ◦ y = xy + 2
ai,j x2 y2 + 2
ai,j x2 y2 ) have been obtained in [4];
i<j i<j
see also [660].
Class P S has been generalized into the generalized partial spread class GPS ; see
Definition 56, page 242.

3. Class H and Niho functions: We have already seen in Subsection 5.1.2, at pages 167
and 169, the principle of Niho Boolean functions, among which we shall characterize (in
Corollary 14) those that are bent. It is proved in [311, proposition 5] that all bent functions
affine on each coset of F∗2n/2 are EA equivalent to P S ap or Niho functions possibly added
with the indicator of one coset of {0} ∪ F∗2n/2 . As observed in [311], Niho bent functions
happen to be the univariate version of bivariate bent Boolean functions that we shall
introduce with class H below in Definition 53, and that are closely related to the functions
introduced by Dillon in [441] as the elements of a family that
 he n/2
denoted by H . The functions
of this family were defined as f (x, y) = trn/2 y + x G yx 2 −2 ; x, y ∈ F2n/2 , where
G is a permutation14 of F2n/2 such that, for every b ∈ F∗2n/2 , the function G(x) + bx is
two-to-one (that is, the preimage of any element of F2n/2 by this function contains zero or
14 Dillon also assumed that G(x) + x does not vanish, but this condition is not necessary for bentness.
218 Bent functions and plateaued functions

two elements). We shall see below why these conditions characterize bentness. New bent
functions were found recently within this framework. The linear term trn/2 (y) being not
useful in the function above, we take it off and consider those functions of the form
  
trn/2 x G yx if x = 0
f (x, y) = (6.15)
0 if x = 0,

where G is any function from F2n/2 to itself. As seen at page 170, we have
y
Wf (a, b) = (−1)trn/2 (x G( x )+ax+by ) + (−1)trn/2 (by)
x∈F∗n/2 ,y∈F2n/2 y∈F2n/2
2

= (−1)trn/2 (x(G(z)+a+bz)) + 2n/2 δ0 (b)


x∈F∗n/2 ,z∈F2n/2
2

= 2n/2 |{z ∈ F2n/2 ; G (z) + a + bz = 0}| + δ0 (b) − 1 . (6.16)

Proposition 81 [311, 441] Any Boolean function of the form (6.15) is bent if and only if
G is a permutation of F2n/2 and

for every b ∈ F∗2n/2 , the function z → G(z) + bz is 2-to-1 on F2n/2 . (6.17)

The dual function of f in (6.15) is



1 if the equation G(z) + bz = a has no solution in F2n/2
f(a, b) =
0 otherwise.

Note that ann-variable function (6.15), ora Niho function (see below), is then bent if
and only if u∈Fn Wf3 (u) = 22n , that is, x,y∈Fn (−1)f (x)⊕f (y)⊕f (x+y) = 2n (the same
2 2
characterization is valid for quadratic functions vanishing at 0n , but for general functions, we
have only a necessary condition). Indeed, |{z ∈ F2n/2 ; G (z)+a +bz = 0}|+δ0 (b)−1 ≥ −1
implies (|{z ∈ F2n/2 ; G (z)+ a + bz = 0}| + δ0 (b) − 1)3 ≥ |{z ∈ F2n/2 ; G (z) + a + bz =
0}| + δ0 (b) − 1 and then u∈Fn Wf (u) ≥ 2
3 n
u∈Fn Wf (u) = 2 , with equality if and
2n
2 2
only if Wf (u) ∈ {±2n/2 , 0} for all u, and therefore Wf (u) ∈ {±2n/2 } because of Parseval’s
relation.

Class H The restrictions of f to the lines through the origin of the affine plane are linear.
More generally, any function whose restriction to each subspace in the Desarguesian spread
is linear has the form
  
trn/2 xψ yx if x = 0
g(x, y) = (6.18)
trn/2 (μy) if x = 0,

where μ ∈ F2n/2 and ψ is a mapping from F2n/2 to itself; this is a particular case of (5.8).

Definition 53 The set of those bent functions of the form (6.18) (i.e., which are linear over
each element of the Desarguesian spread) is denoted by H.
6.1 Bent Boolean functions 219

All the functions in class H being clearly EA equivalent to functions of the form (6.15),
Proposition 81 settles the case of all Niho bent functions ([311] also settled the more general
case where the restrictions are affine).
As seen in [311] (see Lemma 7 below for a proof), Condition (6.17) implies the bijectivity
of G and is then necessary and sufficient for f to be bent. The set of those functions G that
satisfy (6.17) is stable under some transformations, among which G → G−1 , and [311]
observed that the functions corresponding to G and G−1 are in general EA inequivalent.
Three other transformations, leading to bent functions that are in general EA inequivalent as
well, have been investigated in [154].

H functions and o-polynomials A connection between functions in class H and oval


polynomials has been shown in [311]; oval polynomials (also called o-polynomials) are a
notion in finite geometry related to hyperovals in the projective plane P G(2, 2n/2 ). Recall
that, for a given power q of 2, P G(2, q) has for points all the one-dimensional subspaces
in F3q and for lines all the two-dimensional subspaces of F3q . In other words, the points of
this projective plane are the equivalence classes of F3q \ {(0, 0, 0)} modulo the equivalence
relation of proportionality.15 Then two distinct lines always intersect in one point. More
precisely, the projective plane can be obtained from the affine plane by adding points at
infinity in the following way: each set of parallel lines in the affine plane defines a point at
infinity, and this gives one point at infinity corresponding to the parallel lines x = a, and
q others corresponding to the parallel lines y = bx + a. The lines of the projective plane
are the lines of the affine plane completed with their corresponding points at infinity and the
line at infinity (made of all points at infinity). A hyperoval of the projective plane P G(2, q)
is a set of q + 2 points no three of which are on a same line; any hyperoval is equivalent to a
hyperoval containing the following four points: (1 : 0 : 0), (0 : 1 : 0), (0 : 0 : 1), (1 : 1 : 1);
it can then be represented as {(1 : t : G(t)); t ∈ Fq } ∪ {(0 : 1 : 0), (0 : 0 : 1)}, where
G(0) = 0, G(1) = 1 and G is equivalently an o-polynomial on Fq :

Definition 54 Let m be any positive integer. A permutation polynomial G over F2m is


called an o-polynomial (an oval polynomial) if, for every c ∈ F2m , the function

G(z+c)+G(c)
if z = 0
z ∈ F2m → z
0 if z = 0
is a permutation of F2m .

As observed in [311]:

Lemma 7 Condition (6.17) is equivalent to the fact that G is an o-polynomial on


F2n/2 .

15 The coordinates (x : y : z) of a point in P G(2, q), which are defined up to multiplication by a nonzero
element of Fq , are called homogeneous coordinates; we can consider that P G(2, q) contains one special affine
plane whose points have the form (1 : x : y) while points at infinity are of the form (0 : x : y), among which
is the so-called nucleus (0 : 1 : 0).
220 Bent functions and plateaued functions

Proof For every b, c ∈ F2m , m = n/2, the equation G(z) + bz = G(c) + bc is satisfied by
c. Thus, if Condition 6.17 is satisfied, then for every b ∈ F∗2m and every c ∈ F2m , there exists
exactly one z ∈ F∗2m such that G(z + c) + b(z + c) = G(c) + bc, that is, G(z+c)+G(c)z = b.
Then, for every c ∈ F2m , the function z ∈ F2m → ∗ G(z+c)+G(c) ∗
∈ F2m is bijective, that
 G(z+c)+G(c) z
if z = 0
is, G and the function z ∈ F2m → z are permutations. Hence, G
0 if z = 0
is an o-polynomial. Conversely, if G is an o-polynomial, then for every c ∈ F2m , we have
G(z+c)+G(c)
z = 0 for every z = 0, and for every b = 0 there exists exactly one nonzero z
such that G(z + c) + G(c) = bz. Then for every u ∈ F2m , either the equation G(z) + bz = u
has no solution, or it has at least a solution c and then exactly one second solution z + c
(z = 0). This completes the proof.

Remark. We have already observed with Lemma 5, page 190, that any Boolean function
is bent if and only if its Walsh transform takes values congruent with 2n/2 modulo 2n/2+1 .
This property and Relation (6.16) show that a permutation polynomial is an o-polynomial if
and only if any equation G(z) + bz = c with b = 0 has an even number of solutions.

The known classes of inequivalent o-polynomials are16 (see [168, 311] and their
references):
i
1. G(z) = z2 , where i is co-prime with m.
2. G(z) = z6 , where m is odd.
G(z) = z3·2 +4 , where m = 2k − 1.
k
3.
G(z) = z2 +2 , where m = 4k − 1.
k 2k
4.
G(z) = z2 +2 , where m = 4k + 1.
2k+1 3k+1
5.
G(z) = z2 + z2 +2 + z3·2 +4 , where m = 2k − 1.
k k k
6.  1
1 1 5
7. G(z) = z 6 + z 2 + z 6 where m is odd; note that G(z) = D5 z 6 , where D5 is the
Dickson polynomial of index 5 (see the definition of Dickson polynomials at page 389).
8. G(z) = δ (z +z)+δ
2 4 2 (1+δ+δ 2 )(z3 +z2 )
z4 +δ 2 z2 +1
+ z1/2 , where trm (1/δ) = 1 and, if m ≡ 2 [mod 4],
then δ ∈ F4 .  
 
n (v r )(z + 1) + tr n (vz + v 2m )r z + tr n (v)z1/2 + 1 1−r + z1/2 ,
9. G(z) = tr n1(v) trm m m
m
m −1 m +1
where m is even, r = ± 2 3 , v ∈ F2m
2 ,v
2 = 1 and v = 1.
The two last classes are related to Subiaco and Adelaide hyperovals, whose description
has been simplified in [3] thanks to a new type of homogeneous coordinates. The known
o-polynomials provided a number of potentially new bent functions detailed in [311],
since each class of o-polynomials gives rise to several EA inequivalent classes of bent
functions; see more in [154, 942]. Continuing the work of the author and Mesnager, [2]
gives geometrical characterization of Niho bent functions; it shows that they are in one-to-
one correspondence with the so-called line ovals in the affine plane (which are sets of q + 1
nonparallel lines no three of which are concurrent, where q is the order of the base field)

16 i
Two more, given in [168], are equivalent to z2 ; another in the list of [769] has a typo.
6.1 Bent Boolean functions 221

and that their dual functions are the complements of the characteristic functions of these line
ovals; it extends this to arbitrary spreads.

Remark. A new notion of equivalence between bent functions in class H is deduced


from Lemma 7. Hyperovals being called equivalent if they are mapped to each other by
collineations (i.e., permutations mapping lines to lines), it provides a notion of equiva-
lence between o-polynomials, and between the related bent functions, called projective
equivalence. In particular, as recalled in [414], the group P L(2, 2m ) of all F2 -linear
j
automorphisms of F2m of the form L(x 2 ), where L is an element of GL(2, 2m ) (associated
with a 2×2 matrix over F2m ) acts on P G(2, 2m ), and then acts on o-polynomials; see more in
[414]. EA equivalence classes of Niho bent functions are in one-to-one correspondence with
projective equivalence classes of ovals in the projective plane P G(2, q) [2, 942]. Notions
of duality for bent functions and duality for projective planes are consistent for Niho bent
functions (a duality of P G(2, q) is a bijection from the set of points of PG(2,q) to the set of
lines, which preserves incidence of points and lines).

Niho bent functions In univariate representation, functions in class H are those functions
whose restrictions to the multiplicative cosets μ F2n/2 of F∗2n/2 are linear, i.e., are Niho
functions (5.6). Niho bent functions have been investigated in [479] and [749, 752] without
that the authors notice their relationship with class H . Relation (5.5), page 168 (in which we
can take for U the multiplicative subgroup of F∗2n of order 2m + 1 since n = 2m) gives for
g = 0 and f (0) = 0

∀u ∈ F2n , Wf (u) = 2m |{μ ∈ U ; φ(μ) + trm n
(uμ) = 0}| − 1 .
We deduce, denoting by U the multiplicative subgroup of F∗2n of order 2m + 1:

Corollary 14 Let f be any Niho function (5.6) in n variables (n even). Then f is bent if
and only if, for every u ∈ F2n , we have |{μ ∈ U ; φ(μ) + trn/2
n (uμ) = 0}| ∈ {0, 2}.

A few examples of infinite classes of Niho bent functions are known up to affine
equivalence.
 The simplest
 one is quadratic and has been already encountered in Section
, where a ∈ F∗2n/2 , x ∈ F2n . The other examples, from [479], are
2 n/2 +1
5.2: trn/2 ax
binomials of the form f (x) = trn (α1 x d1 + α2 x d2 ), x ∈ F2n , d1 , d2 ∈ Z/(2n − 1)Z,
where 2d1 = 2n/2 + 1 and α1 , α2 ∈ F∗2n are such that (α1 + α12 )2 = α22 +1 .
n/2 n/2

Equivalently, denoting a = (α1 + α12 )2 and b = α2 , we have a = b2 +1 ∈ F∗2n/2


n/2 n/2

and f (x) = trn/2 (ax 2 +1 ) + trn (bx d2 ) (note that if b = 0 and a = 0, then f is also bent,
n/2

but it belongs then to the class of quadratic Niho bent functions seen above). The values of
d2 are (see [479] for the proofs):
1. d2 = (2n/2 − 1) 3 + 1 (originally in [479] was included the condition that, if n ≡ 4 [mod
8], then b = α2 is the fifth power of an element in F2n , but as observed in [596], the value
of b can be taken arbitrarily under the condition that a = b2 +1 ).
n/2

2. 4d2 = (2n/2 −1)+4 (with the condition that n/2 is odd), This example has been extended
 2r−1 −1 s
by Leander and Kholosha [749, 752] into the functions: trn αx 2 +1 + i=1
n/2
xi ,
222 Bent functions and plateaued functions
n/2
r > 1 such that gcd(r, n/2) = 1, α ∈ F2n such that α + α 2 = 1, si = (2n/2 −
1) 2r + 1 (mod 2 + 1), i ∈ {1, . . . , 2
i n/2 r−1 − 1}. It is shown in [763] that the functions
2r −1 n/2−r +1)(2n/2 −1)+1 n/2
i=1 trn (αx
(i2 ); α ∈ F2 , α + α 2 = 0, enter in this class up to EA
n
n/2
equivalence while they cover it for α + α 2 = 1, with a nice original proof of their
bentness.
3. 6d2 = (2n/2 − 1) + 6 (with the condition that n/2 is even).

As observed in [479] and in [155], these functions have respectively algebraic degree n/2,
3 and n/2. In [475], the value distribution of the Walsh spectrum of the monomial function
corresponding to the first exponent d2 above was determined for n/2 odd, in terms of
Kloosterman sums.
After [311], several works investigated the properties of the known Niho bent functions
and their relation with o-polynomials (when transformed from univariate form to bivariate
form); we follow here the survey [313] on bent functions:
– The dual function of the second example above (with 4d2 = (2n/2 − 1) + 4) has been
calculated (in [311]) as well as that of the Niho bent function consisting of 2r exponents
(see [155, 296]); it has been shown in [155, 311] that the dual bent functions are not of
the Niho type; this replied negatively to an open question stated in [479].
– The quadratic monomial and (as shown in [311]) the second example above belong to
the completed M class, but (as proved in [155]), when m = n/2 > 2, the two others
and the generalization of the second example do not; this gives a positive answer to an
open question (since 1974) whether completed class H differs from completed class M.
– It is shown in [296] that the o-polynomials associated with the Leander–Kholosha bent
functions are equivalent to Frobenius automorphisms; the relation between the binomial
Niho bent functions with d2 = (2m − 1) 3 + 1 and 6d2 = (2m − 1) + 6 and the Subiaco
and Adelaide classes of hyperovals (related to the two last o-polynomials above) was
found in [596]; this allowed when m ≡ 2 (mod 4) to expand the class of bent functions
corresponding to Subiaco hyperovals. Later, in [168], the o-polynomials associated to
all known Niho bent functions have been identified and the class of Niho bent functions
consisting of 2r terms has been extended by inserting coefficients of the power terms in
the original function; it can then give any Niho bent function. Several classes of explicit
Niho bent functions have been deduced (as also detailed in [769, section 3]).

Remark. We have seen in Proposition 66, page 193, a characterization of bent functions
by power moments of even exponents of the Walsh transform. In the case of Niho functions,
we have a characterization with odd exponents as well:

Proposition 82 Let n = 2m be any even positive integer, w any odd integer such that
w ≥ 3, and f any Niho n-variable Boolean function. Then we have
w−1 
f (xi )⊕f ( w−1
Wfw (u) ≥ 2(w+1)m , i.e. (−1) i=1 i=1 xi ) ≥ 2(w−1)m
u∈F2n x1 ,...,xw−1 ∈F2n

with equality if and only if f is bent.


6.1 Bent Boolean functions 223

Proof We still denote by U the multiplicative subgroup of F∗2n of order 2m + 1, where


n = 2m. Let f (μ x) = trm (x φ(μ)),
 μ ∈ U , x ∈ Ftr2 (xφ(μ))
m , where φ is some function from U to

F2m . We have Wf (0) = F (0) = x∈F2m ,μ∈U (−1) m − 2m = 2m (|φ −1 (0)| − 1).
For every u ∈ F2n , the function f (z)+trn (uz) is Niho too since its value at z = μx equals
= φ(μ) + trm n (uμ). We have then W (u) = 2m (|φ −1 (0)| − 1)
φu (μ)), where
trm (x φu (μ)  f u
and u∈F2n Wf (u) = 2wm u∈F2n (|{μ ∈ U ; φ(μ) = trm
w n (uμ)}| − 1)w .

For all u ∈ F2n , we have |{μ ∈ U ; φ(μ) = trm n (uμ)}| − 1 ≥ −1 and therefore

(|{μ ∈ U ; φ(μ) = trm


n
(uμ)}| − 1)w ≥ |{μ ∈ U ; φ(μ) = trm
n
(uμ)}| − 1. (6.19)
 
We deduce that u∈F 2 n (|{μ ∈ U ; φ(μ) = tr m
n (uμ)}| − 1)w ≥
u∈F2n (|{μ ∈ U ; φ(μ) =
trm (uμ)}| − 1) = μ∈U |{u ∈ F2 ; φ(μ) = trm (uμ)}| − 2 . For each μ, since uμ ranges
n n n n

over F2n when u ranges over F2n , and since trm n (z) ranges uniformly over F m when z ranges
 2
over F2n , we have |{u ∈ F2n ; φ(μ) = trm (uμ)}| = 2m . Hence u∈F2n Wfw (u) ≥ 2wm ((2m +
n

1)2m − 2n ) = 2(w+1)m with equality if and only if, for every u ∈ F2n , we have equality in
(6.19), that is, |{μ ∈ U ; φ(μ) = trm n (uμ)}| ∈ {0, 1, 2}, that is,17 W (u) ∈ {−2m , 0, 2m }.
f
Moreover, this last condition is equivalent
 to W f (u) ∈ {−2 m , 2m } for every u, that is, f is

bent, because of the Parseval identity u∈F2n Wf2 (u) = 22n . And we have u∈Fn Wfw (u) =
 w−1 w−1 2

2n x1 ,...,xw−1 ∈F2n (−1) i=1 f (xi )⊕f ( i=1 xi ) .

Proposition 82 allows proving the bentness of classes


of Niho functions: a set E of Niho
functions is made of bent functions if and only if f ∈E a∈F2n Wfw (a) = 2(w+1)m |E|. And
handling w = 3 is easier than w = 4. Corollary 13, page 194, and the remark that follows it
generalize to odd exponents.
Note that this characterization is not valid for all Boolean functions, even if their algebraic
degree is bounded above by n2 (like bent functions). For instance, it is easily seen that the
function, which is null when x1 = x2 = · · · = x n2 = 0 and has value 1 everywhere else, has

a value of a∈F2n Wf3 (a) negative and has algebraic degree bounded above by n2 (since the
value of the function depends in half its variables only).
However, it is interesting to see that this characterization is also valid for
those quadratic
functions that are null at 0, since for any quadratic function f , we have a∈F2n Wf3 (a) =

(−1)f (0) 2n x,y∈F2n (−1)βf (x,y) = (−1)f (0) 22n |Ef |, where βf is the symplectic form
βf (x, y) = f (x + y) ⊕ f (x) ⊕ f (y) ⊕ f (0) and Ef is its kernel, and we know that f
is bent if and only if Ef = {0}.
It has been shown in [861] that the only bent functions of the form (5.4), page 168,
equivalently (5.8), are, up to translation, those corresponding to Niho-bent and PS #ap
classes.

Niho-like and H-like bent functions It is possible to extend to other spreads (than the
Desarguesian spread) the principle of H and Niho functions (i.e., considering Boolean
functions whose restrictions to the elements of a spread are linear). This has been done
with André’s spreads in [246] and with three spreads from prequasifields and presemifields

17 Recall that w is odd; for w ≥ 4 even, there cannot be equality, and we know it already from Proposition 66.
224 Bent functions and plateaued functions

in [335] and in [246], independently.18 Probably many other spreads could be investigated,
since many more exist; see [425, 649, 661]. But we wish to find explicit examples of such
bent functions (and the associated o-likepolynomials).
Let us first study the general framework into which these four examples of spreads will
fit. Consider a spread whose elements are the subspace {(0, y), y ∈ F2n/2 } and the 2n/2
subspaces of the form {(x, Lz (x)), x ∈ F2n/2 }, where, for every z ∈ F2n/2 , function Lz
is linear. The property of being a spread corresponds to the fact that, for every nonzero
x ∈ F2n/2 , the mapping z → Lz (x) is a permutation of F2n/2 . Let us denote by x the
compositional inverse of this bijection.
A Boolean function over F22n/2 is linear over each element of the spread if and only if there
exists a mapping G : F2n/2 → F2n/2 and an element ν of F2n/2 such that, for every y ∈ F2n/2 ,
f (0, y) = trn/2 (νy) and for every x, z ∈ F2n/2 , x = 0,
f (x, Lz (x)) = trn/2 (G(z)x). (6.20)
Note that, up to EA equivalence, we can assume that ν = 0. Indeed, we can add the linear
n-variable function (x, y) → trn/2 (νy) to f ; this changes ν into 0 and G(z) into G(z) +
L∗z (ν), where L∗z is the adjoint operator of Lz , since for y = Lz (x), we have trn/2 (νy) =
trn/2 (xL∗z (ν)). We take ν = 0 and define 0 (y) = 0. By definition of x , Relation (6.20) is
equivalent to
∀x, y ∈ F2n/2 , f (x, y) = trn/2 (G (x (y)) x) . (6.21)

The value of the Walsh transform Wf (a, b) = x,y∈F n/2 (−1)
f (x,y)+trn/2 (ax+by) equals
2
then, for every (a, b) ∈ F22n/2 ,

(−1)trn/2 (G(x (y))x+ax+by)


(x,y)∈F2n/2
2

= 2n/2 δ0 (b) + (−1)trn/2 (G(z)x+ax+bLz (x))


x∈F∗n/2 ,z∈F2n/2
2

=2 n/2
(δ0 (b) − 1) + (−1)trn/2 ((G(z)+a+Lz (b))x )
z∈F2n/2 x∈F2n/2

= 2n/2 δ0 (b) − 1 + |{z ∈ F2n/2 ; G(z) + a + L∗z (b) = 0}| .
Hence f is bent if and only if G is a permutation and
|{z ∈ F2n/2 ; G(z) + a + L∗z (b) = 0}| ∈ {0, 2}, ∀a, b ∈ F2n/2 , b = 0. (6.22)
This condition on G(z) is similar to the definition of o-polynomials. In the case of André’s
spreads, it is a generalization of the notion of o-polynomial.

18 Reference [335] is a little more general: it deals with functions affine on each spread and also addresses odd
characteristics. It shows that bent functions from Fmp × Fp to Fp , which are affine on the elements of a given
m

spread of Fp × Fp , either arise from partial spread bent functions, or are a generalization in characteristic 2 of
m m

class H. Reference [246] is slightly more general as well since it also addresses spreads not related to
prequasifields.
6.1 Bent Boolean functions 225

Remark. As explained for instance in the nice survey by W. Kantor [661], every spread
has a dual in the space of linear forms. Viewing this in F22n/2 the subspaces belonging to this
spread are the orthogonals of those corresponding to the original spread. In other words, the
fact that for every x = 0 and every b = 0, the function z → trn/2 (bLz (x)) = trn/2 (xL∗z (b))
is balanced implies that function z → L∗z (b) is also a permutation and the elements of the
dual spread are the subspace {(x, 0), x ∈ F2n/2 } and the 2n/2 subspaces {(L∗z (y), y), y ∈
F2n/2 }.

It is shown in [246] that, as in the case of o-polynomials, the condition that G is a


permutation is implied by Relation (6.22).
The question that can lead to new bent functions when addressed positively is: can we
build efficiently permutations G of F2n/2 and linear mappings Lz : F2n/2 → F2n/2 , with
z ∈ F2n/2 , such that function z → Lz (x) is bijective for every x = 0 and that the equation
G(z) + Lz (b) = a has zero or two solutions for every a ∈ F2n/2 and every b ∈ F∗2n/2 ?
Equivalently, by denoting by Hx the permutation z → Lz (x), can we find a permutation G
and a set of permutations Hx , x ∈ F∗2n/2 , such that, denoting H0 = 0, the set {Hx , x ∈ F2n/2 }
is a vector space and every function G + Hx , x ∈ F∗2n/2 , is two-to-one? Note that finding
nine classes of o-polynomials has been a hard 40-year-long mathematical work and we can
expect that finding such o-like-polynomials will be also difficult, except maybe for a few
simple cases like with o-polynomials.
kϕ(z)
In the case of André’s spreads, we have Lz (x) = x 2 z. According to (6.12), we have
and L∗z (b) = (bz)2
y m−kϕ(z)
then x (y) = 2kϕ(y/x) . Relation (6.21) becomes
x
 
y
∀x, y ∈ F2n/2 , f (x, y) = trn/2 G kϕ(y/x)
x . (6.23)
x2
The condition for such f to be bent is that, for every b ∈ F∗2n/2 and every a ∈ F2n/2 , there
m−kϕ(z)
exist two values of z or none such that G(z) + (bz)2 = a.
As shown for instance in [425, 649] (and recalled by Kantor in [662]), a spread can
be derived from any prequasifield, that is, any Abelian finite group having a second law
∗ that is left-distributive with respect to the first law and is such that the right and left
multiplications by a nonzero element are bijective, and that the multiplications by 0 are
absorbent. The elements of this spread are the F2 -vector subspaces {(0, y), y ∈ F2n/2 } and
{(x, z ∗ x), x ∈ F2n/2 }, z ∈ F2n/2 . Wu [1123] has studied three particular examples for
designing PS functions (many others could have been studied), and he determined explicitly
the related functions x . Let us see what we obtain with them in the framework of Niho-like
functions.
The Dempwolff–Müller prequasifield is defined as follows. Let k and m be co-prime odd
k−1 2i
integers. Let e = 2m−1 − 2k−1 − 1, L(x) = i=0 x , and define x ∗ y = x L(xy).
e

Then (F2m , +, ∗) is a prequasifield [431], leading to the spread of the F2 -vector subspaces
{(0, y), y ∈ F2m } and {(x, z ∗ x), x ∈ F2m } = {(x, ze L(xz)), x ∈ F2m }, z ∈ F2m .
Then x (y) = 1
2
 , where Dd is the Dickson polynomial (see the definition at
y
xDd k
x 2 +1
k−1 e )2−i z.
page 389) of index the inverse d of 2k − 1 modulo 2m − 1, and L∗z (b) = i=0 (bz
Relation (6.21) becomes
226 Bent functions and plateaued functions
⎛ ⎛ ⎞ ⎞
1
∀x, y ∈ F2m , f (x, y) = trm ⎝G ⎝   ⎠ x⎠ , (6.24)
y2
xDd 2k +1
x
k−1 −i
and such f is bent if and only if the equation G(z)+ i=0 (bze )2 z = a has 0 or 2 solutions
for every b = 0 and every a.
The Knuth commutative presemifield is defined as follows. Let m be an odd integer
and b ∈ F∗2m . Then x ∗ y = xy + x 2 trm (by) + y 2 trm (bx) defines a presemifield (a
prequasifield that remains one when a ∗ b is replaced by b ∗ a), leading to the spread of
the F2 -vector subspaces {(0, y), y ∈ F2m } and {(x, z ∗ x), x ∈ F2m } = {(x, zx + x 2 trm (bz) +
z2 trm (bx)), x ∈ F2m }, z ∈ F2m .  

Then x (y) = (1 + trm (bx)) yx + xtrm b yx + xtrm (bx)C 1 xy2 , where Ca (x) =
m−1 2i , where c = 1 + 1 + · · · +
bx

i=0 ic x 0 2i 3·2i
1
(m−3)·2i
, ci = 1 + 1
2i
+ 3·2
1
i + · · · + (i−2)·2i +
1
a a a a a a
1
(i+1)·2i
+ ··· + 1
(m−1)·2i
if i is odd and ci = 1 + 1
2·2i
+ 1
4·2i
+ ··· + 1
(i−2)·2i
+ 1
i +
a a a a a a (i+1)·2
· · · + (m−2)·2 ∗ 2m−1 2m−1
i if i is even. We have Lz (b) = bz + b trm (bz) + btrm (b
1
z). Relation
a
(6.21) becomes
  y  y  y  
trm G (1 + trm (bx)) + xtrm b + xtrm (bx)C 1 x , (6.25)
x x bx x2
m−1 m−1
and such f is bent if and only if the equation G(z) + bz + b2 trm (bz) + btrm (b2 z) = a
has zero or two solutions for every b = 0 and every a.
There are more examples of semifields due to Knuth [710] that could be studied.
A third example is the dual of the symplectic version of the Knuth commutative
presemifield. Assume m is an odd integer. Then x ∗ y = x 2 y + trm (xy) + xtrm (y) defines
a presemifield [659], leading to two spreads:
– The spread of the F2 -vector subspaces {(0, y), y ∈ F2m } and {(x, z ∗ x), x ∈ F2m } =
{(x, z2 x + trm (zx) + ztrm (x)), x ∈ F2m }
– The spread of the F2 -vector subspaces {(0, y), y ∈ F2m } and {(x, x ∗ z), x ∈ F2m } =
{(x, x 2 z + trm (xz) + xtrm (z)), x ∈ F2m }), where z ∈ F2m (two such spreads are
sometimes called opposite of each other)

In the first case, the corresponding function x has been determined in [1123] and
L∗z (b) = bz2 + ztrm (b) + trm (bz). Then f (x, y) equals
⎛ ⎛⎡ ⎤
m−1 m−3
2 2
⎜ ⎜⎢ m−1 2i −1 2i ⎥ trm (x)
trm ⎝G ⎝⎣(xy)2 + (xy)2 + x 2 trm (xy)⎦
x
i=0 i=0
⎞ ⎞
m−1 −1 m−1 m−1 −1 ⎟ ⎟
+x 2 y2 + x2 trm (xy)⎠ x ⎠ , (6.26)

and such f is bent if and only if the equation G(z) + bz2 + ztrm (b) + trm (bz) = a has zero
or two solutions for every b = 0 and every a.
6.1 Bent Boolean functions 227

⎧ In they second case, the relation y = x 2 z + trm (xz) + xtrm (z) implies for x = 0 that

⎪ z = x 2 + x 2 + trmx(z)
trm (xz)
⎨   
trm (xz) = trm yx + trm (xz)trm x1 + trm (z) and is equivalent to

⎪    
⎩ tr (z) = tr y
+ (tr (xz) + tr (z)) tr 1
m m x2 m m m x
⎛  
 y  ⎞
y 1 ⎝ trm x 2 trm yx
z = 2 + trm + ⎠
x x x2 x
⎛   y  ⎞
  trm y
+ trm trm y
1 x2 x x2
+ trm +1 ⎝ + ⎠,
x x2 x

which gives x (y). We have L∗z (b) = (bz)2


m−1
+ ztrm (b) + btrm (z). Then f (x, y) equals
⎛ ⎛ ⎛  
 trm y  ⎞  
y 1 x2 trm yx 1

trm G ⎝ + trm ⎝ + ⎠ + trm +1
x2 x x2 x x
⎛   y   ⎞⎞ ⎞
y y
trm x2
+ trm x trm x2
⎝ + ⎠⎠ x ⎠ ,
x2 x
(6.27)
m−1
and such f is bent if and only if the equation G(z) + (bz)2 + ztrm (b) + btrm (z) = a has
zero or two solutions for every b = 0 and every a.
See in [5] more constructions of bent functions linear on elements of presemifield
spreads and a survey on this topic, with explicit descriptions of such functions for known
commutative presemifields and related (new types of) oval polynomials.

4. Class C + has been introduced by Dillon [441]. Since PS ap functions have for supports
the unions of multiplicative cosets of the subgroup F∗2n/2 of F∗2n (i.e., the subgroup of all
(2n/2 +1)th powers), plus possibly the 0 element, he addressed the other possible subgroup U
of (2n/2 − 1)th powers in F∗2n , and studied then the functions of the form f (z) = g(z2 +1 ),
n/2

where z ∈ F2n and g is balanced over F2n/2 (note that z2 +1 ∈ F2n/2 ) and vanishes at 0 (if
n/2

not, we can apply the result to g ⊕ 1). He showed that such function is bent if and only if
 −1
the mapping a ∈ F2n/2 → Wg (a −1 ) = x∈F n/2 (−1)g(x)+trn/2 (a x) , with the convention
2
0−1 = 0, equals the Walsh transform of some Boolean function over F2n/2 .
Dillon refers to the Singer difference set in his proof. An elementary proof is as follows:
using polar representation ux (with x ∈ F∗2n/2 , u ∈ U ) in F∗2n , we have for every λ ∈ F2n that
2n/2 +1 )+tr 2 )+tr 2n/2 u2n/2 )x)
Wf (λ) = (−1)g(z n (λz) = 1+ (−1)g(x n/2 ((λu+λ =
z∈F2n u∈U x∈F∗n/2
2
2n/2 2n/2 2
−2 n/2
+ Wg ((λu + λ u ) ). If λ = 0, then Wf (λ) = −2n/2 + (2n/2 + 1)Wg (0) =
u∈U
−2n/2 . Otherwise, we can assume without loss of generality that λ belongs to F∗2n/2 (since
228 Bent functions and plateaued functions

Wf (λ) is clearly invariant when multiplying λ  by an element of U ), and we have then


  n/2
2
Wf (λ) = −2n/2 + u∈U Wg λ2 u + u2 . It is well known that, when u ranges

over U \ {1}, u + u2 ranges twice over the set {z ∈ F∗2n/2 , trn/2 z−1 = 1} (indeed, we
n/2

 2
have u2 = u−1 and the equation u + u−1 = z, i.e., uz + uz = z−2 , has solutions in
n/2


U \ {1} if and only if trn/2 z−1 = 1). Then, since g is balanced, Wf (λ) equals
     −2 
−2n/2 + 2 Wg (λz)2 = −2n/2 + Wg (λz)2 1 − (−1)trn/2 z
z∈F∗n/2 ; trn/2 (z−1 )=1 z∈F2n/2
2
  2 
trn/2 z−1
and Wf (λ) is equal to −2n/2 + 2n/2 (−1)g(0) − z∈F2n/2 W g λ z (−1) =
  −1
− z∈F n/2 Wg z (−1) tr n/2 (μz) , where μ = λ = 0.
2
2  
Hence, f is bent if and only if Wg z−1 (−1)trn/2 (μz) ∈ {2n/2 , −2n/2 } for every
z∈F2n/2
μ = 0, which is then equivalent to the condition stated by Dillon, according to the inverse
Fourier transform formula, since it is always verified for μ = 0.
Dillon mentions the example where g is the absolute trace function trn/2 (x) over F2n/2 ;
the resulting function is quadratic and so belongs to class M completed, and it also belongs
to class H, up to EA equivalence. No example is known yet lying outside known completed
classes.

5. Dobbertin’s class, introduced in [466], is a class of bent functions that contains both
PS ap and M and is based on the so-called triple construction.
 The elements of this class
are the functions f defined by f (x, φ(y)) = g x+ψ(y)
y , where g is a balanced Boolean
function on F n2 and φ, ψ are two mappings from F n2 to itself such that, if T denotes the
2 2
affine subspace of F n2 spanned by the support of function Wg , then, for any a in F n2 , the
2 2
functions φ and ψ are affine on aT = {ax, x ∈ T }. The mapping φ must additionally be
one-to-one. Dobbertin gives two explicit examples of bent functions constructed this way.
In both, φ is a power function.

6. The class of functions γ related to almost bent functions exists when n ≡ 2 [mod
4]. Recall that a vectorial Boolean function F : F2n/2 → F2n/2 is called almost bent (see
Definition 31, page 119) if the Walsh transforms of all component functions v · F , v = 0
n/2+1 n/2+1
in F2n/2 take values in {−2 2 , 0, 2 2 }. The function γF (a, b), a, b ∈ F2n/2 , equal to 1 if
the equation F (x) + F (x + a) = b admits solutions, with a = 0 in F2n/2 , and equal to 0
otherwise, is then bent (see Proposition 158, page 375), and the dual of γF is the indicator
of the Walsh support of F , deprived of (0, 0). Several classes of AB functions are known
(see Section 11.4, page 394). The bent functions γF associated to known AB functions have
been investigated in [152]. We give them below:
6.1 Bent Boolean functions 229

– Gold: F (x) = x 2 +1 , gcd(i, n/2) = 1 , γF (a, b) = trn/2 ( 2bi +1 ) with 10 = 0


i

1 a
– Inverse: F (x) = x 2 −2 , γF (a, b) = trn ab
n
+1+δ0 (a)+δ0 (b)+δ0 (a)δ0 (b)+δ0 (ab+1),
where δ0 (x) is the Dirac (or Kronecker) function
n/2−1
2i −2i +1 2 +3
– Kasami–Welch F (x) = x 2 , gcd(i, n/2) = 1, Welch F (x) = x 2 , Niho
n/2−1
n/2−1 n/2−1 n/2−1 3 2 +1
F (x) = x 2 2 +2 4 −1 if n ≡ 1 [mod 4], F (x) = x 2 2 +2 2 −1 if n ≡ 3 [mod
4]
s
We have F (x + 1) + F (x) = q(x 2 + x), where gcd(s, n/2) = 1 and q is in each case a
permutation determined by Dobbertin (see [470]):
i +1
x2
1. Kasami–Welch: s = i, q(x) = i  ji + 1, where i  ≡ 1/i [mod n/2],
x 2 +αtr n/2 (x)
 j =1
0 if i  is odd
α=
1 otherwise.
n/2−1
n/2−1 2 +1 +1
2. Welch: s = 2 , q(x) = x 2 + x 3 + x + 1.
n/2−1 3 n/2−1
2 +1
3. Niho:
 s = 4 if n ≡ 1 [mod 4] and s = 2 if n ≡ 3 [mod 4], q(x) =
g(x 2
1
s −1
)+1
+ 1 if x ∈
/ F2
where
1 otherwise,
2s+1 +2s+1 +1 2s+1 +2s+1 −1 2s+1 +1 2s+1 −1
g(x) = x 2 + x2 + x2 + x2 + x.
and F (x + 1) + F (x) = b has solutions if and only if trn/2 (q −1 (b)) = 0. Then

trn/2 (q −1 (b/a d )) + 1 if a = 0,
γF (a, b) =
0 otherwise.
The functions γF associated to Kasami–Welch, Welch, and Niho functions with n/2 =
7, 9, are neither in the completed M class nor in the completed PS ap class.
The other known infinite classes of AB functions are quadratic; their associated γF belong
to the completed M class.

7. Classes of bent monomial Boolean univariate functions (which can more simply be
called monomial bent functions and are sometimes called power bent functions), that is,
functions of the form f (x) = trn (ax d ), where x ∈ F2n and a belongs to some subset19 of
F∗2n .
Obviously, trn (ax d ) can be bent only if the mapping x → x d is not a permutation
(otherwise, the function would be balanced, a contradiction), that is, if d is not coprime
with 2n − 1.
19 It is impossible that trn (ax d ) be bent for every a = 0 since this would mean that the (n, n)-function x d is bent,
and we shall see in Proposition 104, page 269, that this is impossible.
230 Bent functions and plateaued functions
n n
It has been proved in [750] that d must be coprime either with 2 2 − 1 or with 2 2 + 1.
2n −1
Indeed, since f (x) is invariant under multiplication of x by β = α gcd(d,2n −1) where α is a
primitive element of F2n , and is then invariant under multiplication by any element of the
multiplicative group of order gcd(d, 2n − 1), we have Wf (0) ≡ 1 [mod gcd(d, 2n − 1)].
n
Hence, gcd(d, 2n − 1), which equals gcd(d, 2n/2 − 1) gcd(d, 2n/2 + 1), since 2 2 − 1 and
n n n
2 2 + 1 are coprime, divides Wf (0) − 1. If Wf (0) = 2 2 then gcd(d, 2 2 + 1) = 1 and if
n n
Wf (0) = −2 2 , then gcd(d, 2 2 − 1) = 1.
Apart from the particular case of quadratic bent function f (x) = tr n2 (x 2 +1 ), already
n/2

encountered, the known values of d for which there exists at least one a such that trn (ax d )
is bent (such values are called bent exponents) are the following (up to conjugacy d → 2j d
[mod 2n − 1]):
– The Gold exponents (already seen at page 206) d = 2j +1, where n
gcd(j ,n) is even and a ∈
{x d , x ∈ F2n } = {x gcd(d,2n −1)
, x ∈ F2n }; being quadratic, function trn (ax 2 +1 ) belongs
j

to the completed Maiorana–McFarland class; these functions have been generalized in


n/2−1
[355, 629, 669, 699, 701, 808, 1144] to functions of the form trn ( i=1 ai x 2 +1 ) +
i

trn/2 (an/2 x 2 +1 ), ai ∈ F2 . Being quadratic, these functions all belong to completed


n/2

class M.
A particular case of Gold exponents is when gcd(j , n) = 1, function trn (ax 2 +1 ) is
j

then bent if and only if a is not the (2j + 1)th power of an element of F2n , that is (since
gcd(2j + 1, 2n − 1) = 3), a is not a cube in F2n . The same result exists with
– The Kasami exponents: 22i − 2i + 1 with gcd(i, n) = 1: function trn (ax 2 −2 +1 ) is bent
2i i

if and only if a is not a cube (this is proved in [448, Theorem 11] for n not divisible
by 3 and is true also for n divisible by 3 as seen by Leander [750]). Note that since the
functions in Maiorana–McFarland’s and PS + classes are normal and functions in the
PS − class have algebraic degree n2 , the Kasami bent functions, which have algebraic
degree w2 (4k − 2k + 1) = k + 1, do not belong, in general, to these classes (see
page 253).
n
– The Dillon exponents [440] (already seen at page 215): d = j · (2 2 − 1), where
n
gcd(j , 2 2 + 1) = 1; function trn (ax d ), with a ∈ F n2 without loss of generality, is
2
 tr n (x −1 +ax)
bent if and only if the Kloosterman sum x∈F n (−1) 2 is null, where 1/0 = 0
22
(it belongs then to the PSap class); see also [750].
– Two exponents that we give without proof:
• The Leander exponent d = (2n/4 + 1)2 , where n is divisible by 4 but not by 8; see
[750]; see also [352], where the set of all a such that the corresponding function
trn (ax d ) is bent is determined: a = a  bi , a  ∈ wF2n/4 , w ∈ F4 \ F2 , b ∈ F2n ; the
function belongs to the Maiorana–McFarland class.
• The Canteaut–Charpin–Kyureghyan exponent [197] d = 2n/3 + 2n/6 + 1, where n
is divisible by 6 (the corresponding function trn (ax d ) is bent if and only if a = a  bi ,
a  ∈ F n2 such that tr n (a  ) = a  + a 2 + a 2
n/6 n/6 2n/6
= 0, b ∈ F2n ; it belongs to the
2 2
Maiorana–McFarland class).
6.1 Bent Boolean functions 231

It has been checked by Canteaut that all bent functions trn (ax i ) are covered by these classes
for n ≤ 20 and shown in [352] that there is no other cubic exponent giving infinite classes
of bent functions in the Maiorana–McFarland class.

Remark. The bent sequences given in [1137] are particular cases of the constructions
given above (using also some of the secondary constructions given below).

8. Classes of bent polynomial functions in univariate representation. We also give them


without proof. See more in [227]:
• Quadratic bent
 functions; see page 206. 
i +1
+ (x 2 + x + 1)trn (x 2 +1 )] , where n ≥ 6,
i i
• f (x) = trn a[x 2 n
2 does not divide i,
n
even, a ∈ F2n \ F2i , {a, a + 1} ∩ {x 2i +1
; x ∈ F2n } = ∅; these functions found in
gcd(i,n)
[150] by applying CCZ equivalence to nonbent vectorial functions belong to completed
M when a ∈ F2n/2
    i 2i +1
f (x) = trn a x + tr3n x 2(2 +1) + x 4(2 +1) + trn (x)tr3n x 2 +1 + x 2 (2 +1)
i i 2i i
• ,
i +1
where 6 | n, 2 does not divide i, gcd(i,n) even, b + d + d ∈ {x
n n 2 2 ; x ∈ F2 } for every
n

d ∈ F23 ; these functions found in [150]) belong to completed M


• The four known classes of Niho bent functions studied above;
• Classes of bent functions via Dillon exponents and their generalizations [350, 441, 448,
478, 479, 629, 749, 752, 764, 851, 852, 853, 871, 1144] (we develop some of them in
other subsections of this book and do not have the room for detailing each)
• The trace function of the multinomial APN functions that we shall describe at page 406
[116].
• Sums of some known bent functions and products of linear functions [1131].

9. Classes of bent polynomial functions in bivariate representation. Except for Maiorana–


McFarland functions, PS ap functions, and functions in class H in bivariate form, there is
the isolated class seen at page 207, f (x, y) = trm (x 2 +1 + y 2 +1 + xy), x, y ∈ F2m , where
i i

gcd(3, n) = gcd(i, n) = 1.

10. Bent functions obtained as restrictions and extensions. In [734], the authors studied
if the restrictions to hyperplanes of Gold functions trn (x 2 +1 ) (see page 206) on F2n , for n
i

odd, gcd(i, n) = 1, could be bent. It is shown that this happens with any linear hyperplane
not containing element 1. It was already known20 from [57] that, for any (n, n)-function F
satisfying F (0) = 0 and such that, for every a ∈ F∗2n , the set Ha = {Da F (x); x ∈ Fn2 } is
the complement of a linear hyperplane,21 the restriction of the Boolean function 1Ha ◦ F to
any linear hyperplane not containing a is bent. Note that the restriction to its complement

20 But [734] also determines the dual function.


21 Reference [57] calls these functions crooked, but we shall use this term at page 278 for a slightly more general
notion.
232 Bent functions and plateaued functions

a + Ha is bent too, since 1Ha ◦ F (x) + 1Ha ◦ F (x + a) equals constant function 1, because
F (x) + F (x + a) ∈ Ha implies that F (x) ∈ Ha is equivalent to F (x + a) ∈ Ha .
= x 2 +1 , we have Ha = {a 2 +1 (x 2 + x + 1); x ∈ F2n } and 1Ha ◦ F (x) =
i i i
For
 F (x) 
2i +1
trn xa . In [548], the authors prove that the restriction of any Gold AB function to
any linear hyperplane is bent.
Dillon and McGuire studied in [449] the more difficult case of Kasami functions
trn (x 4 −2 +1 ) (see page 230) on F2n , for n odd, gcd(i, n) = 1. They showed that for n
i i

not divisible by 3, there is one Kasami exponent with n = 3k ± 1 for which the function
is bent when restricted to one particular hyperplane (of equation trn (x) = 0). This function
is not bent when restricted to any other hyperplane. They also presented a criterion for
the restriction of a near-bent function (see Subsection 6.2.4) to a hyperplane to be bent.
More investigations between bent restrictions and near-bent extensions were made in [754];
see also some results in [191]. In [755], Leander and McGuire have considered, in the
other sense, the problem of going from a near-bent n-variable function to a bent (n + 1)-
variable function; using the construction of bent functions by the concatenation (f , g) of
two near-bent functions f , g whose Walsh spectra are complementary (this condition is
straightforwardly necessary and sufficient), that is, disjoint, and consequently that function
(f , f ⊕ trn ) is bent if and only if the indicator h of the support of Wf satisfies D1 h = 1,
where 1 ∈ F2n (which is also easily seen and is equivalent to the bentness of the restrictions
of f to {x ∈ F2n ; trn (x) = 0} and its complement), they deduced from the Kasami near-bent
functions, the first examples of non-weakly normal bent functions in dimensions 10 and 12.

6.1.16 Secondary constructions of bent Boolean functions


Since very few bent functions are known from primary constructions, it seems useful to
derive secondary constructions.22 We have already seen in Proposition 79, page 211, a
secondary construction based on the Maiorana–McFarland construction. We describe now
the others (which have been found so far).

1. The direct sum is the first secondary construction given by J. Dillon and O. Rothaus
in [441, 1005]: let f be a bent function on Fn2 (n even) and g a bent function on Fm
2 (m
even), then the function h defined on F2n+m by h(x, y) = f (x) ⊕ g(y) is bent. Indeed, a
straightforward calculation gives
Wh (a, b) = Wf (a) × Wg (b). (6.28)
This construction provides decomposable functions only (a Boolean function is called
decomposable if it is equivalent to the sum of two functions that depend on two disjoint
subsets of coordinates). Such peculiarity is easy to detect and can be used for designing
divide-and-conquer attacks, as pointed out by J. Dillon in [442]. However, in some cases
(see an example in [839]), this construction provides nice solutions to specific problems.
Anyway, if the direct sum provides weak functions in a given framework, the indirect sum
(see below) is an alternative, since it has almost the same property with respect to the Walsh
transform and does not have the drawback of direct sum.
22 However, as Dobbertin and Leander write in [477], “most bent functions appear without any roots to bent
functions in lower dimensions which could explain their existence.”
6.1 Bent Boolean functions 233

2. The Rothaus construction was introduced by the same authors: if g, h, k, and


g ⊕ h ⊕ k are bent on Fn2 (n even), then the function defined at every element
(x, y1 , y2 ) of F2n+2 (x ∈ Fn2 , y1 , y2 ∈ F2 ) by f (x, y1 , y2 ) =
g(x)h(x) ⊕ g(x)k(x) ⊕ h(x)k(x) ⊕ [g(x) ⊕ h(x)]y1 ⊕ [g(x) ⊕ k(x)]y2 ⊕ y1 y2
is bent. We do not give a proof since this construction will be a particular case of Theorem
15 below (see also [876]). A method is proposed in [329] to construct three bent functions
that can be used as initial functions in the Rothaus construction.

3. The indirect sum generalizes the direct sum. It has first been found as a construction
of resilient functions, which generalized and unified several previous constructions; see
Theorem 21, page 300. The same principle allows constructing bent functions:

Proposition 83 [225] Let f1 and f2 be two n-variable bent functions (n even) and let g1
and g2 be two m-variable bent functions (m even). Define23
h(x, y) = f1 (x) ⊕ g1 (y) ⊕ (f1 ⊕ f2 )(x) (g1 ⊕ g2 )(y); x ∈ Fn2 , y ∈ Fm
2.

Then h is bent and its dual is obtained from f1 , f2 , 


g1 and 
g2 by the same formula as h is
obtained from f1 , f2 , g1 , and g2 .

We do not give a proof of this result either, since we shall see that it is also a particular
case of Theorem 15 below.
Similarly to the direct sum and contrary to the Rothaus construction above and to the
bent concatenation construction below, the indirect sum requires no condition on the bent
functions f1 , f2 , g1 , and g2 used.
An interest of this construction, compared to the direct sum, is that it allows designing
functions h, which are more complex (in particular, which may have larger algebraic degree
and algebraic immunity) than the functions f1 , f2 , g1 , and g2 used.
The indirect sum has been modified and generalized in several ways. These generaliza-
tions often require conditions on the functions used. In [329], the authors introduced the
constructions
1. f (x, y) = f1 (x) ⊕ g1 (y) ⊕ (f1 ⊕ f2 )(x)(g1 ⊕ g2 )(y) ⊕ (f2 ⊕ f3 )(x)(g2 ⊕ g3 )(y),
where f1 , f2 , and f3 are bent functions in n variables such that f1 ⊕ f2 ⊕ f3 is bent and
has f1 ⊕ f2 ⊕ f3 for dual, and g1 , g2 , and g3 are bent functions in m variables such that
g1 ⊕ g2 ⊕ g3 is bent.
2. f (x, y) = f0 (x) ⊕ g0 (y) ⊕ (f0 ⊕ f1 )(x)(g0 ⊕ g1 )(y) ⊕ (f1 ⊕ f2 )(x)(g1 ⊕ g2 )(y)⊕

(f2 ⊕ f3 )(x)(g2 ⊕ g3 )(y),


with a slightly more complex condition on functions f0 , . . . , f3 , g0 , . . . , g3 .

23 h can be seen as the concatenation of the four functions f1 , f1 ⊕ 1, f2 , and f2 ⊕ 1, in an order controlled by
g1 (y) and g2 (y).
234 Bent functions and plateaued functions

A modified indirect sum is also introduced in [1153], in which functions f1 and f2


(resp. g1 and g2 ) are the restrictions of a bent function f (resp. g) to two hyperplanes,
complementary of each other.

4. The semidirect sum [336] f (x) ⊕ g(y + H (x)), where f and g are bent and H is such
that f ⊕ u · H is bent for every u.

5. The bent concatenation construction generalizes the direct sum, the Rothaus construc-
tion, the indirect sum, and the semidirect sum (but as with the semidirect sum, it needs to
find initial bent functions satisfying additional conditions):

Theorem 15 [215] Let n and m be two even positive integers. Let f be a Boolean function
on F2n+m = Fn2 × Fm 2 such that, for any element y of F2 , the function fy : x ∈ F2 → f (x, y)
m n

is bent. Then f is bent if and only if, for any element s of F2 , the function
n

ϕs : y → fy (s)

is bent on Fm 
2 . If this condition is satisfied, then the dual of f is the function f (s, t) = ϕ
s (t)
(taking as inner product in F2 × F2 : (x, y) · (s, t) = x · s ⊕ y · t).
n m

This very general result is easy to prove, using that, for every s ∈ Fn2 ,

n  n
(−1)f (x,y)⊕x·s = 2 2 (−1)fy (s) = 2 2 (−1)ϕs (y) ,
x∈Fn2

n
and thus that Wf (s, t) = 2 2 (−1)ϕs (y)⊕y·t .
y∈Fm
2
A very particular case of this construction had been previously considered by Adams
and Tavares [7] under the name of bent-based functions, and later studied by J. Seberry
and X.-M. Zhang in [1025]. The direct sum and Rothaus’ constructions are particular cases
of Theorem 15 (the latter covers the case m = 2). Several classes of bent functions have
been deduced in [215], and later in [620]. It is also deduced in [245] (where more details
on secondary constructions can be found) that if f , g are n-variable Boolean functions, g
being bent, and if φ is a mapping from Fn2 to itself, then the 2n-variable function f (x) ⊕

g (y) ⊕ φ(x) · y is bent if and only if f (x) ⊕ g(φ(x) + b) is bent for every b. Three cases of
application are exhibited; all three use two bent functions g and h, and we have
• If g and h differ by a quadratic function, then the 2n-variable function (g ⊕ h)(x) ⊕

g (y) ⊕ x · y is bent,
• If g is quadratic and φ is an affine permutation, then the 2n-variable function g(φ(x)) ⊕
h(x) ⊕  g (y) ⊕ φ(x) · y is bent.
• If I m(φ) = {φ(x); x ∈ Fn2 } is either included in or disjoint from any translate of
supp(g), then the 2n-variable function f (x) ⊕  g (y) ⊕ φ(x) · y is bent.
6.1 Bent Boolean functions 235

The indirect sum is a particular case of the bent concatenation construction of Theorem
15: let h be defined as in Proposition 83, then for every y, the function hy (x) of Theorem 15
(with h instead of f ) equals f1 (x) plus the constant g1 (y) if g1 (y) = g2 (y) and f2 (x) plus
the constant g1 (y) if g1 (y) = g2 (y); thus it is bent and function ϕs (y) equals f1 (s) ⊕ g1 (y)
if g1 (y) = g2 (y) and f2 (s) ⊕ g1 (y) if g1 (y) = g2 (y), that is, equals f1 (s) ⊕ g1 (y) ⊕ (f1 ⊕
f2 )(s) (g1 ⊕ g2 )(y); hence, ϕs (y) is bent too since it equals f1 (s) ⊕ g1 (y) or f1 (s) ⊕ g2 (y)
according to whether (f1 ⊕ f2 )(s) vanishes or not, and according to Theorem 15, h is then
bent and its dual equals

h(s, t) = f1 (s) ⊕ 
g1 (t) ⊕ (f1 ⊕ f2 )(s)(
g1 ⊕ 
g2 )(t).

The semidirect sum is also a direct consequence thanks to g ◦ ta =  g ⊕ a , ta (y) = y +


a, a (s) = a · s.
Another simple application of Theorem 15, called extension of Maiorana–McFarland
m/2
type, is given in [270]: let m be even and π be a permutation of F2 and g an m/2-variable
Boolean function, and let fπ,g (z, y) = z · π(y) ⊕ g(y) be the related Maiorana–McFarland
bent function; let (hy )y∈Fm/2 be a collection of bent functions on Fn2 for some even integer
2
n, then the function
m/2 m/2
(x, y, z) ∈ Fn2 × F2 × F2 → hy (x) ⊕ fπ,g (z, y) (6.29)

is bent. Indeed, Theorem 15 with (z, y) in the place of y applies with x → hy (x)⊕fπ,g (z, y)
in the place of fy , and with ϕs (z, y) = hy (s) ⊕ fπ,g (z, y), which is a bent Maiorana–
McFarland function.
This generalizes a construction due to Davis and Jedwab [415] that was slightly posterior
to [215] but anterior to [270]: let n and m be two positive even integers; let hy (x) be a
m/2 m/2
collection of bent functions on Fn2 for y ∈ F2 , then the function (x, y, z) ∈ Fn2 × F2 ×
m/2
F2 → hy (x) ⊕ y · z is bent.
Note that in (6.29), no term involves both x and z, so the structure of the bent function
is peculiar (to a lesser extent than for a direct sum, though); instead can be tried (x, y) ∈
Fn2 × Fn2 → hy (x) ⊕ fπ,g (x, y). The restriction of such function when fixing y is bent
since that of fπ ,g (x, y) is affine. Then for the global function to be bent, it is necessary and
sufficient that ϕs (y) = hy (s + π(y)) ⊕ g(y) be bent for all s. Note that the semidirect sum
is a particular case of its dual.
Of course, if f (x, y) is an (n + s)-variable function such that, for any y ∈ Fs2 , the
n-variable function fy : x → f (x, y) is s-plateaued (see the definition at page 258) and
the supports of the Walsh transforms of these functions fy are pairwise disjoint, then these
supports constitute a partition of Fn2 and f is bent.

6. A permutation-based construction due to X.-D. Hou and P. Langevin is built on a very


simple observation that leads to potentially new bent functions:

Proposition 84 [627] Let f be a Boolean function on Fn2 , n even. Let σ be a permutation


of Fn2 . We denote its coordinate functions by σ1 , . . . , σn and we assume that, for every a ∈ Fn2 ,
we have
236 Bent functions and plateaued functions
 

n
n
dH f, ai σi = 2n−1 ± 2 2 −1 .
i=1

Then f ◦ σ −1 is bent.

Indeed, theHamming distance between f ◦ σ −1 and the linear function a (x) = a · x


equals dH (f , ni=1 ai σi ).
Hou and Langevin proposed two frameworks for applying Proposition 84:
– If h is an affine function on Fn2 and f1 , f2 , and g are Boolean functions on Fn2 such that
the following function is bent
f (x1 , x2 , x) = x1 x2 h(x) ⊕ x1 f1 (x) ⊕ x2 f2 (x) ⊕ g(x); x ∈ Fn2 , x1 , x2 ∈ F2 ,
then the function
f (x1 , x2 , x) ⊕ (h(x) ⊕ 1) f1 (x)f2 (x) ⊕ f1 (x) ⊕ (x1 ⊕ h(x) ⊕ 1) f2 (x) ⊕ x2 h(x)
is bent; in [932] are given cases of application by taking f as the indirect sum of bent
functions and using semi-bent 4-decomposition of bent functions.
– If f is a bent function on Fn2 whose algebraic degree is at most 3, and if σ is a permutation
of Fn2 such that, for every i = 1, . . . , n, there exists a subset Ui of Fn2 and an affine
function hi such that

σi (x) = (f (x) ⊕ f (x + u)) ⊕ hi (x),
u∈Ui

then f ◦ σ −1 is bent.
n/2
X.-D. Hou in [620] deduced that if f (x, y) (x, y ∈ F2 ) is a Maiorana–McFarland
function of the particular form x · y ⊕ g(y) and if σ1 , . . . , σn are all of the form
 −1 is bent. He gave several examples
1≤i<j ≤ n2 ai,j xi yj ⊕ b · x ⊕ c · y ⊕ h(y), then f ◦ σ
of application of this result.

7. A construction without extension of the number of variables24 has been introduced in


[227] and is based on the following result:

Proposition 85 Let f1 , f2 and f3 be three Boolean functions on Fn2 . Denote by s1 the


Boolean function equal to f1 ⊕ f2 ⊕ f3 and by s2 the Boolean function equal to f1 f2 ⊕
f1 f3 ⊕ f2 f3 . Then we have f1 + f2 + f3 = s1 + 2s2 . This implies the following equality
between the Fourier–Hadamard transforms f1 + f2 + f3 = s1 + 2
s2 and the similar equality
between the Walsh transforms
Wf1 + Wf2 + Wf3 = Ws1 + 2 Ws2 . (6.30)

Proof The fact that f1 +f2 +f3 = s1 +2s2 (the sums being computed in Z and not modulo
2) can be checked easily. The R-linearity of the Fourier–Hadamard transform implies then
24 Note that Hou–Langevin’s permutation-based construction above does not increase either the number of
variables, contrary to most other secondary constructions.
6.1 Bent Boolean functions 237

f1 + f2 + f3 = s1 + 2 s2 . The equality f1 + f2 + f3 = s1 + 2s2 also directly implies
f1 χ + f2 χ + f3 χ = s1 χ + 2s2 χ , thanks to the equality fχ = 1 − 2f valid for every Boolean
function, which implies Relation (6.30).

Remark. It is observed in [8, lemma 1] that, given four Boolean functions f1 , f2 , f3 , f4 ,


the pseudo-Boolean function 1/2(Wf1 + Wf2 + Wf3 + Wf4 ) is the Walsh transform of a
Boolean function, say g, if and only if f1 ⊕ f2 ⊕ f3 ⊕ f4 equals constant function 1 (this
is easily deduced from the fact that, by the R-linearity of the Fourier–Hadamard transform
on pseudo-Boolean functions and its bijectivity, we have equivalently (−1)g = 12 ((−1)f1 +
(−1)f2 + (−1)f3 + (−1)f4 ). This means that f4 = s1 ⊕ 1 and then according to (6.30), we
have g = f1 f2 ⊕ f1 f3 ⊕ f2 f3 , as also observed in [8].

Proposition 85 leads then to a double construction of bent functions:

Corollary 15 [227] Let f1 , f2 , and f3 be three n-variable bent functions, n even. Let
s1 = f1 ⊕ f2 ⊕ f3 and s2 = f1 f2 ⊕ f1 f3 ⊕ f2 f3 . Then:
– If s1 is bent and if s˜1 = f˜1 ⊕ f˜2 ⊕ f˜3 , then s2 is bent, and s˜2 = f˜1 f˜2 ⊕ f˜1 f˜3 ⊕ f˜2 f˜3 .
n
– If Ws2 (a) is divisible by 2 2 for every a (e.g., if s2 is bent, or quadratic, or more generally
if it is plateaued; see the definition in Section 6.2), then s1 is bent.

Proof
– If s1 is bent and if s˜1 = f˜1 ⊕ f˜2 ⊕ f˜3 , then, for every a, Relation (6.30) implies
  n−2
˜ ˜ ˜ ˜ ˜ ˜
Ws2 (a) = (−1)f1 (a) + (−1)f2 (a) + (−1)f3 (a) − (−1)f1 (a)⊕f2 (a)⊕f3 (a) 2 2
˜ ˜ ˜ ˜ ˜ ˜ n
= (−1)f1 (a)f2 (a)⊕f1 (a)f3 (a)⊕f2 (a)f3 (a) 2 2 .
Indeed, as we already saw above with the relation f1 χ + f2 χ + f3 χ = s1 χ + 2s2 χ , for every
bits , η, and τ , we have (−1) + (−1)η + (−1)τ − (−1) ⊕η⊕τ = 2 (−1) η⊕ τ ⊕ητ .
n
– If Ws2 (a) is divisible by 2 2 for every a, then the number Ws1 (a), which is equal
 
˜ ˜ ˜ n
to (−1)f1 (a) + (−1)f2 (a) + (−1)f3 (a) 2 2 − 2 Ws2 (a), according to Relation (6.30), is
n n
congruent with 2 2 modulo 2 2 +1 for every a. This is sufficient to imply that s1 is bent,
according to Lemma 5, page 190.

Corollaries are deduced in [227] that revisit results from [327] (this latter reference also
includes constructions of plateaued functions).
This construction has been used in [860, 864] (where it is observed that, conversely, if
f1 , f2 , f3 , s1 , and s2 are bent, then s̃1 = f˜1 ⊕ f˜2 ⊕ f˜3 ) and is called Carlet’s secondary
construction in [386, 873]. It is used in [711, 873] with linear structures. In the continuation
of [863], it is shown in [386] that using Corollary 15, three involutions whose sum is
an involution give rise through the Maiorana–McFarland construction to bent functions in
bivariate representation.
The construction of Corollary 15 was extended to more than three functions:
238 Bent functions and plateaued functions

Proposition 86 [227] Let f1 , . . ., fm be Boolean functions on Fn2 . For every positive


integer l, let sl be the Boolean function defined by

 
l
sl = fij if l ≤ m and sl = 0 otherwise.
1≤i1 <...<il ≤m j =1

Then we have f1 + . . . + fm = i≥0 2i s2i (sums in Z). This implies f1 + · · · + f+
m =

i≥0 2 +
is
2 i . Moreover, if m is primitive, say m = 2 r − 1, then

r−1
Wf1 + · · · + Wfm = 2i Ws2i . (6.31)
i=0


Proof Let x ∈ Fn2 and jx = m  fk(x).
k=1 Accordingto Lucas’ theorem (see page 487),
 i
jx
the binary expansion of jx is i≥0 2 2i
[mod 2] . It is a simple matter to check that
jx 
2i
[mod 2] = s2i (x). Thus, f1 + . . . + fm = i
i≥0 2 s2i . The linearity ofthe Walsh
transform with respect to the addition in R implies then directly f1 +· · ·+ f+
m =
i +.
i≥0 2 s 2i
If m = 2− 1 (recall that in coding theory, such number is called
r primitive),
r−1 i then we
r−1 i
have m = i=0 2. Thus, we deduce (−1) + . . . + (−1)
f1 fm =
i=0 2 (−1)
s2i
from
r−1 i
f1 + . . . + fm = i=0 2 s2i . The linearity of the Walsh transform implies then Relation
(6.31).

Corollary 16 [227] Let n be any positive even integer and f1 , . . ., fm (m ≤ 7) be bent


functions on Fn2 .
• Assume that s1 is bent, and that, for every a ∈ Fn2 , the number Ws4 (a) is divisible by
2n/2 . Then:
– If m = 5 and s1 = f1 ⊕ . . . ⊕ f5 ⊕ 1, then s2 is bent.
– If m = 7 and s1 = f1 ⊕ . . . ⊕ f7 , then s2 is bent.

• Assume that m ∈ {5, 7} and, for every a ∈ Fn2 , the number Ws4 (a) is divisible by 2n/2−1
and the number Ws2 (a) is divisible by 2n/2 , then s1 is bent.

Proof We have for i = 1, . . ., m and for every vector a = 0: Wfi (a) = −2fi (a) =
 
(−1)fi (a) 2n/2 and f1 (a) + · · · + f+
m (a) =
i +(a).
i≥0 2 s2i

– If s1 is bent and, for every a ∈ Fn2 , the number Ws4 (a) is divisible by 2n/2 , then Ws2 (a)
 
 K
is congruent with (−1)f1 (a) + · · · + (−1)fm (a) − (−1)s1 (a) 2n/2−1 modulo 2n/2+1 , for
every a = 0n .
If m = 5 and s1 = f1 ⊕ . . . ⊕ f5 ⊕ 1 then, denoting by k the Hamming weight of the
word (f1 (a), . . . , f5 (a)), the number Ws2 (a) is congruent with [5 − 2k + (−1)k ] 2n/2−1
modulo 2n/2+1 .
If m = 7 and s1 = f1 ⊕ . . . ⊕ f7 , then, denoting by k the Hamming weight of the
word (f1 (a), . . . , f7 (a)), the number Ws2 (a) is congruent with [7 − 2k − (−1)k ] 2n/2−1
6.1 Bent Boolean functions 239

modulo 2n/2+1 . So, in both cases, we have Ws2 (a) ≡ 2n/2 [mod 2n/2+1 ], and s2 is bent,
according to Lemma 5, page 190.
– If, for every a ∈ Fn2 , the number Ws4 (a) is divisible by 2n/2−1 and the number Ws2 (a)
is divisible by 2 , then, for every a = 0n , the number Ws1 (a) is congruent with
n/2
 K
(−1)f1 (a) + · · · + (−1)fm (a) 2n/2 mod 2n/2+1 . Since m ∈ {5, 7}, it is then congruent
with 2n/2 mod 2n/2+1 and s1 is bent, according to Lemma 5 again.

8. A construction related to the notion of normal extension of bent function can be found
in Proposition 93, page 253.

9. A construction related to bent rectangles. In [8, 10] are represented n-variable Boolean
functions f by matrices called rectangles (among which squares, when n is even). The
rows of such matrices are the Walsh transforms of the restrictions of f obtained by fixing
m coordinates at fixed positions (say, at the first m positions), where 1 ≤ m ≤ n − 1:
denoting by fu the restriction of f obtained by fixing xi = ui for i = 1, . . . , m,
the row indexed by u ∈ Fm
the term at  2 and column indexed by v ∈ F
n−m equals25

Wfu (v) = y∈Fn−m (−1)


f (u,y)⊕v·y (i.e. the row is the Walsh transform vector of fu ). It
2
n
is proved in [10] that f is bent if and only if the columns, when multiplied by 2m− 2 ,
are also the Walsh transforms of Boolean functions. This is, in a way, a generalization of
Theorem 15, page 234, since m does not need to be even and the restrictions do not need
n−m
to be bent. The condition is necessary since, for every a ∈ Fm 2 and v ∈ F2 , we have
  n 
u∈Fm Wfu (v)(−1) = u∈Fm ,y∈Fn−m (−1)
u·a f (u,y)⊕v·y⊕u·a = Wf (a, v) = 2 2 (−1)f (a,v) ,
2 2 2
and denoting by fv the restriction of fobtained by fixing its n − m last input coordinates to
the corresponding values of v, and by applying the inverse Walsh transform formula to fv ,
n
we see that the column indexed by v and multiplied by 2m− 2 equals the Walsh transform
of fv . It is also easily seen that the condition is sufficient. Constructions of bent squares are
deduced in [10] by using so-called biaffine transformations and partitions of Fn2 into affine
planes of equal dimension (but it is not checked whether such constructions can provide new
bent functions nor whether the constructions themselves fall within known ones or not).

10. A general construction in the framework of the so-called Z-bent functions. Most
constructions above build bent functions from bent functions. The idea of Z-bent functions
is to extend the corpus in order to embed bent functions into a recursive context. This has
been initiated by Dobbertin in 2005, and G. Leander has presented the results and given
guidelines for further research in a paper posthumously coauthored by Hans Dobbertin [477]
(see also [476]).
Z-bent functions are integer-valued functions ϕ on Fn2 whose normalized Fourier
transform  ϕnorm = 2−n/2  ϕ , is also integer valued. Bent Boolean functions (or
more precisely their sign functions) will be among Z-bent functions those which are
±1 valued.

25 There seems to be a slight confusion between rows and columns in the description given at the bottom of page
5 in [10].
240 Bent functions and plateaued functions

The following nested subsets of Z are defined: W0 = {±1} and for r = 0, Wr = {w ∈


Z | −2r−1 ≤ w ≤ 2r−1 }. They satisfy Wr ± Wr = Wr+1 for r > 0 and lead to a hierarchy
on Z-bent functions:

Definition 55 [477] A function ϕ : Fn2 → Wr is called a Z-bent function of size n


2 and
level r if 
ϕnorm is also valued in Wr .

In this hierarchy, the (sign functions of) usual bent functions are the zero level Z-bent
functions. Since the normalized Fourier transform is self-inverse,  ϕnorm is then also a Z-
bent function of size n/2 and level r, which is called the dual of ϕ.
Z-bent functions of level r on n variables can be used to construct all Z-bent functions of
level r − 1 on (n + 2) variables. This is referred to as “gluing” technique. All bent functions
in (n + 2r) variables (i.e., all Z-bent functions of level 0 in (n + 2r) variables) are eventually
reached this way.
The construction of partial spread (PS ) bent functions has been generalized to partial
spread Z-bent functions of arbitrary level in [526]. This led to a new construction of bent
Boolean functions. A bent function in eight variables outside the completed M and PS ap
classes was deduced; all bent functions in six variables can be obtained, up to equivalence,
by this construction.
Secondary construction of bent functions from near-bent functions have been also
proposed, for instance in [1122].

6.1.17 Decompositions of bent functions


The following theorem is a direct consequence of the second-order Poisson formula (2.57),
page 62, applied to f ⊕  where  is linear, and to a linear hyperplane E of Fn2 , and of the
well-known (easy to prove) fact that, for every even integer n, the sum of the squares of two
integers equals 2n (resp. 2n+1 ) if and only if one of these squares is null and the other one
equals 2n (resp. both squares equal 2n ):

Theorem 16 [191] Let n ≥ 4 be an even integer and let f be an n-variable Boolean


function. Then the following properties are equivalent.
1. f is bent.
2. For every (or some) hyperplane E of Fn2 , the restrictions of f to E and Fn2 \ E (viewed
n
as Boolean functions on F2n−1 ) are plateaued with amplitude 2 2 (i.e., are near-bent), and
their Walsh supports partition the whole space F2n−1 .
3. For every (or some) linear hyperplane E of Fn2 , every derivative De f , e ∈ E \ {0n } is
balanced.

The fact that Property 3 is enough comes from Relation (2.56), page 62. Note that we
have also (see [191]) that, if a function in an odd number of variables is such that, for some
nonzero a ∈ Fn2 , every derivative Du f , u = 0n , u ∈ a ⊥ , is balanced, then its restriction to
the linear hyperplane a ⊥ or to its complement is bent.
It is also proved in [191] that the Walsh transforms of the four restrictions of a bent
function to an (n − 2)-dimensional vector subspace E of Fn2 and its cosets have the same
6.1 Bent Boolean functions 241

sets of absolute values. It is a simple matter to see that, denoting by a and b two vectors
such that E ⊥ is the linear space spanned by a and b, these four restrictions are bent if and
only if Da Db f takes on constant value 1, and as observed in [193] that26 f ⊕ 1E is bent if
and only if Da Db f takes on constant value 0 (see examples in [198, corollary 15]). More
on decomposing bent functions can be found in [191, 193, 349].

6.1.18 Class GPS and a geometric characterization of bent Boolean functions


Class PS generalizes to a class introduced in [213] and called GPS (for generalized
partial spreads), which led to a characterization of bent functions that we call geometric
characterization. This characterization, given below in Theorem 17, can be proved rather
simply by using Proposition 67, page 195, which is posterior to the introduction of GPS
and to Theorem 17:

Theorem 17 [290] Let f be a Boolean function on Fn2 . Then f is bent if and only if there
exist n/2-dimensional subspaces E1 , . . . , Ek of Fn2 (with no constraint on number k) and
integers m1 , . . . , mk (positive or negative) such that, for any element x of Fn2 :
k
n
 n

f (x) ≡ mi 1Ei (x) − 2 2 −1 δ0 (x) mod 2 2 . (6.32)
i=1
k
mi 1Ei (x) − 2 2 −1 δ0 (x), then the dual of f equals f(x) =
n
If we have f (x) =
k n
i=1
2 −1 δ0 (x).
i=1 mi 1E ⊥ (x) − 2
i

Proof (sketch of): Relation (6.32) is a sufficient condition for f being bent, according to
Lemma 5 and to Relation (2.38), page 58. This same Relation (2.38) also implies the last
sentence of Theorem 17. Conversely, if f is bent, then Proposition 67 allows to deduce
Relation (6.32), by expressing all the monomials x I by means of the indicators of subspaces
of dimension at least n−|I | (indeed,
 the indicator of the subspace {x ∈ Fn2 ; xi =
the NNF of
0, ∀i ∈ I } being equal to i∈I (1 − xi ) = |J | J I
J ⊆I (−1) x , the monomial x can be
J
expressed by means of this indicator and of the monomials x , where J is strictly included
in I ) and by using Lemma 8 below.

Lemma 8 Let F be any d-dimensional subspace of Fn2 . There exist n/2-dimensional


subspaces E1 , . . . , Ek of Fn2 and integers m, m1 , . . . , mk such that, for any element x of Fn2 :

n
k  n
 n
2 2 −d 1F (x) ≡ m + mi 1Ei (x) mod 2 2 if d < , and
2
i=1
k  n
 n
1F (x) ≡ mi 1Ei (x) mod 2 2 if d > .
2
i=1

This lemma completes the proof of Theorem 17 since d ≥ n − |I | implies |I | − n/2 ≥


n/2 − d.
26 We have seen in the second remark of page 203 that this is a direct consequence of Theorem 14.
242 Bent functions and plateaued functions

Definition 56 The class of those functions f that satisfy the relation obtained from (6.32)
n
by withdrawing “[mod 2 2 ]” is called generalized partial spread class and denoted by GPS .

Class GPS includes PS , see [213]. The dual f of such function f of GPS equaling
f(x) = ki=1 mi 1E ⊥ (x) − 2 2 −1 δ0 (x), it belongs to GPS too.
n

i
There is no uniqueness of the representation of a given bent function inthe form (6.32).
There exists another characterization, shown in [291], in the form f (x) = ki=1 mi 1Ei (x)±
n
2 2 −1 δ0 (x), where E1 , . . . , Ek are vector subspaces of Fn2 of dimensions n/2 or n/2 + 1 and
where m1 , . . . , mk are integers (positive or negative). There is not a unique way, either, to
choose these spaces Ei . But it is possible to define some subclass of n/2-dimensional and
(n/2 + 1)-dimensional spaces such that there is uniqueness, if the spaces Ei are chosen in
this subclass.
P. Guillot has proved subsequently in [579] that, up to the composition by a translation
x → x + a, every bent function belongs to GPS . The proof is a little too technical for being
included here.

6.1.19 On the number of bent Boolean functions


Nonexistence of efficient lower bounds
The original Maiorana–McFarland class is one of the the widest classes. The number of bent
n n
functions of the form (6.9), page 209, equals (2 2 )! ×22 2 , which is asymptotically equivalent
 n
n +1 2 2 8
n
to 22
e 2 2 +1 π (according to Stirling’s formula) while the other important straight-
n
 n
22 2 2 + 12
forward construction of bent functions, PS ap , leads only to n −1
22
≈ √
2
n functions.27
π2 2
However, the number of bent Maiorana–McFarland functions seems negligible with respect
to the total number of bent functions. The size of the completed Maiorana–McFarland’s class
is unknown; it is at most equal to the number of Maiorana–McFarland’s functions times the
number of affine automorphisms, which equals 2n (2n − 1)(2n − 2) . . . (2n − 2n−1 ). It seems
also negligible with respect to the total number of bent functions. In fact, the lower bounds
that can be deduced from all known constructions of bent functions seem very far from the
actual number. For instance, in eight variables, there are approximately 2106 different bent
functions,28 see below, and about 277 correspond to the known constructions. The problem
of determining an efficient lower bound on the number of n-variable bent functions is then
open.
There exists a related open question by N. Tokareva in [1087] (that she calls the bent sum
decomposition problem) whether all Boolean functions of algebraic degree at most n/2 are
equal to the sums of two n-variable bent functions (which is equivalent to asking whether the
set of such sums is stable under addition [977]). The reply to this question seems probably
negative, but there is no proof that it is, and it is shown in [977] that the reply is positive when
restricting ourselves to a number of subclasses (Boolean functions in at most six variables,

27 Its extension with André’s spreads, see page 216, has nevertheless more elements.
28 Among which probably many could lead to new infinite classes; this shows how limited is our knowledge.
6.1 Bent Boolean functions 243

quadratic Boolean functions, Maiorana–McFarland bent functions, partial spread functions).


And the usual parameters and properties of Boolean functions (ANF, NNF and numerical
degree, generalized degree, divisibility of the Fourier transform or of the coefficients of
the NNF, other properties of the Fourier or Walsh transform values) do not seem to allow
discriminating sums of two bent functions from other Boolean functions of degrees at most
n/2. If the reply to Tokareva’s question was finally positive, this would give a straightforward
lower bound on the number of bent functions that would be much better than what is known.

Upper bounds
Rothaus’ inequality recalled in Section 6.1.8 (Theorem 13, page 200) states that any bent
function has algebraic degree at most n/2. Thus, the number of bent functions is at most
n n
1+n+···+(n/2) 2n−1 + 12 (n/2 )
2 =2 .
We shall call this upper bound the naive bound. For n = 6, the number of bent functions
is known and is approximately equal to 232.3 (see [968]), which is much less than the naive
bound gives: 242 . For n = 8, the number is also known: it has been first shown in [744]
that it is inferior to 2129.2 ; it has been later calculated by Langevin et al. [743] and equals
approximately 2106.3 (the naive bound gives 2163 ). Hence picking at random an 8-variable
Boolean function of algebraic degree bounded above by 4 does not allow obtaining bent
functions (but more clever methods exist; see [278, 413]). An upper bound improving upon
the naive bound has been found in [301]. It is exponentially better than the naive bound since
n
n
it divides it by approximately 22 2 − 2 −1 . But it seems to be still far from the exact number of
bent functions: for n = 6, it gives roughly 238 (to be compared with 232.3 ), and for n = 8
it gives roughly 2152 (to be compared with 2106.3 ). But the bound of [301] could not be
improved since it was obtained.

Number of bent components of a vectorial function


It is shown in [962] that the number of bent components of any (n, n)-function is at most 2n −
n n
2 2 and that this upper bound is achieved with equality by the Niho power function x 2 2 +1 ,
n
i
and the function x 2 (x + x 2 2 ) for all i = 0, . . . , n − 1 (these latter functions are pairwise
EA/CCZ inequivalent for i = 0, n/2). In [877], it is shown that the set of those (n, n)-
functions having maximum number of bent components is preserved by CCZ equivalence
and does not contain any APN plateaued function.

6.1.20 Hyper-bent, homogeneous, symmetric/rotation symmetric


bent Boolean functions
Hyper-bent Boolean functions
Hyper-bent functions were initially proposed by Golomb and Gong [554] in relation with
the security of symmetric cryptosystems, for the reason that when gcd(i, 2n − 1) = 1, both
functions trn (ax) and tr(ax i ) provide m-sequences. But no explicit attack was proposed.
In [202], Canteaut and Rotella showed that, in the context of filtered LFSR, a relevant
criterion is the minimum distance between the function and the Boolean functions of the
244 Bent functions and plateaued functions

form trn (ax i ) ⊕ , where gcd(i, 2n − 1) = 1, a ∈ F2n and ∈ F2 : they showed that if
f (x) ⊕ trn (ax i ) is biased, then a fast correlation attack can be performed to recover the
initial state. Even the case when i is not coprime with 2n − 1 leads to an attack, and this
provides a new criterion to evaluate the security of filtered LFSR. Nevertheless, these new
considerations confirm the interest of the definition introduced by Golomb and Gong.

Definition 57 Let n be any even positive integer. An n-variable Boolean function f


on the field F2n is a hyper-bent function if, for every positive integer i coprime with
2n − 1, function f (x i ) is bent (or equivalently, since the compositional inverse of a
generic power permutation x i is a generic power permutation, if for any such i, we have
 f (x)+trn (a x i ) = ±2n/2 for every a ∈ F n ).
x∈F2n (−1) 2

Remark. In [214], the author determined those Boolean functions on Fn2 such that, for a
given even integer k (2 ≤ k ≤ n − 2), any of the Boolean functions on F2n−k , obtained by
keeping constant k coordinates among x1 , . . . , xn , is bent (i.e., those functions that satisfy
the propagation criterion of degree n − k and order k). These are the four bent symmetric
Boolean functions (see Section 10.1). They were called hyper-bent in [214], but we keep this
term for the notion introduced by Golomb and Gong.

Hyper-bent functions can be characterized in terms of the extended Walsh transform


[554]:
i
Wf (a, i) = (−1)f (x)+trn (ax ) , ∀a ∈ F2n , with gcd(i, 2n − 1) = 1,
x∈F2n

n
as those functions whose extended Walsh transform takes only the values ±2 2 .
The condition seems difficult to satisfy. However, A. Youssef and G. Gong, who
introduced the term in [1143], showed that hyper-bent functions exist. Recall that class
PS #ap , defined at page 215, is the set of those bent functions over F2n that can be obtained
from those of PS ap by composition by the transformations x ∈ F2n → δx, δ = 0, and by
addition of a constant. We have:

Proposition 87 [278] All the functions of class PS #ap are hyper-bent.

Let us give here a direct proof of this fact.

Proof We can restrict ourselves without loss of generality to the functions of class PS ap .
Let ω be any element in F2n \ F2n/2 . The pair (1, ω) is a basis of the F2n/2 -vector space F2n .
Hence, we have F2n = F2n/2 + ωF2n/2 , and the elements of class PS ap are the functions
y y
f (y  + ω y) = g
n/2
y , with y = 0 if y = 0, where g is balanced on F2 and vanishes
at 0. Note that every element y of F2n/2 satisfies y 2n/2
= y and therefore trn (y) = y +
n/2−1 n/2−1
y2 + · · · + y2 + y + y2 + · · · + y2 = 0. Consider the inner product in F2n defined
by y · y  = trn (y y  ); the subspace F2n/2 is then its own orthogonal; hence, according to
6.1 Bent Boolean functions 245

Relation (2.38), page 58, any sum of the form y∈F2n/2 (−1)
trn (λy) is null if λ ∈ F2n/2 and
equals 2n/2 if λ ∈ F2n/2 . For every a ∈ F2n , we have
 
y
f (x)+trn (a x i ) g +trn (a (y  +ωy)i )
(−1) = (−1) y
.
x∈F2n y,y  ∈F2n/2

y
Denoting y by z, we see that
 
y
g +trn (a (y  +ωy)i ) i (z+ω)i )
(−1) y
= (−1)g(z)+trn (a y .
y∈F∗n/2 ,y  ∈F2n/2 z∈F2n/2 ,y∈F∗n/2
2 2
i ) 
The remaining sum (−1)g(0)+trn (a y = (−1)trn (a y ) equals 2n/2 if a ∈ F2n/2
y  ∈F2n/2 y  ∈F2n/2
and is null otherwise.
i
Thus, (−1)f (x)+trn (a x ) equals
x∈F2n
⎛ ⎞
⎝(−1)g(z) (−1)trn (a(z+ω)
i y) ⎠
− (−1)g(z) + 2n/2 1F2n/2 (a).
z∈F2n/2 y∈F2n/2 z∈F2n/2
 g(z)
The sum is null since g is balanced.
z∈F2n/2 (−1)
  i

The sum z∈F n/2 (−1)g(z) y∈F n/2 (−1)trn (a(z+ω) y) equals ±2n/2 if a ∈ F2n/2 , since
2 2
we prove in the next lemma that there exists then exactly one z ∈ F2n/2 such that a(z +
ω)i ∈ F2n/2 ; and this sum is null if a ∈ F2n/2 (this can be checked, if a = 0 thanks to the
balancedness of g, and if a = 0 because y ranges over F2n/2 and a(z + ω)i ∈ F2n/2 ). This
completes the proof.

Lemma 9 Let n be any positive integer. Let a and ω be two elements of the set F2n \ F2n/2 ,
n/2
and let i be coprime with 2n −1. There exists a unique element z ∈ F2 such that a(z+ω)i ∈
n/2
F2 .

n/2
Proof Let j be the inverse of i modulo 2n − 1. We have a(z + ω)i ∈ F2 if and only
if z ∈ ω + a −j × F2 . The sets ω + a −j × F2 and F2 are two flats whose directions
n/2 n/2 n/2

a −j × F2 and F2 are subspaces whose sum is direct and equals F2n . Hence, they have a
n/2 n/2

unique vector in their intersection.

The duals of hyper-bent functions in PS #ap are also in PS #ap and then are hyper-bent.
Relationships between the notion of hyper-bent function and cyclic codes are studied in
[278], and it is deduced that:

Proposition
r 88 [278] Every hyper-bent function f : F2n → F2 can be represented as:
f (x) = i=1 trn (ai x ) + , where ai ∈ F2n , ∈ F2 and w2 (ti ) = n/2, where w2 denotes
t i

the 2-weight (see page 45). Consequently, all hyper-bent functions have algebraic degree
n/2.
246 Bent functions and plateaued functions

It is also shown in [278] that the elements in PS #ap are the functions of Hamming weight

2n−1 ± 2n/2−1 , which can be written in the form ri=1 trn (ai x ji ), where ai ∈ F2n and ji is
a multiple of 2n/2 − 1. Hence, PS #ap coincides with the set of bent functions whose trace
form involves Dillon-like exponents r(2n/2 − 1) only.
In [350], it is proved that, for every n even, λ ∈ F∗2n/2 and r ∈]0; n2 [ such that
the cyclotomic cosets of 2 modulo 2n/2 + 1 containing respectively 2r − 1 and 2r + 1
have size n and such that the function tr n2 λx 2 +1 is balanced on F2n/2 , the function
r

  r 
trn λ x (2 −1)(2 −1) + x (2 +1)(2 −1)
n/2 r n/2
is bent (i.e., hyper-bent) if and only if the
 −1
function tr n2 x + λx 2 +1 is also balanced on F2n/2 .
r

Computer experiments have been reported in [278]. For n = 4, there exist hyper-bent
functions that are not in PS #ap . Hence, stricto sensu, the set of hyper-bent functions contains
strictly PS #ap , but no other example was found for n > 4. See more in [725].

Constructions of hyper-bent functions in univariate trace


form and characterizations
The simplest examples of hyper-bent functions (belonging to PS #ap ) in trace form are the
(generalized) Dillon monomial functions trn (ax r(2 −1) ), x ∈ F2n , a ∈ F∗2n , gcd(r, 2n/2 +
n/2

1) = 1, where the restriction of trn (ax) to U has Hamming weight 2n/2−1 (see page 215).
The bentness (hyper-bentness) of such functions has been studied by several authors: in the
case r = 1 by Dillon [441], next by Leander [750], and when r is coprime with 2n/2 + 1, by
Charpin and Gong [350]:
n/2 −1)
1. The bentness of trn (ax r(2 ) does not depend on the choice of r.
 2n/2 +1 x+ 1 )
2. It is bent if and only if the Kloosterman sum x∈F n/2 (−1)trn/2 (a x equals 0.
2
n/2 −1)
3. When bent, trn (ax r(2 ) is self-dual.

The other known examples are


– Binomial hyper-bent functions, mainly due to S. Mesnager [851, 853] who made deep
work on this subject; these functions are the sums of a Dillon monomial function and of a
function expressed by means of the trace function over the subfield F4 of F2n :
   2n −1 
• trn ax r(2 −1) + tr2 bx 3 , where a ∈ F∗2n , b ∈ F4 ∗ , gcd(r, 2n/2 + 1) = 1.
n/2

When n/2 is odd larger than 3, such function is hyperbent if and only if
 n/2
trn/2 (a 2 +1 x+ x1 ) 2n/2 +1
x∈F n/2 (−1) = 4 (and this implies trn/2 (a 3 ) = 0); the function
2
belongs then to class PS #ap (it belongs to PS ap if b ∈ F2 ). The dual has the same form.
When n/2 is even, the characterization of the bentness of this function is an open problem
 2n/2 +1 x+ 1 )
(but we know that x∈F n/2 (−1)trn/2 (a x = 4 is necessary), and it is not known
2
whether
 the function,
 when  bent, belongs
 to the class PS − or not.
2n −1
• trn aζ i x 3(2 −1) +tr2 β j x 3 , where a ∈ F∗n/2 , β is a primitive element of F4 , ζ a
n/2
2
generator of the cyclic group of (2m + 1)th of unity, and with n/2 odd and not congruent
with 3 mod 6, is a hyper-bent function if and only if we are in one of the following
cases:
6.1 Bent Boolean functions 247
 trn/2 (ax+ x1 )
– trn/2 (a 1/3 ) = 0 and x∈F2n/2 (−1) = 4,
 trn/2 (ax+ x1 )  3 ))
– trn/2 (a 1/3 )=1, i∈{1, 2}, and x∈F2n/2(−1) + x∈F (−1)trn/2 (a(x+x =4.
2n/2

When these functions are bent, they belong to class PS # (and to PS ap if b ∈ F2 ) and
the dual function has the same form.
Note that n/2 being odd, 3(2n/2 − 1) is not a Dillon exponent because 3 divides
2n/2 + 1, contrary to when n/2 is even; hence this second class is not included in the first
class.
In [871], the authors study the hyper-bentness of more general binomial functions and
obtain a long list of (potentially new) hyper-bent functions.
- Polynomial hyper-bent functions:

• In [350] in the form r∈R trn (βr x r(2 −1) ), βr ∈ F2n , where R is a set of representa-
n/2

tives of full size29 cyclotomic cosets modulo 2n/2 + 1, with a characterization of hyper-
bentness by means of Dickson polynomials (see also [475, 782]). When r is coprime
with 2n/2 + 1, the functions are the sums of several Dillon monomial functions.
• In [546] in the form:
2n/2−1 −1  
i(2n/2 −1) ; β ∈ F
– i=1 trn βx 2n/2 \ F2 ,
2n/2−2 −1   −1
trn βx i(2 −1) ; n/2 odd and β (2 −4) ∈ {x ∈ F∗2n/2 ; trn/2 (x) = 0}.
n/2 n/2
– i=1

 2n −1
• In [852] (in [350] for b = 0) in the form r∈R trn (ar x r(2 −1) ) + tr2 (bx 3 ), x ∈ F2n ,
n/2

b ∈ F4 .
Hyper-bentness can be characterized by means of exponential sums involving Dickson
polynomials (see also [511]).
When b is a primitive element of F4 , the condition reduces to the evaluation of the
Hamming weight of some Boolean functions.

• In [871] in the form r∈R trn (ar x r(2 −1) ) + trt (bx s(2 −1) ), where
n/2 n/2

– R is a set of representatives of the cyclotomic classes modulo 2n/2 +1 (not necessarily


of maximal size).
– The coefficients ar are in F2n/2 .
– s divides 2n/2 + 1, i.e., s(2n/2 − 1) is a Dillon-like exponent; we set τ = 2 s +1 .
n/2

– t is the size of the cyclotomic coset of s modulo 2n/2 + 1.


– b ∈ F 2t .

But the characterization of hyper-bentness in terms of exponential sums is so complex


that no new hyper-bent function could be deduced except in some particular cases.
• In [1062], more Dillon exponent hyper-bent functions (see also [764]), with coefficients
in F2n (with a general result unifying results from the references above), and generalized
exponents in [1063].

See also [512].

29 It has been shown later in [871] that it is enough to assume that the size does not divide n/2.
248 Bent functions and plateaued functions

Homogeneous bent functions


Definition 58 [975] A Boolean function is called a homogeneous function if all the
monomials of its algebraic normal form have the same degree.

In [347], Charnes et al. showed how to use invariant theory to construct homogeneous bent
functions. They showed connections between homogeneous cubic functions and 1-designs
and certain graphs and proved that there exist cubic homogeneous bent functions in each
even number of variables n ≥ 6. They studied the equivalence between the constructed
bent functions and the properties of the associated elementary Abelian difference sets. It is
proved in [1126] that no homogeneous bent function of degree n/2 exists in n variables for
n > 6, and in [848] that, for any nonnegative integer k, if n is large enough, there exists no
homogeneous bent function in n variables having degree n/2 − k at least. Partial results
toward a conjectured nonexistence of homogeneous rotation symmetric bent functions (see
below) having algebraic degree larger than 2 have been obtained in [847].

Rotation symmetric bent functions and idempotent bent functions


Symmetry, that is, invariance under any permutation of input variables, simplifies the study
of Boolean functions, but all symmetric Boolean bent functions (see Section 10.1, page
352) are quadratic and belong then to one EA equivalence class of Boolean functions. The
superclass of rotation symmetric Boolean functions has then been introduced by Pieprzyk
and Qu in [954].

Definition 59 Let n be any positive integer. A Boolean function over Fn2 is called
rotation symmetric (RS) if it is invariant under any cyclic shift of input coordinates, which
is equivalent to saying that it is invariant under a primitive cyclic shift, for instance:
(x0 , x1 , . . . , xn−1 ) → (xn−1 , x0 , x1 , . . . , xn−2 ).

RS functions are in fact linked to a notion that had been anteriorly introduced by Filiol
and Fontaine in [503, 515] as observed by them:

Definition 60 Let n be any positive integer. A Boolean function f on F2n is called an


idempotent function (or briefly an idempotent) if it satisfies f (x) = f (x 2 ), for all x ∈ F2n .

 n −1
Note that a Boolean function given in univariate form f (x) = j2 =0 δj x j (or in subfield
trace representation; see page 43) is an idempotent if and only if every coefficient δj belongs
to F2 . The link between RS functions and idempotents is through normal bases. Recall that
2 n/2−1
for every n, there exists a primitive element α in F2n such that (α, α 2 , α 2 , . . . , α 2 ) is a
basis of the vector space F2n (see [775, 890]). Such basis is called a normal basis.

Proposition 89 For any Boolean function f (x) over F2n , and every normal basis
n−1
(α, α 2 , . . . , α 2 ) of F2n , the function
6.1 Bent Boolean functions 249
n−1 
2i
(x0 , . . . , xn−1 ) ∈ Fn2 → f xi α
i=0

is RS if and only if f is an idempotent.

This is easily proved. Hence the two notions are theoretically equivalent (but knowing
infinite classes for each notion is not equivalent). Proposition 89 leads to a notion of circulant
equivalence of RS functions; see, e.g., [245].
The bivariate representation and more general k-variate representation of RS functions
and of idempotent functions is studied in [281], where the link between these notions is
studied further; see Section 10.2, page 360.

Quadratic RS functions and idempotents The quadratic part of any quadratic RS


Boolean function has the form
⎛ ⎞ ⎛ ⎞

n/2−1 n−1 
n/2−1
ci ⎝ xj xi+j ⎠ ⊕ cn/2 ⎝ xj xn/2+j ⎠ , (6.33)
i=1 j =0 j =0

where c1 , . . . , cn/2 ∈ F2 and where the indices of x are modulo n. We have:

Proposition 90 [531] Let n be any even integer. Any RS quadratic function (6.33) is bent
n/2−1
if and only if the polynomial P (X) = i=1 ci (Xi + Xn−i ) + cn/2 Xn/2 is coprime with
n/2−1 i n−i n/2
Xn + 1, that is, the linearized polynomial L(X) = i=1 ci (X2 + X2 ) + cn/2 X2 is
a permutation polynomial.

Indeed, according to the characterization of quadratic bent functions recalled at page 205,
the function is bent if and only if the matrix of its associated symplectic form is nonsingular,
that is, the cyclic code generated by the rows of this matrix equals Fn2 , and the generator
n/2−1
polynomial of this code equals gcd( i=1 ci (Xi + X n−i ) + cn/2 Xn/2 , X n + 1).
Infinite classes of bent quadratic RS functions have been deduced:
n/2−1
j =0 xj xn/2+j (and we can add h(x0 ⊕ xn/2 , . . . , xn/2−1 ⊕ xn−1 ) to this Maiorana-

McFarland function, where h is any RS function, as observed in [1065]).
n/2−1 n−1 n/2−1
i=1 ( j =0 xj xi+j ) ⊕ ( j =0 xj xn/2+j ).

These two examples correspond to cn/2 = 1 and ci = 0 for i = n/2 in the former case
n/2
and ci = 1 for i = 1, . . . , n/2 − 1 in the latter case. Note that L(X) equals X2 in the
former case and X + trn (X) in the latter case, and these are permutation
n−1 ipolynomials since
n is even; equivalently P (X) equals Xn/2 in the former case and i=1 X in the latter case,
and these are coprime with X n + 1. More examples can be found, as observed in [245]. For
instance, let k be such that 2k − 2 divides n (and 2k − 1 is coprime with n). Then we have
250 Bent functions and plateaued functions
 n
k
X 2 −1 +1 2k −2 n/2−1
that the function X+1 + Xn + 1 has the form i=1 ci (Xi + Xn−i ) + cn/2 Xn/2
 n
k 2k −2
X 2 −1 +1
(indeed, X+1 is self-reciprocal, has degree n, and is normalized) and is coprime
 n
k k
X 2 −1 +1 2 −2
with X + 1 (indeed, the zeros of
n
X+1 in the algebraic closure of F2 are the
elements of F2k \ F2 and for any ξ ∈ F2k \ F2 we have ξ n + 1 = 0, since ξ → ξ n is a
 n/2
permutation of F∗2k ). Taking for example k = 2, we have X 2 + X + 1 + Xn + 1 =
(n/2)! 2u+v
X , and for n not divisible by 3, the following function is RS
0≤u,v,w≤n/2
u! v! w!
u+v+w=n/2,2u+v∈{0,n}
bent:
⎛ ⎞ ⎛ ⎞
 (n/2)! ⎝
n−1 
n/2−1
xj x2u+v+j ⎠ ⊕ ⎝ xj xn/2+j ⎠ ,
u! v! w!
0≤u,v,w≤n/2 j =0 j =0
u+v+w=n/2,2u+v∈{1,...,n/2−1}

where the coefficients are taken modulo 2.


Another example is as follows. If n is a power of 2, then according to [1043, proposition
n/2−1  n/2−1
3.1], the function i=1 ci ( jn−1 =0 xj xi+j ) ⊕ cn/2 ( j =0 xj xn/2+j ) is bent if and only if
n−1
i=0 ci = 1 (with cn−i = ci ), that is, cn/2 = 1. See more in [245].
Quadratic bent idempotents have been also characterized: as shown in [808], for
n/2−1
c1 , . . . , cn/2 ∈ F2 , the function equal to i=1 ci trn (x 2 +1 ) + cn/2 trn/2 (x 2 +1 ) is bent if
i n/2

n/2−1
and only if gcd( i=1 ci (Xi +Xn−i )+cn/2 Xn/2 , X n +1) = 1 (and necessarily, cn/2 = 1).
This condition is the same as that obtained for quadratic RS bent functions above. The
infinite classes of RS bent functions seen above provide the following bent idempotents:
The bent quadratic monomial idempotent f  (x) = trn/2 (x 2
n/2 +1
• ).
n/2−1
Functions f  (x) = i=1 trn (x 2 +1 ) + trn/2 (x 2 +1 ).
i n/2

• For n a power of 2, all nonzero quadratic idempotents.
• For n not divisible by 3, functions
n/2 +1 (n/2)!
trn (z2 +1 ),
2u+v
trn/2 (z2 )+
0≤u,v,w≤n/2
u! v! w!
u+v+w=n/2,2u+v∈{1,...,n/2−1}

where the coefficients are taken modulo 2 [245]. Of course, what is written above for RS
functions when n is a power of 2 is valid here.
• More results can be found in [1144].

The similarities between the quadratic RS bent functions and the quadratic bent idempotents
seen above leads to considering below a transformation of RS functions into idempotents.
Before that, let us recall what is known for nonquadratic functions.

Nonquadratic RS functions and idempotents Two infinite classes of cubic RS bent


functions (belonging to the completed Maiorana–McFarland class) are
6.1 Bent Boolean functions 251


n−1 
n/2−1
n/2
• (xi xt+i xn/2+i ⊕ xi xt+i ) ⊕ xi xn/2+i , where gcd(n/2,t) is odd [531] (and here also
i=0 i=0
we can of course add h(x0 ⊕ ⊕ xn−1 ) to this MM function, where h is
xn/2 , . . . , xn/2−1
any RS function, [1065]).

n−1 
2r−1 
n/2−1
• xi xi+r xi+2r ⊕ xi xi+2r xi+4r ⊕ xi xi+n/2 , where n/2 = 3r [282].
i=0 i=0 i=0

The Dillon and Kasami power functions with coefficient 1, and the Niho bent functions
trn/2 (z2 +1 ) + trn (zd2 ) (see page 221) are bent idempotents. The extension of the second
n/2

class of Niho bent functions by Leander and Kholosha gives also a bent idempotent.
For n = 6r, r ≥ 1, trn (z1+2 +2 ) + tr2r (z1+2 +2 ) + tr3r (z1+2 ) = trr ((z +
r 2r 2r 4r t

z2 )1+2 +2 ) + tr3r (z1+2 ) is a bent idempotent [282].


3r r 2r t

More bent idempotents of any algebraic degrees between 2 and n/2 are given in [1075]
n/2−1
in the form g(x) ⊕ h(trn (αx), trn (α 2 x), . . . , trn (α 2 x)), where g is an n-variable bent
function satisfying a strong condition and h is an n/2-variable rotation symmetric function.

Remark. The generalized Dillon and Mesnager functions could be viewed as bent
idempotent candidates, but the conditions happen not to be satisfiable: it is known that
• For every m = n/2 such that Km (1) is null, g1 (x) = trn (x r(2 −1) ) is bent when
m

gcd(r, 2m + 1) = 1.
2n −1
• For every m = n/2 odd such that Km (1) = 4, g2 (x) = trn (x r(2 −1) ) + tr2 (x 3 ) is
m

bent when gcd(r, 2m + 1) = 1.

But the condition Km (1) = 0 never happens as shown in [783, theorem 2.2], and it
can be checked by computer that the condition Km (1) = 4 never happens as well for
5 ≤ m ≤ 20.

Other nonquadratic functions A secondary construction of rotation symmetric functions


(and equivalently of idempotent bent functions) from near-bent RS functions (the definition
of near-bent functions is given in Subsection 6.2.4, page 262) based on the indirect sum (see
page 233) is given in [281] (see also [245]): let f1 and f2 be two m-variable RS near-bent
functions (m odd); if the Walsh supports of f1 and f2 are complementary, then function

h(x0 , y1 , x2 , y3 , . . . , xn−2 , yn−1 ) = f1 (x0 , x1 , . . . , xm−1 ) ⊕ f1 (y0 , y1 , . . . , ym−1 )⊕


(f1 ⊕ f2 )(x0 , x1 , . . . , xm−1 )(f1 ⊕ f2 )(y0 , y1 , . . . , ym−1 )

is bent RS. This provides constructions of RS functions and idempotent bent m−1functions
of algebraic degree 4, for m odd: given the two RS functions f1 (x) = i=0 (xi ⊕
m−1
xi x(m−1)/2+i ) and f2 (x) = x x
i=0 i 1+i , where the subscripts are taken modulo m,
function h(x0 , y1 , x2 , y3 , . . . , xn−2 , yn−1 ) = f1 (x0 , . . . , xm−1 ) ⊕ f1 (y0 , . . . , ym−1 ) ⊕ (f1 ⊕
f2 )(x0 , . . . , xm−1 )(f1 ⊕ f2 )(y0 , . . . , ym−1 ) is an RS bent function. Similarly, given the
(m−1)/2 +1
m-variable idempotent functions f1 (x) = trm (x) + trm (x 2 ) and f2 (x) = trm (x 3 ),
function h(x, y) = f1 (x) ⊕ f1 (y) ⊕ (f1 ⊕ f2 )(x) (f1 ⊕ f2 )(y) is a bent idempotent.
252 Bent functions and plateaued functions

Su and Tang [1054] have proposed, for any even n, constructions of rotation symmetric
bent functions with any possible algebraic degree ranging from 2 to n/2, obtained by
the modification of quadratic symmetric bent functions, and of bent idempotent functions
of algebraic degree n/2, obtained by the modification of the bent quadratic monomial
idempotent (see page 250).

A transformation As observed with quadratic RS functions and idempotents, there is a


natural way
 n−1 of transforming an RS function into an idempotent: let f (x0 , · · · , xn−1 ) =
ui 
i=0 xi , au ∈ F2 , be any Boolean RS function over F2 , then f (x) =
n a n
u∈F2 u
n−1  n−1 i
f (x, x 2 , . . . , x 2 ) = u∈Fn2 au x
i=0 ui 2 is a Boolean idempotent, and any idempotent

Boolean function can be obtained this way. We have seen that if f is a quadratic RS function,
then f is bent if and only if f  is bent.30 But for nonquadratic functions, it is shown in
[281, 282] that all cases can happen: examples are given of an infinite class of cubic bent RS
functions f such that f  is not bent, of an infinite class of cubic bent idempotents f  such
that f is not bent, and of infinite classes of bent RS functions f such that f  is bent.

6.1.21 Normal and nonnormal bent Boolean functions


We have seen the definition of normal functions in Definition 28, page 105.
As observed in [212] (see Theorem 14, page 202), if a bent function f is normal (resp.
weakly normal), that is, constant (resp. affine) on an n/2-dimensional flat b + E, where
E is a subspace of Fn2 , then its dual f is such that f(u) ⊕ b · u is constant on E ⊥ (resp.
on a + E ⊥ , where a is a vector such that f (x) ⊕ a · x is constant on E). Thus, f is weakly
normal. Moreover, we have already seen that f (resp. f (x) ⊕ a · x) is balanced on each of
the other cosets of the flat.
H. Dobbertin used normal bent functions to construct balanced functions with high
nonlinearities: take a bent function f in n variables that is constant on an n/2-dimensional
flat A of Fn2 ; replace the values of f on A by the values of a highly nonlinear balanced
n/2
function on A (identified to a function g on F2 ); note that this process is recursive since
such n/2-variable Boolean function g can be obtained by the same process (as long as
n/2 is even) with n replaced by n/2; when n becomes odd (say n = 2k + 1), replace
the constant value by a balanced function of best-known nonlinearity nl2k+1 (larger than
or equal to 22k − 2k ); this provides a balanced function (as we shall see in Proposition
121, page 296) whose nonlinearity equals 2n−1 − 2n/2−1 − · · · − 22k + (nl2k+1 − 22k ) ≥
2n−1 − 2n/2−1 − · · · − 22k − 2k .
The existence of nonnormal (and even non-weakly normal) bent functions, i.e., bent
functions that are nonconstant (resp. nonaffine) on every n/2-dimensional flat, has been
shown, contradicting a conjecture made by several authors that such bent function did
not exist. It is proved in[448] that the so-called Kasami function defined over F2n by
f (x) = trn ax 2 −2 +1 , with gcd(k, n) = 1, is bent if n is not divisible by 3 and if
2k k

a ∈ F2n is not a cube. As shown in [198] (thanks to [412]), if a ∈ F4 \ F2 and k = 3,

30 Note that if n ≡ 2 [mod 4], then there exists a self-dual normal basis of F2n and that f  expressed over Fn2 by
means of such basis is then the same function as f ; this is also the case if n is odd.
6.1 Bent Boolean functions 253

then for n = 10, the function f (x) ⊕ trn (bx) is nonnormal for some b, and for n = 14,
the function f (x) is not weakly normal (while the Kasami function is normal for n divisible
by 4 or k = 1). A nonnormal bent function in 12 variables is given in [278]. Cubic bent
functions on eight variables are all normal, as shown in [349].
The direct sum (see the definition in Subsection 6.1.16) of two normal functions is
obviously a normal function, while the direct sum of two nonnormal functions can be
normal. What about the sum of a normal bent function and of a nonnormal bent function?
This question has been studied in [270]. To this aim, a notion more general than normality
has been introduced as follows:

Definition 61 Let U ⊆ V be two vector spaces over F2 . Let β : U → F2 and f : V → F2


be bent functions. Then we say that f is a normal extension of β, in symbols β f , if
there is a direct decomposition V = U ⊕ W1 ⊕ W2 such that (i) β(u) = f (u + w1 ) for all
u ∈ U , w1 ∈ W1 , and (ii) dim W1 = dim W2 .

Obviously, we get a normal extension of any β by taking any normal bent function g and
making its direct sum with β. The relation is transitive and if β f then the same relation
 f.
exists between the duals: β
A bent function is normal if and only if f , where ∈ F2 is viewed as a Boolean
function over the vector space F02 = {0}.
Examples of normal extensions are given in [270] (some by the construction of Theorem
15, page 234, and its particular cases, the indirect sum and the extension of Maiorana–
McFarland type).
The clarification about the sum of a normal bent function and of a nonnormal bent
function comes from the two following propositions (see the proofs in [270]):

Proposition 91 Let fi : Vi → F2 , i = 1, 2, be bent functions. The direct sum f1 ⊕ f2 is


normal if and only if bent functions β1 and β2 exist such that fi is a normal extension of
βi (i = 1, 2) and either β1 and β2 or β1 and β2 ⊕ 1 are linearly equivalent.

Proposition 92 Suppose that β f for bent functions β and f . If f is normal, then also
β is normal.

Hence, since the direct sum of a bent function β and of a normal bent function g is a
normal extension of β, the direct sum of a normal and a nonnormal bent function is always
nonnormal.
Normal extension leads to a secondary construction of bent functions:

Proposition 93 Let β be a bent function on U and f a bent function on V = U × W × W .


Assume that β f . Let
β  : U → F2
be any bent function. Modify f by setting for all x ∈ U , y ∈ W
f  (x, y, 0) = β  (x),
254 Bent functions and plateaued functions

while f  (x, y, z) = f (x, y, z) for all x ∈ U , y, z ∈ W , z = 0. Then f  is bent and we have


β  f .

Hence, we can replace β by any other bent function on U and get again a normal
extension.

6.1.22 Kerdock codes


For every even n, the Kerdock code Kn [689] is a supercode of RM(1, n) (i.e., con-
tains RM(1, n) as a subset) and is a subcode of RM(2, n). More precisely Kn is a
union of cosets fu ⊕ RM(1, n) of RM(1, n), where the functions fu are quadratic
(one of them is null and all the others have algebraic degree 2). The difference fu ⊕
fv between two distinct functions fu and fv being bent, Kn has minimum distance
n
2n−1 − 2 2 −1 (n even), which is the best possible minimum distance for a code equal to a
union of cosets of RM(1, n), according to the covering radius bound. The size of Kn equals
22n . This is the best possible size for such length and minimum distance (see [177, 422]).
The Kerdock code of length 16 is called the Nordstrom–Robinson code. We describe now
how the construction of Kerdock codes can be simply presented.

Construction of the Kerdock code


We revisit Kerdock’s construction, which was presented by means of idempotents, that we
shall not need here. The function, already seen at page 206,
 
wH (x)
f (x) = σ2 (x) = [mod 2] = xi xj (6.34)
2
1≤i<j ≤n

is bent. Thus, the linear code RM(1, n) ∪ (f ⊕ RM(1, n)) has minimum distance 2n−1 −
n
2 2 −1 .
We have recalled at page 41 and foll. and at page 248 some properties of the field F2m
(where m is any positive integer). In particular, we have seen that F2m admits normal bases
m−1
(α, α 2 , . . . , α 2 ). If m is odd, there exists a self-dual normal basis, that is, a normal basis
such that trm (α 2 +2 ) = 1 if i = j (that is, trm (α) = 1) and trm (α 2 +2 ) = 0 otherwise (see
i j i j

m−1
[775, 890]). As a consequence, for all x = x1 α + · · · + xm α 2 in F2m , we have


m
j +1 
m
trm (x) = xi trm (x 2 )= xi xi+j ,
i=1 i=1

(where i + j is taken mod m).


The function f of Relation (6.34), viewed as a function f (x, xn ) on F2m × F2 , where
m = n − 1 is odd – say m = 2t + 1 – can now be written as
⎛ ⎞
t
f (x, xn ) = trm ⎝ ⎠ + xn trm (x) ,
j +1
x2
j =1
6.2 Partially-bent and plateaued Boolean functions 255

and this expression can be taken as the definition of f . Notice that the associated symplectic
form βf ((x, xn ), (y, yn )) associated to f equals trm (x)trm (y) + trm (xy) + xn trm (y) +
yn trm (x).
Let us denote f (ux, xn ) by fu (x, xn ) (u ∈ F2m ), then Kn is defined as the union, when u
ranges over F2m , of the cosets fu + RM(1, n).
Kn contains all 2n+1 affine functions (since for u = 0, we have fu = 0) and 22n −
n
2 n+1 quadratic bent functions. Its minimum distance equals 2n−1 − 2 2 −1 since the sum of
two distinct functions fu and fv is bent. Indeed, the kernel of the associated symplectic
form equals the set of all ordered pairs (x, xn ) such that trm (ux)trm (uy) + trm (u2 xy) +
xn trm (uy) + yn trm (ux) = trm (vx)trm (vy) + trm (v 2 xy) + xn trm (vy) + yn trm (vx) for every
(y, yn ) ∈ F2m × F2 , which is equivalent to utrm (ux) + u2 x + xn u = vtrm (vx) + v 2 x + xn v
and trm (ux) = trm (vx); it is a simple matter to show that it equals {(0, 0)}.
A more general approach to the construction of Kerdock codes is developed in [327].

Open problem: Other examples of codes having the same parameters exist; see [657] (see
also [658] and observations in [72, 208, 217]). All are equal to subcodes of the Reed–
Muller code of order 2, up to affine equivalence. We do not know how to obtain the same
parameters with nonquadratic functions (up to code equivalence). This would be useful for
cryptographic purposes and for the design of sequences for code division multiple access
(CDMA) in telecommunications.

Remark. The Kerdock codes are not linear. However, they share some nice properties with
linear codes: the distance distribution between any codeword and all the other codewords
does not depend on the choice of the codeword (we say that the Kerdock codes are distance
invariant; this results in the fact that their distance enumerators are equal to their weight
enumerators); and, as proved by Semakov and Zinoviev [1029], the weight enumerators of
the Kerdock codes satisfy a MacWilliams-like relation, similar to Relation (1.1), page 14, in
which C is replaced by Kn and C ⊥ is replaced by the so-called Preparata code [43] of the
same length (we say that the Kerdock codes and the Preparata codes are formally dual). An
explanation of this astonishing property has been given in [586]: the Kerdock code is stable
under an addition inherited of the addition in Z4 = Z/4Z (we say it is Z4 -linear), and the
MacWilliams identity still holds in this different framework. Such an explanation had been
an open problem for two decades.

6.2 Partially-bent and plateaued Boolean functions


We have seen that bent Boolean functions can never be balanced, which makes them
improper for a direct cryptographic use. This has led to a research on superclasses of the class
of bent functions, whose elements can have high nonlinearities, but can also be balanced31
(and possibly be resilient).

31 The functions found will, however, still have bounded algebraic degree, which is cryptographically crippling
in many situations.
256 Bent functions and plateaued functions

6.2.1 Partially-bent functions


A first superclass of possibly balanced functions with high nonlinearity has been obtained
as the set of those functions that achieve a bound conjectured by B. Preneel in [969] and
expressing some trade-off between the number of unbalanced derivatives (i.e., of nonzero
autocorrelation coefficients) of a Boolean function and the number of nonzero values of its
Walsh transform.

Proposition 94 [211] Let n be any positive integer. Let ' f be any Boolean (func-
tion on F n . Let us denote the cardinalities of the sets b ∈ Fn | F (D f ) = 0 and
' 2 ( 2 b
a ∈ Fn2 | Wf (a) = 0 by Nf and NWf , respectively. Then:

Nf × NWf ≥ 2n . (6.35)

Moreover, Nf × NWf = 2n if and only if, for every b ∈ Fn2 , the derivative Db f is either
balanced or constant. This property is also equivalent to the fact that there exist two linear
subspaces E (of even dimension) and E  of Fn2 , whose direct sum equals Fn2 , and Boolean
functions g, bent on E, and h, affine on E  , such that

∀x ∈ E, ∀y ∈ E  , f (x + y) = g(x) ⊕ h(y). (6.36)

Inequality (6.35) comes directly from the Wiener–Khintchine relation (2.53), page
62: since the value of the autocorrelation coefficient F (Db f ) lies between −2n and
2n for every
' b ∈ Fn2 , the arithmetic
( mean of (−1)u·b F (Db f ) when b ranges over
the set b ∈ Fn2 | F (Db f ) = 0 is at most 2n , for every u ∈ Fn2 , and we have then
−n −n W 2 (u) and thus N −n max
Nf ≥ 2 b∈Fn (−1) F (Db f ) = 2 f ≥ 2
u·b 2
f u∈Fn2 Wf (u).
2 
u∈Fn Wf2 (u)
22n
Moreover, we have NWf ≥ 2
= . This proves Inequality (6.35).
maxu∈Fn Wf2 (u) maxu∈Fn Wf2 (u)
2 2
This inequality is an equality if and only if both inequalities above are equalities, that is,
for every b ∈ Fn2 , the autocorrelation coefficient F (Db f ) equals 0 or 2n (−1)u0 ·b , where
maxu∈Fn2 Wf2 (u) = Wf2 (u0 ) (and this implies that, for every b ∈ Fn2 , Db f is either balanced
or constant) and f is plateaued (see page 258).
The single condition that Db f is either balanced or constant for every b implies that f
has the form (6.36). Indeed, let E be any supplementary space of the linear kernel Ef , then
E having trivial intersection with Ef , the restriction of f to E has balanced derivatives (their
balancedness over E being equivalent to their balancedness over Fn2 ) and is then bent, and
f has the form (6.36) with E  = Ef . Then it is easily seen that (6.35) is an equality. This
completes the proof.
See some more properties in [338].
A generalization of Relation (6.35) to pseudo-Boolean functions has been obtained
in [986].

Definition 62 The n-variable Boolean functions such that (6.35) is an equality, that is,
whose derivatives are all either balanced or constant, that is, the functions of the form
(6.36), are called partially-bent functions.
6.2 Partially-bent and plateaued Boolean functions 257

Bounds similar to Relation (6.35) but different are obtained in [1178] and lead to other
characterizations of partially-bent functions.
Every quadratic function is partially-bent. Partially-bent functions share with quadratic
functions almost all of their nice properties (the Walsh spectrum is easier to calculate,
they have potential good nonlinearity, and good resiliency order); see [211], where the
cryptographic properties of partially-bent functions are characterized. In particular, the

values of the Walsh transform equal 0 or ±2dim(E )+dim(E)/2 . The support of such plateaued
function is a coset (i.e., a translate) of E. Note that, viewing a function of the form (6.36) as
a bivariate function, its Walsh transform equals Wf (u, v) = Wg (u)Wh (v).
Instead of using Relation (2.53), we  can use Relations (3.7), page 97, and (3.10),
−2n −2n V (f ) and N
page 98. We have then Nf ≥ 2 b∈Fn F (Db f ) = 2 Wf ≤
2
 2
u∈Fn Wf4 (u) 2n V (f )
2
= , and therefore:
min{Wf4 (u); u∈Fn2 , Wf (u)=0} min{Wf (u); u∈Fn2 , Wf (u)=0}
4

Proposition 95 Let n be any positive integer. Let f be any Boolean function on Fn2 . With
the same notation as in Proposition 94, we have
Nf
≥ 2−3n min{Wf4 (u); u ∈ Fn2 , Wf (u) = 0},
NW f

with equality if and only if f is partially-bent.

We can also use Relations (3.9) and (3.10). Denoting by N(2) the size of the set
' ( f

(a, b) ∈ (Fn2 )2 | F (Da Db f ) = 0 , we have then N(2) ≥ 2−n F (Da Db f ) =


f
a,b∈Fn2
−n 2n V (f )
2 V (f ) and NWf ≤ min{Wf4 (u); u∈Fn2 , Wf (u)=0}
, and therefore

N(2)
≥ 2−2n min{Wf4 (u); u ∈ Fn2 , Wf (u) = 0},
f
(6.37)
NW f

with equality if and only if both inequalities are equalities, which is equivalent to the fact
that all second-order derivatives of f are either balanced or equal to the constant function 0
and that f is plateaued. We leave open the determination of such functions.
The functions achieving (6.37) with equality seem somewhat related to the so-called
second–order bent functions introduced in [275], which are by definition those Boolean
functions such that, for every F2 -linearly independent elements a, b ∈ Fn2 (i.e., a = 0n , b =
0n , a = b), Da Db f is balanced (which is a more demanding condition on the second-order
derivatives but does not require that f be plateaued). In fact, there is no intersection between
the two sets of functions, because no second-order bent function can be plateaued. Indeed,
it is shown in [275] that f is second-order bent if and only if, for all b, c ∈ Fn2 , we have

⎨ −22n+1 if b = 0n , c = 0n , b = c,
Wf (u + b + c)Wf (u + b)Wf (u + c)Wf (u) = 3 · 23n − 22n+1 if b = c = 0n ,

u∈Fn2 23n − 22n+1 otherwise .
258 Bent functions and plateaued functions

Then taking b = c = 0n , we see that if f is plateaued, its amplitude (see Definition 63) must
2n+1 n−1
divide 2 4  and therefore must divide 2 2 (since n is odd; see below) and the size of the
support of the Walsh transform of f is then a multiple of 3 · 2n+2 − 23 , which is impossible
since it cannot be larger than 2n .
The only known second-order bent functions are the 3-variable functions equal to x1 x2 x3
plus a quadratic function. It is shown in [275] that second-order bent n-variable functions
can exist only if n ≡ 3 [mod 4], and the existence of such functions in more than three
variables is an open question.

Remark. Partially-bent functions must not be mistaken for partial bent functions, studied
by P. Guillot in [578]. By definition, the Fourier–Hadamard transforms of partial bent
n
functions take exactly two values32 λ and λ + 2 2 on Fn2 \ {0n } (n even). Rothaus’ bound
on the degree generalizes to partial bent functions. The dual f of f , defined by f(u) = 0 if
f(u) = λ and f(u) = 1 if f(u) = λ+2 2 , is also partial bent; and its dual is f . Two kinds of
n

partial bent functions f exist: those such that f(0n ) − f (0n ) = −λ(2 2 − 1), and those such
n

that f(0n )−f (0n ) = (2 2 −λ)(2 2 +1). This can be deduced from Parseval’s relation (2.47).
n n

The sum of two partial bent functions of the same kind, whose supports share at most the
zero vector, is partial bent. An interest of partial bent functions is in the possibility of using
them as building blocks for constructing bent functions.

6.2.2 Plateaued Boolean functions


In spite of their good properties, partially-bent functions, when they are not bent, have by
definition nonzero linear structures and so do not give full satisfaction. The class of plateaued
functions, already encountered above in Section 3.1 (and sometimes called three-valued
functions), is a natural extension of that of partially-bent functions. They have been first
studied by Zheng and Zhang in [1173, 1174, 1176] and more recently in [247, 317, 858,
1178].

Definition 63 A function is called plateaued if its Walsh transform takes at most one
nonzero absolute value λ, that is, takes at most three values 0 and ±λ (where λ is some
positive integer, which we call the amplitude of the plateaued function).

Because of Parseval’s relation (2.47), the amplitude λ of any plateaued function must be
of the form 2j , where j ≥ n2 (since NWf ≤ 2n ). Then some authors call f a (2j − n)-
n+r
plateaued function (i.e., call r-plateaued the plateaued functions of amplitude 2 2 ), and
bent functions are 0-plateaued, near-bent functions are 1-plateaued, and semi-bent functions
in even dimension are 2-plateaued. According to Parseval’s relation (2.47), a plateaued
function is bent if and only if its Walsh transform never takes the value 0. The Walsh
spectrum of a plateaued function of amplitude λ is (thanks to Parseval’s and inverse Walsh
transform formulae)

32 Partial bent functions are the indicators of partial difference sets.


6.2 Partially-bent and plateaued Boolean functions 259

Walsh transform value Frequency


0 2n − 22n−2j
2j 22n−2j −1 + (−1)f (0n ) 2n−j −1
−2j 22n−2j −1 − (−1)f (0n ) 2n−j −1

 
and we have a∈F2n Wf3 (a) = (−1)f (0n ) 2n+2j and a∈F2n Wf4 (a) = 22n+2j .
The characterization of bent functions by difference sets has been extended in [918] to a
characterization of plateaued functions by so-called one-and-half difference sets.
Of course, an n-variable Boolean function f is plateaued with amplitude λ if and only if
its Walsh transform satisfies Wf2 = λ2 1supp(Wf ) , where supp(Wf ) is the Walsh support of
f and 1supp(Wf ) is its indicator. Since the autocorrelation function f has Wf2 for Fourier
transform, partially-bent functions are then those plateaued functions whose Walsh support
is an affine subspace of Fn2 . Indeed, this condition is necessary and it is also sufficient since
Relation (6.35) is then an equality because Nf equals then the size of the dual of the vector
n
space equal to the direction of supp(Wf ), and it equals then N2W .
f
Note that, according to Parseval’s relation, for every n-variable Boolean function f , we
have NWf × maxa∈Fn2 Wf2 (a) ≥ 22n and therefore, according to Relation (3.1), page 79,
 
nl(f ) ≤ 2n−1 1 − 11 . Equality is achieved if and only if f is plateaued.
NWf
According to Theorem 2, page 63, we have:

Proposition 96 The algebraic degree of any n-variable plateaued function is bounded


above by n − j + 1, where λ = 2j is the amplitude of f , and therefore by n/2 + 1 if n
is even (and by n/2 in the particular case of bent functions), and by n+1
2 if n is odd.

Note that the second part of the remark at page 67 gives additional information on the
ANF of plateaued functions.
Proposition 96 makes all plateaued functions weak against fast algebraic and Rønjom–
Helleseth attacks on stream ciphers. The class of plateaued functions contains those
functions that achieve the best possible trade-offs among resiliency, nonlinearity, and
algebraic degree: the order of resiliency and the nonlinearity of any Boolean function
are bounded by Sarkar et al.’s bound (see Chapter 7 below), and the best compromise
between those two criteria is achieved by plateaued functions only; the third criterion –
the algebraic degree – is then also optimal. Other properties of plateaued functions can be
found in [191, 692].

6.2.3 Characterizations of plateaued Boolean functions


A few characterizations of plateaued functions are given in [1173] for Boolean functions,
which are direct consequences of the definition. Plateaued functions have been more recently
characterized by their derivatives, their autocorrelation functions, and power moments of
their Walsh transforms.
260 Bent functions and plateaued functions

Characterization by means of the derivatives


Proposition 97 [317] A Boolean function f on Fn2 is plateaued if and only if there exists
λ ∈ N such that, for every x ∈ Fn2

(−1)Da Db f (x) = λ2 . (6.38)


a,b∈Fn2

λ is then the amplitude of the plateaued function.

The proof is very similar to that of Proposition 6.1, page 193. Afunction f is plateaued
with amplitude λ if and only if, for every u ∈ Fn2 , we have Wf (u) Wf2 (u) − λ2 = 0, that
is, Wf3 (u) − λ2 Wf (u) = 0. Applying the Fourier–Hadamard transform to both terms of this
equality and using Relations (2.42), page 59, and (2.44) iterated (with three functions), page
60, we see that this is equivalent to the fact that, for every a ∈ Fn2 , we have

(−1)f (x)⊕f (y)⊕f (x+y+a) = λ2 (−1)f (a) ,


x,y∈Fn2

and this completes the proof (after moving (−1)f (a) to the other hand side and changing
x, y, a into x + a, x + b, x).
The fact that quadratic functions are plateaued is a direct consequence of Proposition
97, since their second-order derivatives are constant; and Proposition 97 gives more insight
on the relationship between the nonlinearity of a quadratic function and the number of its
nonzero second-order derivatives.

Characterization by means of the autocorrelation function


A Boolean function f being plateaued of amplitude λ if and only if the functions Wf2 × Wf2
and λ2 Wf2 are equal, applying the Fourier transform to both functions, and using the formula
ϕ× ψ = 2−n  ϕ⊗ψ  with ϕ = ψ = W 2 , where ⊗ denotes the convolutional product, gives
f

Proposition
 98 [247] Let n be any positive integer and f any Boolean function. Let
f (a) = x∈Fn (−1)f (x)⊕f (x+a) be the autocorrelation function of f . Then f is plateaued
2
of amplitude λ if and only, for every x ∈ Fn2 :

f (a)f (a + x) = λ2 f (x).
a∈Fn2

Characterization by means of power moments of the Walsh transform



The sum a,b∈Fn (−1)Da Db f (x) in Proposition 97, equals
2

2−n (−1)f (x)⊕f (a)⊕f (b)⊕f (c)⊕w·(x+a+b+c) .


a,b,c,w∈Fn2

Let us apply the Fourier transform to this real-valued function of x and use that any
function of x is constant if and only if its Fourier transform is null at every nonzero vector
6.2 Partially-bent and plateaued Boolean functions 261

α. We deduce that f is plateaued if and only if, for every nonzero α ∈ Fn2 , the sum
 f (x)⊕f (a)⊕f (b)⊕f (c)⊕w·(x+a+b+c)⊕α·x is null. This latter sum equals:
x,a,b,c,w∈Fn (−1)
2

(−1)f (x)⊕(w+α)·x (−1)f (a)⊕w·a (−1)f (b)⊕w·b (−1)f (c)⊕w·c .


w∈Fn2 x∈Fn2 a∈Fn2 b∈Fn2 c∈Fn2

We deduce:

Proposition 99 [247] Any n-variable Boolean function f is plateaued if and only if, for
every nonzero α ∈ Fn2 , we have

Wf (w + α) Wf3 (w) = 0.
w∈Fn2

Another characterization of plateaued functions by means of the Walsh transform exists.


For
 a plateaued Boolean function of amplitude λ, we have, using Parseval’s relation, that
a∈Fn2 Wf (a) = 2 λ . We also have, for every b ∈ F2 , that a∈Fn2 (−1) Wf (a) =
4 2n 2 n a·b 3

λ2 a∈Fn (−1)a·b Wf (a) = λ2 2n (−1)f (b) . A necessary condition for f to be plateaued is
2  
then that, for every b ∈ Fn2 , a∈Fn Wf4 (a) = 2n (−1)f (b) a∈Fn (−1)a·b Wf3 (a). Conversely,
2 2 
if this property is satisfied by f , then the function b ∈ Fn2 → (−1)f (b) a∈Fn (−1)a·b Wf3 (a)
2
is constant. Then the Fourier transform of this function, that is, 
the function that maps every
α ∈ Fn2 to the sum b∈Fn a∈Fn (−1)(a+α)·b⊕f (b) Wf3 (a) = a∈Fn Wf (a + α)Wf3 (a) is
2 2 2
null at every nonzero α, and f is plateaued, according to Proposition 99:

Corollary 17 [247] Any n-variable Boolean function f is plateaued if and only if, for
every b ∈ Fn2 :

Wf4 (a) = 2n (−1)f (b) (−1)a·b Wf3 (a).


a∈Fn2 a∈Fn2

More characterizations exist. An obvious one is that, for every positive integer k,
an n-variable Boolean function f is plateaued if and only if there exists ν ∈ Z
  2
a∈Fn2 Wf (a) Wf (a) − ν = 0 (ν equals then the square of
2k 2
such that we have
the amplitude
 of the plateaued  function). This nonnegative
 expression of degree 2 in ν
2k+4 2k+2
writes a∈Fn Wf (a) − 2ν a∈Fn Wf (a) + ν 2 2k
a∈Fn2 Wf (a); hence, the reduced
2
 2k+2 2
2
  2k+4  
discriminant a∈Fn2 Wf (a) − a∈Fn2 Wf (a) 2k
a∈Fn2 Wf (a) is nonpositive
and is null if and only if f is plateaued. We deduce (see more in [247]):

Proposition 100 [247, 858] For every n-variable Boolean function f and every k ∈ N∗ ,
we have
⎛ ⎞2 ⎛ ⎞⎛ ⎞
⎝ Wf2k+2 (a)⎠ ≤ ⎝ Wf2k (a)⎠ ⎝ Wf2k+4 (a)⎠ ,
a∈Fn2 a∈Fn2 a∈Fn2

with equality if and only if f is plateaued.


262 Bent functions and plateaued functions

Table 6.2 Weight distribution of Cf for f plateaued of


amplitude λ.

Hamming weight w Multiplicity Aw

0 1
2n
2n−1 2n+1 − 2 2 − 1
λ
2n−1 − λ2 22n−1 + (−1)f (0n ) 2n−1
λ2 λ
2n−1 + λ2 22n−1 − (−1)f (0n ) 2n−1
λ 2 λ

Characterization by means of codes


Any plateaued Boolean function f , viewed as a (vectorial) (n, 1)-function, can be related to
the code Cf seen at page 160, which has then the weight distribution given by Table 6.2.
See in [874] examples of such codes.
Langevin proved in [738] that, if f is a plateaued function, then the coset f ⊕RM(1, n) of
the Reed–Muller code of order 1 is an orphan of RM(1, n). The notion of orphan has been
introduced in [599] (with the term “urcoset” instead of orphan) and studied in [137]. A coset
of RM(1, n) is an orphan if it is maximum with respect to the following partial order relation:
g ⊕ RM(1, n) is smaller than f ⊕ RM(1, n) if there exists in g ⊕ RM(1, n) an element g1
of Hamming weight nl(g) (that is, of minimum Hamming weight in g ⊕ RM(1, n)), and in
f ⊕ RM(1, n) an element f1 of Hamming weight nl(f ), such that supp(g1 ) ⊆ supp(f1 ).
Clearly, if f is a function of maximum nonlinearity, then f ⊕ RM(1, n) is an orphan
of RM(1, n) (the converse is false, since plateaued functions with nonoptimal nonlinearity
exist). The notion of orphan can be used in algorithms searching for functions with high
nonlinearities.

6.2.4 The subclasses of semi-bent and near-bent functions


– Recall that for n odd, near-bent functions (also called semi-bent functions) are those
n+1
plateaued functions of amplitude 2 2 . In [191], the authors observed that the class of
so-called three-valued almost optimal functions, such that the coset f ⊕ RM(1, n) takes
n−1
exactly three weights and whose nonlinearity is at least 2n−1 − 2 2 , coincides with that
of near-bent functions (such functions are plateaued because there are three weights and
because the coset is stable under complementation, and the amplitude of such plateaued
functions is minimal). Parseval’s identity shows that the support of their Walsh transform
has cardinality 2n−1 . Other properties have been shown in [1121] in connection with the
theory of cyclic codes and in [427] in connection with that of designs.
According to the properties seen in Section 5.2, page 170, quadratic Boolean functions
are near-bent if and only if their linear kernel has dimension 1, that is, their rank equals
n − 1.
Several constructions of quadratic near-bent functions exist; see a survey in [859]. All
the component functions of almost bent (n, n)-functions (see Subsection 11.3, page 371)
are near-bent, by definition, and the restriction of any bent Boolean function to an affine
hyperplane is near-bent (the restrictions to an affine hyperplane and to its complement
6.2 Partially-bent and plateaued Boolean functions 263

Table 6.3 Weight distribution of the code Csupp(f ) for f near-bent


such that f (0n ) = 0.

Hamming weight Multiplicity

0 1
wH (f )−2(n−1)/2
2 wH (f )[1 − 2−n wH (f ) − 2−(n+1)/2 ]
wH (f )
2 2n − 1 − wH (f )(2n − wH (f ))2−(n−1)
wH (f )+2(n−1)/2
2 wH (f )[1 − 2−n wH (f ) + 2−(n+1)/2 ]

have complementary Walsh supports and conversely such a pair of near-bent functions
arises from a bent function).
In [453], Ding extended Proposition 68, page 195, to near-bent functions: any n-
variable Boolean function f such that f (0n ) = 0 is near-bent if and only if the
dimension of the linear code Csupp(f ) , whose generator matrix has for columns the
vectors of supp(f ), equals n, and has weight distribution given by Table 6.3.
This provides codes with three weights.
– For n even, as also seen at page 178, semi-bent functions are those plateaued functions
n+2
of amplitude 2 2 . The term of semi-bent has been introduced in [357], but as for n
odd, these functions had been anteriorly studied under the name of three-valued almost
optimal Boolean functions in [191], where it is observed that the class of such functions
n
whose nonlinearity is at least 2n−1 − 2 2 coincides with that of semi-bent functions.
In [312], the authors show that the sum of a Boolean function g equal to the linear
combination of the indicators of the elements of a spread and of a Boolean function h
whose restrictions to these elements are linear is semi-bent if and only if g and h are
both bent; related infinite classes are specified and a version with partial spreads is also
given. Other recent works on semi-bent functions are [206, 282, 711, 855, 856, 857, 868,
876, 1131]. Up to recently, the known semi-bent functions were often quadratic or the
component functions of power functions (see, e.g., [355]). More constructions have been
proposed in [376] to derive semi-bent functions from bent functions. See a survey in
[859].

6.2.5 Primary constructions of plateaued Boolean functions


All quadratic Boolean functions and all bent and semi-bent Boolean functions are plateaued.
We recall from [247] the other primary constructions. Most of them have been already
presented above for constructing bent functions; they are extended here to more general
plateaued functions.

Maiorana–McFarland (MM) functions


Any (x, y) = x · φ (y) ⊕ h (y) ; x ∈ Fr2 , y ∈ Fs2 , is plateaued if and only if
 function fφ,hb·y⊕h(y)
| y∈φ −1 (a) (−1) | can take two values, one of which is 0, when (a, b) ∈ Fr2 × Fs2 ,
264 Bent functions and plateaued functions

since Wfφ,h (a, b) = 2r (−1)b·y⊕h(y) . If φ is injective (resp. takes exactly two times
y∈φ −1 (a)
each value of I m (φ)), then fφ,h is plateaued of amplitude 2r (resp. 2r+1 ). Note that the
address function (see page 68) is plateaued as observed in [692] and easily checked.

Zheng–Zhang’s functions In [1173], Zheng and Zhang introduce a class of plateaued


functions and prove that some of them are not partially-bent. These functions are defined
as follows: let t and k be two integers such that k < 2t < 2k and let E ⊆ Fk2 be a subset
of 2t elements such that any linear nonnull function on Fk2 is not constant on E. For every
element ei of E, let ξi denote the truth table of the linear function x → x · ei on Fk2 . Then,
the Boolean function f on Fk+t 2 having for the truth table the concatenation ξ0 ξ1 · · · ξ2t −1
of these truth tables is plateaued on Fk+t
2 and its amplitude equals 2k . Such a function is the
concatenation of distinct linear functions. Then, as already observed in [317], it belongs to
the Maiorana–McFarland class and satisfies the first hypothesis above.

Generalizations of Maiorana–McFarland functions


Concatenations of quadratic functions in Dickson form Let n and r be positive integers
such that r ≤ n. As proved in [223] and recalled at page 179, the function

t
fψ,φ,g (x, y) = x2i−1 x2i ψi (y) ⊕ x · φ(y) ⊕ g(y)
i=1


t 
r
= x2i−1 x2i ψi (y) ⊕ xi φi (y) ⊕ g(y); x ∈ Fr2 , y ∈ Fs2 ,
i=1 j =1
:r ;
where t = 2 , satisfies Wfψ,φ,g (a, b) =
t
2r−w(ψ(y)) (−1) i=1 (φ2i−1 (y)⊕a2i−1 )(φ2i (y)⊕a2i )⊕g(y)⊕y·b ,
y∈Ea

where w(ψ(y)) denotes the Hamming weight and Ea is the superset of φ −1 (a) equal if r is
even to
 >
ψi (y) = 0 ⇒
y ∈ F2 ; ∀i ≤ t,
s
,
(φ2i−1 (y) = a2i−1 and φ2i (y) = a2i )
and if r is odd to
⎧ ⎧ ⎫
⎨ ⎨ ∀i ≤ t, ψi (y) = 0 ⇒ ⎬
y ∈ Fs2 ; (φ2i−1 (y) = a2i−1 and φ2i (y) = a2i ) .
⎩ ⎩ ⎭
φr (y) = ar
As observed in [317], if Ea has size 0 or 1 (respectively 0 or 2) for every a and if ψ has
constant weight, then fψ,φ,g is plateaued.

Concatenations of quadratic functions of rank 2 As seen at page 180, assuming that


φ2 (y) = 0r for every y ∈ Fs2 and denoting by E the set of y ∈ Fs2 such that φ1 (y) and φ2 (y)
6.2 Partially-bent and plateaued Boolean functions 265

are linearly independent, function fφ1 ,φ2 ,φ3 ,g (x, y) = (x · φ1 (y)) (x · φ2 (y))⊕x·φ3 (y)⊕g(y)
satisfies Wfφ1 ,φ2 ,φ3 ,g (a, b) =

2r−1 (−1)g(y)⊕b·y − 2r−1 (−1)g(y)⊕b·y


y∈E; y∈E;
φ3 (y)+a∈{0r ,φ1 (y),φ2 (y)} φ3 (y)+a=φ1 (y)+φ2 (y)

+ 2r (−1)g(y)⊕b·y ,
y∈Fs2 \E;
φ3 (y)+a=φ1 (y)

for every a ∈ Fr2 and b ∈ Fs2 . As shown in [317], if E = Fs2 and the two-dimensional
flats φ3 (y) + (φ1 (y), φ2 (y)); y ∈ Fs2 , are pairwise disjoint, then fφ1 ,φ2 ,φ3 ,g is plateaued of
amplitude 2r−1 . And assuming that φ2 (y) is nonzero for every y ∈ Fs2 and denoting by
Fa (resp. Fa ), the set of all y ∈ Fs2 such that φ1 (y) and φ2 (y) are linearly independent
(resp. dependent) and such that a belongs to the flat φ3 (y) + (φ1 (y), φ2 (y)) (resp. a =
φ3 (y) + φ1 (y)), we have that if, for every a ∈ Fr2 , the number |Fa | + 2|Fa | equals 0 or 2,
then fφ1 ,φ2 ,φ3 ,g is plateaued of amplitude 2r . See a little more in [1058].

6.2.6 Secondary constructions of plateaued Boolean functions


The direct sum preserves plateauedness since h(x, y) = f (x) ⊕ g(y), x ∈ Fr2 , y ∈ Fs2
satisfies Wh (a, b) = Wf (a)Wg (b) (and we have then nl(h) = 2s nl(f ) + 2r nl(g) −
2nl(f )nl(g)). The indirect sum does too, under some conditions:

Proposition 101 [247] Let h(x, y) = f1 (x) ⊕ g1 (y) ⊕ (f1 ⊕ f2 )(x) (g1 ⊕ g2 )(y), then
if f1 and f2 are plateaued with the same amplitude, g1 and g2 are plateaued with the same
amplitude, and
• f1 and f2 have the same Walsh support (i.e., the same extended Walsh spectrum),
• or g1 and g2 have the same Walsh support (idem)
• or f1 and f2 have disjoint Walsh supports and g1 and g2 have disjoint Walsh supports,

then h is plateaued.

Proof We have seen already that


1  1 
Wh (a, b) = Wf1 (a) Wg1 (b) + Wg2 (b) + Wf2 (a) Wg1 (b) − Wg2 (b) . (6.39)
2 2
Moreover, if f1 and f2 have both amplitude λ, and if g1 and g2 have both amplitude μ, then
according to Relation (6.39), we have the following:
• If g1 and g2 have the same Walsh support, then Wh (a, b) ∈ {0, ±λμ} (indeed, at most
one of the two values Wg1 (b) + Wg2 (b) and Wg1 (b) − Wg2 (b) is then nonzero, and this
value equals ±2μ).
• If f1 and f2 have the same Walsh support, then Wh (a, b) ∈ {0, ±λμ} (same argument,
after exchanging the roles of the fi ’s and the gi ’s in (6.39)).
• If f1 and f2 have disjoint Walsh supports and g1 and g2 have disjoint Walsh supports,
then Wh (a, b) ∈ {0, ± λμ
2 }.

Hence, h is plateaued.
266 Bent functions and plateaued functions

In [1152], a secondary construction of plateaued functions (with disjoint supports) is


given from three bent functions and three plateaued functions, under some conditions.
The construction without extension of the number of variables viewed at page 236 can be
easily adapted to plateaued functions:

Proposition 102 [247] Let f1 , f2 and f3 be three n-variable Boolean functions. Denote
by s1 the Boolean function f1 + f2 + f3 and by s2 the Boolean function f1 f2 + f1 f3 + f2 f3 .
We have
Wf1 + Wf2 + Wf3 = Ws1 + 2 Ws2 .
Moreover,
• If f1 , f2 , f3 , and s1 are plateaued with the same amplitude λ and with disjoint Walsh
supports, then s2 is plateaued with amplitude λ2 .
• If f1 , f2 , f3 , and s1 are plateaued with the same amplitude λ and with Walsh supports
whose multiset equals twice some subset of Fn2 , then s2 is plateaued with amplitude λ.
• If f1 , f2 , f3 are plateaued with the same amplitude λ and with disjoint Walsh supports,
and s2 is plateaued with amplitude λ/2 and Walsh support disjoint from those of f1 , f2 ,
f3 , then s1 is plateaued with amplitude λ.
• If f1 , f2 , f3 are plateaued with the same amplitude λ, s2 is plateaued with amplitude λ/2
and the Walsh supports of f1 , f2 , f3 and s2 make a multiset equal to twice some subset
of Fn2 , then s1 is plateaued with amplitude 2λ.

6.3 Bent4 and partially-bent4 functions


There exist several generalizations of the notion of bent function; see, e.g., [313]. We shall
not address them here since we focus on Boolean functions. But bent4 functions [17, 18,
529, 927, 993] are Boolean functions (whose definition is a modification of that of bent
function); we need then to give the main definitions and results on them, even if their use
in cryptography and coding is not so clear.33 In even dimension, bent4 functions are defined
as bent functions, but with respect to a transformation called unitary transformation that
we recall below, and which generalizes the Walsh transform. They can also be defined by
the balancedness of so-called modified derivatives. In odd dimension, there is a one-to-
one correspondence between the set of bent4 functions and the set of semi-bent functions
satisfying additional properties that we shall describe as well.

Definition 64 [993, 18] Let n be any positive integer and f a Boolean function over F2n .
For any element c ∈ F2n , the unitary transformation Vfc, : F2n → C is defined as
c (x)
Vfc (u) = (−1)f (x)+σ i trn (cx) (−1)trn (ux) ,
x∈F2n

33 The motivation given in [993] comes from the quantum domain; another motivation comes from the relation
to the notion of modified planar functions; see [18], where it is proved that bent4 functions describe the
components of modified planar functions.
6.3 Bent4 and partially-bent4 functions 267

where σ c (x) is the Boolean function whose univariate representation equals


i j
σ c (x) = (cx)2 (cx)2 .
0≤i<j ≤n−1

For c = 0, the transformation Vfc is simply the well-known Walsh transform. For c = 1,
Vfc is the nega-Hadamard transform (see [927]).
In even dimension, the class of bent4 functions can be defined as follows in terms of the
unitary transformation:

Definition 65 Let n be an even integer. A Boolean function f is called a c-bent4 function,


for some c ∈ F2n , if the unitary transformation Vfc satisfies |Vfc (u)| = 2n/2 for all u ∈ F2n .
A function is bent4 if it is c-bent4 for some c ∈ F2n .

In other words, a Boolean function is c-bent4 if it has a flat spectrum with respect to at
least one of the transforms Vfc . Note that when c = 0, a c-bent4 function is a classical bent,
and when c = 1, a c-bent is so-called nega-bent.

Proposition 103 [18] Let n be an even integer. A Boolean function f : F2n → F2 is


c-bent4 if and only if f ⊕ σ c is bent.

Proof We will employ Jacobi’s two-square theorem stating that for an even integer n, the
integer solutions of the Diophantine equation R 2 + I 2 = 2n are (R, I ) = (0, ±2n/2 ) or
(±2n/2 , 0). One has
c (x)
Vfc (u) = (−1)f (x)+σ i trn (cx) (−1)trn (ux)
x∈F2n
 
f (x)+σ c (x)+trn (ux) 1 + (−1)trn (cx) 1 − (−1)trn (cx)
= (−1) +i
2 2
x∈F2n
Wf ⊕σ c (u) + Wf ⊕σ c (u + c) Wf ⊕σ c (u) − Wf ⊕σ c (u + c)
= +i
2 2
(c)
If f is c-bent4 then |Vf (u)| = 2n/2 , that is,
 2  2
Wf ⊕σ c (u) + Wf ⊕σ c (u + c) + Wf ⊕σ c (u) − Wf ⊕σ c (u + c) = 2n+2 . (6.40)

Now, by Jacobi’s two-square theorem, one has |Wf ⊕σ c (u)| = |Wf ⊕σ c (u + c)| = 2n/2 ,
which proves that f ⊕ σ c is bent. The converse of the statement comes immediately from
Equation (6.40).

Some authors call such f a shifted bent function (i.e., f is the shifted version of the bent
function f ⊕ σ c ).

W f ⊕σ c (u)+W
f ⊕σ c (u+c) W c (u)−W c (u+c)
Remark. 2 (resp. f ⊕σ 2
f ⊕σ
) ranges (twice, when u ranges
over F2n ) over the Walsh spectrum of the restriction of f ⊕ σ c to the linear hyperplane of
268 Bent functions and plateaued functions

equation trn (cx) = 0 (resp. its complement) and we know from Theorem 16, page 240,
that f ⊕ σ c is bent if and only if these two restrictions are semi-bent (i.e., near-bent) with
complementary Walsh supports.

An alternative definition of a c-bent4 function f can be given in relation to the so-called


modified derivative of f . More specifically, it has been proved in [18] that f is c-bent4 if
and only if the modified derivative f (x + a) ⊕ f (x) ⊕ trn (c2 ax) is balanced for all nonzero
a ∈ F2n . This corresponds to the characterization of bent functions via derivatives when
c = 0.
Bent4 functions exist also in odd dimension. More precisely, let n be an odd integer. Then
a function f : F2n → F2 is c-bent4 if and only if f ⊕ σ c (x) is a semi-bent function and
|Wf ⊕σ c (u)| = |Wf ⊕σ c (u + c)| for all u ∈ F2n .
In [17], the authors have introduced the notion of partially-bent4 functions, which are
functions whose modified derivative is either constant or balanced for every element of the
input set. It is known that every quadratic function is partially c-bent4 .

6.4 Bent vectorial functions


Definition 66 An (n, m) function is called bent if all its component functions v · F , v ∈
Fm
2 \ {0m } (where “·” is an inner product in F2 ), are bent Boolean functions, that is, if
m

WF (u, v) = 2 for every v ∈ F2 \ {0m } and every u ∈ Fn2 . Equivalently, all the derivatives
2 n m

Da F , a ∈ Fn2 \ {0n }, are balanced.

The equivalence between these two characteristic properties, called respectively bentness
and perfect nonlinearity,34 is a direct consequence of Theorem 12, page 192, which implies
that F is bent if and only if, for every v ∈ Fm
2 \ {0m } and every a ∈ F2 \ {0n }, the function
n

v · Da F is balanced, and of Proposition 35, page 112, applied to Da F .


Up to linear equivalence (precisely, up to the composition on the left by a linear
automorphism), the knowledge of a bent (n, m)-function is equivalent to that of an m-
dimensional F2 -vector space of Boolean functions, all being bent except the zero one (the
vector space is made of all component functions and of the zero function); from such an
m-dimensional space E, we can build a bent (n, m)-function by choosing its coordinate
functions as m linearly independent functions in E.
Bent vectorial functions are never balanced since their component functions are
not balanced. More precisely, we saw at page 113 that their imbalance NbF =
 0 −1 0 0 0
0F (b)0 − 2n−m 2 satisfies NbF =  n 0(Da F )−1 (0m )0 −22n−m = 2n −2n−m
b∈Fm
2 a∈F2
(and thatNbF +L = 2n − 2n−m for every linear function L). We have also seen that
NBF = a∈Fn \{0n } NbDa F equals 0 if and only if F is bent.
2
The algebraic degree of any bent (n, m)-function is at most n/2, since this bound is true
for any component function.

Remark. We have seen with Proposition 37, page 120, that it is possible, as for Boolean
functions, to characterize the bentness of (n, m)-functions F by a property of the functions
34 There are then (over Fn2 ) two different terminologies for the same class of functions.
6.4 Bent vectorial functions 269

F + L, where L is a linear (n, m)-function expressing that F + L is not far from a balanced
function.

Bent vectorial functions have been initially considered by Nyberg who proved:

Proposition 104 [906] Bent (n, m)-functions exist if and only if n is even and m ≤ n2 .

Proof It is easily seen that the condition is sufficient, thanks to the constructions of bent
functions that we shall see in Subsection 6.4.1, page 270. Let us prove that it is necessary. We
have seen in Relation (3.17), page 113, that,
 for every (n, m)-function F and any element b ∈
−1
F2 , the size of F (b) is equal to 2
m −m (−1) v·(F (x)+b) . Assuming that F is bent
x∈Fn2 ;v∈Fm
2

and denoting, for every v ∈ F2 \ {0n }, by v · F the dual of the bent Boolean function x → v ·
n
 n 
F (x), we have, by definition, x∈Fn (−1)v·F (x) = 2 2 (−1)v·F (0n ) . The size of F −1 (b) equals
n  2
  
then 2n−m +2 2 −m v∈Fn \{0n } (−1)v·F (0n )⊕v·b . Since the sum v∈Fn \{0n } (−1)v·F (0n )⊕v·b has
2 2
n
an odd value (Fn2 \ {0n } having an odd size), we deduce that, if m ≤ n, then 2 2 −m must be
an integer. And it is also easily shown that m > n is impossible.

Remark. The situation with PN functions is different for odd characteristic, in which
PN (n, n)-functions (defined similarly) do exist for every n (they are also called planar).
A notion of planar function in characteristic 2 (stating that x ∈ F2n → Da F (x) + ax
is bijective for every a = 0) sometimes called pseudoplanar or modified planar has
been proposed in [1023] (see also [963]). Such functions share many of the properties of
planar functions in odd characteristic, in relation with relative difference sets and finite
geometries.

A survey on bent vectorial functions can be found in [310].


In [337] are called dual-bent vectorial functions the bent (n, m)-functions having the
property that the duals of their component functions form, together with the zero function, a
vector space of dimension m, and are then the component functions of some vectorial bent
function, called a vectorial dual of F ; classical classes are then studied from this viewpoint.
CCZ equivalence and EA equivalence coincide for bent functions [148, 150]: let F be
a bent (n, m)-function (n even, m ≤ n/2) and let (without loss of generality) L1 and L2
be two linear functions from Fn2 × Fm 2 to (respectively) F2 and F2 , such that (L1 , L2 ) is
n m

a permutation of F2 × F2 and F1 (x) = L1 (x, F (x)) is a permutation of Fn2 . For every


n m

vector v in Fn2 , the function v · F1 is necessarily non-bent since, if v = 0m , then it is null,


and if v = 0m , then it is balanced. Let us denote L1 (x, y) = L (x) + L (y). We have
then F1 (x) = L (x) + L ◦ F (x). The adjoint operator L of L (satisfying by definition
v · L (y) = L (v) · y) is then the null function, since if L (v) = 0m , then v · F1 (x) =
v · L (x) ⊕ L (v) · F (x) is bent. This means that L is null and L1 depends then only on x,
which corresponds to EA equivalence.
We have seen in Proposition 104 that bent (n, m)-functions exist if and only if n is even
and m ≤ n/2. Better bounds than the covering radius bound are open problems for:
270 Bent functions and plateaued functions

– n odd and m < n (for m ≥ n, the Sidelnikov–Chabaud–Vaudenay bound, and other


bounds if m is large enough, are better)
– n even and n/2 < m < n

In [459], the authors provided a coding-theoretic characterization of bent vectorial func-


tions and used them for the construction of a two-parameter family of binary linear codes
that do not satisfy the conditions of the Assmus–Mattson theorem [36], but nevertheless hold
2-designs.

6.4.1 Primary constructions of bent vectorial functions


Recall that bent (n, m)-functions can exist only for n even and m ≤ n/2, that we
shall assume satisfied. The main classes of bent Boolean functions lead to classes of bent
(n, m)-functions (this was first observed in [906] by Nyberg, who proposed constructions
within the Maiorana–McFarland and PS ap constructions).

Constructions in bivariate representation


The three first primary constructions below are by increasing order of generality. We follow
[310, 313] for the description. When necessary (i.e., when we need to make multiplications
n
or divisions), we endow F22 with the structure of the field F n2 and we identify Fn2 with
2
F n2 × F n2 .
2 2
• Bent (n, m)-functions in the strict class of Maiorana–McFarland are defined as F (x, y) =
L(x π(y)) + G(y), x, y ∈ F2n/2 , where π is a permutation of F2n/2 , L : F2n/2 → Fm 2 is
linear surjective, and G is any (n/2, m)-function. An example is given in [1018] (which
achieves optimal algebraic degree n/2): the i-th coordinate of this function is defined as
fi (x, y) = tr n2 (x φi (y)) ⊕ gi (y), x, y ∈ F n2 , where gi is any Boolean function on F n2
 2 2
0 if y = 0
and where φi (y) = , where α is a primitive element of F n2 and
α dec(y)+i−1 otherwise 2
n n
−1 −2
dec(y) = 2 2 y1 + 2 2 y2 + · · · + y n2 . This function belongs to the strict Maiorana–

0 if y = 0
McFarland class because the mapping y → is a permutation from
α dec(y) otherwise
n n n
F22 to F n2 , and the function L : x ∈ F n2 → (tr n2 (x), tr n2 (αx), . . . , tr n2 (α 2 −1 x)) ∈ F22
2 2
is an isomorphism.
• Bent (n, m)-functions in the extended class of Maiorana–McFarland are defined as
F (x, y) = ψ(x, y) + G(y), where G is any (n/2, m)-function and ψ is such that, for all
y ∈ F2n/2 , the function x → ψ(x, y) is linear and for all x ∈ F2n/2 \ {0}, the function
y → ψ(x, y) is balanced.
• Bent (n, m)-functions in the general class of Maiorana–McFarland are defined such that,
for all v ∈ F∗2m , function v · F belongs, up to affine equivalence, to the Maiorana–
McFarland class of Boolean bent functions. Some bent quadratic functions, elements of
the general class, may not belong to the strict class are open problems.
• Modifications of the Maiorana–McFarland bent functions have been proposed in [909],
using the classes C and D of bent Boolean functions.
6.4 Bent vectorial functions 271

• Bent (n, m)-functions in the PS ap class of vectorial functions are defined as F (x, y) =
 
G(xy 2 −2 ) = G xy , with the convention 1/0 = 0, where G is a balanced (n/2, m)-
n

function. These functions are hyper-bent in the sense that their component functions are
hyper-bent. In [804], the authors give their expression in the polar representation that
we saw at page 168.
• Bent (n + n , m)-functions (where n is also even) can be defined in the form F (x, y) =

2 , m)-function such that, for all x ∈ F2n/2 , the function y ∈
K( xy , zt ), where K is a ( n+n
F2n /2 → K(x, y) is balanced and for all y ∈ F2n /2 , the function x ∈ F2n/2 → K(x, y)
is balanced.
• Bent (n, n/2)-functions from class H of bent Boolean functions (see page 218) are
defined as: F (x, y) = xG(yx 2 −2 ), where G is an o-polynomial on F2n/2 ; see [862]. A
n/2

version in univariate form can be found in [479]; see also [310].


• Bent (n, m)-functions are built from m-dimensional vector spaces of functions whose
nonzero elements are all bent. Examples are (n, 2)-functions derived from the Kerdock
codes; see [310]. Another example (found by the author in common with G. Leander)
n
takes n ≡ 2 [mod 4]; then F n2 consists of cubes only (since gcd(3, 2 2 − 1) = 1). If
2
w ∈ F2n is not a cube, then all the nonzero elements of the vector space E = w F n2
2
are noncubes. Then if F (z) = zd , where d = 2i + 1 (Gold exponent) or 22i − 2i + 1
(Kasami exponent) and gcd(n, i) = 1, all the functions trn (vF (z)), where v ∈ E ∗ , are
bent. This leads to the bent (n, n2 )-functions z ∈ F2n → (trn (β1 wzd ), . . . , trn (β n2 wzd ) ∈
n
F22 , where (β1 , . . . , β n2 ) is a basis of F n2 over F2 . To make such function valued
2
in F n2 , we choose a basis (α1 , . . . , α n2 ) of F n2 orthogonal to (β1 , . . . , β n2 ), that is,
2 2
such that tr n2 (αi βj ) = δi,j (the Kronecker symbol). For every y ∈ F n2 , we have
2
 n2
then y = j =1 α j tr n (βj y). The image of every z ∈ F2n by the function equals
2
 n2  n2 n n

j =1 α j trn (β j wz d ) = j =1 α j tr n (βj (wzd + (wzd )2 )) = wzd + (wzd )2 . In the


2
2 2

case of the Gold exponent, it can be made a function from F n2 × F n2 to F n2 : we express


2 2 2
z in the form x + wy, where x, y ∈ F n2 and if n is not a multiple of 3, we can take
2
w primitive in F4 (otherwise, all elements of F4 are cubes and we have then to take w
i
outside F4 ), for which we have then w2 = w + 1, w 2 = w 2 (since i is necessarily
odd) and w2 +1 = w3 = 1. We have then zd = x 2 +1 + wx 2 y + w2 xy 2 + y 2 +1 and
i i i i i

n
wzd + (wzd )2 2 = (w + w2 )x 2 +1 + (w2 + w)x 2 y + (w3 + w3 )xy 2 + (w + w2 )y 2 +1 =
i i i i

x 2 +1 +x 2 y+y 2 +1 . We can extend the construction to gcd(i, n) = 1; the exact condition


i i i

is that gcd(i,n) is even and v ∈ {x d , x ∈ F2n }.


n

Constructions of bent vectorial functions in univariate representation


The bent (n, m)-functions built from m-dimensional vector spaces of functions above
n (wx d ), where w is not a cube and d = 2i + 1 or 4i − 2i + 1,
provide first examples, like trn/2
gcd(i, n) = 1. The other functions above, which are defined in bivariate representation (over
F n2 × F n2 and valued in F n2 ), can be seen in univariate representation, from F2n to itself.
2 2 2
If n/2 is odd, this is quite easy: we have then F n2 ∩ F4 = F2 and we can choose the basis
2
(1, w) of the two-dimensional vector space F2n over F n2 , where w is a primitive element
n
2
of F4 . Then w2 = w + 1 and w2 2 = w2 since n/2 is odd. A general element of F2n
272 Bent functions and plateaued functions
n
has the form z = x + wy, where x, y ∈ F n and we have z2 2 = x + w 2 y = z + y
n n
22 n
and therefore y = z + z2 2 , and x = z2 2 + w2 y = w 2 z + wz2 2 . For instance, the
univariate representation of the simplest Maiorana–McFarland function, that is, the function
n n n
(x, y) → xy, is (z + z2 2 )(w2 z + wz2 2 ), that is, up to linear terms: z1+2 2 .
We describe now the constructions that are given directly in univariate form.
In [935], the authors observed that if trn (ax d ) is a bent Boolean function and x d permutes
F2m for some divisor m of n ≥ 4, then trm n (ax d ) is bent (the double condition is necessary

if m = n/2; see [1133]); more is obtained in [892] for multiple trace term functions
with Dillon-like exponents. In [483, 935, 1064], the authors studied (further) bent vectorial
functions of the form trn/2 n (ax d ). All functions tr n (ax d ) where m divides n are addressed
m

in the recent paper [1133], where it is proved that if m | n and gcd 2m − 1, 22m−1
n

 2n −1 −1 = 1, and
if the (n, m)-function trm (ax ) is bent, then gcd d, 2m −1 = 1. Characterizations are given
n d

when d is a Gold 2i + 1 (with any i), a Kasami 22i − 2i + 1 (idem), a Leander (2n/4 + 1)2 ,
n
a Canteaut–Charpin–Khyureghyan 2n/3 + 2n/6 + 1, and a Dillon j · (2 2 − 1) exponent
(with precisions and corrections of errors from previous papers) as well as functions with
multiple terms with Niho and Dillon exponents. The authors of [483] also propose a method
to construct bent vectorial functions based on PS − and PS + bent functions. In [892], the
authors derive three necessary and sufficient conditions for a function of the form F (x) =
n (r
trn/2 ri (2n/2 −1) ) to be bent. The first characterization is a direct consequence of a
i=1 ai x
result in [854]. The second characterization provides an interesting link between the bentness
of F and its evaluation on the cyclic group U . The third characterization is stated in terms
of the evaluation of certain elementary symmetric polynomials, and can be transformed into
some explicit conditions regarding the choice of some coefficients. In [961], the authors
studied the quadratic vectorial functions of the form F (x) = trn/2 n (ax 2i (x 2j + (x 2j )2n/2 )),

where n ≥ 4 is even and a ∈ F2n/2 , which are all bent.


The existence, and the constructions in case of existence, of bent vectorial functions of
n (P (x)) where P (x) ∈ F n [x] has been studied on the basis of known Boolean
the form trn/2 2
bent functions of the form trn (P (x)). For instance, the nonexistence of some bent vectorial
functions with binomial trace representation in PS − has been proved in [930, 931]: for
n ≡ 0 (mod 4), there is no bent vectorial function of the form F (x) = trn/2 n (x 2n/2 −1 +

ax r(2 −1) ), where 1 ≤ r ≤ 2n/2 and a ∈ F2n .


n/2

We have seen at pages 30 and 269 that CCZ equivalence on bent functions coincides
with EA equivalence and then does not provide new (bent) functions. However, applied
to nonbent functions, it can give functions having some bent components and lead to bent
vectorial functions with less output bits (but possibly larger algebraic degree). Examples like

F (x) = x 2 +1 + (x 2 + x + 1)trn (x 2 +1 ), for n ≥ 6 even, and F (x) = x + tr3n (x 2(2 +1) +
i i i i

2i +1
x 4(2 +1) )+trn (x)tr3n (x 2 +1 +x 2 (2 +1) )
i i 2i i
, where 6 | n and in both cases gcd(i,n)
n
even, are
given in [150] (deduced from functions in [163]). Ideas for deriving bent vectorial functions
from AB functions are given in [248, Subsection 4.3].
In [1143], Youssef and Gong have extended the notion of hyper-bent function to vectorial
functions: such F is called hyper-bent if all its component functions are hyper-bent.
Muratović-Ribić et al. [893] have characterized a class of vectorial hyper-bent functions of
6.4 Bent vectorial functions 273

n (2n/2 a x i(2n/2 −1) ) from the class PS , and determined the number
the form F (x) = trn/2 i=0 i ap
of such hyper-bent functions.

6.4.2 Secondary constructions of bent vectorial functions


Given any bent (n, m)-function F , any chopped function obtained by deleting some
coordinates of F (or more generally by composing it on the left with any surjective affine
mapping) is obviously still bent. But there exist other more useful secondary constructions
(that is, constructions of new bent functions from known ones). The secondary construction
of Boolean bent functions of Proposition 79, page 211, generalizes directly to vectorial
functions [234]:

Proposition 105 Let r and s be two positive integers with the same parity and such that
r ≤ s/3. Let ψ be any (balanced) mapping from Fs2 to F2r such that, for every a ∈ F2r , the
set ψ −1 (a) is an (s − r)-dimensional affine subspace of Fs2 . Let H be any (s, r)-function
whose restriction to ψ −1 (a) (viewed as an (s − r, r)-function via an affine isomorphism
between ψ −1 (a) and F2s−r ) is bent for every a ∈ F2r . Then the function Fψ,H (x, y) =
x ψ(y) + H (y), x ∈ F2r , y ∈ Fs2 , is a bent function from Fr+s
2 to F2r .

Indeed, taking x · y = trr (xy) for inner product in F2r , for every v ∈ F∗2r , the function
trr (v Fψ,H (x, y)) is bent, according to Proposition 79, with φ(y) = v ψ(y) and g(y) =
trr (v H (y)) (the more restrictive condition r ≤ s/3 is meant so that r ≤ s−r 2 , which is
necessary, according to Proposition 104, for allowing the restrictions of H to be bent). The
condition on ψ being easily satisfied,35 it is then a simple matter to choose H . Hence,
this construction is quite effective (but only for designing bent (n, m)-functions such that
m ≤ n/4, since r ≤ s/3 is equivalent to r ≤ r+s 4 ).
The construction of Theorem 15, page 234, can also be adapted to vectorial functions as
follows [234]:

Proposition 106 Let r and s be two positive even integers and m a positive integer such
that m ≤ r/2. Let H be a function from Fn2 = Fr2 × Fs2 to Fm2 . Assume that, for every y ∈ F2 ,
s

the function Hy : x ∈ F2 → H (x, y) is a bent (r, m)-function. For every nonzero v ∈ Fm


r
2
and every a ∈ Fr2 and y ∈ Fs2 , let us denote byv·H
fa,v (y) the value at a of the dual of the
Boolean function v · Hy , defined by x∈Fr (−1) (x,y)⊕a·x = 2r/2 (−1)fa,v (y) . Then H is
2
2 and every a ∈ F2 , the Boolean function fa,v is
bent if and only if, for every nonzero v ∈ Fm r

bent.

Indeed, we have, for every nonzero v ∈ Fm


2 and every a ∈ F2 and b ∈ F2 :
r s

(−1)v·H (x,y)⊕a·x⊕b·y = 2r/2 (−1)fa,v (y)⊕b·y .


x∈Fr2 y∈Fs2
y∈Fs2

35 Note that it does not make ψ necessarily affine.


274 Bent functions and plateaued functions

An example of application of Proposition 106 is when we choose every Hy in the


Maiorana–McFarland’s class: Hy (x, x  ) = x πy (x  ) + Gy (x  ), x, x  ∈ F2r/2 , where πy is
bijective for every y ∈ Fs2 . According to the results on the duals of Maiorana–McFarland’s
functions, for every v ∈ F∗2r/2 and every a, a  ∈ F2r/2 , we have then f(a,a  ),v (y) =
    
tr 2r a  πy−1 av + v Gy πy−1 av , where tr 2r is the trace function from F2r/2 to F2 .
Then H is bent if and only if, for every v ∈ F∗2r/2 and every a, a  ∈ F2r/2 , the function
 
y → tr 2r a  πy−1 (a) + v Gy (πy−1 (a)) is bent on Fs2 . A simple possibility for achieving
this is for s = r/2 to choose πy−1 such that, for every a, the mapping y → πy−1 (a) is an
affine automorphism of F2r/2 (e.g., πy−1 (a) = πy (a) = a + y) and to choose Gy such that,
for every a, the function y → Gy (a) is bent.
An obvious corollary of Proposition 106 is that the so-called direct sum of bent functions
gives bent functions: we define H (x, y) = F (x)+G(y), where F is any bent (r, m)-function
and G any bent (s, m)-function, and we have then fa,v (y) = v · F (a) ⊕ v · G(y), which is
a bent Boolean function for every a and every v = 0m . Hence, H is bent.

Remark. Identifying Fm 2 with F2 and defining H (x, y) = F1 (x) + G1 (y) + (F1 (x) +
m

F2 (x)) (G1 (y) + G2 (y)), a component function v · Hy (x) = trm (v F1 (x)) + trm (v G1 (y)) +
trm (v (F1 (x) + F2 (x)) (G1 (y) + G2 (y))) does not enter, in general, in the framework
of Proposition 83 nor of Proposition 106. Note that the function fa,v exists under the
sufficient condition that, for every nonzero ordered pair (v, w) ∈ F2m × F2m , the function
trm (v F1 (x)) + trm (w F2 (x)) is bent (which is equivalent to saying that the (r, 2m)-function
(F1 , F2 ) is bent).
There are particular cases where the construction works, as shown in [310]: let F1 and
F2 be two bent (n, r)-functions and G = (g1 , . . . , gr+1 ) an (m, r + 1)-function such that
for every nonzero v in Fr+1 2 different from (1, 0, . . . , 0), the component function v · G is
bent, then the function H (x, y) = F1 (x) + G1 (y) + g1 (y)(F1 (x) + F2 (x)), where G1 is
the (m, r)-function (g2 , . . . , gr+1 ), is a bent (n + m, r)-function. This indirect sum has been
generalized in [310].

Remark. In [18], bent4 functions have been extended to vectorial bent4 functions (over
finite fields), which correspond to relative difference sets in certain groups. The authors
have provided conditions under which Maiorana–McFarland functions are bent4 .

6.5 Plateaued vectorial functions


There exist three notions of plateauedness for vectorial functions:

Definition 67 An (n, m)-function is called strongly plateaued if all its component functions
2 , v = 0m , where “·” is an inner product in F2 , are partially-bent (see Definition
v·F ; v ∈ Fm m

62, page 256).


An (n, m)-function is called plateaued with single amplitude if all its component functions
are plateaued with the same amplitude (see Definition 63, page 258).
6.5 Plateaued vectorial functions 275

An (n, m)-function is called plateaued if all its component functions are plateaued, with
possibly different amplitudes.

The reason why the first notion is called strongly plateaued will be made clear with
Corollary 18 below. The two first notions are independent in the sense that none is a
particular case of the other (there exist indeed strongly plateaued vectorial functions with
different amplitudes and plateaued functions with single amplitude that are not strongly
plateaued). Both are a particular case of the third. Quadratic functions (which are all strongly
plateaued) can have components with different amplitudes (this is the case for instance of
the Gold functions x 2 +1 , gcd(i, n) = 1, for n even). They can also have single amplitude
i

(this is the case of Gold functions for n odd). Of course, the two definitions of plateaued
functions and of plateaued with single amplitude functions coincide for Boolean functions.
Note that, since the Walsh transform values of plateaued (n, m)-functions are divisible by
 n2 
2 and the Walsh transform of F equals the Fourier transform of the indicator 1GF of its
graph GF , the algebraic degree of 1GF is at most n + m −  n2  =  n2  + m, according to
Theorem 2, page 63. Applying  Relation (2.7), page 40, we have then that, for every subset
J of {1, . . . , m}, we have dalg j ∈{1,...,m}\J (fj ⊕ 1) ≤  2  + m − |J |, where the fj are
n

the coordinate
 functions of F . And if F is plateaued with single amplitude 2r , then we have
dalg j ∈{1,...,m}\J (fj ⊕ 1) ≤ n + m − r − |J |. This gives much more information than
the single inequality dalg (F ) ≤  n2  + 1 (resp. ≤ n − r + 1) provided by Proposition 96,
page 259.
It has been proved in [174] that, when n is a power of 2, no power plateaued (n, n)-
permutation exists36 and in [835] that, when n is divisible by 4, no such function exists with
n
the Walsh spectrum {0, ±2 2 +1 }.
The set of plateaued vectorial functions with single amplitude is CCZ invariant: if the
graphs {(x, F (x)); x ∈ Fn2 } and {(x, G(x)); x ∈ Fn2 } of two (n, m)-functions F , G
correspond to each other by an affine permutation of Fn2 × Fm 2 , then one is plateaued
with single amplitude if and only if the other is. The larger set of plateaued vectorial
functions is (only) EA invariant: it is indeed invariant under composition on the right by
affine automorphisms and under addition of an affine function, and it is also invariant under
composition on the left by a linear automorphism L since WL◦F (u, v) = WF (L∗ (v), u),
where L∗ is the adjoint operator of L.

6.5.1 Characterizations of plateaued vectorial functions


The characterization of plateaued Boolean functions by Proposition 97, page 260, has been
generalized to vectorial functions for each notion, by means of the value distributions of
their derivatives. This allowed one to derive several characterizations of APN functions in
this framework. Characterizations of plateaued vectorial functions have been also obtained
by means of their autocorrelation functions and of the power moments of their Walsh
transforms. We survey below all these results from [247].

36 A conjecture by T. Helleseth states that there is no power permutation having three Walsh transform values
when n is a power of 2.
276 Bent functions and plateaued functions

Characterization by means of the derivatives


Applying Proposition 97, page 260, an (n,v·Dm)-function F is plateaued if and only if, for
every v ∈ Fm
2 , the expression n
a,b∈F2 (−1) a Db F (x) does not depend on x ∈ Fn and F is
2
plateaued with single amplitude if and only if this sum does not depend on x nor on v = 0m .

Theorem 18 [247] Let F be an (n, m)-function. Then:


• F is plateaued if and only if, for every w ∈ Fm
2 , the size of the set

{(a, b) ∈ (Fn2 )2 ; Da Db F (x) = w} (6.41)

does not depend on x ∈ Fn2 (in other words, the value distribution of Da Db F (x) when
(a, b) ranges over (Fn2 )2 is independent of x ∈ Fn2 ).
• F is plateaued with single amplitude if and only if the size of the set in (6.41) does not
depend on x ∈ Fn2 , nor on w ∈ Fm 2 when w = 0m .

Moreover:
• For every (n, m)-function F , the value distribution of Da Db F (x) when (a, b) ranges
over (Fn2 )2 equals the value distribution of Da F (b) + Da F (x).
• If two plateaued functions F , G have the same such distribution, then for every v, their
component functions v · F and v · G have the same amplitude.

Proof Recall that any two integer-valued functions over Fn2 are equal if and only if their
Fourier transforms are equal, and that any integer-valued function is constant except at
0n if and only if itsFourier transform is constant except at 0n as well. Applying this to
the functions v → a,b∈Fn (−1)v·Da Db F (x) for different values of x, we deduce that F is
2
plateaued if and only if, for every w ∈ Fm
2 , the sum (−1)v·Da Db F (x)⊕v·w , which
v∈Fm n
2 a,b∈F2

is equal to (−1) v·(Da Db F (x)+w)


=2m
|{(a, b) ∈ (F2 ) ;
n 2
Da Db F (x) = w}|, does
a,b∈Fn2 v∈Fm
2
not depend on x ∈ Fn2 , and F is plateaued with single amplitude if and only if this size does
not depend on x nor on w = 0m . This proves the first part.
By the change of variable b → b + x, we have that |{(a, b) ∈ (Fn2 )2 ; Da Db F (x) = w}|
equals |{(a, b) ∈ (Fn2 )2 ; Da Db+x F (x) = w}|, that is, |{(a, b) ∈ (Fn2 )2 ; F (x) + F (x + a) +
F (b) + F (b + a) = w}|. This proves the first item of the second part.
The last item is a direct consequence of the fact that, for a plateaued function F , the sum
v·Da Db F (x) equals the square of the amplitude of v · F .
a,b∈F n (−1)
2

It is observed in [194, 196] that |{(a, b, x) ∈ (Fn2 )3 ; Da Db F (x) = 0n , a = 0n , b =


0n , a = b}| ≤ (2n − 1)(maxu,v∈Fn2 ,v=0n WF (u, v)2 − 2n+1 ), for every (n, n)-function F , with
equality if and only if F is plateaued with single amplitude.
Note that the algebraic degree d = 2 (for which the first item of Theorem 18 is
straightforwardly satisfied since the second-order derivatives are then constant) is the only
one for which all functions of algebraic degree at most d are plateaued, since we know that
6.5 Plateaued vectorial functions 277

cubic Boolean functions can have values of the Walsh transform at 0n different from 0 and
from powers of 2 (see Section 5.3, page 180), and therefore be nonplateaued.

Examples
1. Almost bent (AB) functions (see Definition 31, page 119), are an example of plateaued
functions with single amplitude. The distribution of values of the second-order derivatives
in Relation (6.41) is as follows: the equation Da Db F (x) = w has 3 · 2n − 2 solutions
(a, b) for any x if w = 0n and 2n − 2 solutions if w = 0n (see Corollary 27, page 377).
Conversely, any function having this property is AB.
2. Let n now be even and F (x) = x 2 +1 be a Gold APN function, (i, n) = 1. We have
i

i i
Da Db F (x) = a 2 b + ab2 . The number of solutions (a, b) of Da Db F (x) = 0 equals
again 3 · 2n − 2 (as for any APN function), and for w = 0 the number of solutions (a, b)
of Da Db F (x) = w is constant when w ranges over a coset of the multiplicative group of
all cubes in F∗2n , since for every λ ∈ F∗2n , (λa)2 (λb) + (λa)(λb)2 = λ2 +1 (a 2 b + ab2 )
i i i i i

and λ → λ2 +1 is three-to-one over F∗2n and has the group of cubes for range. This allows
i

i i
taking w = 1 without loss of generality when w is a cube, and a 2 b + ab2 = 1 is
 a 2i
equivalent when b = 0 to b + ab = 2i1+1 and has two solutions a for every b such
b
1
that 2i +1
has null trace (and none otherwise). The number of such nonzero b equals
b
n
2n−1 ± 2 2 − 1 since f (x) = trn (x 2 +1 ) has the same Hamming weight as trn (x 3 ),
i
n
which is 2n−1 ± 2 2 according to Carlitz’ result recalled at page 177. When w is not a
i i  2i
cube, a 2 b + ab2 = w is equivalent when b = 0 to ab + ab = 2wi +1 and has two
b
solutions a for every b such that 2wi +1 has null trace. The number of such nonzero b
b
n
equals 2n−1 ± 2 2 −1 − 1 since trn (wb2 +1 ) is bent (see page 206). Hence the number of
i

solutions (a, b) of Da Db F (x) = w equals



⎨ 3 · 2 −n 2 for w = 0,
n

2 ± 2 2 +1 − 2 for w a nonzero cube ( 2 3−1 cases)


n
n
⎩ n
for w a non-cube (2 · 2 3−1 cases),
n n
2 ± 22 − 2
where, among the two “±” above, one is a “+” and one is a “−.” We shall see below that
the Kasami APN functions (see page 400) have the same distribution.
i
3. The case of functions F (x, y) = (xπ(y) + φ(y), x(π(y))2 + ψ(y)) (which are plateaued
(n, n)-functions when π is a permutation, as we shall see in Proposition 115, page 282)
is studied in [247].

A particular case where the condition of Theorem 18 is satisfied is when, for each fixed
value of a, the value distribution of the function b → Da Db F (x) is independent of x. It
is easily seen, as in the proof of Theorem 18, that an (n, m)-function F has this property if
and only if all of its component functions have it, and that, for every Boolean function f ,
the size, for every a ∈ Fn2 , w ∈ F2 , of the set {b ∈ Fn2 ; Da f (b) = Da f (x) + w} does not
depend on x if and only if the derivatives of f are either constant or balanced, that is, f is
partially-bent. The condition is indeed sufficient, and it is necessary because if Da f is not
constant, then it means that {b ∈ Fn2 ; Da f (b) = 0}| = {b ∈ Fn2 ; Da f (b) = 1}. Hence:
278 Bent functions and plateaued functions

Corollary 18 [247] A vectorial function F is strongly plateaued if and only if, for every a
in Fn2 and every w, the size of the set {b ∈ Fn2 ; Da Db F (x) = w} does not depend on x ∈ Fn2 ,
or equivalently the size of the set {b ∈ Fn2 ; Da F (b) = Da F (x) + w} does not depend on
x ∈ Fn2 .

Proposition 107 [247] For every strongly plateaued (n, m)-function F , the image set
I m(Da F ) = (Da F )(Fn2 ) of any derivative Da F is an affine space.

Proof By hypothesis, every derivative Da F of F matches the same number of times any
two values Da F (x) + w and Da F (y) + w. Hence, it matches at least once Da F (x) + w (i.e.,
we have w ∈ Da F (x) + I m(Da F )) if and only if it matches at least once Da F (y) + w (i.e.,
we have w ∈ Da F (y) + I m(Da F )). Hence, the set I m(Da F ) is invariant under translation
by any element of I m(Da F ) + I m(Da F ) and is then an affine space.

Crooked functions According to Proposition 107, if F is a strongly plateaued APN


(n, n)-function, then it is a so-called crooked function, in the sense37 of [80, 727, 729]:

Definition 68 An (n, n)-function F is called crooked if, for every nonzero a, the set
{Da F (x); x ∈ Fn2 } is an affine hyperplane (i.e., a linear hyperplane or its complement).

Conversely, crooked functions are strongly plateaued (and APN), i.e., their component
functions are partially-bent [247, 252, 730], because the affine hyperplane {Da F (x), x ∈
F2n } is matched twice and the function y → v · y restricted to an affine hyperplane is either
constant or balanced for every v. This allows us to show more directly some results that were
first obtained in [729, 730]): crooked functions are plateaued, and for n odd, they are then
AB (since we know that “plateaued APN” implies AB for n odd; see Proposition 163, page
382). Their component functions being partially-bent, they all satisfy Nv·F × NWv·F = 2n
(see page 256); therefore, in the case of n odd, we have Nv·F = 2 for every v = 0n , that
is, there exists a unique a = 0n such that v·F (a) = 0, i.e., {Da F (x), x ∈ Fn2 } = {0n , v}⊥
or {Da F (x), x ∈ Fn2 } = Fn2 \ {0n , v}⊥ . And for every n, a function F is crooked if and
only if, for every a = 0n , there exists a unique v = 0n such that WDa F (0n , v) = 0 and
then WDa F (0n , v) = 2n and {Da F (x), x ∈ Fn2 } = {0n , v}⊥ or WDa F (0n , v) = −2n and
{Da F (x), x ∈ Fn2 } = Fn2 \ {0n , v}⊥ . Indeed,
 a set E is an affine hyperplane if and only if
there exists a unique v = 0n such that y∈E (−1)v·y equals ±|E| and that such sum is null
for any other v. This characterization canbe expressed by means of the Walsh transform of
F since WDa F (0n , v) = v·F (a) = 2−n u∈Fn WF2 (u, v)(−1)u·a .
2
Of course, all quadratic APN functions are crooked; the question of knowing whether
nonquadratic crooked functions exist is open. It is proved in [728, 729] that the reply is no
for power functions (monomials) and in [80] that it is no for binomials.

37 This is nowadays the most used definition of crooked functions, but originally in [57], they were defined such
that, for every nonzero a, the set {Da F (x); x ∈ Fn2 } is the complement of a linear hyperplane; this restricted
definition required that crooked functions be bijective; they were also AB. Some authors call “generalized
crooked” the functions we call crooked here.
6.5 Plateaued vectorial functions 279

Assuming that F (0n ) = 0n , it is proved in [57] that the set Ha = {Da F (x); x ∈ Fn2 }
is the complement of a linear hyperplane for every nonzero a (i.e., F is crooked in the
original restricted sense of [57]) if and only if F is APN and for every nonzero a, we have
Da F (x) + Da F (y) + Da F (z) = 0n for every x, y, z; n is then necessarily odd. Then, F is
bijective (take x = y = z) and AB, and we have seen that all the sets Ha , for a = 0n , are
distinct (and therefore every complement of a linear hyperplane equals Ha for some unique
a = 0n ). More characterizations are given in [539], in relation with nonlinear codes. Note
that crookedness may represent a weakness; see [200].

The case of power functions It is often simpler to consider power functions than general
functions. In the case of plateaued functions, we have:

Corollary 19 [247] Let F (x) = x d be any power function. Then, for every w ∈ F2n , every
x ∈ F2n , and every λ ∈ F∗2n , |{(a, b) ∈ F22n ; Da F (b) + Da F (x) = w}| equals |{(a, b) ∈
F22n ; Da F (b) + Da F (x/λ) = w/λd }| and |{(a, b) ∈ F22n ; Da F (b) + Da F (0) = w}| is
invariant when w is multiplied by any dth power in F∗2n . Then:
• F is plateaued if and only if, for every w ∈ F2n

|{(a, b) ∈ F22n ; Da F (b) + Da F (1) = w}| = |{(a, b) ∈ F22n ; Da F (b) + Da F (0) = w}|.
• F is plateaued with single amplitude if and only if additionally this common size does not
depend on w = 0.

If d is co-prime with 2n − 1, then F is plateaued if and only if it is plateaued with single


amplitude.

This is a more or less direct consequence of the fact that, for every λ = 0, we have
Dλa F (λx) = λd Da F (x).

The case of unbalanced components In the particular case where all the component
functions of a function are unbalanced (we shall see that this is for instance the case of
all APN power functions x d when n is even, since they satisfy, as proved by Dobbertin,
see Proposition 165, page 385, that gcd(d, 2n − 1) = 3), plateauedness is simpler to study
because, for each v, the value of |WF (0n , v)| being nonzero, equals the amplitude of the
component function v · F . Hence, according to Proposition  97, page 260, if F is plateaued
with unbalanced components then, for every v, x, the sum a,b∈Fn (−1)v·Da Db F (x) equals
 2
WF2 (0n , v) = a,b∈Fn2 (−1)
v·(F (a)+F (b)) . The converse is straightforward too since, when

constant, a,b∈Fn (−1)v·Da Db F (x) is equal to the squared amplitude and cannot then be null,
2
and this gives by the same method as in the proof of Theorem 18:

Theorem 19 [247] Let F be any (n, m)-function. Then F is plateaued with component
functions all unbalanced if and only if, for every w, x ∈ Fn2 , we have
0 0 0 0
0 0 0 0
0{(a, b) ∈ (Fn2 )2 ; Da Db F (x) = w}0 = 0{(a, b) ∈ (Fn2 )2 ; F (a) + F (b) = w}0 .
280 Bent functions and plateaued functions

Moreover, F is then plateaued with single amplitude if and only if, additionally, this common
value does not depend on w for w = 0n .

This theorem will have interesting consequences in Subsection 11.3, page 371.

Characterization by means of the autocorrelation functions and related


value distributions
We have seen in Proposition  98, page 260, that a Boolean function f is plateaued of
amplitude λ if and only if a∈Fn f (a)f (a + x) = λ2 f (x). To be able to deduce a
2
characterization of plateaued vectorial functions, we need to eliminate λ2 from
 this relation.
The value of λ can be obtained from this same relation, with x = 0n : a∈Fn 2f (a) =
 2
λ2 f (0n ) = λ2 2n . Hence, if f is plateaued, then 2n f ⊗ f = [ a∈Fn 2f (a)] f .
 2
Conversely, if 2n f ⊗ f = ( a∈Fn 2f (a)) f then f ⊗ f = λ2 f , where
 2

a∈Fn f (a) = λ 2 and f is plateaued of amplitude λ. We deduce:


2 2 n
2

Proposition 108 Any (n, m)-function F is plateaued if and only if, for every x ∈ Fn2 and
every v ∈ Fm
2 , we have

2n v·F (a)v·F (a + x) = [ 2v·F (a)] v·F (x).


a∈Fn2 a∈Fn2

It is plateaued with single amplitude λ if and only if, for every x ∈ Fn2 and every v ∈ Fm
2 , we
have
v·F (a)v·F (a + x) = λ2 v·F (x).
a∈Fn2

Characterization by means of power moments of the Walsh transform


We have seen in Proposition 99, page 261, that any n-variable Boolean function f is
plateaued if and only if, for every nonzero α ∈ Fn2 , we have u∈Fn Wf (u + α) Wf3 (u) = 0.
2
We deduce:

Proposition 109 [247] Any (n, m)-function F is plateaued if and only if

∀v ∈ Fm
2 , ∀α ∈ F2 , α = 0n ,
n
WF (u + α, v) WF3 (u, v) = 0.
u∈Fn2

F is plateaued with single amplitude if and only if, additionally, u∈Fn2 WF4 (u, v) does not
depend on v for v = 0m .

We deduce also from Corollary 17, page 261:

Corollary 20 [247] Any (n, m)-function F is plateaued if and only if, for every b ∈ Fn2
and every v ∈ Fm
2,
6.5 Plateaued vectorial functions 281

WF4 (a, v) = 2n (−1)v·F (b) (−1)a·b WF3 (a, v).


a∈Fn2 a∈Fn2

And F is plateaued with single amplitude if and only if, additionally, these sums do not
depend on v, for v = 0m .

We have seen that plateaued functions can be characterized by the constance of the ratio
of two consecutive Walsh power moments of even orders [858].
We deduce from Proposition 100, page 261:

Proposition 110 [247, 858] For every (n, m)-function F , and every k ∈ N∗ , we have
⎛ ⎞2 ⎛ ⎞⎛ ⎞
⎝ WF2k+2 (a, v)⎠ ≤ ⎝ WF2k (a, v)⎠ ⎝ WF2k+4 (a, v)⎠ ,
v∈Fm
2 a∈Fn2 v∈Fm
2 a∈Fn2 a∈Fn2

with equality if and only if F is plateaued.

See more in [247, 858].

6.5.2 CCZ and EA equivalence of plateaued functions


In [247], the author deduced from Theorem 18, page 276, the following:

Corollary 21 Let n be any even integer, n ≥ 4. Let F be an (n, n)-function CCZ equivalent
to a Gold APN function G(x) = x 2 +1 or to a Kasami APN function G(x) = x 4 −2 +1 ,
i i i

(i, n) = 1. Then F is plateaued if and only if it is EA equivalent to G(x).

This result has been later generalized in [1141] by S. Yoshiara:

Proposition 111 Let F and G be plateaued APN functions on F2n with n even. Assume
that F is a power function, then it is CCZ equivalent to G if and only if F is EA equivalent
to G.

This same author had proved in [1139]:

Proposition 112 Two quadratic APN functions are CCZ equivalent if and only if they are
EA equivalent.

and in [1140]:

Proposition 113 For any n ≥ 3, two power APN functions x d and x e over F2n are CCZ
equivalent if and only if there is an integer a such that 0 ≤ a ≤ n − 1 and either e = 2a d
[mod 2n − 1] or de = 2a [mod 2n − 1], where the latter case occurs only when n is odd.

Proposition 114 Any quadratic APN function is CCZ equivalent to a power APN function
if and only if it is EA equivalent to one of the Gold APN functions.
282 Bent functions and plateaued functions

From Proposition 113 are deduced all cases of CCZ equivalence/inequivalence between
the known APN functions; see Proposition 177.

6.5.3 Constructions of plateaued vectorial functions


Primary constructions
All quadratic functions are plateaued. The Maiorana–McFarland construction F (x, y) =
xπ(y) + φ(y); x, y ∈ F2m , allows constructing nonquadratic ones; it gives a plateaued
(2m, m)-function when π is a permutation (F is then bent) and when π is 2-to-1, φ being
any (m, m)-function in both cases. Erasing some coordinates from their output provides
plateaued (n, m)-functions with m ≤ n/2.
We recall from [247] an example of primary construction of plateaued (n, n)-functions
also based on the Maiorana–McFarland construction. Let π be a permutation of F2m and
φ, ψ two functions from F2m to F2m . Let i be an integer coprime with m. We define the
i
(2m, 2m)-function F (x, y) = (xπ(y) + φ(y), x(π(y))2 + ψ(y)) ∈ F2m × F2m . For every
element (a, b) ∈ F2m × F2m , the Walsh transform at (a, b) of (u, v) · F (x, y) equals

2i +uφ(y)+vψ(y)+ax+by)
(−1)trm (uxπ(y)+vx(π(y)) =
x,y∈F2m
⎛ ⎞
2i
(−1)trm (uφ(y)+vψ(y)+by) ⎝ (−1)trm ((uπ(y)+v(π(y)) +a)x) ⎠
=
y∈F2m x∈F2m

2m (−1)trm (uφ(y)+vψ(y)+by) .
y∈F2m
i
uπ(y)+v(π(y))2 =a

i
The number of solutions of the equation uπ(y) + v(π(y))2 = a equals the number of
i
solutions of the linear equation uy + vy 2 = a. If u = 0 and v = 0, or if u = 0 and v = 0,
the number of solutions of this equation equals 1; hence, (u, v) · F is plateaued of amplitude
2m (i.e., is bent). If u = 0 and v = 0, this number either equals 0 or equals the number
i
of solutions of the associated homogeneous equation uy + vy 2 = 0, that is, 2 (indeed,
uy + vy 2 = 0 is equivalent to y = 0 or y 2 −1 = u/v = 0 and i being coprime with m,
i i

2i − 1 is coprime with 2m − 1); hence, (u, v) · F (x, y) is plateaued of amplitude 2m+1 (i.e.,
is semi-bent). Then:

Proposition 115 [247] Let m be a positive integer, π a permutation of F2m , and φ, ψ two
functions from F2m to F2m . Let i be an integer coprime with m. Then function F (x, y) =
i
(xπ(y) + φ(y), x(π(y))2 + ψ(y)) is plateaued (but does not have single amplitude).

i
Of course, this observation more generally applies when (π(y))2 is replaced by any other
permutation π  (y) such that, for every u = 0, v = 0, the equation uπ(y) + vπ  (y) = a has
0 or a fixed number (depending on u and v only) of solutions.
6.5 Plateaued vectorial functions 283

There exist other examples of nonquadratic plateaued (n, n)-functions, such as AB


functions (see page 395) and Kasami APN functions in even dimension (see page 400);
erasing some coordinates gives plateaued (n, m)-functions with n/2 < m ≤ n.

Secondary constructions
Let r, s, t, p be positive integers. Let F be a plateaued (r, t)-function and G a plateaued
(s, p)-function, then function H (x, y) = (F (x), G(y)); x ∈ Fr2 , y ∈ Fs2 is a plateaued
p
(r + s, t + p)-function. Indeed, for every (a, b) ∈ Fr2 × Fs2 and every (u, v) ∈ Ft2 × F2 , we
have: WH ((a, b), (u, v)) = WF (a, u)WG (b, v). Note that this works even if u or v is null,
but such a function is never with single amplitude, except when F and G are affine.
7

Correlation immune and resilient functions

The notion of correlation immune Boolean function is due to Siegenthaler [1041] as a


criterion for resistance to his correlation attack on the combiner model of stream cipher, as
we saw at page 86. Balanced correlation immune functions have soon been called resilient
after [370], which dealt with another cryptographic issue: the bit extraction problem. It has
been later observed in [181] that the notion of correlation immune Boolean function already
existed in combinatorics (in a wider framework) under another name, since the support of
a correlation immune function is an orthogonal array (see Definition 22, page 86). Resilient
functions have been extensively studied in the 1990s in relation with nonlinearity. But in
2003 were invented the fast algebraic attack [388] and the Rønjom–Helleseth attack [1003],
which are very efficient against stream ciphers using nonlinear functions whose algebraic
degrees are not large. Since correlation immune and resilient functions have algebraic
degree bounded from above, this made them weak. But, as we already recalled at page 147,
correlation immune and resilient Boolean functions can be employed for secret sharing, as
shown in [461]. Recently, the interest of correlation immune functions has been also renewed
in the framework of side-channel attacks (see [286] and Section 12.1). The functions need
then to have low Hamming weight (and this excludes resilient functions).

7.1 Correlation immune and resilient Boolean functions


For the convenience of the reader, we recall in the next definition what we have seen in
Section 3.1, page 86, on correlation immune and resilient functions.

Definition 69 Let n be a positive integer and t ≤ n a nonnegative integer. An n-variable


Boolean function f is called a t-th order correlation immune (t-CI) function if its output
distribution probability (i.e., the density of the support) is unaltered when at most t (or,
equivalently, exactly t) of its input bits are kept constant, that is, if the code equal to its
support has dual distance (see Definition 4, page 16) at least t + 1. It is called a t-resilient
function if it is balanced and t-th order correlation immune. Equivalently, f is t-th order
correlation immune if Wf (u) = 0, i.e. f(u) = 0, for all u ∈ Fn2 such that 1 ≤ wH (u) ≤ t,
and it is t-resilient if Wf (u) = 0 for all u ∈ Fn2 such that wH (u) ≤ t.

This generalizes to other alphabets [178]. Note that thanks to the R-linearity of the
Fourier–Hadamard transform, the sum of t-th order correlation immune functions with
disjoint supports is a t-th order correlation immune function.
284
7.1 Correlation immune and resilient Boolean functions 285

The combining functions in stream ciphers must be t-resilient with large t. As with any
cryptographic functions, they must also have high algebraic degrees (which is partially
contradictory with correlation immunity, but trade-offs can be found), high nonlinearities
(idem), and since 2003 high resistance to algebraic attacks and fast algebraic attacks (which
is problematic).
Notation: By an (n, t, d, N )- function, we mean an n-variable, t-resilient function having
algebraic degree at least d and nonlinearity at least N .

7.1.1 Bound on the correlation immunity order


The correlation immunity order of n-variable functions (i.e., the maximum d such that they
are d-CI) is unbounded (that is, it can be as high as n, since constant functions are n-th order
 and their resiliency order is only bounded above by n − 1, since the
correlation immune),
Boolean function ni=1 xi is (n − 1)-resilient. In the case of unbalanced and nonconstant
correlation immune functions, the situation is different:

Proposition 116 [514] Let f be an unbalanced nonconstant t-th order correlation immune
Boolean function. Then t ≤ 2n
3 − 1.

Proof Let f be an unbalanced nonconstant t-CI Boolean function. Since f is unbalanced,


we have Wf (0n ) = 0, and since f is nonconstant, there exists a ∈ Fn2 nonzero such that
Wf (a) = 0. The Golomb–Xiao–Massey characterization (Theorem 5, page 87) gives that
wH (a) ≥ t + 1.
3 − 1. By the Titsworth relation (2.51), page 61, we have:
Suppose that t > 2n

Wf (u)Wf (u + a) = 0. (7.1)
u∈Fn2

For u = 0n , the summand in the left part of (7.1) equals 22n , according to Parseval’s identity.
If 1 ≤ wH (u) ≤ 23 n < t + 1, then Wf (u) = 0. If wH (u) > 23 n, then the vectors u and
a have more than n3 common 1s, therefore wH (u + a) < 23 n. Thus the left-hand side of
Equation (7.1) has exactly two equal nonzero summands (for u = 0n and u = a), therefore
the equality in Equation (7.1) cannot be achieved.

7.1.2 Bounds on algebraic degree


The Siegenthaler bound states:

Proposition 117 [1041] Let n be any positive integer and let 0 ≤ t ≤ n. Any t-th order
correlation immune n-variable Boolean function has Hamming weight divisible by 2t and
algebraic degree smaller than or equal to n−t. Any t-resilient function has algebraic degree
smaller than or equal to n − t − 1 if t ≤ n − 2 and to 1 (i.e., is affine) if t = n − 1. Moreover,
if a t-th order correlation immune function has Hamming weight divisible by 2t+1 , then it
satisfies the same bound as t-resilient functions.
286 Correlation immune and resilient functions

Siegenthaler’s bound gives an example of the trade-offs that must be accepted in the
design of combiner generators.1
The first assertion in Proposition 117 comes directly from the fact that all the restrictions
obtained by fixing t coordinates of the input have the same Hamming weight. The
other
 results can be proved directly by using Relation (2.4), page 33, since the bit
x∈Fn2 ; supp(x)⊆I f (x) equals the parity of the Hamming weight of the restriction of f
obtained by setting to 0 the coordinates of x that lie outside I . It is then null if |I | > n − t.
In the case that the restriction by fixing t input coordinates has even Hamming weight, that
is, when wH (f ) is divisible by 2t+1 , this bit is null if |I | ≥ n − t. Note that we can also use
the Golomb–Xiao–Massey characterization (Theorem 5, page 87, resulting in Definition 69,
page 284) together with the Poisson summation formula (2.40), page 59, applied to ϕ = f
and with E ⊥ = {x ∈ Fn2 ; supp(x) ⊆ I }, where I has size strictly larger than n − t − 1. But
this gives a less simple proof.

7.1.3 Characterization by the NNF


Siegenthaler’s bound is also a direct consequence of a characterization of correlation
immune and of resilient functions through their NNFs and of the facts that the ANF equals
the NNF mod 2 and that the Hamming weight of a t-th order correlation immune function
with t ≥ 1 is even (the Walsh transform at 0n is then divisible by 4).

Proposition 118 [220, 293] Let n be any positive integer and t < n a non-negative integer.
A Boolean function
 f on Fn2 is t-th order correlation immune if and only if the numerical
normal form I ⊆{1,...,n} λI x I of the function g(x) = f (x) ⊕ x1 ⊕ · · · ⊕ xn satisfies that,
for every I of size larger than or equal to n − t, (−2)n−|I | λI is independent
 of the choice
of I . And f is t-resilient if and only if the numerical normal form I ⊆{1,...,n} λI x I of g has
degree at most n − t − 1.

Proof For each vector a ∈ Fn2 , we denote by a the componentwise complement of a


equal to a + 1n . We have Wf (a) = Wg (a). Thus, f is t-th order correlation immune (resp.
t-resilient) if and only if, for every vector u = (1, . . . , 1) of Hamming weight larger than or
equal to n − t (resp. for every vector u of Hamming weight larger than or equal to n − t),
the number Wg (u) is null. According to Relations (2.61), (2.62), page 67, and (2.32), page
55, applied to g, we have for nonzero u
Wg (u) = (−1)wH (u)+1 2n−|I |+1 λI ,
I ⊆{1,...,n}; supp(u)⊆I

and for nonempty I


λI = 2−n (−2)|I |−1 Wg (u).
u∈Fn2 ; I ⊆supp(u)

This completes the proof.

1 One approach to avoid such a trade-off is to allow memory in the nonlinear combination generator, that is, to
replace the combining function by a finite state machine; see [845].
7.1 Correlation immune and resilient Boolean functions 287

Proposition 118 proves, by applying Relation (2.64), page 67, to g(x) = f (x) ⊕ x1
⊕ · · · ⊕ xn , that if t is the resiliency order of an n-variable function f of algebraic degree at
least 2, and each variable xi is effective in g(x), then n − t − 1 ≥ n 2−dalg (f )+1 , that is,

2n
dalg (f ) ≥ log2 .
n−t −1

Remark. According to Proposition 118, a nonaffine balanced n-variable Boolean function


g has its algebraic degree and numerical degree equal to each other if and only if, given
Boolean function f (x) = g(x) ⊕ x1 ⊕ · · · ⊕ xn and its resiliency order, Siegenthaler’s bound
is an equality.

Proposition 118 has been used by X.-D. Hou in [621] for constructing resilient functions.

7.1.4 Bounds on the nonlinearity


Sarkar and Maitra showed that:

Proposition 119 [1012] The values of the Walsh transform of an n-variable, t-resilient
(resp. t-th order correlation immune) function are divisible by 2t+2 (resp. 2t+1 ) if 0 ≤ t ≤
n − 3 (resp. 1 ≤ t ≤ n − 2).

A more precise result being given in Proposition 120 below, we skip the proof of
Proposition 119. More is proved in [220, 322]; in particular: if the Hamming weight
of a t-th order correlation immune function is divisible by 2t+1 , then the values of its
Walsh transform are divisible by 2t+2 . This Sarkar–Maitra’s divisibility bound and its
extension have provided nontrivial upper bounds on the nonlinearity of resilient functions,
independently obtained by Tarannikov [1080] and by Zheng and Zhang [1177]:

Theorem 20 [1012, 1080, 1177] For every n and t ≤ n − 2, the nonlinearity of any tth-
order correlation immune (resp. t-resilient) function is bounded above by 2n−1 − 2t (resp.
2n−1 − 2t+1 ).

Of course, this brings information only if 2n−1 − 2t (resp. 2n−1 − 2t+1 ) is smaller than
n
2n−1 − 2 2 −1 . Zheng and Zhang [1177] showed that correlation immune functions of high
orders satisfy the same upper bound on the nonlinearity as resilient functions of the same
orders. In [1083] (where the authors also obtained a bound on f for f resilient and
studied the resiliency order of all quadratic functions), Tarannikov et al. showed for each
i ∈ {1, 2} that if t is larger than some rather complex expression of i and n, then for every
unbalanced nonconstant t-CI function, we have nl(f ) ≤ 2n−1 − 2t+i . The maximal higher-
order nonlinearity of resilient functions has also been studied in [101, 719] and determined
for low order (≤ 2) or low number of variables (≤ 7).
The bound of Theorem 20 for resilient functions is tight when t ≥ 0.6 n, see [1080,
1081]. We shall call it Sarkar et al.’s bound. Notice that, if a t-resilient function f achieves
nonlinearity 2n−1 − 2t+1 , then f is plateaued. Indeed, the distances between f and affine
functions lie then between 2n−1 − 2t+1 and 2n−1 + 2t+1 and must be therefore equal to
288 Correlation immune and resilient functions

2n−1 − 2t+1 , 2n−1 and 2n−1 + 2t+1 because of the divisibility result of Sarkar and Maitra.
Thus, the Walsh transform of f takes three values 0 and ±2t+2 . Moreover, it is proved
in [1080] (and is a direct consequence of Proposition 120 below) that such function f also
achieves Siegenthaler’s bound (and as proved in [814], achieves minimum sum-of-squares
indicator).
If 2n−1 −2t+1 is larger than the best possible nonlinearity of all balanced functions (and in
particular if it is larger than the covering radius bound), then, obviously, a better bound than
in Theorem 20 exists. In the case of n even, the best possible nonlinearity of all balanced
n
functions being strictly smaller than 2n−1 − 2 2 −1 , Sarkar and Maitra deduce that nl(f ) ≤
n
2n−1 − 2 2 −1 − 2t+1 for every t-resilient function f with t ≤ n2 − 2. In the case of n odd, they
state that nl(f ) is smaller than or equal to the highest multiple of 2t+1 , which is less than
or equal to the best possible nonlinearity of all Boolean functions. But a potentially better
upper bound can be given, whatever is the parity of n. Indeed, Sarkar–Maitra’s divisibility
bound shows that Wf (a) = ω(a) × 2t+2 , where ω(a) is integer-valued. Parseval’s rela-
tion (2.48), page 61, and the fact that Wf (a) is null for every vector a of Hamming weight
≤ t imply

ω2 (a) = 22n−2t−4
a∈Fn2 ; wH (a)>t

and, thus,
5
22n−2t−4 2n−t−2
maxn |ω(a)| ≥  t n = 1   .
a∈F2 2n − i=0 i 2n − ti=0 ni
O P
2n−t−2
Hence, we have maxa∈Fn2 |ω(a)| ≥ 1  , and this implies
2n − ti=0 (ni)

⎡ ⎤
2n−t−2
nl(f ) ≤ 2n−1 − 2t+1 ⎢
⎢1   ⎥.
⎥ (7.2)
⎢ 2n − ti=0 n ⎥
i

When n is even and t ≤ n


2 − 2, this number is always less than or equal to the number
n
2 −1
n−t−2
2n−1 −2 −2t+1 (given by Sarkar and Maitra), because 1 2 t n is strictly larger than
2n − i=0 ( i )
O P
n n n
2 2 −t−2 and 2 2 −t−2 is an integer, and, thus, is at least 2 2 −t−2 +1. And when
n−t−2
1 2
2n − ti=0 (ni)
n
n increases, the right-hand side of Relation (7.2) is strictly smaller than 2n−1 − 2 2 −1 − 2t+1
for an increasing number of values of t ≤ n2 − 2 (but this improvement does not appear
when we compare the values we obtain with this bound to the values indicated in the table
given by Sarkar and Maitra in [1012], because the values of n they consider in this table are
small).
When n is odd, it is difficult to say if Inequality (7.2) is better than the bound given by
Sarkar and Maitra, because their bound involves a value that is unknown for n ≥ 9 (the best
7.1 Correlation immune and resilient Boolean functions 289

possible nonlinearity of all balanced Boolean functions). In any case, this makes (7.2) better
usable. t n nH2 (t/n)
We know (see [809, page 310]) that i=0 i ≥ √28t (1−t/n) , where H2 (x) =
−x log2 (x) − (1 − x) log2 (1 − x), the so-called binary entropy function, satisfies
H2 ( 12 − x) = 1 − 2x 2 log2 e + o(x 2 ). Thus, we have
⎡ ⎤
⎢ 2n−t−2 ⎥
nl(f ) ≤ 2n−1 − 2t+1 ⎢
⎢/
⎥.
⎥ (7.3)
⎢ 2n − √2nH2 (t/n) ⎥
⎢ 8t (1−t/n) ⎥

Remark. If a Boolean function f is t-th order correlation immune (resp. t-resilient), then
for every 1 ≤ e ≤ t and every set {i1 , . . . , ie } of size e, its restriction obtained by fixing
coordinates xi1 , . . . , xie is a (t−e)-th order correlation immune (resp. (t−e)-resilient) (n−e)-
variable function. But the n-variable function equal to the product of f with the monomial
function m(x) = ej =1 xij of degree e is not (t − e)-th order correlation immune, although
the support of f m equals the intersection of the support of f with the set {i1 , . . . , ie }: fixing
(t −e) coordinates of x preserves the output distribution probability only if these coordinates
are outside {i1 , . . . , ie }. Nevertheless, it is possible to prove that such f m has same Walsh
divisibility property as a (t − e)-th order correlation immune (resp. (t − e)-resilient)
function.

Proposition 119 has been improved:

Proposition 120 [220, 322] Let n be any positive integer and let t ≤ n−2 be a nonnegative
integer. Let f be any n-variable t-th order correlation immune function (resp. any t-resilient
function or any t-th order correlation immune function whose Hamming weight is divisible
n−t−2
t+1+ d
by 2 ) and let d be its algebraic degree. The values of the Walsh transform
n−t−1 n−t−2
t+1+ d t+2+ d
of f are divisible by 2 (resp. by 2 ). Hence the nonlinearity of f is
n−t−1 n−t−2
t+ d t+1+ d
divisible by 2 (resp. by 2 ).

A little more can be said in the former case; see [322].


The approach for proving this tight bound was first to use the numerical normal form (we
refer the reader to [220] for this proof, for the tightness, and for an improvement when the
number of terms of highest degree in the ANF is small enough). Later, a second proof using
only the properties of the Fourier–Hadamard transform was given in [322]:

Proof The Poisson summation formula (2.40), page 59, applied to ϕ = fχ and to
the vector space E = {u ∈ Fn2 ; ∀i ∈ {1, . . . , n}, ui ≤ vi }, where v is some vector
 F2 , whose orthogonal
of n equals E ⊥ = {u ∈ Fn2 ; ∀i ∈ {1, . . . , n}, ui ≤ vi ⊕ 1}, gives

u∈E Wf (u) = 2
w (v)
x∈E ⊥ fχ (x). It is then a simple matter to prove the result by
H

induction on the Hamming weight of v, starting with the vectors of weight t (resp. t + 1),
and using McEliece’s divisibility property (see Subsection 4.1.5, page 156).
290 Correlation immune and resilient functions

Proposition 120 gives directly more precise upper bounds on the nonlinearity of any
t-resilient function of degree d: for instance, this nonlinearity is bounded above by 2n−1 −
n−t−2
t+1+
2 d
. This gives a simpler proof that it can be equal to 2n−1 − 2t+1 only if
d = n − t − 1, i.e., if Siegenthaler’s bound is achieved with equality. Moreover, the proof
above also shows that the nonlinearity of any t-resilient n-variable Boolean function is
n−t−2
t+1+
bounded above by 2n−1 − 2 d
, where d is the minimum algebraic degree of the
restrictions of f to the subspaces {u ∈ Fn2 ; ∀i ∈ {1, . . . , n}, ui ≤ vi ⊕ 1} such that v has
Hamming weight t + 1 and Wf (v) = 0. See more in [322].

7.1.5 Bound on the maximum correlation with index subsets


An upper bound on the maximum correlation of t-resilient functions with respect to subsets
I of {1, . . . , n} can be directly deduced from Relation (3.14), page 102, and from Sarkar et
al.’s bound. Note that we get an improvement by using that the support of Wf , restricted to
|I | 
the set of vectors u ∈ Fn2 such that ui = 0, ∀i ∈ I , contains at most i=t+1 |Ii | vectors. In
particular, if |I | = t +1, the maximum correlation of f with respect to I equals 2−n |Wf (u)|,
where u is the vector of support I , see [187, 203, 1155]. The optimal number of LFSRs that
should be considered together in a correlation attack on a cryptosystem using a t-resilient
combining function is t + 1; see [187].

7.1.6 Relationship with other criteria


The relationships between resiliency and other criteria have been studied in [354, 814, 1083,
1175]. For instance, t-resilient P C(l) functions can exist only if t +l ≤ n−1. This is a direct
consequence of Relation (2.56), page 62, applied with a = b = 0n , E = {x ∈ Fn2 ; xi = 0,
∀i ∈ I } and E ⊥ = {x ∈ Fn2 ; xi = 0, ∀i ∈ I }, where I has size n − t: if l ≥ n − t, then
the right-hand side term of Relation (2.56) is nonzero while the left-hand side term is null.
Equality t + l = n − 1 is possible only if l = n − 1, n is odd and t = 0 [354, 1175]. The
known upper bounds on the nonlinearity can then be improved for such functions.
The definition of resiliency has been weakened in [126, 294, 720, 721] in order to relax
some of the trade-offs recalled above, without weakening the cryptosystem against the
correlation attack.
Resiliency is related to the notion of corrector (useful for the generation of random
sequences having good statistical properties) introduced by Lacharme in [732].

7.1.7 Relationship with covering sequences


According to Proposition 60, page 182, knowing a covering sequence λ = (λa )a∈Fn2 (trivial

or not) of a function f allows knowing that supp(Wf ) ⊆  λ−1  λ(0n ) − 2ρ , where ρ is the
level of the sequence. Hence, as observed in [326], f is t-th order correlation immune where
t + 1 is the minimum Hamming weight of nonzero b ∈ Fn2 such that  λ(b) = 
λ(0n ) − 2ρ,
and if ρ = 0, it is then t-resilient. Conversely, if f is t-th order correlation immune (resp.
t-resilient) and if it is not (t + 1)-th order correlation immune (resp. (t + 1)-resilient), then
7.1 Correlation immune and resilient Boolean functions 291

there exists at least one (nontrivial) covering sequence λ = (λa )a∈Fn2 with level ρ such that
t + 1 is the minimum Hamming weight of b ∈ Fn2 satisfying  λ(b) =  λ(0n ) − 2ρ.
A particularly simple covering sequence is the indicator of the set of vectors of Hamming
weight 1. The functions that admit this covering sequence are called regular; they are
(ρ − 1)-resilient, where ρ is the level. More generally, any function admitting as covering
sequence the indicator of a set of vectors of weight 1 has this same property (this
generalizes to any vectors with disjoint supports). We speak then of a simple covering
sequence; see [326], where the algebraic degree and the nonlinearity of regular functions
are studied, and where constructions are given as well as bounds on the number of
variables.

7.1.8 Primary constructions of correlation immune and resilient functions


In the 1990s, high-order resilient functions with the best possible algebraic degree and
nonlinearity were needed for applications in stream ciphers using the combiner model.
But fast algebraic attacks (FAA) have changed the situation. The combiner model is now
considered problematic, because of Siegenthaler’s bound and the fact that combiner or
filter functions need to have very high algebraic degree for resisting FAA. For the sake
of completeness and also because building correlation immune functions means building
orthogonal arrays (see Definition 22, page 86), which are of interest in combinatorics and
statistics, and because a new way of using low-weight correlation immune functions exists
(see Section 12.1), and new ways of using resilient functions may be found in the future, we
report the state of the art for constructing highly nonlinear correlation immune and resilient
functions. As we shall see, most constructions build in fact resilient functions, and these
constructions unfortunately do not allow us to construct low-weight correlation immune
functions. More work is then needed to build such functions. Such work, which we shall
report at the end of this subsection, has been initiated in [258] and continued in [1104].
The primary constructions (which allow designing resilient functions without using
known ones) are supposed to lead potentially to wider classes of functions than secondary
constructions (recall that the number of Boolean functions on n − 1 variables is only equal
to the square root of the number of n-variable Boolean functions). But the known primary
constructions of resilient Boolean functions do not lead to very large classes of functions.
In fact, only one reasonably large class of Boolean functions is known, whose elements can
be analyzed with respect to the cryptographic criteria recalled in Section 3.1. So we observe
some imbalance in the knowledge on cryptographic functions for stream ciphers: much is
known on the properties of resilient functions, but little is known on how to construct them.
Examples of t-resilient functions achieving the best possible nonlinearity 2n−1 − 2t+1 (and
thus the best algebraic degree) have been obtained for n ≤ 10 in [934, 1011, 1012] and
for every t ≥ 0.6 n [1080, 1081] (n being then not limited). But n ≤ 10 is too small
for applications and t ≥ 0.6 n is too large (because of Siegenthaler’s bound).2 Moreover,
these examples give very limited numbers of functions (they are often defined recursively

2 And almost nothing is known on the immunity of these functions to algebraic attacks; anyway, their resistance
to FAA is bad.
292 Correlation immune and resilient functions

or obtained after a computer search), and many of these functions have cryptographic
weaknesses such as linear structures (see [354, 814]). Balanced Boolean functions with
high nonlinearities have been obtained by Fontaine in [515] and by Filiol and Fontaine
in [503], who made a computer investigation – but for n = 7, 9, which is too small –
on the corpus of idempotent functions (see the definition at page 248). These functions,
whose ANFs are invariant under the cyclic shifts of the coordinates xi , have been called later
rotation symmetric (see Section 10.2, page 360). Other ad hoc constructions can be found in
[819, 1011].

A construction derived from the characterization of correlation immunity


by the dual distance
It has been observed in [417] that the characterization of Corollary 6, page 88, can be
straightforwardly applied to build correlation immune functions from linear codes. In fact,
this was already known from [58].

Corollary 22 Let C be any (linear) [n, k, d]-code and G a generator matrix of C. Then for
every k-variable function g, the n-variable function f (x) = g(x × Gt ) is (d − 1)-th order
correlation immune (and it is (d − 1)-resilient if g is balanced).

Proof If g is the indicator δ0 of the singleton {0k }, the result is a direct consequence of
Corollary 6, page 88, since we have f (x) = 0 if and only if x × Gt = 0k , that is, x ∈ C ⊥ .
It is easily seen that if g = δa , we have then that f (x) = 0 if and only if x belongs either to
the empty set or to a coset of C ⊥ . Then f is (d − 1)-th order correlation immune according
to Corollary 6, since the dual distance is invariant by translation. And if g is any sum of
such atomic functions, that is, any Boolean function, we have the same result since the sum
of t-th order correlation immune functions with disjoint supports is a t-th order correlation
immune function. Finally, G being a generator matrix, function x ∈ Fn2 → x × Gt ∈ Fk2 is
balanced and then f is balanced if and only if g is balanced.

Such a correlation immune function can have at most algebraic degree dalg (g) ≤ k (and
≤ k − 1 if it is resilient).

Remark. Given k < n, a k-variable function g, a surjective linear mapping L : Fn2 → Fk2 ,
and an element u of Fn2 , the function f (x) = g ◦ L(x) ⊕ u · x is (d − 1)-resilient, where
d is the Hamming distance between u and the linear code C whose generator matrix equals
the matrix of L. Indeed, for any vector a ∈ Fn2 of Hamming weight at most d − 1, the
vector u + a does not belong to C. This implies that the Boolean function f (x) ⊕ a · x is
linearly equivalent to the function g(x1 , . . . , xk ) ⊕ xk+1 , since we may assume without loss
of generality that L is systematic (i.e., has the form [I dk |N]). Boolean function f (x) ⊕ a · x
is therefore balanced. This construction is similar to that of Corollary 22 but different (note
that g does not need to be balanced for f to be balanced).
In both constructions, f has nonzero linear structures since it is EA equivalent to
g(x1 , . . . , xk ); then it does not give full satisfaction.
7.1 Correlation immune and resilient Boolean functions 293

Maiorana–McFarland’s construction
An extension of the class of bent functions that we called above the Maiorana–McFarland
original class has been given in [181] (where are also characterized the quadratic n-variable
correlation immune functions of order n − 3), based on the same principle of concatenating
affine functions3 (we have already met in Section 5.1 this generalization): let r be a positive
integer smaller than n; we denote n − r by s; let g be any Boolean function on Fs2 and let φ
be a mapping from Fs2 to Fr2 . Then we define the function

r
fφ,g (x, y) = x · φ(y) ⊕ g(y) = xi φi (y) ⊕ g(y), x ∈ Fr2 , y ∈ Fs2 (7.4)
i=1

where φi (y) is the ith coordinate function of φ(y).


For every a ∈ Fr2 and every b ∈ Fs2 , we have seen in Section 6.1.15 that

Wfφ,g (a, b) = 2r (−1)g(y)⊕b·y . (7.5)


y∈φ −1 (a)

This can be used to design resilient functions: if every element in φ(Fs2 ) has Hamming
weight strictly larger than t, then fφ,g is t-resilient (in particular, if φ(Fs2 ) does not contain
the null vector, then fφ,g is balanced). Indeed, if wH (a) ≤ t then φ −1 (a) is empty in
Relation (7.5); hence, if wH (a) + wH (b) ≤ t, then Wfφ,g (a, b) is null. The t-resiliency
of fφ,g under this hypothesis can also be deduced from the facts that any affine function
x ∈ Fr2 → c · x ⊕ (c ∈ Fr2 nonzero, ∈ F2 ) is (wH (c) − 1)-resilient, and that any
Boolean function equal to the concatenation of t-resilient functions is a t-resilient function
(see secondary construction 3 below).
It is possible (see [221, 223, 398]) to obtain a t-resilient function with (7.4) when every
element in φ(Fs2 ) has Hamming weight larger than or equal to t (instead of strictly larger):
we know that such function is (t − 1)-resilient by the observation  above, and it is moreover
t-resilient if, for every a ∈ Fr2 of Hamming weight t, we have y∈φ −1 (a) (−1)g(y) = 0. We
just need then that, for every a ∈ Fr2 of Hamming weight t, if φ −1 (a) = ∅, then φ −1 (a) has
even size and the restriction of g to φ −1 (a) is balanced.
It is more difficult to construct unbalanced correlation immune functions with this
method: in practice, we need that every nonzero element in φ(Fs2 ) has Hamming weight
strictly larger than t and that, for every b ∈ Fs2 such that 1 ≤ wH (b) ≤ t, we have
 g(y)⊕b·y = 0. If φ −1 (0 ) is an affine space, then this results in a condition on
y∈φ −1 (0r ) (−1) r
the restriction of g to φ −1 (0r ), which is similar to t-th order correlation immunity (this gives
a construction that is more secondary than primary) and if φ −1 (0r ) has no such structure,
then g needs to be built from scratch (very little work has been done on that).
Degree: The algebraic degree of fφ,g is at most s + 1 = n − r + 1. It equals s + 1 if and
only if φ has algebraic degree s (i.e., if at least one of its coordinate functions
 has algebraic
degree s, that is, has odd Hamming weight, which is equivalent to y∈Fs φ(y) = 0r ). If
2
we assume that every element in φ(Fs2 ) has Hamming weight strictly larger than t, then φ
can have algebraic degree s only if t ≤ r − 2, since if t = r − 1, then φ is constant. Thus,

3 These functions have also been studied under the name of linear-based functions in [7, 1137].
294 Correlation immune and resilient functions

the algebraic degree of fφ,g reaches Siegenthaler’s bound n − t − 1 if and only if either
t = r − 2 and φ has algebraic degree s = n − t − 2 or t = r − 1 and g has algebraic degree
s = n − t − 1.
Nonlinearity: Relations (3.1), page 79, relating the nonlinearity to the Walsh transform, and
(7.5) above lead straightforwardly to a general lower bound on the nonlinearity of Maiorana–
McFarland’s functions (first observed in [1026]):
nl(fφ,g ) ≥ 2n−1 − 2r−1 maxr |φ −1 (a)| (7.6)
a∈F2

(where |φ −1 (a)| denotes the size of φ −1 (a)). An upper bound obtained in [221] strengthens
a bound previously obtained in [358, 359], which stated nl(fφ,g ) ≤ 2n−1 − 2r−1 :
O P
/
nl(fφ,g ) ≤ 2n−1 − 2r−1 maxr |φ −1 (a)| . (7.7)
a∈F2

Proof of (7.7): The sum


⎛ ⎞2
⎝ (−1)g(y)⊕b·y ⎠ = (−1)g(y)⊕g(z)⊕b·(y+z)
b∈Fs2 y∈φ −1 (a) y,z∈φ −1 (a); b∈Fs2

equals 2s |φ −1 (a)| (since the sum b∈Fs (−1)b·(y+z) is null if y = z). The maximum of a set
2
of values being always larger than or equal to its arithmetic mean, we deduce
0 0
0 0 1
0 0
0 g(y)⊕b·y 0 −1
maxs 0
b∈F2 0
(−1) 0 ≥ |φ (a)|
−1
y∈φ (a)
0
and thus, according to Relation (7.5):
O P
/
max |Wfφ,g (a, b)| ≥ 2r maxr |φ −1 (a)| .
a∈Fr2 ;b∈Fs2 a∈F2

Relation (3.1) completes the proof.

This bound allowed characterizing the Maiorana–McFarland’s functions fφ,g such that
n−1 − 2k+1 : Relation (7.7) implies
1H (φ(y)) > k for every y and achieving nonlinearity 2
w
maxa∈F2r |φ −1 (a)| ≤ 2k−r+2 and thus k + 1 ≤ r ≤ k + 2 since maxa∈F2r |φ −1 (a)| ≥ 1 and
r+ 2s −1
it also implies the inequality nl(fφ,g ) ≤ 2n−1 − √
2
r r .
i=k+1 i ()
If r = k+1, then φ is the constant 1s and maxa∈F2r |φ −1 (a)|
= 2s , thus s ≤ 2(k−r +2) =
2 and n ≤ k + 3. Either s = 1 and g(y) is then any function in one variable, or s = 2 and g
(which is then bent) is any function of the form y1 y2 ⊕(y) where
  is affine.
r
If r = k + 2, then φ is injective, therefore 2s ≤ r−1 + rr = r + 1 and thus n ≤
k +2+log2 (k +3), g is any function on n−k −2 variables and dalg (fφ,g ) ≤ 1+log2 (k +3).
See more in [221] on how to optimize the nonlinearity.
A simple example of k-resilient Maiorana–McFarland’s functions such that nl(fφ,g ) =
2n−1 − 2k+1 (and thus achieving Sarkar et al.’s bound) can be given for any r ≥ 2s − 1 and
7.1 Correlation immune and resilient Boolean functions 295

for k = r−2 (see [221]). And, for every even n ≤ 10, Sarkar et al.’s bound with t = n2 −2 can
be achieved by Maiorana–McFarland’s functions. Also, functions with high nonlinearities
but not achieving Sarkar et al.’s bound with equality exist in Maiorana–McFarland’s class
(for every n ≡ 1 [ mod 4], there exist such n−1
4 -resilient functions on F2 with nonlinearity
n
n−1
2n−1 − 2 2 ).

Generalizations of Maiorana–McFarland’s construction


Such generalizations, whose general frameworks have been seen in the present book
in Subsections 5.2.2 and 5.4.1, have been introduced in [221] and [317]; the latter
generalization has been further generalized into a class introduced in [226]. A motivation for
introducing such generalizations is that Maiorana–McFarland’s functions have the weakness
that x → fφ,g (x, y) is affine for every y ∈ Fs2 and have high divisibilities of their Fourier–
Hadamard spectra (indeed, if we want to ensure that f is t-resilient with a large value of t,
then we need to choose r large; then the Walsh spectrum of f is divisible by 2r according
to Relation (7.5); there is a risk that this property can be used in attacks, as it is used
in [204] to attack block ciphers). The functions constructed in [221, 317] are concatenations
of quadratic functions and those of [226] concatenations of indicators of flats. We have seen
already in Subsections 5.2.2 and 5.4.1 the two classes:

k
1. fψ,φ,g (x, y) = x2i−1 x2i ψi (y) ⊕ x · φ(y) ⊕ g(y),
i=1
:r ;
with x ∈ Fr2 , y ∈ Fs2 ,
where n = r + s, k = 2 , and where ψ : F2 → F2 , φ : F2 → F2
s k s r

and g : F2 → F2 can be chosen arbitrarily.


s


ϕ(y)
2. ∀(x, y) ∈ Fr2 × Fs2 , f (x, y) = (x · φi (y) ⊕ gi (y) ⊕ 1) ⊕ x · φ(y) ⊕ g(y),
i=1

where ϕ is a function from Fs2 into {0, 1, . . . , r}, φ1 , . . . , φr and φ are functions from Fs2 into
Fr2 such that, for every y ∈ Fs2 , the vectors φ1 (y), . . . , φϕ(y) (y) are linearly independent, and
g1 , . . . , gr and g are Boolean functions on Fs2 .
We have seen at pages 179 and 181 the formulae for the Walsh transforms of the functions
of these classes, which result in sufficient conditions for their resiliency and in bounds on
their nonlinearities; see [221, 226], where the author also studied how to optimize these
parameters.
More complex ways of adapting the Maiorana–McFarland construction and other con-
structions can be found in [817, 928, 934, 1013, 1163, 1164], where some better parameters
can be found but trade-offs are less clear.

Other constructions
A construction derived from PS ap construction is introduced in [216] to obtain resilient
functions: let k and r be positive integers and n ≥ r; we denote n − r by s; the vector
space Fr2 is identified to the Galois field F2r . Let g be any Boolean function on F2r and φ an
F2 -linear mapping from Fs2 to F2r ; set a ∈ F2r and b ∈ Fs2 such that, for every y in Fs2 and
296 Correlation immune and resilient functions

every z in F2r , a + φ(y) is nonzero and φ ∗ (z) + b has Hamming weight larger than k, where
φ ∗ is the adjoint of φ (satisfying u · φ(x) = φ ∗ (u) · x for every x and u). Then, the function

x
f (x, y) = g ⊕ b · y, where x ∈ F2r , y ∈ Fs2 , (7.8)
a + φ(y)
is t-resilient with t ≥ k. There exist bounds on the nonlinearities of these functions
(see [223]), similar to those existing for Maiorana–McFarland’s functions. But this class
has much fewer elements than Maiorana–McFarland’s class, because φ is linear.
Dobbertin’s construction: We have seen at page 252 this method for modifying bent
functions into balanced functions with high nonlinearities. Up to affine equivalence, we can
n/2 n/2
assume that the bent function that starts the method, say f (x, y), x ∈ F2 , y ∈ F2 , is
n/2
such that f (x, 0n/2 ) = ( ∈ F2 ) for every x ∈ F2 and that = 0 (otherwise, consider
f ⊕ 1).

n/2 n/2
Proposition 121 Let f (x, y), x ∈ F2 , y ∈ F2 be any bent function such that
n/2 n/2
f (x, 0n/2 ) = 0 for every x ∈ F2 and let g be any balanced function on F2 . Then the
Walsh transform of the function h(x, y) = f (x, y) ⊕ δ0 (y)g(x), where δ0 is the Dirac (or
Kronecker) symbol, satisfies
Wh (u, v) = 0 if u = 0n/2 and Wh (u, v) = Wf (u, v) + Wg (u) otherwise. (7.9)
 
Proof We have Wh (u, v) = Wf (u, v) − x∈F2
n/2 (−1)u·x + x∈F2
n/2 (−1)g(x)⊕u·x =
n
Wf (u, v) − 2 2 δ0 (u) + Wg (u). Function g being balanced, we have Wg (0n/2 ) = 0. And
n n/2
Wf (0n/2 , v) equals 2 2 for every v, since f is null on F2 × {0n/2 } and according to
n/2
Relation (6.7), page 200, applied to E = {0n/2 } × F2 and a = b = 0n/2 (or see the
remark after Theorem 14, page 203).

We deduce that
max |Wh (u, v)| ≤ max |Wf (u, v)| + max |Wg (u)|,
n/2 n/2 n/2
u,v∈F2 u,v∈F2 u∈F2
n
i.e., that 2n − 2nl(h) ≤ 2n − 2nl(f ) + 2 2 − 2nl(g), that is,
n n
nl(h) ≥ nl(f ) + nl(g) − 2 2 −1 = 2n−1 − 2 2 + nl(g).
Applying recursively this principle (if n2 is even, g can be constructed in the same way),
we see that if n = 2k n (n odd), Dobbertin’s method allows reaching the nonlinearity
 n −1
2n−1 − 2 2 −1 − 2 4 −1 − · · · − 2n −1 − 2
n n
2 since we know that, for every odd n , the
n −1
n 
nonlinearity of functions on F2 can be as high as 2n −1 −2 2 , and that balanced (quadratic)
functions can achieve this value. If n ≤ 7, then this value is the best possible and 2n−1 −
 n −1
2 2 −1 − 2 4 −1 − · · · − 2n −1 − 2 2 is therefore the best-known nonlinearity of balanced
n n

functions in general. For n > 7, the best nonlinearity of balanced n -variable functions
 n −1
is larger than 2n −1 − 2 2 (see the paragraph devoted to nonlinearity in Section 3.1) and
7.1 Correlation immune and resilient Boolean functions 297
n n  
2n−1 −2 2 −1 −2 4 −1 −· · ·−22n −1 −2n +nl(g), where g is an n -variable balanced function,
can therefore reach higher values.
Dobbertin’s conjecture on balanced functions is that his construction allows reaching the
best nonlinearities of balanced functions in even numbers of variables. This question is still
open, and it is, in particular, an open problem to find an 8-variable balanced Boolean function
with nonlinearity 118.
Unfortunately, according to Relation (7.9), Dobbertin’s construction cannot produce
n/2
t-resilient functions with t > 0 since, g being a function defined on F2 , there cannot
n
exist more than one vector a such that Wg (a) equals ±2 2 . Modifying bent functions into
resilient functions has been studied in [821].

7.1.9 Secondary constructions of correlation immune and resilient functions


There exist several simple secondary constructions, which can be combined to obtain
resilient functions achieving the bounds of Sarkar et al. and Siegenthaler. We list them
below in chronological order.

I. The direct sum of functions


A. Adding a variable
Let f be an r-variable t-resilient function. The Boolean function on Fr+1
2 :

h(x1 , . . . , xr , xr+1 ) = f (x1 , . . . , xr ) ⊕ xr+1


is (t + 1)-resilient [1041], since, for a ∈ Fr2 and ar+1 ∈ F2 , we have Wh (a, ar+1 ) =
2 Wf (a) δ1 (ar+1 ). If f is an (r, t, r − t − 1, 2r−1 − 2t+1 ) function4 , then h is an (r + 1,
t + 1, r − t − 1, 2r − 2t+2 ) function, and thus achieves Siegenthaler’s and Sarkar et al.’s
bounds. But h has the linear structure (0, . . . , 0, 1).
B. Generalization
If f is an r-variable t-resilient function (t ≥ 0) and if g is an s-variable m-resilient
function (m ≥ 0), then the function
h(x1 , . . . , xr , xr+1 , . . . , xr+s ) = f (x1 , . . . , xr ) ⊕ g(xr+1 , . . . , xr+s )
is (t + m + 1)-resilient, since
Wh (a, b) = Wf (a) × Wg (b), a ∈ Fr2 , b ∈ Fs2 . (7.10)
We have also dalg (h) = max(dalg (f ), dalg (g)) and, thanks to Relation (3.1), page 79,
relating the nonlinearity to the Walsh transform, nl(h) = 2r+s−1 − 12 (2r − 2nl(f ))(2s −
2nl(g)) = 2r nl(g) + 2s nl(f ) − 2nl(f )nl(g). Such a function, called decomposable, does
not give full satisfaction since such particular structure may be used in attacks. Moreover,
h has a low algebraic degree, in general. And if nl(f ) = 2r−1 − 2t+1 (t ≤ r − 2) and
nl(g) = 2s−1 − 2m+1 (m ≤ s − 2, which is not the case when adding one variable), i.e.,
if nl(f ) and nl(g) have maximum possible values, then nl(h) = 2r+s−1 − 2t+m+3 and h
does not achieve Sarkar’s and Maitra’s bound. Function h has no nonzero linear structure if
4 Recall that, by an (n, m, d, N )- function, we mean an n-variable, t-resilient function having algebraic degree at
least d and nonlinearity at least N .
298 Correlation immune and resilient functions

and only if f and g both have no nonzero linear structure (we see then that having no linear
structure is not a sufficient criterion).
Note that the result does not work with unbalanced functions.

II. Siegenthaler’s construction


Let f and g be two Boolean functions on Fr2 . Let us consider the function
h(x1 , . . . , xr , xr+1 ) = (xr+1 ⊕ 1)f (x1 , . . . , xr ) ⊕ xr+1 g(x1 , . . . , xr )
on Fr+1
2 . Note that the truth table of h can be obtained by concatenating the truth tables of f
and g. Then:
Wh (a1 , . . . , ar , ar+1 ) = Wf (a1 , . . . , ar ) + (−1)ar+1 Wg (a1 , . . . , ar ). (7.11)
Thus:
1. If f and g are t-resilient, then h is t-resilient [1041]; moreover, if for every a ∈ Fr2
of Hamming weight t + 1, we have Wf (a) + Wg (a) = 0, then h is (t + 1)-resilient.
Note that the construction recalled in I.A corresponds to g = f ⊕ 1 and satisfies this
condition. Another possible choice of a function g satisfying this condition (first pointed
out in [181]) is g(x) = f (x1 ⊕ 1, . . . , xr ⊕ 1) ⊕ , where = t [ mod 2], since Wg (a) =
 f (x)⊕ ⊕(x⊕1r )·a = (−1) +wH (a) W (a). It leads to a function h having also a
x∈Fr2 (−1) f
nonzero linear structure.
2. The value max |Wh (a1 , . . . , ar , ar+1 )| is bounded above by the number
a1 ,...,ar+1 ∈F2
max |Wf (a1 , . . . , ar )| + max |Wg (a1 , . . . , ar )|; this implies 2r+1 − 2nl(h) ≤
a1 ,...,ar ∈F2 a1 ,...,ar ∈F2
2r+1 − 2nl(f ) − 2nl(g), that is nl(h) ≥ nl(f ) + nl(g).
a. If f and g achieve maximum possible nonlinearity 2r−1 − 2t+1 and if h is (t + 1)-
resilient, then the nonlinearity 2r − 2t+2 of h is the best possible.
b. If f and g are such that, for every vector a, at least one of the numbers
Wf (a), Wg (a) is null (in other words, if the supports of the Walsh transforms
of f  and g are disjoint), then we have maxa1 ,...,ar+1 ∈F2 |Wh (a1 , . . . , ar , ar+1 )| =
max maxa1 ,...,ar ∈F2 |Wf (a1 , . . . , ar )|; maxa1 ,...,ar ∈F2 |Wg (a1 , . . . , ar )| . Hence we
have 2r+1 − 2nl(h) = 2r − 2 min(nl(f ), nl(g)) and nl(h) equals therefore
2r−1 + min(nl(f ), nl(g)); thus, if f and g achieve best possible nonlinearity
2r−1 − 2t+1 , then h achieves best possible nonlinearity 2r − 2t+1 .
3. If the monomials of highest degree in the algebraic normal forms of f and g are not
all the same, then dalg (h) = 1 + max(dalg (f ), dalg (g)). Note that this condition is not
satisfied in the two cases indicated above in 1, for which h is (t + 1)-resilient.
4. For every a = (a1 , . . . , ar ) ∈ Fr2 and every ar+1 ∈ F2 , we have, denoting (x1 , . . . , xr )
by x: D(a,ar+1 ) h(x, xr+1 ) = Da f (x) ⊕ ar+1 (f ⊕ g)(x) ⊕ xr+1 Da (f ⊕ g)(x) ⊕ ar+1 Da
(f ⊕ g)(x). If dalg (f ⊕ g) ≥ dalg (f ), then D(a,1) h is nonconstant, for every a. And if,
additionally, there does not exist a = 0r such that Da f and Da g are constant and equal
to each other, then h admits no nonzero linear structure.
This construction allows obtaining from any two t-resilient functions f and g having
disjoint Walsh spectra, achieving nonlinearity 2r−1 − 2t+1 and such that dalg (f ⊕ g) =
7.1 Correlation immune and resilient Boolean functions 299

r − t − 1, a t-resilient function h having algebraic degree r − t and having nonlinearity 2r −


2t+1 , that is, achieving Siegenthaler’s and Sarkar et al.’s bounds; note that this construction
increases (by 1) the algebraic degrees of f and g. And since, from any t-resilient function
f having algebraic degree r − t − 1 and nonlinearity 2r−1 − 2t+1 , we can deduce a function
h having resiliency order t + 1 and nonlinearity 2r − 2t+2 , that is, achieving Siegenthaler’s
and Sarkar et al.’s bounds and having same algebraic degree as f (but having nonzero linear
structures), we can by combining these two methods keep best trade-offs among resiliency
order, algebraic degree, and nonlinearity, and increase by 1 the degree and the resiliency
order.

Generalization: Let (fy )y∈Fs2 be a family of r-variable t-resilient functions; then the function
on Fr+s
2 defined
 by f (x, y) = fy (x) (x ∈ Fr2 , y ∈ Fs2 ) is t-resilient. Indeed, we have
Wf (a, b) = y∈Fs2 (−1)
b·y W (a). Function f corresponds to the concatenation of
fy
the functions fy ; hence, this secondary construction can be viewed as a generalization
of Maiorana–McFarland’s construction (in which the functions fy are t-resilient affine
functions).
More on the resilient functions achieving high nonlinearities and constructed by using,
among others, the secondary constructions above (as well as algorithmic methods) can be
found in [696].

III. Tarannikov’s elementary construction


Let g be any Boolean function on Fr2 . We define the Boolean function h on Fr+1 2
by h(x1 , . . . , xr , xr+1 ) = xr+1 ⊕ g(x1 , . . . , xr−1 , xr ⊕ xr+1 ). By the change of vari-
able xr ← xr ⊕ xr+1 , we see that the Walsh transform Wh (a1 , . . . , ar+1 ) is equal to
(−1)a·x⊕g(x1 ,...,xr )⊕(ar ⊕ar+1 ⊕1)xr+1 , where a = (a1 , . . . , ar ) and x = (x1 , . . . , xr );
x1 ,...,xr+1 ∈F2
if ar ⊕ ar+1 = 0, then this value is null, and if ar ⊕ ar+1 = 1, then it equals
2 Wg (a1 , . . . , ar−1 , ar ). Thus:
1. nl(h) = 2 nl(g).
2. If g is t-resilient, then h is t-resilient, since wH (a1 , . . . , ar ) ≤ wH (a1 , . . . , ar+1 ). And
h is (t + 1)-resilient if and only if, for every vector (a1 , . . . , ar+1 ) of Hamming weight
t + 1 such that ar ⊕ ar+1 = 1, we have Wg (a1 , . . . , ar ) = 0, and the only case not
implied by the t-resiliency of g is when ar = 1 and ar+1 = 0; hence, h is (t + 1)-resilient
if and only if Wg (a1 , . . . , ar−1 , 1) is null for every vector (a1 , . . . , ar−1 ) of Hamming
weight t; note that, in such a case, if g has nonlinearity 2r−1 − 2t+1 then the nonlinearity
of h, which equals 2r − 2t+2 , achieves then Sarkar et al.’s bound too. The condition that
Wg (a1 , . . . , ar−1 , 1) is null for every vector (a1 , . . . , ar−1 ) of Hamming weight at most t
is achieved if g does not actually depend on its last input bit; but the construction is then
a particular case of the construction recalled in I.A. The condition is also achieved if g is
obtained from two t-resilient functions, by using Siegenthaler’s construction (recalled in
II), according to Relation (7.11).
3. dalg (h) = dalg (g) if dalg (g) ≥ 1.
4. h has the nonzero linear structure (0, . . . , 0, 1, 1).
300 Correlation immune and resilient functions

Tarannikov combined in [1080] this construction with the direct sum and Siegenthaler
constructions recalled in I and II, to build a more complex secondary construction, which
allows increasing at the same time the resiliency order and the algebraic degree of the
functions and which leads to an infinite sequence of functions achieving Siegenthaler’s
and Sarkar et al.’s bounds. Increasing then, by using the construction recalled in I.A, the
set of ordered pairs (r, m) for which such functions can be constructed, he deduced the
existence of r-variable t-resilient functions achieving Siegenthaler’s and Sarkar et al.’s
bounds for any number of variables r and any resiliency order t such that t ≥ 2r−7 3 and
t > 2r − 2 (but these functions have nonzero linear structures). In [934], Pasalic et al.
slightly modified this more complex Tarannikov’s construction into a construction that we
shall call Tarannikov et al.’s construction, which allowed, when iterating it together with the
construction recalled in I.A, to relax slightly the condition on t into t ≥ 2r−10
3 and t > 2r −2.

IV. Indirect sum of functions


Tarannikov et al.’s construction has been in its turn generalized into a construction that has
been named indirect sum a few years after it was introduced, and that we already encountered
at page 233 as a construction of bent functions. Indirect sum builds a function h from four
functions, while the previous constructions used at most two functions. All the secondary
constructions listed above are particular cases of it: they correspond to fixing two or three of
the four functions.

Theorem 21 [225] Let r and s be positive integers and let t and m be non-negative integers
such that t < r and m < s. Let f1 and f2 be two r-variable functions. Let g1 and g2 be two
s-variable functions. We define the (r + s)-variable function:
h(x, y) = f1 (x) ⊕ g1 (y) ⊕ (f1 ⊕ f2 )(x) (g1 ⊕ g2 )(y); x ∈ Fr2 , y ∈ Fs2 .
If f1 and f2 are distinct and if g1 and g2 are distinct, then the algebraic degree
of h equals max(dalg (f1 ), dalg (g1 ), dalg (f1 ⊕ f2 ) + dalg (g1 ⊕ g2 )); otherwise, it
equals max(dalg (f1 ), dalg (g1 )). The Walsh transform of h takes value at (a, b), where
a ∈ Fr2 , b ∈ Fs2 :
1  1 
Wh (a, b) = Wf1 (a) Wg1 (b) + Wg2 (b) + Wf2 (a) Wg1 (b) − Wg2 (b) . (7.12)
2 2
If f1 and f2 are t-resilient and g1 and g2 are m-resilient, then h is (t + m + 1)-resilient.
If the Walsh transforms of f1 and f2 have disjoint supports and if the Walsh transforms
of g1 and g2 have disjoint supports, then
 
nl(h) = min 2r+s−2 + 2r−1 nl(gj ) + 2s−1 nl(fi ) − nl(fi )nl(gj ) . (7.13)
i,j ∈{1,2}

In particular, if f1 and f2 are two (r, t, −, 2r−1 − 2t+1 ) functions with disjoint Walsh
supports, if g1 and g2 are two (s, m, −, 2s−1 − 2m+1 ) functions with disjoint Walsh supports,
and if f1 ⊕ f2 has degree r − t − 1 and g1 ⊕ g2 has algebraic degree s − m − 1, then h
is a (r + s, t + m + 1, r + s − t − m − 2, 2r+s−1 − 2t+m+2 ) function, and thus achieves
Siegenthaler’s and Sarkar et al.’s bounds.
7.1 Correlation immune and resilient Boolean functions 301

Proof For every a ∈ Fr2 , b ∈ Fs2 , we have:


⎛ ⎞

Wh (a, b) = ⎝ (−1)f1 (x)⊕a·x ⎠ (−1)g1 (y)⊕b·y


y∈Fs2 ; g1 ⊕g2 (y)=0 x∈Fr2
⎛ ⎞

+ ⎝ (−1)f2 (x)⊕a·x ⎠ (−1)g1 (y)⊕b·y


y∈Fs2 ; g1 ⊕g2 (y)=1 x∈Fr2

= Wf1 (a) (−1)g1 (y)⊕b·y + Wf2 (a) (−1)g1 (y)⊕b·y


y∈Fs2 ; y∈Fs2 ;
g1 ⊕g2 (y)=0 g1 ⊕g2 (y)=1
 
1 + (−1)(g1 ⊕g2 )(y)
= Wf1 (a) (−1) g1 (y)⊕b·y
s 2
y∈F2
 
(g1 ⊕g2 )(y)
g1 (y)⊕b·y 1 − (−1)
+ Wf2 (a) (−1) .
s 2
y∈F2

We deduce Relation (7.12). If (a, b) has Hamming weight at most t + m + 1, then a has
Hamming weight at most t or b has Hamming weight at most t; hence we have Wh (a, b) = 0.
Thus, h is t + m + 1-resilient.
If f1 ⊕ f2 and g1 ⊕ g2 are nonconstant, then the algebraic degree of h equals
max(dalg (f1 ), dalg (g1 ), dalg (f1 ⊕ f2 ) + dalg (g1 ⊕ g2 )) because the terms of highest degrees
in (g1 ⊕ g2 )(y) (f1 ⊕ f2 )(x), in f1 (x) and in g1 (y) cannot cancel each other. We deduce
from Relation (7.12) that if the supports of the Walsh transforms of f1 and f2 are disjoint,
as well as those of g1 and g2 , then
 
1
max |Wh (a, b)| = max max |Wfi (a)| maxs |Wgj (b)|
(a,b)∈Fr2 ×Fs2 2 i,j ∈{1,2} a∈Fr2 b∈F2

and according to Relation (3.1) relating the nonlinearity to the Walsh transform, this implies
1 
2r+s − 2nl(h) = max (2r − 2nl(fi ))(2s − 2nl(gj )) ,
2 i,j ∈{1,2}
which is equivalent to Relation (7.13).

Note that function h, defined this way, is the concatenation of the four functions f1 , f1 ⊕1,
f2 and f2 ⊕ 1, in an order controlled by g1 (y) and g2 (y).
This construction is nicely general and does not need the initial functions f1 , f2 and g1 , g2
to satisfy complex conditions, contrary to other constructions that have been derived later for
building bent functions (see pages 233 and foll.) and could be adapted for designing resilient
functions.
Examples of pairs (f1 , f2 ) (or (g1 , g2 )) satisfying the hypotheses of Theorem 21 can be
found in [225]. The interest of the indirect sum compared to the direct sum is that it allows
designing functions h that are more complex (have larger algebraic degree and possibly
larger algebraic immunity and fast algebraic immunity).
302 Correlation immune and resilient functions

Remark. The indirect sum (as well as all its particular cases viewed above) is less
well adapted to constructing correlation immune functions: Relation (7.12) shows that
if Wf1 (a) = Wf2 (a) = Wg1 (b) = Wg2 (b) = 0, then Wh (a, b) = 0, but when for
instance a = 0r and b = 0s , we have Wh (a, b) = 12 Wf1 (0r ) Wg1 (b) + Wg2 (b) +

2 Wf2 (0r ) Wg1 (b) − Wg2 (b) , and there are additional conditions on the values of
1

Wf1 (0r ), Wf2 (0r ), Wg1 (b), and Wg2 (b) when wH (b) ≥ m + 1 (and on the values of
Wg1 (0s ), Wg2 (0s ), Wf1 (a), and Wf2 (a) when wH (a) ≥ t + 1) for allowing h to be more
than (min(t, m))-th order correlation immune.

V. Constructions without extension of the number of variables


Proposition 85, page 236, leads to the following construction:

Proposition 122 [227] Let n be any positive integer and t any nonnegative integer such
that t ≤ n. Let f1 , f2 , and f3 be three t-th order correlation immune (resp. t-resilient)
functions. Then the function s1 = f1 ⊕ f2 ⊕ f3 is t-th order correlation immune (resp.
t-resilient) if and only if the function s2 = f1 f2 ⊕ f1 f3 ⊕ f2 f3 is t-th order correlation
immune (resp. t-resilient). Moreover,
 3

1
nl(s2 ) ≥ nl(s1 ) + nl(fi ) − 2n−1 (7.14)
2
i=1
and if the Walsh supports of f1 , f2 , and f3 are pairwise disjoint (that is, if at most one value
Wfi (s), i = 1, 2, 3 is nonzero, for every vector s), then

1
nl(s2 ) ≥ nl(s1 ) + min nl(fi ) . (7.15)
2 1≤i≤3

Proof Relation (6.30), page 236, and the fact that, for every nonzero vector (resp. any
vector) a of Hamming weight at most t, we have Wfi (a) = 0 for i = 1, 2, 3 imply
that Ws1 (a) = 0 if and only if Ws2 (a) = 0. Relations (7.14) and (7.15) are also direct
consequences of Relation (6.30) and of Relation (3.1), page 79, relating the nonlinearity to
the Walsh transform.

Note that this secondary construction is proper to allow achieving high algebraic
immunity with s2 , given functions with lower algebraic immunities f1 , f2 , f3 , and s1 ,
since the support of s2 can be made more complex than those of these functions. This is
done without changing the number of variables and keeping similar resiliency order and
nonlinearity.

Remark. Let g and h be two Boolean functions on Fn2 with disjoint supports and let f
be equal to g ⊕ h = g + h. Then f is balanced if and only if wH (g) + wH (h) = 2n−1 .
By linearity of the Fourier–Hadamard transform, we have f =  g + h. Thus, if g and h are
t-th order correlation immune, then f is t-resilient. For every nonzero a ∈ Fn2 , we have
|Wf (a)| = 2 |f(a)| ≤ 2 |
g (a)| + 2 |
h(a)| = |Wg (a)| + |Wh (a)|. Thus, assuming that f is
balanced, we have nl(f ) ≥ nl(g) + nl(h) − 2n−1 . The algebraic degree of f is bounded
above by (and can be equal to) the maximum of the algebraic degrees of g and h.
7.1 Correlation immune and resilient Boolean functions 303

The largest part of the secondary constructions of bent functions described in Subsection
6.1.16 can be altered into constructions of correlation immune and resilient functions;
see [216].
The generalization of Proposition 85 given by Proposition 86, page 237, leads to:

Proposition 123 [227] Let n be any positive integer and k any nonnegative integer such
that k ≤ n. Let f1 , . . ., f7 be k-th order correlation immune (resp. k-resilient) functions.
If two among the functions s1 = f1 ⊕ . . . ⊕ f7 , s2 = f1 f2 ⊕ f1 f3 ⊕ . . . ⊕ f6 f7 and
  l
s4 = fij is k-th order correlation immune (resp. k-resilient), then the
1≤i1 <...<i4 ≤7 j =1
third one is k-th order correlation immune (resp. k-resilient).

Low Hamming weight correlation immune functions


Except for the secondary construction without extension of the number of variables, the
primary and secondary constructions of resilient functions recalled above do not work well
for building unbalanced correlation immune functions, as we observed for the indirect sum
in the remark at the head of page 302. We shall see in Section 12.1, page 425, that low
Hamming weight correlation immune functions are useful for countermeasures to side-
channel attacks. More constructions are then needed.
We denote by CIn,t the set of n-variable t-th order correlation immune Boolean functions
and by ωn,t the minimal Hamming weight of nonzero functions in CIn,t .
According to Proposition 120, page
C 289,D the Hamming weight of a t-th order correlation
n−t−1
t+ d (f )
alg
immune function is divisible by 2 .
The only n-variable n-th order correlation immune Boolean functions are the two constant
functions. The only (n − 1)-thorder correlation
 immune nonconstant Boolean functions are
the (n−1)-resilient functions ni=1 xi and ni=1 xi ⊕1. Then ωn,n = 2n and ωn,n−1 = 2n−1 .
We have of course ωn,t ≤ ωn,t+1 and more precisely:

Lemma 10 Let 1 ≤ t ≤ n be integers. Then

ωn+1,t ≤ 2ωn,t ≤ ωn+1,t+1 .

Proof For every f ∈ CIn,t , the (n + 1)-variable function g(x, xn+1 ) = f (x) belongs
g (a, 0) = 2f(a) and 
to CIn+1,t , since, for every a, we have  g (a, 1) = 0. Moreover,
g has Hamming weight 2wH (f ). This proves the left-hand side inequality. For every
f ∈ CIn+1,t+1 , the restriction of f to the hyperplane of equation xn+1 = 0 is a tth-
order correlation immune Boolean function with half weight. This proves the right-hand
side inequality.

As observed in [591], the largest dimension kmax (n, t + 1) of a binary linear code [n, k,
t + 1] provides the upper bound ωn,t ≤ 2n−kmax (n,t+1) , according to Corollary 6, page 88,
and to the fact that the dual of a linear code of dimension k has dimension n − k. Since a
binary MDS code of parameters [n, n − 1, 2] exists, we have then ωn,1 = 2 for every n.
304 Correlation immune and resilient functions

Table 7.1 Lower bound on ωn,t from Delsarte’s linear programming bound [74].

t
1 2 3 4 5 6 7 8 9 10 11 12 13
n

1 2
2 2 4
3 2 4 8
4 2 6 8 16
5 2 8 12 16 32
6 2 8 16 32 32 64
7 2 8 16 48 64 64 128
8 2 10 16 64 88 112 128 256
9 2 12 20 96 128 192 224 256 512
10 2 12 24 96 192 320 384 512 512 1,024
11 2 12 24 96 192 512 640 1,024 1,024 1,024 2,048
12 2 14 24 112 176 768 1,024 1,536 1,792 2,048 2,048 4,096
13 2 16 28 128 224 1,024 1,536 2,560 3,072 3,584 4,096 4,096 8,192

Table 7.2 Minimum weight of t-th order correlation immune nonzero n-variable functions.

t
1 2 3 4 5 6 7 8 9 10 11 12 13
n

1 2
2 2 4
3 2 4 8
4 2 8 8 16
5 2 8 16 16 32
6 2 8 16 32 32 64
7 2 8 16 64 64 64 128
8 2 12 16 64 128 128 128 256
9 2 12 24 128 128 256 256 256 512
10 2 12 24 128 256 512 512 512 512 1,024
11 2 12 24 128 256 512 1,024 1,024 1,024 1,024 2,048
12 2 16 24 ??? 256 512 1,024 2,048 2,048 2,048 2,048 4,096
13 2 16 32 ??? ??? ??? 1,024 4,096 4,096 4,096 4,096 4,096 8,192

As also observed in [591], ωn,t being equal to the minimal number of rows in a simple
binary orthogonal array of strength t, Delsarte’s linear programming bound [422] provides
a lower bound on ωn,t that we give in Table 7.1.
The Satisfiability Modulo Theory (SMT) tool has been used to search for correlation
immune Boolean functions in [74] together with the upper bound deduced from known
constructions of binary codes, the lower bound of Table 7.1, and the divisibility of ωn,t
by 2t .
Table 7.2, displaying the known values of ωn,t for n ≤ 13, is taken from [74, 258, 287,
1104]. The entries in light gray follow from ωn,1 = 2 and ωn,n = 2n . The entries in dark gray
follow from ωn,n−1 = 2n−1 and from Lemma 10 above, which imply ωn,t ≤ ωn,n−1 = 2n−1 ,
and from Theorem 116, page 285, which implies that ωn,t = 2n−1 for  2n−2 3  ≤ t ≤
n − 1. The entry n = 11, t = 4 is obtained in [74, 1104] and the entries n = 11, t = 5;
n = 12, t = 5; and n = 12, t = 7 follow from Proposition 124 below. The entries in
7.1 Correlation immune and resilient Boolean functions 305

bold have been obtained by the SMT tool. A triple question mark in this table indicates
that the value is unknown. Note, however, that upper bounds are known for these entries:
[946] makes a detailed exploration of several evolutionary algorithms for finding Boolean
functions that have various orders of correlation immunity and minimal Hamming weight.
These investigations show that ω11,4 ≤ 128, ω11,5 ≤ 256, ω12,5 ≤ 256, ω12,6 ≤ 1, 024,
ω13,7 ≤ 2, 048.
It is an open question to determine whether the columns in this table (and more generally
for every value of n and t) are nondecreasing, that is, ωn,t ≤ ωn+1,t for every n and t. If the
reply to this question is positive, then these values are optimal. It is also shown in [946] that
ω12,4 ≤ 256, ω13,4 ≤ 256, ω13,5 ≤ 512, ω13,6 ≤ 1024. See also [112], where nonexistence
results are proved. - .
It is shown in [1104] that ωn,2 ≥ 4 n+1 4 for n ≥ 2 (and the proof can be slightly
simplified): the Golomb–Xiao–Massey characterization of correlation immune functions
(Theorem 5, page 87) directly gives that a Boolean function f is in CIn,2 if and only if
the matrix H = ((−1)ei ·x ) x∈supp(f ) , where e0 = 0n and (e1 , . . . , en ) is the canonical basis of
i=0,1,...,n
Fn2 over F2 , satisfies H t × H = wH (f ) In+1 ,-where
. In+1 is the identity matrix (i.e., H is a
Hadamard matrix); this shows that ωn,2 ≥ 4 n+1
4 since we know that 4 divides ωn,2 and
that matrix In+1 has rank n + 1, while matrix H has rank at most wH (f ) and H t × H has
necessarily rank smaller than or equal to that of H .
It is deduced in [1104] that for each known Hadamard 4k × 4k matrix, a function in
CI4k−1,2 of (minimum) Hamming weight 4k (and functions in CI4k+i,2 of Hamming weight
4k for every i = 0, 1, 2) can be deduced. It has been conjectured by J. Hadamard that there
exists a 4k × 4k Hadamard matrix for every k. According
- .to the observations above, this
conjecture is equivalent to conjecturing that ωn,2 = 4 4 for every n.
n+1

Proposition 124 [258] Let t be any even integer such that 2 ≤ t ≤ n. Then

ωn+1,t+1 = 2 ωn,t .

Proof For every f ∈ CIn,t , the (n + 1)-variable function


f (x), when xn+1 = 0
g(x, xn+1 ) =
f (x + 1n ), when xn+1 = 1,

has Hamming weight 2 wH (f ) and is a (t + 1)-th order correlation immune Boolean


function. Indeed, for any u ∈ Fn2 and any un+1 ∈ F2 , we have:


g (u, un+1 ) = g(x, xn+1 )(−1)(u,un+1 )·(x,xn+1 )
(x,xn+1 )∈Fn+1
2

= f (x)(−1)u·x + f (x + 1n )(−1)(u,un+1 )·(x,1)


x∈Fn2 x∈Fn2
306 Correlation immune and resilient functions

= f(u) + f (x)(−1)(u,un+1 )·(x+1n ,1)


x∈Fn2

= (1 + (−1)wH (u,un+1 ) ) f(u).

If wH (u, un+1 ) = t + 1, then since t is an even integer, we have 1 + (−1)wH (u,un+1 ) =


1 + (−1)t+1 = 0, thus  g (u, un+1 ) = 0.
If u = 0n and un+1 = 1, then 1 + (−1)wH (u,un+1 ) = 0, and  g (u, un+1 ) = 0.
If 1 ≤ wH (u, un+1 ) ≤ t and u = 0n , we have that 1 ≤ wH (u) ≤ t, and since f (x) ∈
CIn,t , we have f(u) = 0, then 
g (u, un+1 ) = 0.
Hence, if 1 ≤ wH (u, un+1 ) ≤ t + 1, then  g (u, un+1 ) = 0 and g(x, xn+1 ) is a (t + 1)th-
order correlation immune Boolean function. Thus, ωn+1,t+1 ≤ 2ωn,t when t is even, and
since 2ωn,t ≤ ωn+1,t+1 for any 1 ≤ t ≤ n according to Lemma 10, this completes the
proof.
<n=
This leads to the bound <n=ωn,3 ≥ 8 4 for n ≥ 3. It is conjectured in [258] that for
n ≥ 3, we have ωn,3 = 8 4 . Using the characterization of functions in CIn,2 given above
Proposition 124 by means of Hadamard matrices and the known existence of infinitely
many 4k × 4k Hadamard matrices, Wang deduced in [1104] that infinitely many values
n ≡ i [mod 4] satisfy the conjecture for each i = −1, 0, 1, 2. He observed that the
conjecture (which is still open) is equivalent to that of Hadamard, which is more than 100
years old.
A construction of functions of weight 2m in CIn,t has been given in [1104], which
defines their support as made of the 2m vectors of the form (v · u1 , . . . , v · un ), where v
ranges over Fm 2 and the uj are such that none of them depends linearly on at most t − 1
others. This construction is nothing more than Corollary 6, page 88, with a linear code
whose generator matrix is made of the uj by columns (recall that the dual distance of such
code is the minimum number of linearly dependent columns), or Corollary 22, page 292.
The construction, however, allowed completing some entries in the table that was given in
[74, 287] (Table 7.2 is the completed table).

Using the Fourier–Hadamard transform instead of the Walsh transform to construct


correlation immune functions
We have seen that correlation immune functions are characterized by both the Fourier–
Hadamard transform and the Walsh transform. We have also seen that most known
constructions of correlation immune functions were based on the properties of the Walsh
transform and that they built in fact resilient functions, mostly. The Fourier–Hadamard
transform and the Walsh transform are closely related through Relation (2.32), page 55.
However, they behave differently with respect to the operations in BF n : while the Walsh
transform behaves well with respect to the addition of Boolean functions (for instance,
the Walsh transform of a direct sum equals the product of the Walsh transforms; see
Relation (7.10), page 297), the Fourier–Hadamard transform behaves well with respect to
the multiplication of functions; in particular, the Fourier–Hadamard transform of a direct
product equals the product of the Fourier–Hadamard transforms, since
7.1 Correlation immune and resilient Boolean functions 307
⎛ ⎞⎛ ⎞

f (x)g(y)(−1)a·x⊕b·y = ⎝ f (x)(−1)a·x ⎠ ⎝ g(y)(−1)b·y ⎠ .


x∈Fn2 ,y∈Fm
2 x∈Fn2 y∈Fm
2

Multiplying Boolean functions produces unbalanced functions, and if the functions have
low Hamming weights, the product has low Hamming weight as well. A related general
construction of correlation immune functions by multiplication is deduced in [258] that we
report now. In the next proposition, given a matrix M ∈ Fns×ns 2 and given i, j = 1, . . . , s,
we denote by M (i,j ) the n × n matrix (called a block of M) obtained from M by selecting its
rows of indices between n(i − 1) + 1 and ni and its columns of indices between n(j − 1) + 1
and nj . Assuming that M is nonsingular, denoting the inverse matrix of M by M −1 and the
(i,j )
transposed matrix of M −1 by M  , we denote by M −1 and M (i,j ) the matrices obtained
(i,j )
similarly from M −1 and M  . Since M (j ,i) is the transposed matrix of M −1 , we have, for
any x, y ∈ Fn2
(i,j )
x · (y × M −1 ) = y · (x × M (j ,i) ), (7.16)

where “·” is the usual inner product.

Proposition 125 [258] Let s be a positive integer and M be an ns × ns nonsingular


matrix over F2 . Let fj ∈ CIn,tj for some nonnegative integers tj , 1 ≤ j ≤ s. Define the
following ns-variable function h, whose input is written in the form (x (1) , x (2) , . . . , x (s) ),
where x (1) , x (2) , . . . , x (s) ∈ Fn2 :


s s 
h(x (1)
,x (2)
,...,x (s)
)= fj x (i)
×M (i,j )
.
j =1 i=1

Assume that if 1 ≤ wH (u(1) , u(2) , . . . , u(s) ) ≤ t, then 1 ≤ j ≤ s, exists such that


s 
1 ≤ wH u(i) × M (i,j ) ≤ tj .
i=1


s
Then h belongs to CIns,t and has Hamming weight wH (fj ).
j =1

Proof For any (u(1) , u(2) , . . . , u(s) ) ∈ (Fn2 )s , we have h(u(1) , u(2) , . . . , u(s) )
⎛ ⎞
 s s  s
⎝ x (i) × M (i,j ) ⎠ (−1) j =1 u · x .
(j ) (j )
= fj
x (1) ,...,x (s) ∈Fn2 j =1 i=1


s
Replace x (i) × M (i,j ) by y (j ) for 1 ≤ j ≤ s, then (y (1) , y (2) , . . . , y (s) ) =
i=1
(x (1) , x (2) , . . . , x (s) ) × M, according to the well-known method of multiplication of matrices
by blocks. Thus
308 Correlation immune and resilient functions

(x (1) , x (2) , . . . , x (s) ) = (y (1) , y (2) , . . . , y (s) ) × M −1 ,


s (i,j )
which means x (j ) = y (i) × M −1 for 1 ≤ j ≤ s. Using (7.16), we have
i=1


h(u(1) , u(2) , . . . , u(s) )
⎛ ⎞  s 
s  (i)
 s
j =1 u ·
(j ) y ×M −1
(i,j )

= ⎝ fj (y (j ) )⎠ (−1) i=1

y (1) ,...,y (s) ∈Fn2 j =1


⎛ ⎞  
s 
s

s
i=1 y
(i) · u(j ) ×M (j ,i)
= ⎝ fj (y (j ) )⎠ (−1) j =1

y (1) ,...,y (s) ∈Fn2 j =1


⎛ ⎞  
s 
s

s
j =1 y ·
(j ) u(i) ×M (i,j )
= ⎝ fj (y (j ) )⎠ (−1) i=1

y (1) ,...,y (s) ∈Fn2 j =1


⎛  ⎞

s

s y (j ) · u(i) ×M (i,j )
= ⎝ fj (y (j ) )(−1) i=1 ⎠
j =1 y (j ) ∈Fn2


s s 
= fj u (i)
×M (i,j )
.
j =1 i=1

According to the hypothesis, for any (u(1) , u(2) , . . . , u(s) ) ∈ (Fn2 )s satisfying 1 ≤
wH (u(1) , u(2) , . . . , u(s) ) ≤ t, there exists 1 ≤ j ≤ s such that
s 
(i,j )
1 ≤ wH u (i)
×M ≤ tj .
i=1

Since fj is a tj -th order correlation immune Boolean function for any 1 ≤ j ≤ s, h is a


t-th order correlation immune Boolean function. And h being affine equivalent to the direct
s
product of fj , we have wH (h) = wH (fj ).
j =1

Corollary 23 [258] Let n, t, s be positive integers satisfying t ≤ n and s ≥ 2. Assume that


f1 ∈ CIn,t and fj ∈ CIn, t  for any 2 ≤ j ≤ s. Define
2


s
h(x (1) , x (2) , . . . , x (s) ) = f1 (x (1) ) fj (x (1) + x (j ) ),
j =2

where x (1) , x (2) , . . . , x (s) ∈ Fn2 . Then h belongs to CIns,t and has Hamming weight
s
wH (fj ).
j =1
7.1 Correlation immune and resilient Boolean functions 309

Proof Let M be the ns × ns nonsingular matrix whose representation by n × n blocks


equals
⎡ ⎤
I I I ··· I
⎢ 0 I 0 ··· 0 ⎥
⎢ ⎥
⎢ ⎥
M = ⎢ 0 0 I ··· 0 ⎥,
⎢ .. .. .. .. ⎥
⎣ . . . ··· . ⎦
0 0 0 ··· I
where I is the identity n × n matrix and 0 is the all-0 n × n matrix. Then:
⎡ ⎤
I 0 0 ··· 0
⎢ I I 0 ··· 0 ⎥
⎢ ⎥
⎢ ⎥
M = ⎢ I 0 I · · · 0 ⎥ .
⎢ .. .. .. . ⎥
⎣ . . . · · · .. ⎦
I 0 0 ··· I
We have:

s s 
h(x (1)
,x (2)
,...,x (s)
)= fj x (i)
×M (i,j )
.
j =1 i=1

For any (u(1) , u(2) , . . . , u(s) ) ∈ (Fn2 )s satisfying 1 ≤ wH (u(1) , u(2) , . . . , u(s) ) ≤ t, we have
either
s  s 
(i,1)
1 ≤ wH u ×M
(i)
= wH u (i)
≤ t,
i=1 i=1

s
or u(i) = 0n , in which case there exists 2 ≤ j ≤ s such that u(j ) = 0n and
i=1 
s 
 (j )
wH (u )+wH u(i)
s
wH (u(i) )
s
i=1,i=j
wH (u ) = wH
(j ) u (i) = 2 ≤ i=1
2 ≤ 2t . Proposition
i=1,i=j
125 completes the proof.

Corollary 24 [258] Let n, s, t be positive integers satisfying t ≤ n and s ≥ 2. We have the


following:
 s−1
ωns,t ≤ ωn, t  ωn,t .
2

A construction of low-weight t-th order correlation immune Boolean


functions through Kronecker sum
The Kronecker sum of two vectors
(1)
(x (1) , x (2) ) = ((x1 , . . . , xn(1)
2
), (x1(2) , . . . , xn(2)
1
)) ∈ Fn2 2 × Fn2 1 →

(1) (2)
x (1)  x (2) = (xi2 ⊕ xi1 ) 1≤i1 ≤n1 ∈ Fn2 1 n2
1≤i2 ≤n2
310 Correlation immune and resilient functions

generalizes to s variables as follows: let n1 , . . . , ns be positive integers and I = {1, . . . , n1 }


× · · · × {1, . . . , ns }, then for every I = (i1 , . . . , is ) ∈ I and every 1 ≤ r ≤ s, we denote by
I (r) the vector (i1 , . . . , ir−1 , ir+1 , . . . , is ). Writing
(r) n ···nr−1 nr+1 ···ns
x (r) = (xi1 ,...,ir−1 ,ir+1 ,...,is ) i1 ∈ {1, . . . , n1 } ∈ F2 1 ,
···
is ∈ {1, . . . , ns }

the s-th order Kronecker sum is defined as


 

s
(x (1)
,x (2)
,...,x (s)
)→x (1)
 ···  x (s)
= xI(r)
(r) ∈ Fn2 1 ···ns .
r=1 I ∈I

Proposition 126 [258] Let s, t be positive integers such that 2s > t. Let f1 (x (1) )
be an (n2 · · · ns )-variable t-th order correlation immune Boolean function and f2 (x (2) )
an (n1 n3 · · · ns )-variable 2 2t -th order correlation immune Boolean function. For every
r = 3, 4, . . . , s, let fr (x (r) ) be an (n1 · · · nr−1 nr+1 · · · ns )-variable Boolean function such
that, for every w ∈ Fn2 r satisfying 1 ≤ wH (w) ≤ t with wH (w) even, we have fr (w) = 0.
We define the (n1 + 1)n2 n3 · · · ns -variable function h by its support as follows: Supp(h) =
  >
x  ···  x ,x
(1) (s) (1)
; x ∈ Supp(f1 ), x ∈ Supp(f2 ), . . . , x ∈ Supp(fs ) ,
(1) (2) (s)


s
then h is t-th order correlation immune of Hamming weight wH (fr ).
r=1
In particular, if f1 is a t-th order correlation immune Boolean function and if each
function fr is 2 2t -th order correlation immune for r = 2, . . . , s, then h is a t-th order
s
correlation immune Boolean function of Hamming weight wH (fr ).
r=1

Proof Let us calculate the Fourier–Hadamard transform of h. Its input is any pair (u, v),
where u is a binary vector of the same length as x (1)  · · ·  x (s) and v is a binary vector
of the same length as x (1) , that is, u = (uI )I ∈I ∈ Fn2 1 n2 ···ns and v = (vJ )J ∈J ∈ Fn2 2 ···ns ,
J = {1, . . . , n2 } × · · · × {1, . . . , ns }. We have
  
s (r)
⊕v·x (1)

h(u, v) = (−1) I ∈I uI r=1 xI (r)
.
x (1) ∈Supp(f1 ),
...,x (s) ∈Supp(fs )
 n1 n1
Let us write u0,i2 ,...,is = vi2 ,...,is ; − →
u1 = i1 =0 ui1 ,1,...,1 , . . . ,


i1 =0 ui1 ,n2 ,...,ns and ur =
 nr nr
ir =1 u1,...,1,ir ,1,...,1 , . . . , ir =1 un1 ,...,nr−1 ,ir ,nr+1 ,...,ns , for every r = 2, . . . s. We have then
⎛ ⎞
 (1)
 ⊕v·x (1)
h(u, v) = ⎝ ⎠
I ∈I uI x
(−1) I (1)

x (1) ∈Supp(f1 )
⎛ ⎞

s   s
⎠ = f1 (−
→ fr (−

(r)
× ⎝ (−1) I ∈I uI xI (r)
u1 ) × ur ).
r=2 x (r) ∈Supp(fr ) r=2
7.1 Correlation immune and resilient Boolean functions 311

For 1 ≤ wH (u, v) ≤ t, we have wH (− →


u1 ) ≤ wH (u, v) ≤ t.
– If wH (−

u1 ) = 0, then since f1 is t-th order correlation immune, we have 
h(u, v) = 0.
– If w ( u ) = 0, then w (u, v) is even since w (u, v) (mod 2) = w (−


H 1 H H

u ) (mod 2), and
H 1
then wH (u, v) ≤ 2 2t . We have then that:
– If wH (− →
u2 ) = 0, then we have 1 ≤ wH (− →
u2 ) ≤ wH (u) ≤ wH (u, v) ≤ 2 2t , and since f2
is 2 2 -th order correlation immune, we deduce 
t
h(u, v) = 0.
– If wH (− →u2 ) = . . . = wH (−u−→) = 0 and w (−
j −1

H j = 0, where 3 ≤ j ≤ s, then wH (u) and
u )
wH ( uj ) are even since wH (u) (mod 2) = wH (−

→ →
u2 ) (mod 2) = wH (− →
uj ) (mod 2) and
we have 2 ≤ wH (− →
uj ) ≤ 2 2t , and the hypothesis on fj implies  h(u, v) = 0.
– If wH (− →u2 ) = . . . = wH (−→
us ) = 0, then since wH (u, v) = 0 and wH (− →u1 ) = 0, there exist
 
0 ≤ i1 < i1 ≤ n1 , 1 ≤ i2 ≤ n2 , . . . , 1 ≤ is ≤ ns such that ui1 ,i2 ,...,is = ui1 ,i2 ,...,is = 1.
Since wH (− →
us ) = 0, there exist in fact two values of is such that ui1 ,i2 ,...,is = 1. Since
− −→
wH (us−1 ) = 0, there exist then four values of (is−1 , is ) such that ui1 ,i2 ,...,is = 1. By
induction, we have then wH (u, v) ≥ 2s . But we have 1 ≤ wH (u, v) ≤ t < 2s by
hypothesis, a contradiction. Hence, wH (− →
u1 ) = wH (− →
u2 ) = . . . = wH (− →
us ) = 0 cannot
happen. This completes the proof.

A corollary can be found in [258], as well as variants of the construction of Proposition


126, one of which needs weaker hypotheses but does not include the term in x (1) in the
support of h (and provides then functions in fewer variables) and the other deals with third-
order correlation immune functions.

7.1.10 On the number of correlation immune and resilient functions


It is important to ensure that the selected criteria for the Boolean functions, supposed to
be used in some cryptosystems, do not restrict the choice of the functions too severely.
Hence, the set of functions should be enumerated. But this enumeration is unknown for most
criteria, and the cases of correlation immune and resilient functions make no exception. We
recall below what is known. More than for bent functions, the class of resilient functions
produced by Maiorana–McFarland’s construction5 is by far the widest class, compared
to the classes obtained from the other usual constructions, and the number of provably
resilient Maiorana–McFarland functions seems negligible with respect to the total number
of functions with the same properties. For balanced (i.e., 0-resilient) functions, this can be
checked: for every positive r, the number of balanced Maiorana–McFarland functions (7.4)
s s
obtained by choosing φ such that φ(y) = 0r , for every y, equals (2r − 1)2 22 , and is
n−1
smaller than or equal to 22 (since r ≥ 1, s = n − r). It is negligible with respect to the
 2n n
2 + 1
number 2n−1 ≈ 2√ 2n of all balanced functions on Fn2 . The number of t-resilient Maiorana–
π2
McFarland functions obtained by choosing φ such that wH (φ(y)) > t for every y equals
 r  2n−r
2 i=t+1 ri , and is probably also very small compared to the number of all t-resilient
functions. But this number is unknown.

5 We have seen that this construction hardly allows building unbalanced correlation immune functions.
312 Correlation immune and resilient functions

The exact number of t-resilient functions is known for t ≥ n−3 (see [181], where (n−3)-
resilient functions are characterized) and (n − 4)-resilient functions have been characterized
[256, 125].
As for bent function, upper bounds on the numbers of correlation immune and resilient
functions come directly from the Siegenthaler bound on the algebraic degree: the number
of t-th order correlation
n−m n
immune (resp. t-resilient) n-variable functions is bounded above
n−m−1 n
by 2 i=0 i (resp. 2 i=0 ( i ) ). These bounds are the so-called naive bounds. In 1990,
( )
Yang and Guo published an upper bound on the number of first-order correlation immune
functions. At the same time, Denisov obtained a rather strong result (see below), but his
result being published in Russian, it was not known internationally. His paper was translated
into English two years later [433] but was not widely known either. This explains why
several papers appeared, some of which with weaker results, that we describe first. Park
et al. [926] improved upon Yang–Guo’s bound. Schneider [1024] proved that the number
of t-resilient n-variable Boolean functions is less than


n−m (n−i−1
m−1 )
2i
,
2i−1
i=1

but this result was known; see [520]. A general upper bound on the number of Boolean
functions whose distances to affine functions are all divisible by 2t has been obtained
in [301]. It implies an upper bound on the number of t-resilient functions, which improves
upon previous bounds for about half the values of (n, m) (it is better for t large). This bound
n−m−1 m−1
divides the naive bound by approximately 2 i=0 ( i )−1 if m ≥ n and by approximately
2
22
2m+1 −1 if m < n .
2
An upper bound on t-resilient functions (m ≥ n2 − 1) partially improving upon this latter
bound thanks to a refinement of its method was obtained for n2 − 1 ≤ m < n − 2 in [285]:
the number of n-variable t-resilient functions is lower than
 (n−i−1
n−m−2 n 
n−m m−1 )
n
()+ n−m−1 2i
2 i=0 i .
m+1
2(n−m−1)+1 i=1 2i−1

The expressions of these bounds seem difficult to compare mathematically. Tables have been
computed in [285].
The problem of counting resilient functions is related to counting integer solutions of a
system of linear equations; see [850].
The main result given by Denisov in [433] is an asymptotic formula for the number of
t-th order correlation immune functions, where t is negligible compared to n. This formula
was later believed incorrect by the author, and a correction was given by him in [434], but it
has been shown later in [182] that the correct expression was the original one, at least under
the condition 1 ≤ t ≤ ( ln62 − ) lnnn where > 0: this number is then equivalent to
t
t n
n
j =1 (j )
2n −t+ j =0 j j( ) (2n−1 π)−
2 2 ,
7.2 Resilient vectorial Boolean functions 313

and the number of t-resilient functions is equivalent to

t
t n
n
j =0 (j )
2n + ( ) (2n−1 π)−
j =0 j j
2 2 .

For large resiliency orders, Tarannikov and Kirienko showed in [1082] that, for every
positive integer t, there exists a number p(m) such that for n > p(m), any (n − m)-
resilient function f (x1 , . . . , xn ) is equivalent, up to permutation of its input coordinates,
to a function of the form g(x1 , . . . , xp(m) ) ⊕ xp(m)+1 ⊕ · · · ⊕ xn . It is then a simple matter
p(m) 
to deduce that the number of (n − m)-resilient functions equals i=0 A(m, i) ni , where
A(m, i) is the number of i-variable (i − m)-resilient functions that depend on all inputs x1 ,
x2 , . . . , xi nonlinearly. Hence, it is equivalent to A(m,p(m))
p(m)! n
p(m) for t constant when n tends

to infinity, and it is at most Am np(m) , where Am depends on t only. It is proved in [1083]


that 3 · 2t−2 ≤ p(m) ≤ (m − 1)2t−2 and in [1082] that p(4) = 10; hence the number
of (n − 4)-resilient functions equals (1/2)n10 + O(n9 ). It is also shown in [1082] that for
n ≥ 10, there does not exist an unbalanced nonconstant (n − 4)-th order correlation immune
function and that for n ≥ 11, there does not exist an (n − 4)-resilient function depending
nonlinearly on all its variables.
The classification of first-order correlation immune functions and of 1-resilient functions
has been studied in [758], with an exact enumeration for n = 7 and a precise estimation for
n = 8.

7.2 Resilient vectorial Boolean functions


For the convenience of the reader, we recall what we have seen in Section 3.3.1, page 129:
an (n, m)-function F (x) is t-th order correlation immune if its output distribution does not
change when at most t coordinates xi of x are kept constant.
S-boxes being better balanced, F is called t-resilient if it is balanced and t-th order
correlation immune. If such an (n, m, t)-function F exists, then wehave the bounds t ≤
2m−1 n 2m−2 (n+1) t/2 n
2m −1 , t ≤ 2 2m −1 − 1, m ≤ n − t in general, m ≤ n − log2 i=0 i if t is even
   
n−1 (t−1)/2 n
and m ≤ n − log2 (t−1)/2 + i=0 i if t is odd, and more complex bounds based on
linear programming [78, 520].
Composing a t-resilient (n, m)-function by a permutation of Fm 2 does not change its
resiliency order. Function F is t-resilient if and only if one of the following conditions
is satisfied (see Proposition 41, page 130):
 v·F (x)⊕u·x = 0, for every u ∈ Fn such that w (u) ≤ t and every v ∈
(i) x∈Fn2 (−1) 2 H
F
2
m \ {0 }.
m
(ii) g(F (x))⊕u·x = 0, for every u ∈ Fn such that w (u) ≤ t and every balanced
x∈Fn2 (−1) 2 H
t-variable Boolean function g.
Finally, F is t-resilient if and only if:
(iii) for every vector b ∈ Fm 2 , the Boolean function ϕb = δ{b} ◦ F is t-th order correlation
immune and has Hamming weight 2n−m .
314 Correlation immune and resilient functions

7.2.1 Constructions of resilient vectorial Boolean functions


Linear or affine resilient functions
The construction of t-resilient linear functions is easy: Bennett et al. [58] and Chor et al.
[370] give the connection between linear resilient functions and linear codes (correlation
immune functions being related to orthogonal arrays; see [181, 180], this relationship is in
fact due to Delsarte [422]). There exists a linear (n, m, t)-function if and only if there exists
a binary linear [n, m, t + 1] code.

Proposition 127 [58] Let G be a generating matrix for an [n, m, d] binary linear code.
We define L : Fn2 → Fm2 by the rule L(x) = x × G , where G is the transpose of G. Then
T T

L is an (n, m, d − 1)-function.

This is a direct consequence of Corollary 22, page 292, and of Proposition 41. It can also
be seen directly: for every nonzero v ∈ Fm 2 , the vector v · L(x) = v · (x × G ) has the form
t

u · x, where u = v × G is a nonzero codeword. Hence, u has Hamming weight at least d


and the linear function v · L is (d − 1)-resilient, since it has at least d independent terms of
degree 1 in its ANF.
The converse of Proposition 127 is clearly also true.
Proposition 127 is still straightforwardly true if L is affine instead of linear, that is,
L(x) = x × Gt + a, where a is a vector of Fk2 .
Stinson [1049] considered the equivalence between resilient functions and what he called
large sets of orthogonal arrays. According to Proposition 41, an (n, m)-function is t-resilient
if and only if there exists a set of 2m disjoint binary arrays of dimensions 2n−m ×n, such that,
in any t columns of each array, each of the 2t elements of Ft2 occurs in exactly 2n−m−t rows
and no two rows are identical. The construction of (n, m, t)-functions by Proposition 127
can be generalized by considering nonlinear codes of length n (that is, subsets of Fn2 ) and of
size 2n−m , whose dual distance (see Definition 4, page 16) is at least t + 1 (see [1050]). In
the case of Proposition 127, C is the dual of the code of generating matrix G. The nonlinear
code needs also to be systematic (that is, there must exist a subset I of {1, . . . , n} called an
information set of C, necessarily of size n − m since the code has size 2n−m , such that every
possible tuple occurs in exactly one codeword within the specified coordinates xi ; i ∈ I ; we
have seen this notion at page 161) to allow the construction of an (n, m, d ⊥ − 1)-function:
the image of a vector x ∈ Fn2 is the unique vector y of Fn2 such that yi = 0 for every i ∈ I
and such that x ∈ y + C (in other words, to calculate y, we first determine the unique
codeword c of C, which matches with x on the information set and we have y = x + c). It is
deduced in [1050] that, for every r ≥ 3, a (2r+1 , 2r+1 − 2r − 2, 5)-resilient function exists
(the construction is based on the Kerdock code), and that no affine resilient function with
such good parameters exists.

Maiorana–McFarland resilient functions


The idea of designing resilient vectorial functions by generalizing the Maiorana–
MacFarland construction is natural. One can find a first reference of such construction
in a paper by Nyberg [906], but for generating perfect nonlinear functions. This technique
has been used by Kurosawa et al. [723], Johansson and Pasalic [648], Pasalic and Maitra
7.2 Resilient vectorial Boolean functions 315

[933], and Gupta and Sarkar [580] to produce functions having high resiliency and high
nonlinearity.6

Definition 70 The class of Maiorana–McFarland (n, m)-functions is the set of those


functions F that can be written in the form
⎛ ⎞
ϕ11 (y) · · · ϕ1m (y)
⎜ .. .. .. ⎟
F (x, y) = x × ⎝ . . . ⎠ + H (y), (x, y) ∈ F2 × F2
r s
(7.17)
ϕr1 (y) · · · ϕrm (y)
where r and s are two integers satisfying r + s = n, H is any (s, m)-function and, for every
i ≤ r and every j ≤ m, ϕij is a Boolean function on Fs2 .

The concatenation of t-resilient functions being still t-resilient, if the transpose matrix of
the matrix involved in Equation (7.17) is the generator matrix of a linear [r, m, d]-code for
every vector y ranging over Fs2 , then the (n, m)-function F is (d − 1)-resilient.
After denoting, for every i ≤ m, by φi the (s, r)-function that admits for coordinate
functions the Boolean functions ϕ1i , . . . , ϕri (in ith column of the matrix above), we can
rewrite Relation (7.17) as
F (x, y) = (x · φ1 (y) ⊕ h1 (y), . . . , x · φm (y) ⊕ hm (y)) . (7.18)

Resiliency Equivalently to what is written above in terms of codes, we have:

Proposition 128 Let n, m, r, and s be integers such that n = r + s. Let F be a Maiorana–


McFarland (n, m)-function defined as in (7.18) and such that, for every y ∈ Fs2 , the family
(φi (y))i≤m is a basis of an m-dimensional subspace of Fr2 having t + 1 for minimum
Hamming weight, then F is at least t-resilient.

Nonlinearity According to Proposition 53, page 166, the nonlinearity nl(F ) of any
Maiorana–McFarland (n, m)-function defined as in Relation (7.18) satisfies
0 0
0 0
0 0
v·H (y)⊕u ·y 0
nl(F ) = 2n−1 − 2r−1 max 0
(u,u )∈Fr2 ×Fs2 ,v∈Fm
0 (−1) 0, (7.19)
2 \{0m } 0y∈E 0
u,v
m
where Eu,v denotes the set {y ∈ F2 ;
s
i=1 vi φi (y) = u}.
The bounds given by Relations (7.6) and (7.7), page 294, imply the following:
O P
/
2n−1 − 2r−1 maxm |Eu,v | ≤ nl(F ) ≤ 2n−1 − 2r−1 maxm |Eu,v | .
r u∈F2 ,v∈F2 \{0m } r u∈F2 ,v∈F2 \{0m }

If, for every element y, the vector space spanned by the vectors φ1 (y), . . . , φm (y) admits
m for dimension and has a minimum Hamming weight strictly larger than k (so that F is

6 But, as seen in Subsection 3.3.2, this notion of nonlinearity is not relevant to S-boxes for stream ciphers. The
generalized nonlinearity, which is the correct notion, needs to be further studied for resilient functions and for
MM functions.
316 Correlation immune and resilient functions

t-resilient with t ≥ k), then we have


⎡ ⎤
2s/2
nl(F ) ≤ 2n−1 − 2r−1 ⎢
⎢ 1r

r ⎥ . (7.20)
⎢ i=k+1

i

The nonlinearity can be exactly calculated  in two situations (at least): if, for every vector
v ∈ Fm 2 \ {0m }, the (s, r)-function y →
 i≤m vi φi (y) is injective (resp. takes exactly two
times each value of its range), then F admits 2n−1 − 2r−1 (resp. 2n−1 − 2r ) for nonlinearity.
Johansson and Pasalic described in [648] a way to specify the vectorial functions
φ1 , . . . , φm so that this kind of condition is satisfied. Their result can be generalized in
the following form:

Lemma 11 Let C be a binary linear [r, m, t + 1] code. Let β1 , . . . , βm be a basis of the


F2 -vector space F2m , and L0 a linear isomorphism between F2m and C. Then the functions
Li (z) = L0 (βi z), i = 1, . . . , m, are such that, for every v ∈ Fm
2 \ {0m }, the function
z ∈ F2m → m i=1 v i L i (z) is a bijection from F 2 m into C.

Proof For every vector v in F m and every element z of F m , we have m v L (z) =


 2 2 i=1 i i
L0 ( m vector v is nonzero, then the element m
i=1 vi βi )z . If the i=1 vi βi is nonzero. Hence,
the function z ∈ F2m → m i=1 vi Li (z) is a bijection.

Since the functions L1 , L2 , . . . , Lm vanish at zero input, they do not satisfy the hypothesis
of Proposition 128. A solution to derive a family of vectorial functions also satisfying the
hypothesis of Proposition 128 is then to right-compose the functions Li with a same injective
(or two-to-one) function π from Fs2 into F∗2m . Then, for every nonzero vector v ∈ Fm 2 \ {0m },
the function y ∈ F2 → m
s v L [π(y)] is injective (or two-to-one) from F s into C ∗ . This
i=1 i i 2
gives the following construction:7
Given integers m < r, let C be an [r, m, t + 1]-code such that t is as large as possible
(Grassl gives in [570] a precise overview of the best-known parameters of codes). Then
define m linear functions L1 , . . . , Lm from F2m into C as in Lemma 11. Choose an integer
s strictly lower than m (resp. lower than or equal to m) and define an injective (resp. two-
to-one) function π from Fs2 into F∗2m . Choose any (s, m)-function H = (h1 , . . . , hm ) and
denote r + s by n. Then the (n, m)-function F whose coordinate functions are defined by
fi (x, y) = x · [Li ◦ π ] (y) ⊕ hi (y) is t-resilient and admits 2n−1 − 2r−1 (resp. 2n−1 − 2r )
for nonlinearity.
All the primary constructions presented in [648, 723, 907, 933] are based on this principle.
The construction of (n, m, t)-functions defined in [580] is also a particular application of this
construction, as shown in [319].

7 Another construction based on Lemma 11 involves a family of nonintersecting codes (i.e., of codes with trivial
pairwise intersection) having the same length, dimension, and minimum distance; however, this construction is
often worse for large resiliency orders, as shown in [319].
7.2 Resilient vectorial Boolean functions 317

Other constructions
Constructions of highly nonlinear resilient vectorial functions, based on elliptic curves
theory and on the trace of some power functions x → x d on finite fields, have been designed
respectively by Cheon [367] and by Khoo and Gong [696]. However, it is still an open
problem to design highly nonlinear functions with high algebraic degrees and high resiliency
orders with Cheon’s method. Besides, the number of functions that can be designed by
these methods is very small. In [1157, 1159, 1161] are designed resilient functions whose
nonlinearity exceeds the bent concatenation bound.
Zhang and Zheng proposed in [1168, 1170] a secondary construction consisting in the
composition F = G ◦ L of a linear resilient (n, m, t)-function L with a highly nonlinear
(m, k)-function. The resulting function F is obviously t-resilient, admits 2n−m nl(G) for
nonlinearity, where nl(G) denotes the nonlinearity of G, and its degree is the same as
that of G. Taking for function G the inverse function x → x −1 on the finite field F2m ,
Zhang and Zheng obtained t-resilient functions having a nonlinearity larger than or equal to
2n−1 −2n−m/2 and having m−1 for algebraic degree. But the linear (n, m)-functions involved
in the construction of Zhang and Zheng introduce a weakness: their unrestricted nonlinearity
(see Definition 38, page 131) being null, this kind of function cannot be used as a multioutput
combination function in stream ciphers. Nevertheless, this drawback can be avoided by
concatenating such functions (recall that the concatenation of t-resilient functions gives t-
resilient functions, and a good nonlinearity can be obtained by concatenating functions with
disjoint Walsh supports). We obtain this way a modified Maiorana–McFarland construction,
which could be investigated further.
More secondary constructions of resilient vectorial functions can be derived from the
secondary constructions of resilient Boolean functions (see, e.g., [180, 225]).
8

Functions satisfying SAC, PC, and EPC,


or having good GAC

The research on Boolean functions achieving the propagation criterion P C(l) of order
1 ≤ l < n was active in the 1990s. The class of P C(l) functions is a super-class of that
of bent functions (bent functions achieve P C(n)). For l ≤ n − 3 when n is even and
for l ≤ n − 1 when n is odd, its elements can be balanced and highly nonlinear. Strict
avalanche property (corresponding to l = 1) and propagation properties give more features
to Boolean functions in the framework of stream ciphers (see an example with [60]), even if
it is more related to block ciphers (and to the differential attack). In the framework of stream
ciphers, the invention of algebraic attacks and the difficulty of designing then Boolean
functions satisfying all the mandatory criteria have more or less refocused research on
Boolean functions meeting mandatory criteria only (including algebraic immunity and fast
algebraic immunity). In the framework of block ciphers, studying individually the coordinate
or component functions of S-boxes is not the most relevant approach. Nevertheless, to be
complete,1 we devote a short chapter to such avalanche criteria.

8.1 P C(l) criterion


For the convenience of the reader, we summarize Definition 24, page 97:

Definition 71 For 1 ≤ l ≤ n, an n-variable Boolean function f satisfies the propagation


criterion of order l (in brief, P C(l)) if F (De f ) = 0 for every e ∈ Fn2 such that
1 ≤ wH (e) ≤ l. Strict avalanche criterion (SAC) corresponds to P C(1) [516, 581].

It is shown in [605, 218, 219]that, if n is even, then P C(n − 2) implies P C(n); so for n
even we can find balanced n-variable P C(l) functions only if l ≤ n − 3. For odd n ≥ 3,
it is also known that the functions that satisfy P C(n − 1) are those functions of the form
g(x1 ⊕ xn , . . . , xn−1 ⊕ xn ) ⊕ (x), where g is bent and  is affine, and that the P C(n − 2)
functions are those functions of a similar form, but where, for at most one index i, the term
xi ⊕ xn may be replaced by xi or by xn (other equivalent characterizations exist [219]).
The algebraic degree of P C(l) functions is bounded above by n − 1. A lower bound on
their nonlinearity is easily shown [1169]: if there exists an l-dimensional subspace F such
l
that, for every nonzero e ∈ F , the derivative De f is balanced, then nl(f ) ≥ 2n−1 − 2n− 2 −1 .
Indeed, Relation (2.56), page 62, applied with b = 0n and E = F ⊥ , shows that every value
Wf2 (u) is then bounded above by 22n−l ; this implies, taking F = {e ∈ Fn2 ; e u} for

1 Note that the avalanche and propagation criteria also play a role with hash functions.

318
8.2 P C(l) of order k and EP C(l) of order k criteria 319
l
wH (u) = l, that P C(l) functions have nonlinearities bounded below by 2n−1 − 2n− 2 −1 .
Equality can occur only if l = n − 1 (n odd) and l = n (n even).
The maximum correlation of Boolean functions satisfying P C(l) (and in particular, of
bent functions) with respect to subsets of indices can be deduced from Relations (3.14),
page 102, and (2.56); see [187].
There exist characterizations of the propagation criterion. A first obvious one
 that, according
is to Relation (2.54), page 62, f satisfies P C(l) if and only if
a·u W 2 (u) = 0 for every nonzero vector a of Hamming weight at most l.
n
u∈F2 (−1) f
A second one (direct consequence of Relation (2.56), page 62) is:

Proposition 129 [219] Any n-variable Boolean function f satisfies  P C(l) if and only if,
for every vector u of Hamming weight at least n−l, and every vector v, w u Wf2 (w+v) =
2n+wH (u) .

Maiorana–McFarland’s construction can be used to produce functions satisfying the


propagation criterion: the derivative D(a,b) (x, y) of a function of the form (5.1), page 165,
being equal to x · Db φ(y) ⊕ a · φ(y + b) ⊕ Db g(y), the function satisfies P C(l) under the
sufficient condition that
1. For every nonzero b ∈ Fs2 of Hamming weight smaller than or equal to l, and every
vector y ∈ Fs2 , the vector Db φ(y) is nonzero (or equivalently every set φ −1 (u),
u ∈ Fr2 , either is empty or is a singleton or has minimum distance strictly larger
than l).
2. Every linear combination of at least one and at most l coordinate functions of φ is
balanced (this condition corresponds to the case b = 0s ).
Constructions of such functions have been given in [218, 219, 722].
According to Proposition 129 above, Dobbertin’s construction cannot produce functions
satisfying P C(l) with l ≥ n2 . Indeed, if u is for instance the vector with n2 first coordinates
equal to 0, and with n2 last coordinates equal to 1, we have, according to Relation (7.9),
page 296, Wh2 (w) = 0 for every w u.

8.2 P C(l) of order k and EP C(l) of order k criteria


Definition 72 An n-variable Boolean function satisfies the propagation criterion P C(l) of
order k (resp. the extended propagation criterion EP C(l) of order k) if it satisfies P C(l)
when k coordinates of the input x are kept constant (resp. if every derivative De f , with
e = 0n of weight at most l, is k-resilient).

According to the characterization of resilient functions and its proof, we have:

Proposition 130 [968] A function f satisfies EP C(l) (resp. P C(l)) of order k if and
only if, for any vector e of Hamming weight smaller than or equal to l and any vector c of
Hamming weight smaller than or equal to k, if (e, c) = (0n , 0n ) (resp. if (e, c) = (0n , 0n )
and if e and c have disjoint supports) then
320 Functions satisfying SAC, PC, and EPC, or having good GAC

WDe f (c) = (−1)f (x)⊕f (x+e)⊕c·x = 0.


x∈Fn2

A characterization by the Walsh transform of f has been deduced in [987].


It has been shown in [970] that SAC(k) (i.e. P C(1) of order k) functions have algebraic
degrees at most n − k − 1. In [797], the criterion SAC(n − 3) was characterized through
the ANF of the function, and its properties were further studied. A construction of P C(l) of
order k functions based on Maiorana–McFarland’s method is given in [722] (the mapping φ
being linear and constructed from linear codes) and generalized in [218, 219] (the mapping
φ being not linear and constructed from nonlinear codes). A construction of n-variable
balanced functions satisfying SAC(k) and having algebraic degree n − k − 1 is given, for
n−k−1 odd, in [722] and, for n−k−1 even, in [1011] (where balancedness and nonlinearity
are also considered).
It is shown in [219] that, for every positive even l ≤ n − 4 (with n ≥ 6) and every odd l
such that 5 ≤ l ≤ n − 5 (withn ≥ 10), the functions that satisfy P C(l) of order n − l − 2
are the functions of the form 1≤i<j ≤n xi xj ⊕ h(x1 , . . . , xn ), where h is affine.

8.3 Absolute indicator


In [1167], the authors state the conjecture that any balanced function on an odd number n of
variables satisfies f ≥ 2(n+1)/2 . In [527, 820], for n = 15 and 21 the authors give balanced
functions with f < 2(n+1)/2 (an error on the 21-variable functions found by computer
investigation has been corrected in [681]). In [684], the first construction giving f < 2n/2
for even n (a balanced 10-variable function with f = 24) is found. In [1074] a construction
is given of n-variable balanced functions with f < 2n/2 , where n > 44 and n ≡ 2 [mod
4], with specific examples for n = 18, 22, 26. In [683], results for n = 12, 14, . . . , 26 are
obtained (the journal version to appear also provides n-variable balanced functions with
f < 2n/2 , where n > 50 and n ≡ 0 [mod 4]).
Bounds between the absolute indicator and the nonlinearity are given in [1178].

Remark. The block sensitivity bs(f ) of an n-variable Boolean function f equals the
maximum number of vectors a (1) , . . . , a (k) ∈ Fn2 with disjoint supports and such that
Da (i) f (x) = 1, ∀i = 1, . . . , k. Its (basic) sensitivity s(f ) is defined similarly with all vectors
a (i) of Hamming weight 1. The 30-year-old sensitivity conjecture states that there exists a
constant C independent of n such that bs(f ) ≤ (s(f ))C for every f [905]. This conjecture
has been proved in [631].
9

Algebraic immune functions

The invention of algebraic attacks and of fast algebraic attacks has deeply modified the
research on Boolean functions for stream ciphers. Before 2003, functions had about ten
variables (to be fastly computable) and were mainly supposed to be balanced, have large
algebraic degree and nonlinearity and in the case of the combiner generator, ensure good
trade-off between algebraic degree, nonlinearity, and resiliency order. Since 2003, the
designer needs also to ensure resistance to the algebraic attack (which needs in practice
optimal or almost optimal algebraic immunity) and good resistance to fast algebraic attacks
and to the Rønjom–Helleseth attack and its improvements. This implies a larger number of
variables (say, between 16 and 20; it can be more if the function is particularly quickly
computable) and an algebraic degree close to n (this is a necessary but not sufficient
condition for the resistance against fast algebraic attacks). For this reason, the combiner
generator seems less adapted nowadays; it needs to be made more complex, for instance
with memory. Even the filter generator has posed a problem: during five years, no function
usable in it could be found (the known functions with optimal algebraic immunity had bad
nonlinearity and bad resistance to fast algebraic attacks; see [27]). In 2008, an infinite class
of functions possessing all mandatory features was found in [273]. The functions in this
class are rather fastly computable, but since stream ciphers need to be faster than block
ciphers (which can be used as pseudorandom generators), there is still a need of functions
satisfying all mandatory criteria and being very fast to compute, like the hidden weight bit
function (HWBF), which has been more recently investigated; see below. To be complete
in this introduction, we need to mention that a new way of using Boolean functions came
recently with the so-called filter permutator, like in the FLIP cryptosystem [839], which
posed new problems on Boolean functions (see [306]); see more in Section 12.2.

9.1 Algebraic immune Boolean functions


For the convenience of the reader, we summarize the definitions seen in Section 3.1 on
algebraic immune functions.

Definition 73 Let f be any n-variable Boolean function. The minimum algebraic degree
of nonzero annihilators of f or of f ⊕ 1 (i.e., of nonzero multiples of f ⊕ 1 or of f ), is
called the algebraic immunity of f and is denoted by AI (f ).
The fast algebraic immunity of f is the integer:
 ' (
F AI (f ) = min 2AI (f ), min dalg (g) + dalg (fg); 1 ≤ dalg (g) < AI (f ) .

321
322 Algebraic immune functions

The fast algebraic complexity of f is the integer:



F AC(f ) := min{max dalg (g) + dalg (fg), 3dalg (g) ; 1 ≤ dalg (g) < AI (f )}.

All three parameters are stable under complementation f → f ⊕ 1; see more in [324].
We have AI (f ) ≤ min(dalg (f ),  n2 ) and F AI (f ) ≤ F AC(f ) ≤ n for any n-variable
function f .
A standard algebraic attack on a stream cipher using some Boolean function f in the
combiner model or the filter model is all the more efficient as AI (f ) is smaller and many
linearly independent lowest degree annihilators of f or f ⊕ 1 exist. Parameter F AC(f )
and its simplified version F AI (f ) play a similar role with respect to fast algebraic attacks.
In [793], the authors call perfect algebraic immune (PAI) the n-variable Boolean functions f
such that, for any pair of strictly positive1 integers (e, d) such that e+d < n and e < n2 , there
is no nonzero function g of algebraic degree at most e such that fg has algebraic degree at
most d (while we have seen at page 94 that for every n-variable function f and every (e, d)
such that e + d ≥ n, such function g exists). Such functions have perfect immunity against
the standard and fast algebraic attacks (indeed, as shown in [793, 789], a PAI function and
an almost PAI function with an even number of variables have optimal algebraic immunity,
where almost PAI is defined similarly with e + d < n − 1 and e < n−1 2 instead of e + d <
n and e < n2 ). It is shown in [485, 793] that perfect algebraic immune functions, when
balanced, can exist only if n equals 1 plus a power of 2, and when unbalanced, can exist
only if n is a power of 2. Indeed, it is easily seen that, for any perfect algebraic immune
function f , we have dalg (f ) ≥ n − 1 and it is proved in [793] that if dalg (f ) = n − 1 (resp.
 
dalg (f ) = n), then for e < n2 such that n−1 e ≡ 1 [mod 2] (resp. n−1 e ≡ 0 [mod 2]), there
exists a nonzero function g such that dalg (g) ≤ e and dalg (fg) ≤ n − e − 1, and such e
exists unless n = 2s + 1 (resp. 2s ). It is shown in [791] that no symmetric Boolean function
can be perfect algebraic immune for n ≥ 5.
In [929], Pasalic introduced a slightly different notion of optimal resistance to FAA: an
n-variable Boolean function f is said to satisfy the high-degree product property (HDP) of
order n if, for every n-variable Boolean function g of algebraic degree e such that 1 ≤ e <
 n2  and that is not an annihilator of f , we have dalg (fg) ≥ n − e. Then [929] proves that
f ⊕ 1 has the same property and AI (f ) =  n2 ; such a function is then called algebraic
attack resistant (AAR).
As we already saw at page 92, since a Boolean function g is an annihilator of f if and
only if g(x) = 0 for every element x in the support of f , to determine whether f (resp.
f ⊕ 1) admits nonzero annihilators of algebraic degree  at most
 d, we consider a general
 
Boolean function g(x) by its ANF g(x) = aI xi and consider the system (see
I ⊆{1,...,n} i∈I
|I |≤d
 
page 92) of the wH (f ) (resp. 2n − wH (f )) equations in the di=0 ni unknowns aI ∈ F2
expressing that g(x) = 0 for x ∈  supp(f ) (resp. x ∈ supp(f )). The matrix Mf ,d (resp.
Mf ⊕1,d ) of this system has term i∈I xi at a row indexed by x and a column indexed by I ,
where x ∈ supp(f ) (resp. x ∈ supp(f )). Calculating the algebraic immunity of a function

1 Assuming that e can be null would oblige f to have algebraic degree n and it could then not be balanced.
9.1 Algebraic immune Boolean functions 323

f by applying the definition consists then in determining the minimum value of d such that
the ranks rk(Mf ,d ) and rk(Mf ⊕1,d ) of the matrices of these two systems do not both equal
 d n
i=0 i , and the dimension dim(An d (f )) of the vector space of annihilators of algebraic
d n
degree at most d of f equals i=0 i − rk(Mf ,d ).
The dimension of And (f ) has been determined for all d in [228] for some classes of
functions: minimum weight elements f of the Reed–Muller codes (i.e., indicators of affine
subspaces of Fn2 ), their complements f ⊕ 1, their sums with affine functions when these
are balanced, and complements of threshold functions (see more on these latter functions in
Subsection 10.1.7).

Remark. Given an n-variable Boolean function f , denoting by LDAn (f ) the F2 -vector


space made of the annihilators of f of algebraic degree AI (f ) (assuming that some exist;
otherwise, we change f into f ⊕ 1) and the zero function, we have, as observed in [261]:

1. dim LDAn (f ) ≤ AIn(f ) , since two distinct annihilators of algebraic degree AI (f )
cannot have the same degree AI (f ) part in their algebraic normal forms (otherwise, their
sum would be a nonzero annihilator of algebraic degree strictly smaller
 than AI (f )).
1 n
2. If f is balanced and AI (f ) = 2 , n even, then dim LDAn (f ) ≥ 2 n , since matrix Mf , n2
n

 n2 n 
2
1 n
has 2 n−1 rows and i=0 i = 2 n−1 + 2 n columns.
2  n
3. If f is such that AI (f ) = 2 , n odd, then dim LDAn (f ) = n+1
n+1
, since we know
2
that wH (f ) = 2n−1 , and Mf , n−1 is then a 2n−1 × 2n−1 square matrix whose rank equals
2
2n−1 ; matrix Mf , n+1 has then rank 2n−1 .
2

9.1.1 General properties of the algebraic immunity and its relationship


with some other criteria
We have seen that the algebraic immunity of any n-variable Boolean function is an affine
invariant and is bounded above by  n2 . The functions used in stream ciphers must have an
algebraic immunity close to this maximum. In the next paragraphs, we give properties and
characterizations, some of which are new.

Algebraic immunity of monomial functions


It has been shown in [899, 900] that if the number r(d) of runs of 1s in the binary expansion
of the exponent d of a power function trn (ax d ) (that is, the number of full subsequences of

consecutive 1s) is smaller than n/2, then the algebraic immunity is bounded above by
? @
√ n
r(d) n + √ − 1. (9.1)
 n
- .
This comes from the fact that there exists g of algebraic degree √nn such that fg
has algebraic degree
√ - .at most (9.1). This property also allows us to prove that F AI (f ) ≤
r(d) n + 2  n − 1, as observed in [870].
√n
324 Algebraic immune functions

Note that (9.1) is better than the general bound  n2  for only a negligible part of power
mappings, but it addresses all those whose exponents have a constant 2-weight or a constant
number of runs – the power functions studied as potential S-boxes in block ciphers enter in
this framework. Moreover, the bound is further improved when n is odd and the function
is almost bent: the algebraic immunity of such functions is bad since bounded above by

2  n. The exact value of the algebraic immunity of the multiplicative inverse function

trn (ax 2 −2 ), a = 0, has been given in [498]; it equals 2 n − 2, which is not good either.
n

Algebraic immunity of a restriction


If the restriction of a function f to an affine space, for instance obtained by fixing xi to ai
for any i ∈ I ⊆ {1, . . . , n}, has a nonzero annihilator g of some algebraic
  degree d, then f
has for nonzero annihilator the function g(x)1A (x), equal to g(x) 1 ⊕ i∈I (xi ⊕ ai ⊕ 1)
in the latter example, in which case the algebraic degree equals d + n − dim(A) = d + |I |.
By applying this to f and to f ⊕ 1, whose restrictions cannot be both null, the algebraic
immunity of the restriction is at least AI (f ) − n + dim(A) = AI (f ) − |I |, as observed in
[407]. Moreover, the annihilators of the restriction of f are the restrictions of the annihilators
of f .
To have a chance of having large algebraic immunity, a function needs then not only
to have large enough algebraic degree but also that each restriction to an affine space of
large dimension, for instance the restriction obtained by fixing a few input coordinates, has
large enough algebraic degree. This implies that Maiorana–McFarland functions defined by
Relation (5.1), page 165, with r large have bad algebraic immunity. It is observed in [279]
that a Maiorana–McFarland function x · φ(y) ⊕ g(y), where x ∈ Fr2 , y ∈ F2n−r , can have
algebraic immunity n − r + 1 (which is its maximal
 possible algebraic degree) only if, for
every affine subspace A of F2n−r , we have y∈A φ(y) = 0r . Indeed, the products of f
and f ⊕ 1 by the indicator function of Fr2 × A are annihilators of f ⊕ 1 and f and are
not both null; they equal the products of the restrictions of f and f ⊕ 1 to Fr2 × A (which
have algebraic degree 1 + dim A if this nonnullity condition is satisfied and at most dim A
otherwise) and of the function of y equal to the indicator function of A (which has algebraic
degree n − r − dim A).

Characterization of annihilators by the Walsh transform


f (x) g(x)
For every x ∈ Fn2 , we have (fg)(x) = ( 12 − (−1)2 )( 12 − (−1)2 ) = 14 (1 − (−1)f (x) −
(−1)g(x) + (−1)f (x)⊕g(x) ). Recall that the Fourier transform is its own inverse up to a
multiplicative factor and that this implies that any integer-valued function ϕ over Fn2 is
1. Equal to the zero function if and only if its Fourier transform 
ϕ is null
2. Constant if and only if 
ϕ (a) is null at any input a = 0n
We deduce a characterization first observed in [128], that we slightly complete:

Proposition 131 Let n be any positive integer and f , g any n-variable Boolean functions.
Then
g ∈ An(f ) ⇐⇒ ∀a ∈ Fn2 , Wf ⊕g (a) + 2n δ0 (a) = Wf (a) + Wg (a),
9.1 Algebraic immune Boolean functions 325

where δ0 is the Dirac (or Kronecker) symbol. Moreover, if f is different from the constant
function 1, we have:

g ∈ An(f ) ⇐⇒ ∀a = 0, Wf ⊕g (a) = Wf (a) + Wg (a).

Indeed, the first equivalence is a consequence of observation 1 above applied to the two
members of the equality above or to ϕ = f + g − f ⊕ g = 2fg, using the linearity
of the Fourier transform and Relation (2.32), page 55. The second equivalence is then a
straightforward consequence of observation 2, since fg constant means fg = 0 because
fg = 1 is impossible, f being not constant function 1.
Note that the bound AI (f ) ≤  n2  shows then that, for every nonconstant n-variable
Boolean function f , there exists a nonzero n-variable Boolean function g of algebraic degree
at most  n2 , such that either Wf ⊕g (a) = Wf (a) + Wg (a) for all a = 0n (g being an
annihilator of f = 1) or Wf ⊕g (a) = Wf (a) − Wg (a) for all a = 0n (g being an annihilator
of f ⊕ 1 = 1).
Moreover, since (−1)(f ⊕g)(x) = (−1)f (x) (−1)g(x) , we have by applying Relation (2.45),
page 60,

2n Wf ⊕g = Wf ⊗ Wg ,

where Wf ⊗ Wg (a) = u∈Fn2 Wf (a + u)Wg (u). We deduce:

Corollary 25 Let n be any positive integer and f any n-variable Boolean function. We
have g ∈ An(f ) if and only if

∀a ∈ Fn2 , Wf ⊗ Wg (a) − 2n Wg (a) = 2n Wf (a) − 22n δ0 (a) (9.2)

and if f is not constant function 1, this condition with a = 0n suffices.

The Walsh transforms of the annihilators of f are then the solutions of the system of
the 2n linear equations (9.2) indexed by a, in the 2n unknowns Wg (a), a ∈ Fn2 , whose
matrix equals M − 2n I , where I is the identity matrix and M is the matrix whose coefficient
at a row indexed by a and a column indexed by u equals Wf (a + u). Note that we have
M × Mt  = M × M = 22n I , where M t is the transpose of M, since, for every a, b ∈ Fn2 ,
we have u∈Fn Wf (a + u)Wf (b + u) = u∈Fn Wf (u)Wf (a + b + u) = 22n δ0 (a + b),
2 2
according to the Parseval and Titsworth relations (2.48) and (2.51), page 61.
It is interesting to see that the annihilators of any Boolean function f combine two linear
algebraic properties over different fields:
– The set of annihilators of f is an F2 -vector space.
– The set of their Walsh transforms is the intersection between an R-vector space (the set of
solutions of the system given
 above) and the set of integer-valued functions W : Fn2 → Z
satisfying the equation u∈Fn W (a + u)W (u) = 2 δ0 (a) for every a ∈ Fn2 (we know
2n
2
indeed that these 2n quadratic equations are characteristic of the Walsh transforms of
Boolean functions).
326 Algebraic immune functions

Characterization of annihilators by the NNF


The determination of annihilators can be handled in a simple way (with two equations over
Z, one quadratic and one linear, instead of wH (f ) linear ones over F2 as we saw with the
ANF) through the NNF representation (see Subsection 2.2.4, page 47): let I ⊆{1,...,n} λI x I ,
λI ∈ Z, be the NNF of a Boolean  function f (x); we know from (2.28), page 51, that an
integer-valued function g(x) = μ
I ⊆{1,...,n} I x I , μ ∈ Z, is Boolean if and only if the
I
single quadratic equation
2n−|I | μJ μJ  = 2n−|I | μI (9.3)
I ⊆{1,...,n} J ,J  ⊆{1,...,n}; I =J ∪J  I ⊆{1,...,n}

is satisfied. We have that g is an annihilator of f if and only if2 x∈Fn2 f (x)g(x) = 0.

Hence, since x∈Fn x I = |{x ∈ Fn2 ; I ⊆ supp(x)}| = 2n−|I | :
2

Proposition
 132 Let f be any n-variable Boolean function and let its NNF equal
λI x I , λI ∈ Z. Then the annihilators of f are the functions g(x) =
I ⊆{1,...,n}
I ⊆{1,...,n} μI x , μI ∈ Z, which satisfy (9.3) and
I

2n−|I | λJ μJ  = 0.
I ⊆{1,...,n} J ,J  ⊆{1,...,n}; I =J ∪J 

Algebraic immunity and codes


It is observed in [600, theorem 1 and corollary 1] that the problem of estimating the algebraic
immunity of Boolean functions over F2n is connected to cyclic codes. We modify the
statement of this result (and give a slightly different proof) so as to complete [600] by taking
into account the facts that if f (0) = 1, then the annihilators g of f must satisfy g(0) = 0
and that annihilators of algebraic degree n may exist.

Proposition 133 Let f (x) be an n-variable Boolean function in univariate form.


Then the annihilators of f (x) in univariate representation are those multiples g(x) of
n n
gcd(f (x) + 1, x 2 + x) in F2n [x]/(x 2 + x) that satisfy (g(x))2 = g(x).
If f (0) = 0, then the annihilators of algebraic degree at most n − 1 are the
codewords of the cyclic code of length 2n − 1 over F2n and of generator polynomial
gcd(f (x) + 1, x 2 −1 + 1) that satisfy (g(x))2 = g(x).
n

Proof We know that the annihilators of f ∈ BF n are those Boolean functions that are
multiples of f ⊕ 1 in BF n . These annihilators in univariate representation are then the
n
multiples of f (x)+1 in F2n [x]/(x 2 +x) that satisfy (g(x))2 = g(x), and since being such a
n n
multiple is equivalent to being a multiple of gcd(f (x)+1, x 2 +x) [mod x 2 +x], this proves
n
the first part. The rest is straightforward since, if f (0) = 0, then gcd(f (x) + 1, x 2 + x) =
gcd(f (x) + 1, x 2 −1 + 1), and since reducing mod x 2 + x or mod x 2 −1 + 1 a polynomial
n n n

of algebraic degree at most n − 1, that is, of degree at most 2n − 2, is the same.


2 Recall that denotes sums in Z.
9.1 Algebraic immune Boolean functions 327

Corollary 26 Let f (x) be an n-variable Boolean function in univariate form. Then AI (f )


n
equals the minimum, among all those nonzero elements g(x) of F2n [x]/(x 2 + x) that satisfy
n n
(g(x))2 = g(x) and are multiples either of gcd(f (x) + 1, x 2 + x) or of gcd(f (x), x 2 + x),
of the maximum 2-weight of the exponents in the terms of these polynomials.

It is also shown in [600] that the spectral immunity (defined at page 96) of a Boolean
function f (x) (in univariate form) is equal to the minimal weight of the nonzero codewords
of the cyclic codes over F2n of generator polynomials gcd(f (x) + 1, x 2 −1 + 1) and
n

gcd(f (x), x 2 −1 + 1).


n

In [869], it is shown that, given an n-variable Boolean function f , if the minimum distance
n 2n −1
of the linear code {(a0 , . . . , a2n −1 ) ∈ F22n ; i=0 ai x i = 0, ∀x ∈ supp(f )} (i.e., the vector
space of univariate representations d ofnF2 -valued annihilators of f ), which we shall denote
n

by Cf , is strictly larger than i=0 i for a given d, then the minimum algebraic degree
of nonzero annihilators of f is strictly larger than d. Indeed, if a nonzero annihilator of f
has algebraic degree at most d, then its Hamming weight as a codeword of Cf is at most
 d n
i=0 i , a contradiction.
There is, however, an issue with this result when f (0) = 0, since Cf has then
minimum distance 2, because it includes the codeword (1, 0, . . . , 0, 1); the result gives then
no information in that case. This difficulty can be easily addressed since the codeword
(1, 0, . . . , 0, 1) corresponds to an annihilator of algebraic degree n (the indicator of {0},
i.e., function δ0 ) and presents then no interest from the viewpoint of algebraic immunity.
We can slightly modify the result of [869] by considering, instead of Cf , the code Cf =
2n −2
{(a0 , . . . , a2n −2 ) ∈ F22n −1 ; i=0
n
ai x i = 0, ∀x ∈ supp(f )} (i.e., the vector space of
univariate representations of F2n -valued annihilators of f of algebraic degree at most n − 1);
the result also works for Cf , unless all nonzero annihilators of f have algebraic degree n,
that is, unless f = 1 ⊕ δa for some a ∈ F2n .
We consider now the cyclic code (also introduced in [869]) C f = {(a1 , . . . , a2n −1 ) ∈
2n −1 2n −1
F2n ; i=1 ai x i = 0, ∀x ∈ supp(f )} (i.e., the subcode of Cf whose elements are the
univariate representations of F2n -valued annihilators of f null at position 0), punctured at 0.
The minimum distance (i.e., nonzero weight) of Cf equals at least the minimum distance of
C f . Indeed, if f (0) = 1, then the minimum distances of Cf and C f are equal to each other,
and if f (0) = 0, then Cf = {(0, 0, . . . , 0, 0), (1, 0, . . . , 0, 1)} + {0} × C f and the minimum
distance of Cf is then larger than or equal to that of C f .
The interest of this observation is that C f is cyclic and we can apply the BCH bound
to this cyclic code. This gives a direct lower bound on the minimum algebraic degree of
nonzero annihilators of f .

Relationship between normality and algebraic immunity


Normality of order larger than n2 represents a weakness with respect to algebraic immunity:

Proposition 134 For any positive n and k ≤ n, if an n-variable function f is k-normal,


then its algebraic immunity is at most n − k.
328 Algebraic immune functions

Indeed, the fact that f (x) = ∈ F2 for every x ∈ A, where A is a k-dimensional flat,
implies that the indicator of A is an annihilator of f + . This bound is tight since, being a
symmetric Boolean function, the majority function (see page 335) is  n2 -normal for every
n and has algebraic immunity  n2 . Obviously, AI (f ) ≤  does not imply conversely that
f is (n − )-normal, since when n tends to infinity, for every a > 1, n-variable Boolean
functions are almost surely non-(a log2 n)-normal [222, 224] and the algebraic immunity is
always bounded above by n2 .

Functions in odd numbers of variables with optimal algebraic immunity


In [188], A. Canteaut has observed the following property:

Proposition 135 If an n-variable balanced function f , with n odd, admits no nonzero


annihilator of algebraic degree at most n−1 n+1
2 , then it has optimal algebraic immunity 2 .

This result is a direct consequence of Proposition 136 below, which has been proved later.
It means that we do not need to check also that f ⊕ 1 has no nonzero annihilator of algebraic
degree at most n−1 2 for showing that f has optimal algebraic immunity.
3

The original proof (simplified in the end) of Proposition 135 is as follows: consider the
Reed–Muller code of length 2n and of order n−1 2 . This code is self-dual (i.e., is its own dual),
according to Theorem 9, page 154. Let G be a generator matrix of this code. Each column of
G is labeled by the vector of Fn2 obtained by keeping its coordinates of indices 2, . . . , n + 1
(assuming that the first row of G is the all-1 vector, corresponding to constant function
1, and that the next n rows correspond to the coordinate functions). Saying that f has no
nonzero annihilator of algebraic degree at most n−1 2 is equivalent to saying that the matrix
obtained by selecting those columns of G corresponding to the elements of the support of
 n−1 n
i =2
f has full rank i=0 2 n−1 . By hypothesis, f has Hamming weight 2n−1 . In terms of

coding theory, the support of the function is an information set. Then the complement of the
support of f being an information set of the dual (recall that if G = [Ik : M] is a systematic
generator matrix of a linear code, then [−M t : In−k ] is a parity check matrix of the code)
and the code being self-dual, this complement is also an information set of the code (i.e., the
code is complementary information set CIS; see page 459).

More relationship between the existence of low degree annihilators of f and of f ⊕ 1


We have, from [800] (we slightly modify the proof):

Proposition 136 If, for some k <  n2 , we have rk(Mf ,k ) = wH (f ) (i.e., all the rows of
 
Mf ,k are F2 -linearly independent), then rk(Mf ⊕1,k ) = ki=0 ni (i.e., f ⊕1 has no nonzero
annihilator of algebraic degree at most k).

Proof Suppose there exists a nonzero annihilator g of algebraic degree at most k of f ⊕ 1.


We have then supp(g) ⊆ supp(f ). Since all the rows of Mf ,k are F2 -linearly independent,

3 The same has been shown for n even but for (less interesting) unbalanced functions.
9.1 Algebraic immune Boolean functions 329

all those of Mg,k are F2 -linearly independent, and for every choice of (bx )x∈supp(g) ∈
w (g)
F2 H , the system of linear equations whose matrix is Mg,k and whose constants are these
bx has a solution. In particular, for every x ∈ supp(g), there exists g  of algebraic degree at
most k such that gg  = δx (the Dirac symbol at x, i.e., the indicator function of the singleton
{x}), a contradiction with dalg (gg  ) ≤ dalg (g) + dalg (g  ) < n.

Minimum Hamming distance to functions of large algebraic immunity bounded below


by means of the dimensions of vector spaces of functions
Lobanov has made in two papers [800, 801] the following observations (that we gather in a
single proposition):

Proposition 137 For any n-variable Boolean functions f , h and any integers 0 ≤ k, l ≤ n,
we have
dH (f , h) ≥ dim(Ank (h)) − dim(Ank (f )) + dim(Anl (h ⊕ 1)) − dim(Anl (f ⊕ 1)).
Moreover, if d ≤ AI (f ), then we have
dH (f , h) ≥ dim(And−1 (h)) + dim(And−1 (h ⊕ 1)). (9.4)

Proof Among the rk(Mf ,k ) linearly F2 -independent rows of Mf ,k that can be selected,
there exist at least rk(Mf ,k ) − rk(Mh,k ) = dim(Ank (h)) − dim(Ank (f )) ones that are not
rows of Mh,k , and there are then at least the same number of distinct elements of Fn2 in the
support of f that are not in the support of h. We can apply this to f ⊕ 1 and h ⊕ 1 as
well, with l in the place of k. This gives the first inequality. Moreover, if d ≤ AI (f ), then
dim(And−1 (f )) = dim(And−1 (f ⊕ 1)) = 0. This completes the proof.

Lobanov notes that, if k ≥ l, then the mapping (g1 , g2 ) → g1 ⊕ g2 is an F2 -linear


isomorphism between the vector spaces Ank (h) × Anl (h ⊕ 1) and
Bk,l (h) = {g ∈ BF n ; dalg (g) ≤ k and dalg (hg) ≤ l}.
Indeed, the image set of this mapping is included in Bk,l (h), since we have (g1 ⊕ g2 )h = g2 ,
and composing it with the mapping g ∈ Bk,l (h) → (g ⊕ hg, hg) ∈ Ank (h) × Anl (h ⊕ 1)
gives identity. Hence, we have
• dim(Ank (h)) + dim(Anl (h ⊕ 1)) = dim Bk,l (h)
• dim(Ank (f )) + dim(Anl (f ⊕ 1)) = dim Bk,l (f )
• dim(And−1 (h)) + dim(And−1 (h ⊕ 1)) = dim Bd−1,d−1 (h).

In [800], it is shown4 that, for every d ≤  n2  and every function h such that
dim(And−1 (h)) + dim(And−1 (h ⊕ 1)) > 0, there exists f for which Bound (9.4) is
an equality and such that AI (f ) ≥ d. Let us give a proof of this astonishingly general
result. Let C1 (resp. C0 ) be a maximal subset of supp(h) (resp. supp(h ⊕ 1)) such that
the corresponding rows of Mh,d−1 (resp. Mh⊕1,d−1 ) are F2 -linearly independent. We have
4 Originally it was assumed the condition that the algebraic degree of h is at most  n2 , but after clarifying the
proof with M. Lobanov, we could see that this is not necessary.
330 Algebraic immune functions
 n d−1 n
|C1 | = d−1i=0 i − dim(And−1 (h)) and |C0 | = i=0 i − dim(And−1 (h ⊕ 1)). According
to Proposition 136 applied to the indicator function 1C1 (resp. 1C0 ) and with k = d − 1,
 n
the ranks of M1C1 ⊕1,d−1 and M1C0 ⊕1,d−1 both equal d−1 i=0 i . Since C0 ⊆ supp(1C1 ⊕ 1)
(resp. C1 ⊆ supp(1C0 ⊕ 1)), there exists outside C1 ∪ C0 , a subset C0 of size
d−1 n 
d−1 n
i=0 i −|C0 | = dim(And−1 (h⊕1)) (resp. C1 of size i=0 i −|C1 | = dim(And−1 (h)))
such that the rows of M1C1 ⊕1,d−1 (resp. M1C0 ⊕1,d−1 ) corresponding to the elements of
C0 ∪ C0 (resp. C1 ∪ C1 ) are F2 -linearly independent. Since C0 and C1 were taken maximal,
we have C1 ⊆ supp(h ⊕ 1) and C0 ⊆ supp(h). The function f = h ⊕ 1C0 ⊕ 1C1 satisfies
dH (f , h) = dim(And−1 (h))  + dim(An d−1 (h ⊕ 1)). And we have AI (f ) ≥ d, since
 d−1 n d−1 n
rk(Mf ,d−1 ) ≥ |C1 | + |C1 | = i=0 i and similarly rk(Mf ⊕1,d−1 ) i=0 i .

Relationship between algebraic immunity, Hamming weight, algebraic degree,


nonlinearity, and higher-order nonlinearity
We have seen that nonlinearity and algebraic degree are rather uncorrelated: there are
Boolean functions with high nonlinearity and low algebraic degree (since there exist
quadratic bent functions), with low nonlinearity and low algebraic degree, with high
nonlinearity and high algebraic degree,5 and with low nonlinearity and high algebraic
degree. Interestingly, if we replace the algebraic degree by the algebraic immunity, the latter
case cannot happen. We need preliminary results that have their own interest.

Proposition 138 [261] For every n-variable Boolean function, we have:


AI (f )−1  n−AI (f ) 
n n
≤ wH (f ) ≤ . (9.5)
i i
i=0 i=0

Indeed, if the left-hand side inequality is not satisfied, then Mf ,AI (f )−1 has rank at most
AI (f )−1 n
wH (f ) < i=0 i , a contradiction. The right-hand side inequality is obtained from
the other one by replacing f by f ⊕ 1.
This implies again that AI (f ) ≤  n2  (since applied with AI (f ) ≥  n2  + 1, it leads to
a contradiction, because the lower bound is then strictly larger than the upper bound), and it
also implies that a function f such that AI (f ) = n+12 (n odd) must be balanced.
In [261, Lemma 1] has been stated:

Proposition 139 For any two n-variable Boolean functions f and h, we have

AI (f ) − dalg (h) ≤ AI (f ⊕ h) ≤ AI (f ) + dalg (h). (9.6)

The proof was incomplete: let g = 0 be such that fg = 0 (resp. (f ⊕ 1)g = 0) and have
algebraic degree AI (f ), then we have (f ⊕ h)((h ⊕ 1)g) = 0 (resp. (f ⊕ 1 ⊕ h)((h ⊕
1)g) = 0); it was written that this proves the inequality on the right since dalg ((h ⊕ 1)g) ≤
AI (f ) + dalg (h), but this conclusion is correct only if (h ⊕ 1)g = 0. Let us address the case
(h ⊕ 1)g = 0: we have then (f ⊕ h ⊕ 1)g = 0 (resp. (f ⊕ h)g = 0), and g being a nonzero
5 But not with maximal nonlinearity and high algebraic degree because of the Rothaus bound.
9.1 Algebraic immune Boolean functions 331

annihilator of f ⊕ h ⊕ 1 (resp. f ⊕ h), we have AI (f ⊕ h) ≤ AI (f ) ≤ AI (f ) + dalg (h).


This completes the proof of the inequality on the right. Applying it to f ⊕ h instead of f
gives then the inequality on the left.
Note that these relations are valid if f and h are defined on different (maybe intersecting)
sets of variables and n is the global number of variables (indeed, algebraic immunity does
not change if we consider a function with more variables, the additional variables being
fictitious). Moreover, if these sets of variables are disjoint, then we have AI (f ) ≤ AI (f ⊕
h) ≤ AI (f ) + dalg (h), since it is then possible to obtain a nonzero annihilator of algebraic
degree AI (f ⊕ h) of f or of f ⊕ 1 as the restriction of a nonzero annihilator of f ⊕ h or of
f ⊕ h ⊕ 1.
It is deduced in [261] that low nonlinearity implies low algebraic immunity (but high
algebraic immunity does not imply high nonlinearity, as well as high nonlinearity does
not imply high algebraic immunity): Relation (9.5) applied to f ⊕ h with h affine and
Relation (9.6) show that
AI (f )−2 
n
nl(f ) ≥
i
i=0

and more generally (by applying Relation (9.5) to f ⊕ h with dalg (h) ≤ r)
AI (f )−r−1 
n
nlr (f ) ≥ . (9.7)
i
i=0

These lower bounds, which play a role with respect to probabilistic algebraic attacks, (see
[792, 793]), have been improved in all cases for the first-order nonlinearity into
AI (f )−2 
n−1
nl(f ) ≥ 2
i
i=0

by Lobanov [798, 799] and in most cases for the r-th order nonlinearity into
AI (f )−r−1 
n−r
nlr (f ) ≥ 2 (9.8)
i
i=0

in [228] (in fact, the improvement was slightly stronger than this, but more complex).
Another improvement
AI (f )−r−1  AI (f )−r−1 
n n−r
nlr (f ) ≥ + (9.9)
i i
i=0 i=AI (f )−2r

(which always improves upon (9.7) and improves upon (9.8) for low values of r) has been
subsequently obtained by Mesnager in [849] and slightly later by Lobanov in [800], who
gives a general proof for all these bounds, which we recall below. Precisions on the bounds,
involving the maximum between the minimal algebraic degree of the nonzero annihilators
of f and the minimal algebraic degree of the nonzero annihilators of f ⊕ 1, have been also
given in [998].
332 Algebraic immune functions

Here is Lobanov’s general proof: Bound (9.4), page 329, and the observations that follow
it imply that, for every n-variable Boolean function f and every positive integer r ≤ n, we
have6
nlr (f ) ≥ min dim(BAI (f )−1,AI (f )−1 (h)). (9.10)
h∈BF n ,dalg (h)≤r

Then, if dalg (h) = r:


 n
• dim(Bk,k (h)) ≥ k−r i=0 i , because all n-variable functions of algebraic degree at most
k − r belong to Bk,k (h); then
n−r(9.10) implies (9.7),

• dim(Bk,k (h)) ≥ 2 k−r i=0 i , because, if i∈I xi is a monomial of degree r in the
ANF of h, then all n-variable functions of the form hg1 ⊕ (h ⊕ 1)g2 where g1 , g2
have algebraic degree at most k − r and depend only on variables xi , i ∈ I , belong to
Bk,k (h) and are distinct since the linear mapping (g1 , g2 ) → hg1 ⊕ (h ⊕ 1)g2 has trivial
kernel, because hg1 ⊕ (h ⊕ 1)g2 = 0 if and only if hg1 = (h ⊕ 1)g2 = 0; then, (9.10)
implies (9.8). k−r n k−r n−r 
• dim(Bk,k (h)) ≥ i=0 i + i=k−2r+1 i , because, if i∈I xi is a monomial of
degree r in the ANF of h, then all n-variable functions of the form g1 ⊕ hg2 , where
g1 , g2 have algebraic degree at most k − r and g2 depends only on variables xi , i ∈ I ,
and has only monomials of degree at least k − 2r + 1, belong to Bk,k (h) and are
distinct since the linear mapping (g1 , g2 ) → g1 ⊕ hg2 has trivial kernel; then (9.10)
implies (9.9).

An obvious upper bound on the higher-order nonlinearity exists that also involves the
algebraic immunity, as observed in [227]: if AI (f ) ≤ r and if f is balanced, then we have
nlr (f ) ≤ 2n−1 − 2n−r , since by hypothesis, there exists a nonzero function g of algebraic
degree at most r such that g f or g f ⊕ 1, and g being nonzero and belonging to the
Reed–Muller code of order r, it has Hamming weight at least the minimum distance of this
code, that is, 2n−r . If g f , for instance, then dH (f , g) = wH (f ⊕g) = wH (f )−wH (g) ≤
2n−1 − 2n−r and nlr (f ) ≤ 2n−1 − 2n−r .
A bound between nlr (f ) and F AI (f ) has been also given in [1106]: nlr (f ) ≥
 F AI (f2 )−r n
i=0 i , but the proof has several shortcomings and the result seems false. For
instance, with the help of a computer, we can check that the unbalanced function in [1067]
with n = 6 has FAI 6 and the result above would imply that there exists a function with
second-order nonlinearity at least 22, but it is known that the covering radius of RM(2,6) is
18. In fact, F AI (f ) − r should be F AI (f ) − r − 1 in this bound, but even if such correction
is made, the proof does not address all issues. Such a result is important to show that some
functions cannot have good behavior against fast algebraic attacks, like functions obtained
by modifying bent functions (e.g., those of [1091]). We give in the next theorem a corrected
result (the first bound in Theorem 22 is more or less the only interesting one; we include
also the second to give a correct alternative to a bound given in [1106] and to show what
were the difficulties missed by its proof).

6 A slightly more complex bound is deduced in [801] from the first bound in Proposition 137, which allows one
to improve upon lower bounds (9.8) and (9.9) in some subcases.
9.1 Algebraic immune Boolean functions 333

Theorem 22 For any positive integer


' n and any nonnegative( integer r ≤ n, let f be any
n-variable function and k = min dalg (g) + dalg (fg); g = 0 . We have then

k−r−1
2 
n
nlr (f ) ≥ .
i
i=0

Moreover, if nlr (f ) = 0 and if AI (f ) > AI (f ⊕ h) for at least one function h of algebraic


degree at most r such that dH (f , h) = nlr (f ), then
F AI (f )−r−1
2 
n
nlr (f ) ≥ .
i
i=0

 k−r−1
2 n
Proof Suppose first that nlr (f ) < i=0 i . Let h be a Boolean function of algebraic
degree at most r whose Hamming distance wH (f ⊕ h) to f equals nlr (f ). Since f ⊕ h
 k−r−1
2 n
has Hamming weight strictly smaller than i=0 i , the rank of matrix M k−r−1
f ⊕h, 2
is also strictly smaller and there exists a nonzero annihilator g of f ⊕ h whose algebraic
degree is at most k−r−12 . We have then fg = hg with g = 0 and dalg (g) + dalg (fg) =
dalg (g) + dalg (hg) ≤ 2 k−r−1
2 + r < k, a contradiction.
n F AI (f )−r−1
2
We address now the second bound. Suppose that nlr (f ) < i=0 i and let us
fix h of algebraic degree at most r such that AI (f ) > AI (f ⊕ h). For such h, similarly to
above, there exist annihilators g = 0 of f ⊕ h such that dalg (g) ≤ FAI (f2)−r−1 and one
of these annihilators at least has algebraic degree AI (f ⊕ h) < AI (f ). Then:
– If one of these annihilators equals constant function 1, then f = h and therefore
nlr (f ) = 0, a contradiction.
– In the other case, we arrive to a contradiction as above.

λn n n H2 (λ)
For instance, if k is near from n and r = n/2, we have nln/2 (f ) ≥ i=0 i ≥ √2
8nλ(1−λ)
,
where λ ≈ 14 (cf. [809, page 310]), and where H2 (x) = −x log2 (x) − (1 − x) log2 (1 − x)
is the entropy function, whose value at 14 equals 12 + 34 (2 − log2 (3)) = 2 − 34 log2 (3) ≈
n
0.8. Note that 2 2 −1 is then negligible with respect to
n H2 (λ)
√2 (this will play a role at
8nλ(1−λ)
page 339).

Remark. If nlr (f ) =  0, the condition g = 0 in the definition of k can be replaced by


dalg (g) ≥ 1. Indeed, if there is no other nonzero annihilator of f ⊕ h of algebraic degree
at most k−r−12 than g = 1, this means that k ≤ r + 1. If k ≤ r, then k−r−1
2 < 0 and
the result holds, and if k = r + 1, then the only case where the bound would not hold is if
nlr (f ) = 0, which is excluded.
334 Algebraic immune functions

9.1.2 The problem of finding functions achieving high algebraic immunity


and high nonlinearity
Recall that, in the framework of stream ciphers, we do not have security proofs but we
need functions allowing resistance to all known attacks and having enough randomness for
hoping they will not be too weak against new attacks. These functions must be as quickly
computable as possible.
No known primary construction viewed in Chapters 5, 6, and 7 allows obtaining classes
of functions satisfying all important criteria, and no secondary construction is known for
designing new functions satisfying all the criteria from already defined functions satisfying
them. We know, however, that functions achieving optimal or suboptimal algebraic immu-
nity and at the same time high algebraic degree and high nonlinearity must exist thanks to the
results of [437, 1000]. But knowing that almost all functions have high algebraic immunity
does not mean that constructing such functions is easy.
Lobanov’s bound seen at page 331 does not ensure high enough nonlinearity:
 
• For n even and AI (f ) = n2 , it gives nl(f ) ≥ 2n−1 − 2 n−1 n
−1 = 2
n−1 − n , which is
n
2 2
n
much smaller than the best possible nonlinearity 2n−1 −2 2 −1 and, more problematically,
much smaller than the asymptotic almost sure nonlinearity of Boolean √ functions, which
n
−1
is, when n tends to ∞, located in the neighborhood of 2 n−1 −2 2 2n ln 2 as we saw.
Until 2008, the best nonlinearity reached by the known functions with optimal AI was
that of the majority function
 and of n−1
the iterative
 construction (see more details below on
these functions): 2n−1 − n−1 n =2 − 12 nn [409]. This was a little better than what
2 2
gives Lobanov’s bound, but insufficient.  n−1
• For n odd and AI (f ) = n+1 , Lobanov’s bound gives nl(f ) ≥ 2n−1 − (n−1)/2 ,
 n 2
2 n−1 − 2 (n−1)/2 , which is a little better than in the n even case, but still far from the
1

average nonlinearity of Boolean functions. Until 2008, the best-known nonlinearity was
that of the majority function and matched this bound.
Efficient algorithms have been given in [27, 438, 439] for  computing the algebraic
 n  n 
immunity, with respective complexities O 2 AI (f ) and O n2 AI (f ) (the latter
n n

being slightly worse but on the other hand the amount of memory needed being smaller).
Algorithms for evaluating the immunity1to fast algebraic attacks are also given in these
d+1 n n n 3 
references with complexity O e e e d + e , where e is significantly smaller than
 n n
AI (f ) and d is comparable to AI (f ), and O n2 k , where k is the degree of the algebraic
system to be solved in the last step of the attacks. They showed the poor resistance of the
majority function to FAA. In [997, 999], Rizomiliotis introduced three matrices to evaluate
the behavior of Boolean functions against fast algebraic attacks using univariate polynomial
representation. Later was shown in [794] that one matrix is enough.

9.1.3 The functions with high algebraic immunity found so far and
their parameters
Sporadic functions
In [407], 7-variable rotation symmetric (RS) functions with nonlinearity 56, resiliency order
2, algebraic immunity 4, and a large number of 8-variable RS functions with nonlinearity
9.1 Algebraic immune Boolean functions 335

116, resiliency order 1, and algebraic immunity 4 are exhibited. These authors claimed there
exist such functions having good resistance against fast algebraic attacks, but Siegenthaler’s
bound shows that this resistance is limited and eight variables is small; rotation symmetry
presents also a risk that the attacker can use such a strong structure in specific attacks.
Balanced highly nonlinear functions in up to 20 variables (derived from power functions)
with high algebraic immunities have been exhibited in [279] and [27]. Some other interesting
ideas of constructions have been proposed, either using simulated annealing [836] (but the
number of variables is limited, the gain in terms of nonlinearity is not large, and, of course,
this cannot produce infinite classes) or using the genetic hill climbing algorithm, starting
from the function of Theorem 23 that we shall see at page 337 and applying a few swaps on
its truth table [694] (this can increase a little its nonlinearity, but it could not lead to infinite
classes either).
In [1031] are calculated the algebraic immunity and nonlinearity of the 20-variable
function used as nonlinear filter in the lightweight stream cipher Hitag2. The algebraic
immunity is no more than 6.
Note that the construction of Proposition 85, page 236, allows increasing the complexity
of Boolean functions while keeping their high nonlinearities and may allow increasing their
algebraic immunity as well.

Primary constructions of infinite classes of functions, with insufficient nonlinearity


– The majority function, considered by Key et al. [691] in the context (equivalent to
that of algebraic immune functions) of the erasure channel, and rediscovered in the
context of algebraic immunity [127, 409]), defined as f (x) = 1 if wH (x) ≥ n2 and
f (x) = 0 otherwise,7 has optimal algebraic immunity. Note that, for n odd, Proposition
135 materializes, in the case of this function, in a rather simple way, since (f ⊕ 1)(x) =
f (x + 1n ) and f and f ⊕ 1 are then affine equivalent. The proof of its optimal algebraic
immunity is easy. We give it for n odd (the case n even is slightly more technical):
an annihilator of f ⊕ 1 being equal to 0 at every input of Hamming weight at most
n−1 n−1
2 , Relation (2.4), page 33, makes that its ANF has no term of degree at most 2 ; a
nonzero anihilator must then have algebraic degree at least n+1 2 . The majority function
is balanced when n is odd. It is a symmetric Boolean function (which can represent a
weakness but also allows using it with more variables while ensuring the same or even
a better speed), and when n is odd it is the only one with optimal AI, up to the addition
of function 1; see more at page 357. It has two main weaknesses: its nonlinearity is
weak8 and its resistance to fast algebraic attack is bad too (as shown in [27], there exist
Boolean functions g = 0 and h such that fg = h, where dalg (h) = n/2 + 1 and
dalg (g) = dalg (h) − 2j , where j is maximum so that this number is strictly positive).
The nonlinearity has been determined in [409]; the proof using Krawtchouk polynomials
is very technical and cannot be included here. A simpler proof has been given by Cusick
(and not published). There is a way of showing that the nonlinearity of the majority
function (which is adaptable to many other functions similar to it) cannot be good: let us
take for instance n odd and apply the second-order Poisson formula (2.57), page 62, with
7 Changing wH (x) ≥ n2 into wH (x) > n2 or wH (x) ≤ n2 or wH (x) < n2 changes the function into an affinely
equivalent one, up to addition of the constant 1, and therefore does not change the AI.
8 This crippling drawback is shared by all the classes of rotation symmetric functions (see definition in Section
10.2, page 360) with optimal AI presented in numerous papers, which are not mentioned in this book.
336 Algebraic immune functions

E = {x ∈ Fn2 ; x  ⊥
b}, where b is a vector of Hamming weight n−1 2 , and E = E =
{x ∈ Fn2 ; x b + 1n }. For every a ∈ E  , when x ranges over E, the Hamming weight of
a + x equals wH (a) + wH (x) (since the two vectors have disjoint supports) and is larger
2 if and only if wH (x) ≥ 2 − wH (a). Hence, we have F (ha ) =
than or equal to n+1 n+1
 2 −wH (a)  n−1
n−1  2n−1  n−1  n−12 −wH (a)
 n−1 wH (a)−1  n−1
i=0
2 − n+1
2 = i=0
2 − i=0 2 =
i i= 2 −wH (a)i i i

⎪  n−1
2 −wH (a)
 n−1
⎨ i=wH (a)
2 if wH (a) ≤ n−1
4
i
  .
⎪ −
⎩ w H (a)−1 n−1
2 if wH (a) ≥ n+3
n+1
i= 2 −wH (a)
i 4

 n−1
The absolute value of F (ha ) is then larger than or equal to 2
, unless it is null,
 n−1
4 

2 − wH (a) = wH (a) − 1, which happens only if n ≡ 3 [mod 4]


that is, unless n−1
and wH (a) = n+1 
4 . Since b + 1n has Hamming weight 2 , the size of E equals
n+1
n+1
2 . According to (2.57), the arithmetic mean of Wf2 (u) when u ∈ E ⊥ is then at
2
2
n+1  n−1 3n−1 n
least (2 2 − 1) 2
n−1 ≈ 2 2 (recall that the Stirling formula implies that n/2 ∼
 4 
1 3n−3
2n πn 2
) and therefore nl(f ) is smaller than or equal to approximately 2n−1 − 2 4 .
Some variants of the majority function have also optimal algebraic immunity and are
balanced for n even, but they have more or less the same drawbacks.
It is proved in [127] that, for n even, changing the value at 1n of the majority function
preserves its optimal algebraic immunity, as well as, for n ≥ 8, changing its values
at the inputs of Hamming weights n2 ± 4 and, for n ≥ 10, making these two changes
simultaneously. All such functions happen to be weak against FAA as shown in [27].
– An iterative construction of an infinite class of functions with optimal algebraic
immunity has been given in [408] and further studied in [261]; however, the functions it
produces are neither balanced (which can be fixed) nor highly nonlinear (which cannot,
unless many variables are added), and it is weak against fast algebraic attacks, as also
shown in [27].
– More numerous functions with optimal algebraic immunity were given in [230]. Among
them are functions with better nonlinearities, but the method did not allow reaching high
nonlinearities (see [328]) and some functions constructed in [766, 767] seem still worse
from this viewpoint. In [929], Pasalic introduced an iterative concatenation method for
constructing maximum AI functions with suboptimal FAI, but the nonlinearity of the
resulting functions was insufficient. Hence, the question of designing infinite classes of
functions achieving all the necessary criteria remained open after these papers.

A first primary construction of an infinite class of functions satisfying all criteria


A function with optimal algebraic immunity, good immunity to fast algebraic attacks,
provably much better nonlinearity than the functions mentioned above, and, in fact,
according to computer investigations, quite sufficient nonlinearity has been exhibited in
2008 (five years after the invention of algebraic attacks) in [273]. This primary construction
is defined over the field F2n . It has been originally defined as the Boolean function whose
support equals {0, 1, α, . . . , α 2 −2 }, where α is any primitive element of F2n . This original
n−1
9.1 Algebraic immune Boolean functions 337

function is the Boolean (single-output) case of a class of vectorial functions studied in [500],
where the optimal algebraic immunity was proved. The contribution of [273] (which resulted
in the authors of the subsequent papers giving these functions the name of Carlet–Feng
functions) was to observe that all the cryptographic parameters of this function were good
(not only the algebraic immunity) and to provide a simpler proof of the optimal algebraic
immunity, which gave a better view of why it happens. The proof has been later slightly
simplified further in [242]. The authors of the papers that soon after 2008 have modified
this function in order to find more functions [1091, 1148] preferred using the function of
support {α s , . . . , α 2 +s−1 }, where s is some integer. The two definitions coinciding for
n−1

s = 2n−1 − 1 up to addition of constant function 1, and two different values of s giving


linearly equivalent functions, the two definitions deal with essentially the same function. In
the next theorem, we take the modified definition.9

Theorem 23 [273, 500] For every positive integer n, every integer s, and every
primitive element α of F2n , the balanced Boolean function over F2n whose support is
{α s , . . . , α 2 +s−1 } has optimal algebraic degree n − 1 and optimal algebraic immunity
n−1

 n2 .

Proof It is shown in [273] that the univariate representation of the original function equals
2n −2 αi n−1
1 + i=1 (1+α i )1/2
x i , where u1/2 = u2 , which shows that the algebraic degree of f
equals n − 1 (optimal for a balanced function). This proves the first property. Up to linear
equivalence, s can be taken equalto 0. Let g be any Boolean function of algebraic degree
2n −2
strictly less than n and g(x) = i=0 gi x i , gi ∈ F2n , its univariate representation in the
field F2n (since g has algebraic degree less than n, we have g2n −1 = 0). Then:
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
g(1) 1 1 1 ··· 1 g0
⎜ g(α) ⎟ ⎜ 1 α α2 ··· α 2 −2
n
⎟ ⎜ g1 ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ g(α 2 ) ⎟ ⎜ 1 α 2 α 4 · · · α 2(2 n −2)
⎟ ⎜ g2 ⎟
⎜ ⎟ ⎜ = ⎟ ⎜× ⎟.
⎜ .. ⎟ ⎜ . .. .. .. ⎟ ⎜ .. ⎟
⎝ . ⎠ ⎝ . . . . ··· . ⎠ ⎝ . ⎠
n −2 n −2 n −2) n −2)(2n −2)
g(α 2 ) 1 α2 α 2(2 · · · α (2 g2n −2

If g is an annihilator of f , then g(1) = g(α) = · · · = g(α 2 −1 ) = 0 and the coefficients


n−1

g0 , . . . , g2n −2 satisfy then


⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 1 1 ··· 1 g0 0
⎜ 1 2 −2 ⎟ ⎜ g1 ⎟ ⎜ 0 ⎟
n
⎜ α α 2 · · · α ⎟ ⎜ ⎟ ⎜ ⎟
⎜ 1 α2 α4 ··· α 2(2 −2)
n
⎟ ⎜ g2 ⎟ ⎜ 0 ⎟
⎜ ⎟×⎜ ⎟ = ⎜ ⎟.
⎜ .. .. .. .. ⎟ ⎜ .. ⎟ ⎜ .. ⎟
⎝ . . . ··· . ⎠ ⎝ . ⎠ ⎝ . ⎠
1 α 2 −1 α 2(2 −1) · · · α (2 −1)(2 −2)
n−1 n−1 n−1 n
g2n −2 0

If at most 2n−1 of the gi are nonzero, then erasing 2n−1 − 1 null coefficients (and the
corresponding matrix columns) from the system above leads to a homogeneous system of
9 This same function has been later rediscovered with another presentation by Q. Wang, J. Peng, H. Kan and X.
Xue in IEEE Transactions on Inf. Th., as shown in [238] (and in another paper published later by H. Chen, T.
Tian and W. Qi in DCC).
338 Algebraic immune functions

linear equations whose matrix is a 2n−1 ×2n−1 Vandermonde matrix and is then nonsingular.
We have then proved that the vector (g0 , . . . , g2n −2 ) is either null or has Hamming weight at
least 2n−1 + 1 (in the framework of coding theory, this result is called the BCH bound). This
implies that any nonzero annihilator of f has algebraic degree at least  n2  (since otherwise,
n/2−1 n
the number of its nonzero coefficients would be at most 2n−1 , because i=0 i ≤
2n−1 ).
If g is an annihilator of f ⊕ 1, then we have g(α i ) = 0 for every i = 2n−1 , . . . , 2n − 2,
and for the same reasons as above, the vector (g0 , . . . , g2n −2 ) has Hamming weight at least
2n−1 . Moreover, suppose that function g has algebraic degree at most n−1 2 and that the vector
(g0 , . . . , g2n −2 ) has Hamming weight 2n−1 exactly. Then n is odd and all the coefficients
gi ; 0 ≤ i ≤ 2n − 2, w2 (i) ≤ (n − 1)/2, are nonzero10 , but g0 = 0 contradicts then g(0) = 0.
This completes the proof.

The nonlinearity of the function is also good, at least for values of n for which the function
can be used in stream ciphers. In fact, this nonlinearity had been previously studied in [129]
(but the algebraic immunity was not considered there) and a lower bound on the nonlinearity
was shown, similar to the one later given in [273]:
n
nl(f ) ≥ 2n−1 − n · ln 2 · 2 2 − 1. (9.11)
Bound (9.11) is not sufficient for showing that f has good nonlinearity. It has been improved
several times, but the improvements are marginal and insufficient for asserting that the
function allows resisting the fast correlation attack. The actual values of the nonlinearity
have been computed up to n = 26 and happen to be very good and quite sufficient for such
a resistance. Note that the nonlinearity depends on the choice of the primitive element α
and the bounds mentioned above are in fact bounds on the minimum Hamming distance
between f and all functions of the form tr(ax j + b) where j is coprime with 2n − 1, which
we can call the hyper-nonlinearity (in relation to the notion of hyper-bent function seen in
Definition 57, page 244). It is an open question to determine whether a significantly better
lower bound on the hyper-nonlinearity of f can be proved (some ideas are given in [248,
subsection 4.2]) or if the gap between the bound and the actual hyper-nonlinearity reduces
when n takes values larger than 26.
The good resistance to fast algebraic attacks has first been checked by computer for
n ≤ 12, using an algorithm from [27], and later shown mathematically in [793] for all n:

Proposition 140 Let e be a positive integer less than n2 and f be the function of Theorem

23. Then, if n−1 is even, there exists no nonzero function g with algebraic degree at most
e 
e such that fg has algebraic degree at most n − e − 1, and if n−1 e is odd, there exists no
nonzero function g with degree at most e such that fg has degree at most n − e − 2.

In particular, f is PAI (see page 322) when n is a power of 2, plus 1 (this was known
before only for n = 3, 5, 9).

10 Thereis a small inaccuracy in what is written in the proofs provided in [236, 242, 273] since the gi are not
necessarily in F2 .
9.1 Algebraic immune Boolean functions 339

The computation of the function of Theorem 23 is reasonably fast, at least for some values
of n ≤ 20. This may seem surprising, because the complexity of its computation is clearly
the same as that of the discrete logarithm, which is known to be asymptotically high (this has
led to a whole branch of public-key cryptography), but for small values of n (like n ≤ 20),
the function is fast to be computed, all the more if 2n − 1 is the product of small factors
(this is the case of 18 and 20, for instance), because this allows using the Pohlig–Hellman
algorithm; in the case of these two values of n, computing one output bit per cycle is possible
with 40,000 transistors, as observed in [238]. This allows avoiding needing using a look-up
table (of about one megabits, which is too heavy for some devices) for computing the output
of the function.
Hence, the functions of this class gather all the properties needed for allowing the stream
ciphers, using them as filtering functions to resist all the main attacks (the Berlekamp–
Massey [BM] algorithm, fast correlation attacks, standard and fast algebraic attacks, and
Rønjom–Helleseth attacks).

Modifications of the functions of Theorem 23


– Classes of functions have been proposed, obtained by replacing a part of the support of
the function by another part of the same size. In [997], Rizomiliotis proposed a matrix
approach (instead of the BCH bound) for proving optimal AI, and the (balanced) function
 n2 −1 n  n2 −1 n  n2  n
of support {1, α, . . . , α i=1 ( i ) } ∪ U , where U ⊂ {α i=0 ( i ) , . . . , α 1+ i=0 ( i ) } has
 n2 −1 n
size 2n−1 − i=0 i , is proved to have optimal AI. In [1148], three classes based on
the same method are proposed; for some values of n, better nonlinearity than with the
function of Theorem 23 could be reached, and for other values of n, the nonlinearity
is worse. The good resistance to FAA has been checked by computer for small
values of n.
– Another kind of modification of this same function has been proposed in [1091]. It is
based on the PS ap construction. The so-called Tu–Deng function is the 2n-variable
function defined over F22n mapping (x, y) to f (xy 2 −2 ), where f is the function of
n

Theorem 23. Note that xy 2 −2 equals xy when y = 0 and is null when y = 0. Since
n

f is balanced, the Tu–Deng function is bent (and therefore has optimal nonlinearity
22n−1 − 2n−1 ) as we saw at page 213. Moreover, its AI has optimal value n, up to a
combinatorial conjecture, which was still an open problem (studied in [513] and other
papers but not solved yet) as this book was written, but which has been checked up to
n = 29; this is quite enough in cryptographic context, since n = 29 makes 58 variables.
We know that bent functions are not balanced, but it is shown in [1091] that modifying
2n−1 output values of the Tu–Deng function can give a balanced function with optimal
AI and very large nonlinearity.
Unfortunately, the resulting balanced function lies then at Hamming distance at most
2n−1 from the Reed–Muller code of order n and length 22n (the set of Boolean functions
in 2n variables of algebraic degree at most n), since because of Theorem 13, page 200,
any 2n-variable bent function has algebraic degree at most n. According to Theorem 22,
page 332, and to the observation that follows it, applied with 2n instead of n and with
r = n, the balanced function is weak against fast algebraic attacks (see more precise
340 Algebraic immune functions

calculations in [1106, lemmas 1–2]), as are the 1-resilient functions obtained from it in
some papers by modifying a few terms.
The Tu–Deng construction has been generalized to vectorial functions in [501].
– The Tu–Deng function has been modified in [1067] into a class of 2n-variable functions
having the same nice properties as the function of Theorem 23. As recalled in the survey
[242]:

Proposition 141 [1067] Let n = 2r m ≥ 2, where r ≥ 0 and m > 0 is odd, and let f be
the function of Theorem 23. We consider the functions
f1 (x, y) = f (xy); x, y ∈ F2n , (9.12)


f1 (x, y), x = 0
f2 (x, y) = (9.13)
u(y), x=0
where u is a balanced Boolean function on F2n satisfying u(0) = 0, deg(u) = n − 1, and
m+1 r n m+1
maxa∈F2n |Wu (a)| ≤ 2 2 if r = 0 and maxa∈F2n |Wu (a)| ≤ 2 2i + 2 2 if r ≥ 1. Then
i=1
f2 is balanced, f1 and f2 have optimal AI (equal to n), f1 has algebraic degree 2n − 2, f2
has algebraic degree 2n − 1, nl(f1 ) > 22n−1 − lnπ2 n + 0.42 2n − 1.

There is a little more complex lower bound on nl(f2 ), first given in [1067] and later
improved in [1108]; it is slightly smaller than for nl(f1 ). Function u does exist; see [1076,
1149].
The proof of optimal AI is obtained up to a conjecture similar to that of Tu–Deng, but
slightly different, which has been finally proved in [374]. The same gap between the bound
on the nonlinearity of f2 and its actual values is observed when computing them up to
n = 19; see [1067]. The nonlinearities of f2 and f are similar when they are taken with
the same numbers of variables; in some cases, nl(f ) is better, and in some cases, nl(f2 )
is better. The good behavior of f2 with respect to FAA has been shown mathematically in
[794].
In [789, 790], the authors introduced a larger class of functions achieving optimal
algebraic immunity and almost perfect immunity to fast algebraic attacks. The exact
nonlinearity of some functions of this larger class is good (slightly smaller than that of
Carlet–Feng functions), and some functions of this family have a slightly larger nonlinearity
than those of [1067] with the same numbers of variables. The class of [789, 790] also
contains a class presented in [644], whose resistance to fast algebraic attacks is also studied
in [789, 790, 794] without that a positive answer be clearly obtained. The class of [1067] is
modified in [1068] to ensure first-order resiliency.

Other constructions
Constructions that we shall not detail are given in [761, 777, 997] and other papers, as well
as constructions in [587, 796, 1124, 1172] based on the decomposition of the multiplicative
group of F∗2n corresponding to what we called polar representation at page 168 or more
general multiplicative decompositions.
9.1 Algebraic immune Boolean functions 341

In [776] is proposed a new method, based on deriving new properties of minimal


codewords of the punctured Reed–Muller code RM ∗ ( n−1 2 , n). Recall that we say that
a vector (a0 , . . . , aN−1 ) ∈ FN
2 is covered by a vector (c 0 , . . . , cN−1 ) ∈ FN
2 if for every
i = 0, . . . , N − 1, we have ci = 0 ⇒ ai = 0, and that the codewords of cyclic codes are
represented by polynomials (see page 12).

Proposition 142 [776] Let n be an integer, α ∈ F2n be a primitive element, and f be


the n-variable Boolean function with supp(f ) = {α m0 , . . . , α ms }, where m0 = 0 <and =
m0 < · · · < ms < 2n −1. Then f ⊕1 has no annihilator with algebraic degree less than n2
if and only if there is no nonzero even weight codeword of the cyclic code RM ∗ ( n−1
2 , n)
covered by c(x) = 1 + x m1 + · · · + x ms .

This result allows generalizing the function of Theorem 23 for any n, and leads for n odd,
thanks to Proposition 135, to large classes of new functions with optimal algebraic immunity
and good behavior against fast algebraic attacks, and high nonlinearity.

9.1.4 Secondary constructions of algebraic immune functions


Algebraic immunity and direct sum
For any positive integers n, m, any n-variable function f , and any m-variable function g
depending on disjoint sets of variables, denoting r = max (dalg (f ), dalg (g)), we have
max(AI (f ), AI (g)) ≤ AI (f ⊕ g) ≤ min (AI (f ) + AI (g), r). (9.14)
Indeed, for some , η ∈ F2 , let h be a nonzero annihilator of algebraic degree AI (f ) of
f ⊕ and k a nonzero annihilator of algebraic degree AI (g) of g ⊕ η, then the product of h
and k is a nonzero11 annihilator of algebraic degree at most AI (f )+AI (g) of f ⊕g ⊕ ⊕η,
and we know also that AI (f ⊕ g) ≤ dalg (f ⊕ g). This proves the right-hand side inequality.
And if h is a nonzero annihilator of the (n + m)-variable function f ⊕ g, then at least one
of its restrictions obtained by fixing x (resp. y) in f (x) ⊕ g(y) is nonzero; this proves the
left-hand side inequality.

Remark. When the sum is not direct, the inequality AI (f ⊕ g) ≤ AI (f ) + AI (g) can
be false [227]: let h be an n-variable Boolean function and let l be an n-variable nonzero
linear function, then the functions f = hl and g = h(l ⊕ 1) have algebraic immunities at
most 1, since f (l ⊕ 1) = gl = 0, and their sum equals h. If AI (h) > 2, we obtain a counter
example.

Of course, the double inequality of (9.14) generalizes to the direct sum of more than
two functions. We have also F AI (f ⊕ g) ≥ max(F AI (f ), F AI (g)) and F AC(f ⊕ g) ≥
max(F AC(f ), F AC(g)). These inequalities are not valid if the sum of f and g is not direct.
The algebraic immunity of direct sums of monomials is studied in Section 10.3; see
Relation (10.6) at page 363. The upper bound in (9.14) is tight. It is shown in [305] that the
upper bound is achieved with equality when the function with the lower algebraic immunity
11 Thanks to the fact that h and k depend on disjoint sets of variables.
342 Algebraic immune functions

(in a broad sense) is nonconstant and the other function f and its complement f ⊕ 1 have
different nonzero annihilator minimum degrees (this is applied in particular to determine
the algebraic immunity of the direct sum of a threshold function, see page 358, and affine
functions). Another example where the upper bound is achieved with equality is with the
direct sum g of an n-variable function f and of a monomial m of degree AI (f ) + 1; as
shown in [279], this gives indeed a function of algebraic immunity AI (f ) + 1 because the
restriction h1 to Fn2 × {0AI (f )+1 } of a nonzero degree at most AI (f ) annihilator h of g is an
annihilator of f , which then either has algebraic degree AI (f ) or is null, and in the former
case, gh = 0 is impossible because mh1 has degree 2AI (f ) + 1 while m(h1 ⊕ h) has degree
at most 2AI (f ) (since each monomial of h + h1 has at least one coordinate in common with
m), and in the latter case, f h cannot contain multiples of m and then cannot equal mh since
dalg (h) ≤ AI (f ) < dalg (m). We shall see at page 363 with triangular functions an example
of application.
Note that the upper bound in (9.14) shows that the direct sum of two functions can have
optimal algebraic immunity only if each has optimal algebraic immunity, except when both
are in odd dimension (an example of a direct sum with maximal algebraic immunity of two
functions not both having optimal algebraic immunity is function x1 ⊕ x2 x3 ⊕ x4 x5 x6 , of
algebraic immunity 3 in six variables, which is the direct sum of x1 ⊕ x2 x3 , of algebraic
immunity 2 in three variables, and of x4 x5 x6 , of algebraic immunity 1 in three variables).
The lower bound of (9.14) is also tight. An example where the lower bound is an equality
is with two functions f (x) and g(y) whose algebraic immunities equal their algebraic
degrees, since we have then max(dalg (f ), dalg (g)) = max(AI (f ), AI (g)) ≤ AI (f ⊕ g) ≤
dalg (f ⊕ g) = max(dalg (f ), dalg (g)). In [305], it is observed that if dalg (g) > 0, and if
(say) max (AI (f ), AI (g)) = AI (f ), and if f ⊕ 1 has no nonzero annihilator of algebraic
degree AI (f ), then the lower bound cannot be an equality.

Algebraic immunity and Siegenthaler’s construction


Proposition 143 [261] Let f , g be two n-variable Boolean functions with AI (f ) = d1
and AI (g) = d2 . Let h = (1 ⊕ xn+1 )f ⊕ xn+1 g ∈ BF n+1 . Then:
1. If d1 = d2 then AI (h) = min{d1 , d2 } + 1.
2. If d1 = d2 = d, then d ≤ AI (h) ≤ d + 1, and AI (h) = d if and only if there exists
f1 , g1 ∈ BF n of algebraic degree d such that {ff1 = 0, gg1 = 0} or {(1 ⊕ f )f1 = 0,
(1 ⊕ g)g1 = 0} and dalg (f1 ⊕ g1 ) ≤ d − 1.

Proof 1. If f has an algebraic degree d1 nonzero annihilator f1 , and g has an algebraic


degree d2 nonzero annihilator g1 , then we have (1 ⊕ xn+1 )f1 h = 0 and xn+1 g1 h = 0,
which proves, after addressing similarly the cases where f1 is an annihilator of f ⊕ 1
and/or g1 is an annihilator of g ⊕ 1, that AI (h) ≤ min{AI (f ), AI (g)} + 1.
Let p = (1 ⊕ xn+1 )p1 ⊕ xn+1 p2 be a lowest algebraic degree nonzero annihilator of
h. We have hp = (1 ⊕ xn+1 )fp1 ⊕ xn+1 gp2 = 0. So fp1 = 0 and gp2 = 0. Similarly, if
p is an annihilator of h ⊕ 1, then (1 ⊕ f )p1 = 0 and (1 ⊕ g)p2 = 0. Now there can be
three cases in both scenarios:
(i) p1 is zero and p2 is nonzero, then dalg (p2 ) ≥ d2 , which gives dalg (p) ≥ d2 + 1.
9.1 Algebraic immune Boolean functions 343

(ii) p2 is zero and p1 is nonzero, then dalg (p1 ) ≥ d1 , which gives dalg (p) ≥ d1 + 1.
(iii) Both p1 , p2 are nonzero, then dalg (p1 ) ≥ d1 and dalg (p2 ) ≥ d2 , which gives
dalg (p) ≥ max{d1 , d2 } + 1, when d1 = d2 and dalg (p) ≥ d, when d1 = d2 = d.
So for d1 = d2 we get AI (h) ≥ min{d1 , d2 } + 1.
2. According to the observations above, we have d ≤ AI (h) ≤ d + 1. And AI (h) equals d
if and only if we are in case (iii) and the degree d terms of p1 and p2 are the same.

Corollaries are given in [261], and more complex constructions are studied in [407].

9.1.5 Another direction of research of Boolean functions suitable for stream ciphers
All the functions described in Subsection 9.1.3 are of optimal algebraic immunity, and the
best ones have good other parameters, given their number of variables. They should be taken
with a number of variables large enough for ensuring sufficient resistance to all attacks but
also small enough for ensuring good speed. An alternative method is to find functions with
good but not optimal parameters, which would be quickly enough computable for being
used with larger numbers of variables, so as to ensure same (and possibly better) resistance
to attacks and also same and possibly better speed. The main example of this kind is with the
Boolean function (mentioned by Knuth in Vol. 4 of “The Art of Computer Programming”)
called hidden weight bit function (HWBF). The principle of this function is as follows: we
compute from the input x = (x1 , . . . , xn ) ∈ Fn2 a value, say φ(x) belonging to {1, . . . , n},
and the output of the function is the value of the coordinate of index φ(x):

f (x1 , . . . , xn ) = xφ(x) , where φ : Fn2 → {1, . . . , n}.

If the computation of φ(x) is fast, then that of the Boolean function is fast. In the case
of HWBF, φ(x) equals wH (x) if x = 0n and φ(0n ) equals any integer between 1 and
n (the value of f (0n ) being 0 for any choice). It is proved in [1105] that the function is
then balanced and has algebraic degree n − 1 (optimal) for n ≥ 3 and that its algebraic
immunity is at least  n3  + 1, which is quite good since the function can be taken in many
more variables than the function of Theorem 23, for instance. But the nonlinearity equals
 n−2
2n−1 − 2  n−2  , which is not quite good, since this gives a too large bias of the nonlinearity
2

with respect to 2n−1 , that is, = 2 −nl(f


n−1 )
2n , and the complexity of the fast correlation
attack is then too small; see page 78. The too large value of is here not compensated by the
number of variables, but as shown in [1105], it can be reduced by making a direct sum with
a function with large nonlinearity (however, the direct sum represents some risk of attacks).
Nevertheless, more functions of this kind need to be investigated. Some attempts have been
made but with no significant gain.

9.1.6 An additional condition modifying the study of Boolean functions


for stream ciphers
As recalled in [324], a stronger condition than balancedness is necessary in the filter
model, if we wish to avoid additionally those attacks which are able, for some choice
of the tapping sequence (i.e., of the positions inside the LFSR where the inputs to the
344 Algebraic immune functions

filter function are taken), to distinguish the keystream (si )i∈N output by the pseudorandom
generator from a random sequence, by the observation of the distribution of a vectorial
sequence of the form (si+j1 , . . . , si+jn ); see page 89. We have seen that, for avoiding such
attacks, the filter function must have one of the two equivalent forms x1 ⊕ f (x2 , . . . , xn )
and f (x1 , . . . , xn−1 ) ⊕ xn [189, 545, 1044]. Studying if a function of the desired form
f (x1 , . . . , xn−1 ) ⊕ xn (say) satisfies the criteria listed above is not equivalent to the same
study for f (taking a function in n − 1 variables providing the best trade-off between all
criteria and adding the extra variable xn in order to obtain the desired form gives an algebraic
immunity that can be either equal to that of the original function or larger by 1, and it results
in functions that no longer ensure the best possible algebraic degree). The constructions in
Subsection 9.1.3 have been modified in [324] in order to achieve inside this desired form the
best possible values.
Constructions of 1-resilient algebraic immune functions have also been found, but
only in even dimension;12 see, e.g., [261, 1055, 1070, 1076, 1092, 1111, 1115],
but many lack good nonlinearity and/or have bad resistance to FAA (some because
their n2 -th order nonlinearity is low) and the behavior of the others may not be
optimal.

9.2 Algebraic immune vectorial functions


We have seen at page 125 that algebraic attacks concern also vectorial functions used in
stream ciphers and in block ciphers. As far as we know, only standard algebraic attacks have
been considered in the literature for stream ciphers using vectorial functions (whose PRG
output several bits at each clock cycle), and fast algebraic attacks do not have reality for
block ciphers. Different related notions of algebraic immunity exist for vectorial Boolean
functions, according to whether these functions are used as multioutput filters in stream
ciphers or as S-boxes in block ciphers. They have been studied in [29, 32, 235]. We first give
the definition of the algebraic immunity of a set:

Definition 74 We call annihilator of a subset E of Fn2 any n-variable Boolean function


vanishing on E. We call algebraic immunity of E, and we denote by AI (E), the minimum
algebraic degree of all nonzero annihilators of E.

The algebraic immunity of an n-variable Boolean function f is then equal to


min(AI (f −1 (0)), AI (f −1 (1))), according to Definition 23, page 91.
The first generalization of algebraic immunity to S-boxes, introduced in [29], is its direct
extension:

Definition 75 The basic algebraic immunity AI (F ) of any (n, m)-function F is the


minimum algebraic immunity of all the preimages F −1 (z) of the elements z of Fm
2 by F .

12 Oneclass of functions in odd dimension has first-order correlation immunity: the concatenation of the majority
function f in n even variables and of f (x + 1n ).
9.2 Algebraic immune vectorial functions 345

The basic algebraic immunity is invariant under affine equivalence. Note that AI (F )
also equals the minimum algebraic immunity of the indicators of these preimages F −1 (z)
since, the algebraic immunity being a nondecreasing function over sets, we have for
every z ∈ Fm2

AI (Fn2 \ F −1 (z)) ≥ AI (F −1 (z )), ∀z = z .

AI (F ) quantifies the resistance to standard algebraic attacks of the stream ciphers using F
as a combiner or as a filter function. Indeed, the attacker can combine the output bits of the
generator in any way; in other words, the attacker can try a standard algebraic attacks on any
stream cipher using Boolean function h ◦ F as filter or combiner, where h is any nonconstant
m-variable Boolean function, and such an attack is the most efficient when h has Hamming
weight 1 (again because the algebraic immunity is a nondecreasing function over sets).
A second notion of algebraic immunity of vectorial functions [29, 235, 368, 392], more
relevant for S-boxes in block ciphers, has been called the graph algebraic immunity.

Definition 76 The graph algebraic immunity AIgr (F ) of any (n, m)-function F is the
algebraic immunity of the graph {(x, F (x)); x ∈ Fn2 } of the S-box.

By definition, the graph algebraic immunity is invariant under CCZ equivalence.


A third notion, introduced in [235] and called the component algebraic immunity, seems
also natural:

Definition 77 The component algebraic immunity AIcomp (F ) of any (n, m)-function F is


the minimal algebraic immunity of the component functions v · F (v = 0m in Fm
2 ) of the
S-box.

The interest of AIcomp is that it has a sense for both cases of stream ciphers and block
ciphers, and it helps studying the two other notions.

9.2.1 Known bounds on algebraic immunities


Note that we have AI (F ) ≤ AIcomp (F ), since AIcomp (F ) equals AI (F −1 (H )) for some
affine hyperplane H of Fm 2 , and since AI is nondecreasing; we also have AIgr (F ) ≤
AIcomp (F ) + 1, since if g is a nonzero annihilator of v · F , v = 0m , then the product
h(x, y) = g(x) (v · y) is a nonzero annihilator of the graph of F , and if g is a nonzero
annihilator of v · F ⊕ 1, then h(x, y) = g(x) (v · y) ⊕ g(x) is a nonzero annihilator of the
graph of F . A few observations are deduced in [235].
It has been observed in [29] that, for any (n, m)-function F , we have

AI (F ) ≤ AIgr (F ) ≤ AI (F ) + m.

Indeed, given any minimum degree nonzero annihilator g(x, y) of the graph of F , there
exists y such that the function x → g(x, y) is not the zero function, and this function is
a nonzero annihilator of F −1 (y), which proves the left-hand side inequality. And given a
minimum degree nonzero annihilator g of F −1 (z), where z is such that AI (F −1 (z)) =
346 Algebraic immune functions

Table 9.1 The values of dn,m .

n\m 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

5 3 2 1 1 1 0
6 3 2 2 1 1 1 0
7 4 3 2 2 1 1 1 0
8 4 3 2 2 1 1 1 1 0
9 5 3 3 2 2 1 1 1 1 0
10 5 4 3 3 2 2 1 1 1 1 0
11 6 4 4 3 2 2 2 1 1 1 1 0
12 6 5 4 3 3 2 2 2 1 1 1 1 0
13 7 5 4 4 3 3 2 2 2 1 1 1 1 0
14 7 6 5 4 4 3 3 2 2 2 1 1 1 1 0
15 8 6 5 5 4 3 3 3 2 2 2 1 1 1 1 0
16 8 7 6 5 4 4 3 3 2 2 2 1 1 1 1 1 0
17 9 7 6 5 5 4 4 3 3 2 2 2 1 1 1 1 1
18 9 8 7 6 5 5 4 4 3 3 2 2 2 1 1 1 1
19 10 8 7 6 5 5 4 4 3 3 3 2 2 2 1 1 1
20 10 8 7 7 6 5 5 4 4 3 3 3 2 2 2 1 1


AI (F ), the function g(x) m j =1 (yj ⊕zj ⊕1) is an annihilator of algebraic degree AI (F )+m
of the graph of F ; this proves the right-hand side inequality.
dn,m n n−m , we have
Denoting by dn,m the smallest integer such that i=0 i >2
-n.
AI (F ) ≤ dn,m ≤ dn,m−1 ≤ · · · ≤ dn,1 = .
2
Indeed, there is at least one z such that |F −1 (z)| ≤ 2n−m and according to Relation (9.5),
AI (f )−1 n
page 330, with f = 1|F −1 (z)| , we have i=0 i ≤ 2
n−m and therefore AI (f ) − 1 <

dn,m . Since AI (F ) ≤ AI (f ), this proves the first inequality, originally observed in [29], and
proved tight in [500] thanks to the function that we shall introduce at page 350; the other
inequalities are straightforward.
We give in Table 9.1 taken from [235] the values of dn,m , for n ranging from 5 to 20 and
for m ranging from 1 to 17.
Similarly, as also proved in [29], denoting by Dn,m the smallest integer such that
Dn,m n+m
i=0 i > 2n , we have
? @
n+1
AIgr (F ) ≤ Dn,m ≤ Dn,m−1 ≤ · · · ≤ Dn,1 = .
2
Note that we have Dn,m = dn+m,m . In [235] is given the table of the values of Dn,m , for n
ranging from 5 to 20 and for m ranging from 1 to 17.

9.2.2 Bounds on the numbers dn,m and Dn,m


n−m n n−m n−m
We have dn,m ≤ n − m and Dn,m ≤ n, since i=0 i > = 2n−m . The bound
<n= i=0 i
dn,m ≤
- . 2 is stronger than dn,m ≤ n − m if and only if m < 2 , and the bound D-n,m ≤
n−1
.
n+1
2 is stronger than Dn,m ≤ n if and only if n ≥ 3. The inequality Dn,m ≤ n+1
2
9.2 Algebraic immune vectorial functions 347

Table 9.2 The values of 1 − H2 (λ).

λ 0.1 0.2 0.3 0.4


1 − H2 (λ) 0.53 0.28 0.19 0.03

- . - .
gives dn+m,m ≤ n+1 2 and therefore, for n > m: dn,m ≤ n−m+1
2 , which is stronger than
<n=
dn,m ≤ 2 and than dn,m ≤ n − m. We know from [809, page 310] that, for any positive
λn  2nH2 (λ)
number λ ≤ 1/2 and every positive integer n, we have i=0 ni ≥ √8λn(1−λ) . This bound
implies, for every m
 >
1
dn,m ≤ min λn / n H2 (λ) − 3 + log2 n + log2 λ + log2 (1 − λ) > n − m
2

(note that the term in 12 3 + log2 n + log2 λ + log2 (1 − λ) is asymptotically negligeable
with respect to n). Hence:

Proposition 144 [235] Let λ ≤ 1/2 be a positive real number. For all positive integers n
and m such that
1
m > n (1 − H2 (λ)) + 3 + log2 n + log2 λ + log2 (1 − λ) ,
2
where H2 (x) = −x log2 (x) − (1 − x) log2 (1 − x), we have dn,m ≤ λn.
For any two positive integers n and m such that
1
m H2 (λ) > n (1 − H2 (λ)) + 3 + log2 (n + m) + log2 λ + log2 (1 − λ) ,
2
we have Dn,m ≤ λ(n + m).

We give in Table 9.2 the values of 1 − H2 (λ) for λ ranging in {.1, .2, .3, .4}.
These general bounds can be improved for specific values of m.

9.2.3 Consequences on the number of output bits and on the tightness of the bounds
AI (F ) can be larger than a number k, only if m ≤ n (1−H2 (k/n))+ 12 (3+log2 (k(1−k/n))),
according to Proposition 144. Hence, vectorial (n, m)-functions can be used as combiners
or filters only if m is small enough compared to n.
The bound AI (F ) ≤ AIgr (F ) is tight. Indeed:

Proposition 145 [235] Let F be an (n, m)-function such that, for every b ∈ Fm 2 , there
exists a ∈ Fn2 such that the ordered pair (a, b) is a linear structure of F (i.e., Da F equals
constant function b). Then AI (F ) = AIgr (F ).

Proof Let (e1 , . . . , em ) be the canonical basis of Fm 2 and for every i ≤ m, let
(αi , ei ) be a linear structure of F . Let z be such that AI (F ) = AI (F −1 (z)). We
can assume, without loss of generality up to translation, that z = 0m . Let g(x) be
a nonzero annihilator of algebraic degree AI (F ) of F −1 (0m ). Then, let h(x, y) =
348 Algebraic immune functions
 m   m
b∈Fm i=1 (yi ⊕ bi ⊕ 1) g x + m i=1 bi αi . Note that i=1 (yi ⊕ bi ⊕ 1) equals 1
2
if and only if y = b; hence, for
every x ∈ F2 , denoting 
n by I the support of the
vector F (x),
we have h(x,  F (x)) = g(x + i∈I αi ). Since F (x + i∈I αi ) = F (x) + i∈I ei = 0m ,
we have x + i∈I αi ∈ F −1 (0m ) and therefore h(x, F (x)) = 0 and h is an annihilator of
the graph of F .  
Moreover, expanding h(x, y) in the form J ⊆{1,...,m} i∈J yi φJ (x), for every vector
 
m 
b ∈ Fm 2 , denoting by I the support of b, we have (yi ⊕ bi ⊕ 1) = yi ,
i=1 J ⊆{1,...,m} / i∈J
I ⊆J
m
and then, φJ (x) = g(x + bi αi ) is a derivative of g of order |J |
b∈Fm
2 / supp(b)⊆J
i=1
and has an algebraic degree that is at most d ◦ g − |J |. Hence, we
have d ◦ h ≤ d ◦ g
◦ ◦
(and in fact d h = d g, since the part φ∅ independent of y in h(x, y) equals g(x)).
This implies AIgr (F ) ≤ AI (F ), and since we know that AI (F ) ≤ AIgr (F ), then
AI (F ) = AIgr (F ).

As seen above, we have two upper bounds on the graph algebraic immunity: AIgr (F ) ≤
AI (F ) + m and AIgr (F ) ≤ Dn,m . It is shown in [235] that the latter implies that the former
cannot be tight when AI (F ) > 0, m ≥ n/2 and n ≥ 3, nor when AI (F ) > 0, m ≥ n/3 and
n ≥ 25, and it is deduced that for n ≥ 2 and m ≥ n/3 we have dn,m ≤ m and for n ≥ 20
and m ≥ n/4, we have also dn,m ≤ m.
The vectorial functions studied in [273, 500] achieve the bound AI (F ) ≤ dn,m with
equality, which shows that this bound is tight for every n, m such that 1 ≤ m < n. It is not
known whether the bound AIgr (F ) ≤ Dn,m is tight too. It is shown in [29] that it is tight for
n ≤ 14.

9.2.4 Nonlinearity and higher-order nonlinearity


Lower bounds on the nonlinearity
AI (f )−2 n−1
As proved in [233], the lower bound nl(f ) ≥ 2 i=0 i due to Lobanov on the
nonlinearity of Boolean functions generalizes to (n, m)-functions as follows:

AI (F )−2 
n−1
nl(F ) ≥ 2m ,
i
i=0

where AI (F ) is the basic algebraic immunity of F . But we have seen that for large m,
AI (F ) − 2 is negative. So a bound involving AIgr (F ) is also needed. Applying Lobanov’s
bound to the component functions of F , we obtain

AIcomp (F )−2 
n−1
nl(F ) ≥ 2 .
i
i=0
9.2 Algebraic immune vectorial functions 349

The inequality AIcomp (F ) ≥ AIgr (F ) − 1 implies then

AIgr (F )−3 
n−1
nl(F ) ≥ 2 .
i
i=0

Lower bounds on the higher-order nonlinearities


For every positive integer r, the r-th order nonlinearity of a vectorial function F is the
minimum r-th order nonlinearity of its component functions (recall that the r-th order
nonlinearity of a Boolean function equals its minimum Hamming distance to functions of
algebraic degree at most r). As proved in [233], the bounds known for Boolean functions
generalize to (n, m)-functions as follows:

AI (F )−r−1 
n−r
nlr (F ) ≥ 2 m
i
i=0

and
AI (F )−r−1  AI (F )−r−1 
n n−r
nlr (F ) ≥ 2m−1
+ 2m−1
i i
i=0 i=AI (F )−2r

(the first of these two bounds can be slightly improved as for Boolean functions).
Applying the bounds valid for Boolean functions to the component functions of F , we
have also
AIcomp (F )−r−1 
n−r
nlr (F ) ≥ 2
i
i=0

and
AIcomp (F )−r−1  AIcomp (F )−r−1 
n n−r
nlr (F ) ≥ + .
i i
i=0 i=AIcomp (F )−2r

The inequality AIcomp (F ) ≥ AIgr (F ) − 1 implies then

AIgr (F )−r−2 
n−r
nlr (F ) ≥ 2
i
i=0

and
AIgr (F )−r−2  AIgr (F )−r−2 
n n−r
nlr (F ) ≥ + .
i i
i=0 i=AIgr (F )−2r−1
350 Algebraic immune functions

9.2.5 Constructions of algebraic immune vectorial functions


Feng et al.’s class
In [274], the class introduced in [500] is studied further. We assume that n≥2 and 1mn.
For any fixed integer s, 0s2n − 2, F2n is a disjoint union of the following 2m subsets:
S0 = {α l | sls + 2n−m − 2} ∪ {0}
(9.15)
Sj = {α l | s + 2n−m j − 1ls + 2n−m (j + 1) − 2}; 1j 2m − 1,
where α is a primitive element. Each integer j , 0j 2m − 1, has a 2-adic expansion
j = j0 + j12 + · · · + jm−1 2m−1 (j0 , . . . , jm−1 ∈ {0, 1})
and corresponds to the vector j = (j0 , . . . , jm−1 ) ∈ Fm2 . For each integer i, 0im − 1, we
define the Boolean function fi : F2 → F2 by
n
⎧ 9
⎨ 1, if x ∈ Sj
0j 2m −1
fi (x) = ji =1 (9.16)

0, otherwise.
Then for the (n, m)-function
F = (f0 , . . . , fm−1 ) : F2n → Fm
2,


m−1
we have, for each j = (j0 , . . . , jm−1 ) ∈ Fm
2 and j = ji 2i ,
i=0

x ∈ F −1 (j ) ⇔ fi (x)Q= ji (0im − 1)
⇔ x ∈ {Sk | 0k2m − 1, ki = ji } (0im − 1)
⇔ x ∈ Sj .
Therefore the (n, m)-function F can be characterized by
F −1 (j ) = Sj (for each j , 0j 2m − 1) (9.17)
n −1 n −2
It is proved in [274] by observation and calculation of the coefficients of x 2 and x 2 in
the univariate representations of the coordinate functions that:

Proposition 146 (1) For every 1 ≤ m ≤ n, function F is balanced.


(2) We have dalg (F ) = n − 1.
α 2m−1
(3) We have dmin (F ) = n − 1 if and only if 1+α
α α 2
, ( 1+α ) , . . . , ( 1+α ) are linearly
independent over F2 .

It is also proved in this same paper that the basic algebraic immunity of F is optimal:

Proposition 147 For every n, m such that 1 ≤ m ≤ n, we have AI (F ) = dn,m .

The proof is very similar to the Boolean case, by application of the BCH bound.
A lower bound on the (hyper-)nonlinearity of F is also proved in [274] by the use of Gauss
sums, which allow transforming the expression of the Walsh transform, and by bounding
from above some trigonometric sums with integrals:
9.2 Algebraic immune vectorial functions 351
n +m
ln( 4(2π−1) ) − 1 ∼ 2n−1 −
n
ln 2 n2 +m
Proposition 148 nl(F )≥2n−1 − 22
π π 2 · n.

A class obtained through group decomposition


In [804], the authors constructed a class of balanced (n, m)-functions over F2n (n even), with
m ≤ n/2, and with high basic algebraic immunity and optimal algebraic degree, based on
the decomposition of the multiplicative group of F∗2n corresponding to what we called polar
representation at page 168.
10

Particular classes of Boolean functions

10.1 Symmetric functions


A function is called a symmetric Boolean function if it is invariant under the action of the
symmetric group (i.e., if its output is invariant under permutation of its input bits). Its output
depends then only on the Hamming weight of the input (and can be implemented with a
number of gates linear in the number of input variables [1117], with a reduced amount of
memory required for storing the function). So a Boolean function f is symmetric if and only
if there exists a function f from {0, 1, . . . , n} to F2 such that
f (x) = f(wH (x)).
The vector (f(0), . . . , f(n)) is sometimes called the simplified value vector of f .
Such functions are of some interest for cryptography, as they allow us to implement in
an efficient way nonlinear functions on large numbers of variables. Let us consider, for
example, an LFSR filtered by a 63-variable symmetric function f , whose input is the content
of an interval of 63 consecutive flip-flops of the LFSR. This device may be implemented with
a cost similar to that of a 6-variable Boolean function, thanks to a 6 bit counter calculating the
Hamming weight of the input to f (this counter is incremented if a 1 is shifted in the interval
and decremented if a 1 is shifted out). However, the pseudorandom sequence obtained this
way has a correlation with transitions (sums of consecutive bits), and a symmetric function
should not take all its inputs in a full interval. In fact, it is not yet completely clarified
whether the advantage of allowing many more variables and the cryptographic weaknesses
these symmetric functions may introduce result in an advantage for the designer or for the
attacker.

10.1.1 Representation
Let r = 0, . . . , n and let 1En,r be the Boolean function whose support is the set En,r of all
vectors of Hamming weight r in Fn2 . Then, according to Relation (2.23), page 49, relating
the values of the coefficients of the NNF to the values of the function, the coefficient
 of x I
|I |
in the NNF of 1En,r equals (−1)|I | (−1)wH (x) = (−1)|I |−r , and we have
n r
x∈F2 ; wH (x)=r
supp(x)⊆I
then:

|I |−r |I | I
1En,r (x) = (−1) x . (10.1)
r
I ⊆{1,...,n}

352
10.1 Symmetric functions 353


n n
Any symmetric function f being equal to f(r) 1En,r , it equals f(r) 1En,r , since
r=0 r=0
the functions 1En,r have disjoint supports. The coefficient of x I in its NNF equals then
n 
|I |−r |I |
f(r)(−1) and depends only on the size of I . Denoting
r
r=0
  w (x) (w (x)−1)...(w (x)−i+1)
wH (x) H H H
if wH (x) ≥ i
Si (x) = xI = = i!
I ⊆{1,...,n}
i 0 otherwise,
|I |=i

the NNF of f equals then


n n 
i
f (x) = ci Si (x), where ci = f(r)(−1)i−r . (10.2)
r
i=0 r=0

According to Relation (10.2), we see by definition of f that this function coincides on


{0, . . . , n} with the polynomial
n  n
z z (z − 1) . . . (z − i + 1)
f(z) = ci = ci ,
i i!
i=0 i=0

of degree max{i; ci = 0} (which is also the degree of the NNF of f ). Note that since this
degree is at most n, and the values taken by this polynomial at n + 1 points are determined
by the values of f , this polynomial representation is unique and can be obtained by the
Lagrange interpolation formula.
Function σi (x) = Si (x) [mod 2] is the ith elementary symmetric function:
 
i
σi (x) = xjk .
1≤j1 <···<ji ≤n k=1

According to Lucas’ theorem (see page 487 or [809, page 404]), σi (x) equals 1 if and only if
log n
the binary expansion l=1 2 il 2l−1 of i is covered by that of wH (x) (i.e., writing wH (x) =
log2 n
l=1 have il ≤ jl , ∀l = 1, . . . , log2 n; we write i wH (x)). Note that this
jl 2l−1 , we
implies that σi = l∈{1,...,log2 n} σ2l . Reducing Relation (10.2) modulo 2, we deduce from
il =1
Lucas’ theorem again that the ANF of f equals:

n
[ANF ] f (x) = i σi (x), where i = ci [mod 2] = f(r) [mod 2]. (10.3)
i=0 r i

The algebraic degree of f equals max{i; i = 1} (in particular, in the case of f = 1En,r ,
we have that i equals 1 if and only if r i and the algebraic degree equals max{i ∈
{r, . . . , n}; r i}).
Using that the binary Möbius transform is involutive, or using that σi (x) = 
1 if and only
wH (x)
if i is odd and Lucas’ theorem again, we deduce from (10.3) that f(j ) = i j i . The
vector ( 0 , . . . , n ) is sometimes called the simplified ANF vector. Relation (10.3) gives the
expression of the simplified ANF vector by means of the simplified value vector, and this
relation gives the reverse expression.
354 Particular classes of Boolean functions

According to the observations above, nonzero symmetric Boolean functions are, up to


the addition of constant function 1, the component functions of the (n, n)-function (x)
whose ith coordinate function is the elementary symmetric function σi (x). Note that, for
every x, y ∈ Fn2 , we have wH (x) = wH (y) if and only if (x) = (y), since the σi
generate by linear combinations all those symmetric Boolean functions null at input 0n , and
two vectors x, y have the same nonzero Hamming weight if and only if every symmetric
Boolean function null at 0n takes the same value at inputs x and y. This translation of an
equality between the Hamming weights of two vectors x and y into the equality between
the images of x and y by a vectorial function is nicely simple. We have then wH (x) = k
for some nonnegative k if and only if, for every i = 1, . . . , n, we have σi (x) ≡ ki [mod 2].
We have also wH (x) ≤ k if and only if σi (x) = 0 for all i > k (this necessary condition is
sufficient because σwH (x) (x) = 1).
Note
n
that
a symmetric Boolean function f has algebraic degree 1 if and only if it equals
n
i=1 ix or i=1 xi ⊕ 1, that is, if the binary function
 f(r) equals r [mod 2] or r + 1 [mod
2], and that it is quadratic if and only if it equals 1≤i<j ≤n xi xj plus a symmetric function
r r
of algebraic
r r 1, that is, if the function f(r) equals 2 [mod 2] or 2 + r [mod
degree at most
2] or 2 + 1 [mod 2] or 2 + r + 1 [mod 2]. Hence, f has algebraic degree 1 if and only if f
satisfies f(r + 1) = f(r) ⊕ 1 and it has degree 2 if and only if f satisfies f(r + 2) = f(r) ⊕ 1.
As observed in [205], the algebraic degree of a symmetric function f is at most 2t − 1,
for some positive integer t such that 2t < n, if and only if the sequence (f(r))r≥0 is periodic
with period 2t (sufficiency is a direct consequence of (10.3) and necessity of the reverse
relation). Here again, it is not clear whether this is more an advantage for the designer of
a cryptosystem using such symmetric function f (since, to compute the image of a vector
x by f , it is enough to compute the number of nonzero coordinates x1 , . . . , xt ) or for the
attacker.

10.1.2 Hamming weight


In [173] is given a closed formula for the correlation between any two symmetric Boolean
functions (and in particular the weight of a symmetric function). In [532], von zur Gathen
and Roche determined all balanced symmetric Boolean functions up to 128 variables. More
recently, it has been proved in [530] that balanced symmetric Boolean functions of fixed
algebraic degree d > 1 and sufficiently large number of variables are trivial. This term
means that n is odd and the simplified value vector f is antisymmetric with respect to the
middle of [0, . . . , n], that is, f(n − i) = f(i) ⊕ 1, ∀i. This same paper also shows (proving
a conjecture by Cusick) the nonexistence of trivial balanced elementary symmetric Boolean
functions except for n = 2t+1 l − 1 and d = 2t , where t and l are any nonnegative
integers.

10.1.3 Fourier–Hadamard and Walsh transforms


For every a ∈ Fn2 and r ∈ {0, . . . , n}, denoting by  the Hamming weight of a, we
n  
 n − 
have 1,
En,r (a) = (−1) a·x
= (−1) j
, denoting by j the size
j r −j
nx∈F2 ; wH (x)=r j =0
10.1 Symmetric functions 355
n  
j X n−X
of supp(a) ∩ supp(x). The polynomials Kn,r (X) = j =0 (−1) j r−j are called
Krawtchouk polynomials. They are characterized by their generating series:
n
Kn,r ()zr = (1 − z) (1 + z)n−
r=0
and have nice resulting properties (see, e.g., [328, 809]). By R-linearity, we deduce that
n
the value at a of the Fourier–Hadamard transform of any symmetric function f(r) 1En,r
r=0
n
equals f(r) Kn,r (wH (a)).
r=0
From the Fourier–Hadamard transform, we can deduce the Walsh transform thanks to
Relation (2.32), page 55.
In [334], the exponential sums of symmetric Boolean functions and their asymptotic
behavior are studied further.

10.1.4 Nonlinearity
If n is even, then the restriction of every symmetric function f on Fn2 to the n2 -dimensional
flat A = {(x1 , . . . , xn ) ∈ Fn2 ; xi+n/2 = xi ⊕ 1, ∀i ≤ n2 } is constant, since all the elements
of A have the same Hamming weight n/2. Thus, f is n2 -normal (see Definition 28, page
105). But Relation (3.15), page 107, does not improve upon the covering radius bound (3.2),
page 80. The symmetric functions that achieve this bound, i.e., that are bent, have been first
characterized by Savicky in [1019]: the bent symmetric functionsare the four symmetric
functions of algebraic degree 2 already described above: f1 (x) = 1≤i<j ≤n xi xj , f2 (x) =
f1 (x) ⊕ 1, f3 (x) = f1 (x) ⊕ x1 ⊕ · · · ⊕ xn and f4 (x) = f3 (x) ⊕ 1. A stronger result can be
proved in a very simple way:

Proposition 149 [566] For every positive even n, the P C(2) n-variable symmetric
functions are the functions f1 , f2 , f3 , and f4 above.

Proof Let f be any P C(2) n-variable symmetric function and let 1 ≤ i < j ≤ n. Let
us denote by x  the vector: x  = (x1 , . . . , xi−1 , xi+1 , . . . , xj −1 , xj +1 , . . . , xn ). Since f (x)
is symmetric, it has the form xi xj g(x  ) ⊕ (xi ⊕ xj ) h(x  ) ⊕ k(x  ). Let us denote by ei,j
the vector of Hamming weight 2 whose nonzero coordinates stand at positions i and j . The
derivative Dei,j f equals (xi ⊕ xj ⊕ 1)g(x  ) and is balanced, by hypothesis. Then g must be
equal to the constant function 1 (indeed, if g(x  ) = 1 for some x  , then (xi ⊕ xj ⊕ 1)g(x  )
equals 1 for half of the inputs (xi , xj ), and
 otherwise it equals 1 for none). Hence, the degree
at least 2 part of the ANF of f equals 1≤i<j ≤n xi xj .

Results on the propagation criterion for symmetric functions are in [205].


If n is odd, then the restriction of any symmetric function f to the n+1 2 -dimensional flat
A = {(x1 , . . . , xn ) ∈ Fn2 ; xi+ n−1 = xi ⊕ 1, ∀i ≤ n2 } is affine, since the Hamming weight
2
function wH is constant on the hyperplane of A of equation xn = 0 and on its complement.
Thus, f is n+12 -weakly normal. According to Relation (3.15), page 107, this implies that its
356 Particular classes of Boolean functions
n−1
nonlinearity is upper bounded by 2n−1 −2 2 . It also allows showing that the only symmetric
functions achieving this bound with equality are the same as the four functions f1 , f2 , f3 ,
and f4 above, but with n odd (this has been first proved by Maitra and Sarkar [818], in a
more complex way). Indeed:

Proposition 150 [224] Let n be any positive integer and let f be any symmetric function
on Fn2 . Let l be any integer satisfying 0 < l ≤ n2 . Denote by hl the symmetric Boolean
function on n − 2l variables defined by hl (y1 , . . . , yn−2l ) = f (x1 , . . . , xl , x1 ⊕ 1, . . . , xl ⊕ 1,
y1 , . . . , yn−2l ), where the values of x1 , . . . , xl are arbitrary (equivalently, hl can be defined
by hl (r) = f(r + l), for every 0 ≤ r ≤ n − 2l). Then nl(f ) ≤ 2n−1 − 2n−l−1 + 2l nl(hl ).

Proof Let A = {(x1 , . . . , xn ) ∈ Fn2 | xi+l = xi ⊕ 1, ∀i ≤ l}. For every element x of A, we


have f (x) = hl (x2l+1 , . . . , xn ). Let us consider the restriction g of f to A as a Boolean func-
tion on F2n−l , say g(x1 , . . . , xl , x2l+1 , . . . , xn ). Then, since g(x1 , . . . , xl , x2l+1 , . . . , xn ) =
hl (x2l+1 , . . . , xn ), g has nonlinearity 2l nl(hl ). According to Relation (3.15) applied with
ha = g and k = n − l, we have nl(f ) ≤ 2n−1 − 2n−l−1 + 2l nl(hl ).

The characterizations recalled above of those symmetric functions achieving best possible
nonlinearity can be straightforwardly deduced. Moreover, if for some 0 ≤ l < n−1 2 , the
nonlinearity of an n-variablesymmetric function f is strictly larger than 2n−1 − 2n−l−1 +
n−2l−1 n−1
2l 2n−2l−1 − 2 2
− 1 = 2n−1 − 2 2
− 2l , then, thanks to these characterizations
and to Proposition 150, the function hl must be quadratic, and f satisfies f(r + 2) = f(r) ⊕ 1,
for all l ≤ r ≤ n − 2 − l (this property has been observed in [205, theorem 6] a little after
that [224] was published, and proved slightly differently).
Further properties of the nonlinearities of symmetric functions can be found
in [205, 224].

10.1.5 Correlation immunity and resiliency


The correlation immunity of symmetric functions has been studied in [138, 887, 1135] and
their resiliency in [370, 557].
There exists a conjecture on symmetric Boolean functions and, equivalently, on functions
defined over {0, 1, . . . , n} and valued in F2 : if f is a nonconstant symmetric Boolean
function, then the numerical degree of f (hence, the degree of the univariate polynomial
representation of f) is larger than or equal to number n − 3. It is easily shown that this
numerical degree is more than n2 (otherwise, the polynomial f2 − f would have degree at
most n, and being null at n + 1 points, it would equal the null polynomial, a contradiction
with the fact that f is assumed not to be constant). But the gap between  n2  + 1 and n − 3
is open. According to Proposition 118, page 286, the conjecture is equivalent to saying
that there does not exist any nonaffine symmetric 3-resilient function. And proving this
10.1 Symmetric functions 357

conjecture is also a problem on binomial coefficients since the numerical degree of f is


bounded above by d if and only if, for every k such that d < k ≤ n:
k 
r k
(−1) f(r) = 0. (10.4)
r
r=0
The conjecture is equivalent to saying that this Relation (10.4), with d = n−4, has no binary
solution f(0), . . . , f(n). Von zur Gathen and Roche [532] have observed that all symmetric
n-variable Boolean functions have numerical degrees larger than or equal to n − 3, for any
n ≤ 128 (they exhibited Boolean functions with numerical degree n − 3; see also [557]).
The same authors also observed that, if the number m = n + 1 is a prime, then
all nonconstant n-variable symmetric Boolean functions have numerical degree n (and
therefore, considering the function g(x) = f (x) ⊕ x1 ⊕ · · · ⊕ xn and applying Proposition
118, all nonaffine n-variable symmetric n Boolean functions (−1)(−2)...(−r)
are unbalanced): indeed, m being
a prime, the binomial n coefficientn r is congruent with n
1·2...r = (−1)r , modulo
m, and the sum r
r=0 (−1) r f(r) is then congruent with r=0 f(r), modulo m, and
Relation (10.4) with k = n implies then that f must be constant.
Notice that, applying Relation (10.4) with k = p−1, where p is the largest prime less than
or equal to n + 1, shows that the numerical degree of any symmetric nonconstant Boolean
function is larger than or equal to p−1 (or equivalently that no symmetric nonaffine Boolean
function is (n − p + 1)-resilient): otherwise, reducing (10.4) modulo p, we would have that
the string f(0), . . . , f(k) is constant, and f having univariate degree less than or equal to k,
the function f, and thus f itself, would be constant.
More results on the balancedness and resiliency/correlation immunity of symmetric
functions can be found in [78, 205, 887, 1014, 1135]. The resiliency order of a symmetric
function of algebraic degree d cannot exceed 2log2 d+1 − 2 [205].

10.1.6 Algebraic immunity and fast algebraic immunity


We have seen in Section 3.1 that, for every n-variable Boolean function f , there exist g = 0
and h, both of algebraic degree at most  n2  and such that f g = h (equivalently, there exist
nonzero annihilators of f or of f ⊕1 of algebraic degree at most  n2 ). The same property can
be proven when dealing with symmetric functions only: the elementary symmetric functions
of degrees at most  n2  and their products with f give a family of 2 ( n2  + 1) > n + 1
symmetric functions, which must be linearly dependent since they live in a vector space of
dimension n + 1. There exist then g = 0 and h of degree at most  n2  such that f g = h and
the conclusion follows (using also the proof of Proposition 25, page 91). However, given an
n-variable symmetric function f , there do not necessarily exist symmetric functions g = 0
and h of algebraic degree as small as AI (f ) such that f g = h.
We have seen that the majority function, which is symmetric, has optimal algebraic
immunity. In the case n is odd, it is the only symmetric function having such property,
up to the addition of a constant (see [979], which completed a partial result of [765]). In the
case n is even, other symmetric functions exist (up to the addition of a constant and to the
transformation x → x = (x1 ⊕ 1, . . . , xn ⊕ 1)) with this property and all are known; more
358 Particular classes of Boolean functions

precisions and more results on the algebraic immunity of symmetric functions can be found
in [127, 364, 774, 785, 941, 976, 978, 979, 1103] and the references therein. In particular, all
symmetric functions of optimal algebraic immunity in numbers of variables that are powers
of 2 are determined in [785], and it is shown in [127] that for n = 2j , 2j − 1 and 2j − 2,
the elementary symmetric function σ2j −1 has optimal algebraic immunity, and that these
are the only cases where an elementary symmetric function can have optimal AI. In [941]
the authors show, thanks to a result of [786], that the corpus of potential annihilators of f
or f ⊕ 1 that needs to be investigated to prove the optimal algebraic immunity of a given
function can be reduced in the case it is symmetric (and some necessary conditions on the
simplified value vector for symmetric functions to achieve high AI are given), and this allows
a description of optimal AI symmetric functions (and also of the suboptimal ones), whose
other parameters are also studied (none is balanced and the nonlinearity is bad).
We have seen at page 322 that, as shown in [791], no symmetric Boolean function can
be perfect algebraic immune. Large classes of symmetric functions are very vulnerable to
fast algebraic attacks despite their proven resistance against standard algebraic attacks: for
2m ≤ n ≤ 2m + 2m−1 − 1, for every symmetric n-variable function f of algebraic immunity
at least 2m−1 , there exists g such that 1 ≤ dalg (f ) ≤ n−2m +1 and dalg (fg) ≤ n−2m−1 +1.
Even the other cases often pose a problem, since if dalg (f ) > 2k , where 2k does not divide
dalg (f ), then there exists g such that dalg (g) ≤ e = dalg (f ) [mod 2k ] and dalg (fg) ≤
dalg (f ) − e − 1, and the FAI of a symmetric function f whose algebraic degree dalg (f ) is
not a power of 2 is smaller than dalg (f ).

10.1.7 The subclass of threshold functions


For every d ≤ n, we call1 threshold function2 of index d, and we denote by tn,d the
n-variable Boolean function whose support equals the set of vectors of Hamming weights
at least d. The majority functions are examples. The reservations we made about symmetric
functions are of course valid for threshold functions. Moreover, we shall see that threshold
functions (as many symmetric functions and as all monotone functions; see page 363) have
bad nonlinearity. They may then be improper for use in most cryptographic frameworks.
But their output is very fast to compute. They can then be used in many more variables
than more complex functions (this is the case for all symmetric functions, but still more for
threshold functions). They deserve then some attention since they may present interest in
some settings like the FLIP cryptosystem (see page 453).
The class of threshold functions has the interest of being preserved by the action of fixing
the values of some variables (fixing one variable to 0 in tn,d gives the function tn−1,d , and
fixing one variable to 1 gives tn−1,d−1 ). The results on them allow then not only to study their
contributions to the resistance against classical attacks, but also against guess and determine
attacks (see page 96). This is also true more generally with symmetric functions, but less is
known on this wider class.
1 Our use of the term of threshold function is a little more restrictive than in [914]; more investigation is then
needed.
2 Not to be confused with the threshold implementation of vectorial functions, which we shall address in
Subsection 12.1.4, page 436.
10.1 Symmetric functions 359

Note that, for each value of d, functions tn,d and tn,n−d+1 are EA equivalent:

∀x ∈ Fn2 , tn,n−d+1 (x) = 1 ⊕ tn,d (x + 1n ).

The majority function for n odd is balanced, but all other threshold functions are
unbalanced. 
We have tn,d (x) = I ⊆{1,...,n} λI x I , where, for d > 0, λ∅ = tn,d (0) = 0 and according
to Relation (2.23), page 49, for I = ∅:

λI = (−1)|I | (−1)wH (x) tn,d (x) = (−1)|I | (−1)wH (x)


x∈Fn2 ; supp(x)⊆I x∈Fn
2 ; supp(x)⊆I
wH (x)≥d

|I |  d−1  
|I | |I | |I | − 1
= (−1)|I | (−1)k = (−1)|I |−1 (−1)k = (−1)|I |−d
k k d −1
k=d k=0

|I | 
(using k=0 (−1)k |Ik | = 0, and the last equality being easily checked by induction on d).
According to Lucas’ theorem (see page 487 or [809, page 404]), the coefficient of x I in the
ANF of tn,d equals 1 (i.e., λI is odd) if and only if the binary expansion of d − 1 is covered
by (i.e., has support included in) that of |I | − 1, and the algebraic degree of tn,d equals then
k + 1, where k is the largest number smaller than n whose binary expansion covers that of
d − 1, that is, where k − d + 1 is the largest number smaller than n − d + 1, whose binary
expansion is disjoint from that of d − 1.
Moreover, according to Relation (2.61), page 66, if u = 0n , then Wtn,d (u) equals
2(−1)wH (u)+1 2n−|I | λI , that is:
I ⊆{1,...,n}
supp(u)⊆I


|I | − 1
Wtn,d (u) = 2(−1)wH (u)+1 2n−|I | (−1)|I |−d .
I ⊆{1,...,n}
d −1
supp(u)⊆I

Recall from Relation (10.1), page 352,  that the NNF of the indicator of the set En,r of vectors
of Hamming weight r has (−1)|I |−r |Ir | for coefficient of x I . We deduce that W1En,r (u) =

wH (u)+1 n−|I | |I |−r |I |
2(−1) 2 (−1) . Therefore, for every u, the Walsh transform of
I ⊆{1,...,n}
r
supp(u)⊆I
function 1En,d at u ∈ Fn2 equals the opposite of the Walsh transform of function tn+1,d+1 at
(u, 1) (where “,” symbolizes concatenation). And since these two functions are symmetric,
this implies that the maximum absolute value of the Walsh transform of 1En,d equals the
maximum absolute value of the Walsh transform of tn+1,d+1 at nonzero inputs. But the
nonlinearities of the two functions are different because the nonlinearity of 1En,r equals its
Hamming weight (since this weight is small), and hence, |W1En,r | takes its maximum at the
zero entry. It is easily deduced that
360 Particular classes of Boolean functions
⎧  n−1

⎪ 2n−1 − (n−1)/2 if d = n+1
2 ,

⎪ 

⎪ n


n
= wH (tn,d ) if d > n+1
2 ,
nl(tn,d ) = k


k=d



d−1
n

⎪ = 2n − wH (tn,d ) n+1

⎩ if d < 2 ,
k
k=0

since this is known from [409] in the case d = n+1


2 , and for d > n+1
2 , we have

n−1
|W1En−1,d−1 (u)| = 2 | (−1)u·x | ≤ 2 wH (1En−1,d−1 ) = 2 ,
d −1
x∈En−1,d−1
   n
for every u = 0n , and since [|Wtn,d (0n )| = 2n − 2 ni=d ni = d−1 i , and using
n n−1 n−1 i=n−d+1
Pascal’s identity i = i + i−1 , we deduce that |Wtn,d | takes its maximum at the 0n
input, and this completes the proof in this case, and also in the last case according to the
identity tn,n−d+1 (x) = 1 ⊕ tn,d (x + 1n ).
It is also known from [305] that AI (tn,d ) = min(d, n − d + 1), and the vector space of
minimum algebraic degree annihilators can be determined. Indeed, applying the transforma-
tion x → x + 1n changes tn,d into the indicator of the set of vectors of Hamming weight at
most n − d; the linear combinations over F2 of the monomials of degrees at least n − d + 1
vanish over the words of  Hamming weight
 at most n − d and are then annihilators of this
indicator; the dimension ni=n−d+1 ni of this vector space of annihilators being equal to
the dimension of the vector space of all annihilators, that is, 2n − wH (tn,d ), these linear
combinations are all the annihilators of the indicator; the annihilators of tn,d are obtained
from these linear combinations by the transformation x → x + 1n . They can have every
algebraic degree at least n−d +1. And the annihilators of 1⊕tn,d are the linear combinations
over F2 of the monomials of degrees at least d. They can have every algebraic degree at least
d. Hence AI (tn,d ) = min(d, n − d + 1).

10.2 Rotation symmetric, idempotent, and other similar functions


We have already encountered rotation symmetric (RS) and idempotent functions in Chapters
6 and 7 (see Definitions 59 and 60, page 248). We have seen how, through the choice of a
normal basis, the latter are related to the former (see Proposition 89, page 248). RS functions
constitute a superclass of symmetric functions, which has been investigated from the
viewpoints of bentness and correlation immunity (see, e.g., [503, 1048]). These functions,
which represent an interesting (reasonably small) corpus for computer investigation, have
also played a role in the study of nonlinearity. It could be shown in [684, 686], thanks to
such computer investigation, that the best nonlinearity of Boolean functions in odd number
n of variables is strictly larger than the quadratic bound if and only if n > 7. Indeed, a
9-variable function of nonlinearity 241 could be found (while the quadratic bound gives
240, and the covering radius bound 244), and using direct sum with quadratic functions, it
gave then 11-variable functions of nonlinearity 994 (while the quadratic bound gives 992 and
the covering radius bound 1,000), and 13-variable functions of nonlinearity 4,036 (while the
10.2 Rotation symmetric, idempotent, and other similar functions 361

quadratic bound gives 4,032 and the covering radius bound 4,050). Later it was checked that
241 is the best nonlinearity of 9-variable rotation symmetric functions, but that 9-variable
functions whose truth tables (or equivalently ANFs) are invariant under cyclic shifts by three
steps and under inversion of the order of the input bits can reach nonlinearity 242, which led
to 11-variable functions of nonlinearity 996 and 13-variable functions of nonlinearity 4,040.
Balanced functions in 13 variables beating the quadratic bound could also be found. The
construction with RS functions does not beat the nonlinearity of the Patterson–Wiedemann
functions for 15 variables.
Hence rotation symmetry is an interesting notion for investigating the parameters
of Boolean functions. Cryptographically speaking, the strong structure it provides may
represent a risk with respect to attacks, while rotation symmetric functions are more difficult
to use with large numbers of variables than symmetric functions (because they are slower to
compute in general).
For n = 2m even, we can consider the bivariate representation alongside the univariate
representation of idempotent functions. We can see how obtaining the univariate (resp.
multivariate) form from the bivariate form and vice versa, and exploit this correspondence to
construct more functions; this has been done in [281], and we follow below this reference.
For m odd, the situation is simplified and we place then ourselves in such case: choosing
2
w ∈ F4 \ F2 , we have w 2 = w + 1, w4 = w, and since ww ∈ F2m , we can take (w, w 2 ) for
a basis of F2n over F2m . Any element of F2n is then written in the form xw + yw2 , where
m−1
x, y ∈ F2m . Given a normal basis (α, α 2 , . . . , α 2 ) of F2m , a natural normal basis of F2n is
 m−2 m−1 m−2 m−1

αw, α 2 w2 , α 4 w, . . . , α 2 w2 , α 2 w, αw2 , . . . , α 2 w, α 2 w2 . (10.5)

Since (xw + yw2 )2 = y 2 w + x 2 w2 , the mapping z ∈ F2n → z2 ∈ F2n corresponds


to the mapping (x, y) ∈ F22m → (y 2 , x 2 ) ∈ F22m . Given a function f (x, y) in bivariate
form, the related Boolean function over Fn2 obtained by decomposing the input xw + yw2
over the normal basis (10.5) is then RS if and only if f (x, y) = f (y 2 , x 2 ). Note that applying
this identity m times gives f (x, y) = f (y, x), and applying it m + 1 times gives f (x, y) =
f (x 2 , y 2 ); the double condition “f (x, y) = f (y, x), and f (x, y) = f (x 2 , y 2 )” is necessary
and sufficient for f being idempotent.

Definition 78 A polynomial f (z) over F2n , n = 2m ≡ 2 (mod 4), is called a weak


idempotent if its associate bivariate expression f (x, y) = f (xw + yw2 ), w ∈ F4 \ F2 ,
x, y ∈ F2m , satisfies f (x, y) = f (x 2 , y 2 ).

Proposition 151 For n ≡ 2 (mod 4), idempotents are those polynomials f (z) over F2n
whose associate bivariate expression f (x, y) = f (xw + yw2 ), w ∈ F4 \ F2 , satisfies
f (x, y) = f (y 2 , x 2 ). Their set is included in that of weak idempotents. An idempotent is a
weak idempotent invariant under the swap x ↔ y.

See more in [245, subsection 5.3]. The corresponding definition at the bit level is
obtained by decomposing the univariate representation over the basis (10.5) and the bivariate
m−1
representation over the basis (α, α 2 , . . . , α 2 ):
362 Particular classes of Boolean functions

Definition 79 Let n = 2m ≡ 2 (mod 4). A Boolean function


f (x0 , y1 , x2 , y3 , . . . , xn−2 , yn−1 )
(where each index is reduced modulo m) over Fn2 is weak RS if it is invariant under the
transformation (xj , yj ) → (xj +1 , yj +1 ).

Note the particular disposition of the indices in f (x0 , y1 , x2 , y3 , . . . , xn−2 , yn−1 ): the index
0 for y does not come at the second position (where we have y1 ) but at the mth position.
Since m is odd, the invariance of f under the transformation (xj , yj ) → (xj +1 , yj +1 ) over
(x, y) is equivalent to its invariance under (xj , yj ) → (xj +2 , yj +2 ). Hence:

Proposition 152 The Boolean function f (x0 , y1 , x2 , y3 , . . . , xn−2 , yn−1 ) is weak RS if and
only if it is invariant under the square of the shift ρn .

Such weak RS function (that some authors call 2-RS function; see, e.g., [685]) is RS
if and only if it is invariant under the swap of x and y. A simple example of a weak RS
function is the direct sum f (x) ⊕ g(y), which is RS when f = g, where f and g are
RS functions with m variables. More generally, the indirect sum is studied in [281] (see
also [245]), with explicit examples of resulting bent idempotents. There exist also examples
of bent and semibent weak idempotents [311, 312, 699, 871].
The secondary constructions recalled above have led to the construction of RS functions
and idempotent bent functions from near-bent RS functions seen at page 251.
The k-variate representation can be studied similarly to the bivariate representation;
see [245].
The weights of rotation symmetric functions are studied in [399]. RS functions with
optimal algebraic immunity have been constructed (see, e.g., [1015]), but these functions
never reached good nonlinearity.
In [748], the class of Matriochka symmetric functions is introduced, which are the sums
of symmetric functions whose sets of variables are different and nested.
The notion of rotation symmetry has been generalized to vectorial functions in [994]. An
(n, n)-function is RS if it commutes with the cyclic shift: F ◦ s = s ◦ F . This is equivalent
to saying that each coordinate function equals (cyclically) the previous one composed by
the cyclic shift. Identifying Fn2 with F2n thanks to a normal basis, this is equivalent to
(F (x))2 = F (x 2 ) and therefore to the fact that the univariate representation of F has all
its coefficients in F2 (using the uniqueness of such representation). Kavut [680] enumerated
all bijective rotation symmetric (6, 6)-functions with maximum nonlinearity 24, showing
that, up to affine equivalence, there are only four functions with differential uniformity 4
and algebraic degree 5.

10.3 Direct sums of monomials



Functions f (x) = I ⊆{1,...,n} aI x I , where (aI = aJ = 1 and I = J ) ⇒ (I ∩ J = ∅), are
well adapted to situations where Boolean functions must be particularly simple, for instance,
when they are used with large numbers of variables and when addition and/or multiplication
are costly, like in the FLIP cryptosystem (see page 453). As for threshold functions, the
class of direct sums of monomials is preserved by the action of fixing the values of some
10.4 Monotone functions 363

variables and their study addresses then also their behavior against guess and determine
attacks resulting in fixing some input values to the functions.
It is convenient to identify a direct sum of monomials whose value at 0n is 0 by its direct
sum vector [m1 , m2 , . . . , mk ], of length k = dalg (f ), in which each mi is the number of
monomials of degree i (this allows us to determine uniquely the function up to permutation
of variables). We shall assume that all variables are effective, i.e., that the number of
k
variables equals i=1 i mi . The property seen in Relation (6.28), page 232, that the Walsh
transform of a direct sum equals the product of the Walsh transforms of the ingredient
functions, the Golomb–Xiao–Massey characterization of resiliency by the Walsh transform
(Theorem 5, page 87), and Relation (3.1), page 79, imply that the resiliency order of f
equals m1 − 1 (with the convention that an unbalanced
  function has resiliency order −1)
m
and that its nonlinearity equals 2n−1 − 2m1 −1 ki=2 2i − 2 i . The algebraic immunity is
more complex to determine but it is shown in [306] that if f (x1 , x2 , x3 , . . . , xn ) is a Boolean
function in n variables such that
∀x ∈ F2n−2 f (x, 0, 0) = f (x, 0, 1) = f (x, 1, 0),
then the Boolean function f  (x1 , . . . , xn−1 ) defined by
∀x ∈ F2n−2 f  (x, 1) = f (x, 1, 1) and f  (x, 0) = f (x, 0, 0)
satisfies that AI (f  ) ≤ AI (f ). Using this property and the algebraic immunity of triangular
functions (see below), the algebraic immunity of sums of monomials has been determined
in [305]:
⎛ ⎞
k
AI (f ) = min ⎝d + mi ⎠ . (10.6)
0≤d≤k
i=d+1

It is also shown in this same reference that, in some cases, the fast algebraic immunity of
such functions can be close to their algebraic immunity.

10.3.1 Triangular functions


Direct sums of monomials are called triangular functions when their direct sum vector is
the all-1 vector (that is, when they have one monomial of each degree).
 Weassume here also
that all variables are effective. The kth triangular function equals ki=1 ij =1 xj +i(i−1)/2 .
 
Its nonlinearity equals 2n−1 − ki=2 2i − 2 , according to what we have seen with direct
sums of monomials, and its algebraic immunity equals k, as first observed in [279] (and used
in [839]). This property is easily shown by induction on k since we have seen at page 342
that making the direct sum of a function f and of a monomial of degree AI (f ) + 1 gives a
function of algebraic immunity AI (f ) + 1.

10.4 Monotone functions


An n-variable Boolean function f is (increasing) monotone if, for every x, y ∈ Fn2 such
that x y (i.e., such that supp(x) ⊆ supp(y);
 see page 32), we have f (x) ≤ f (y).
Any monomial Boolean (multivariate) function i∈I xi is monotone. Other examples are
threshold functions; see above.
364 Particular classes of Boolean functions

As mentioned in [298, 249], monotone Boolean functions play a role in voting theory (a
voting scheme should be monotone), reliability theory (a system currently working should
not fail when we replace a defective component by an operative one), hypergraphs (the
stability function of a hypergraph, which takes value 1 at x when supp(x) contains at least
one edge, is monotone Boolean), and learning (monotone Boolean functions are easier
to learn). The question addressed here is whether they can also play a role with stream
ciphers (as filter functions), and our conclusion at the end of this section will be essentially
negative.
The balancedness and the algebraic immunity of monotone Boolean functions are
addressed in [298], which also recalls what their ANF is and how they can be constructed.
This reference studies their Walsh spectrum and their nonlinearity, showing that no
monotone bent n-variable function exists for n ≥ 4, and that every monotone n-variable
n−1
function f has nonlinearity at most 2n−1 − 2 2 for n ≥ 5 odd. Let us show how these
results are obtained. For every y ∈ Fn2 such that f (y) = 0, we have, according to the Poisson
summation formula (2.41), page 59, applied with a = b = 0n and E ⊥ = {x ∈ Fn2 ; x y},
E = {u ∈ Fn2 ; u y + 1n }:

Wf (u) = 2n ,
u∈Fn2 ; u y+1n

and this implies that maxu∈Fn2 ; u y+1n |Wf (u)| ≥ 2wH (y) , since the maximum of a sequence
cannot be smaller than its arithmetic mean. And when f (y) = 1:

(−1)1n ·u Wf (u) = −2n ,


u∈Fn2 ; u y

and this implies that maxu∈Fn2 ; u y |Wf (u)| ≥ 2n−wH (y) . Then:

Proposition 153 [298] For every odd n ≥ 5 and every monotone n-variable function f ,
we have nl(f ) ≤ 2n−1 − 2(n−1)/2 .

Indeed, the observations above and Relation (3.1), page 79, imply this bound when there
exists y of Hamming weight at least n+1 2 such that f (y) = 0, or of Hamming weight at most
2 such that f (y) =
n−1
1, and the only case left is when f is the majority function, which has
 n−1
nonlinearity 2n−1 − (n−1)/2 .
But no general upper bound for n even could be shown. Indeed, only the case where
f (x) differs from the majority function for at least one input x of Hamming weight different
from n/2 can be easily handled similarly. The case where f (x) coincides with the majority
function for every input x of Hamming weight different from n/2 must be handled by other
n
means. Then [298] only conjectured the upper bound nl(f ) ≤ 2n−1 − 2 2 for n even large
enough.
This conjecture was proved in [249]. We give its proof (and this will also prove the
nonexistence of monotone bent functions). According to the observations above, we can
restrict ourselves to the case where n is even and f equals the majority function at every
input x of Hamming weight different from n/2. We can assume f different from the
strict and large majority functions, since the nonlinearity of these two functions, equal to
10.4 Monotone functions 365

2n−1 − n−1n/2 , is larger than 2
n−1 − 2n/2 for n large enough. What makes the proof work is

the second-order Poisson summation formula (see Relation (2.57), page 62):
 2
Wf2 (u) = |E ⊥ | (−1)f (a+x) , (10.7)
u∈E ⊥ a∈E  x∈E

valid for any Boolean function f and supplementary subspaces E and E  of Fn2 .
For a given y of Hamming weight n/2 and such that f (y) = 0, let us take E = {x ∈
Fn2 ; x y}. Then E ⊥ = {u ∈ Fn2 ; u y + 1n } is supplementary of E, and we can then take
E  = E ⊥ ; we obtain, f being null on E since it is monotone
⎛ ⎞2

Wf2 (u) = 2n/2 ⎝ (−1)f (a+x) ⎠


u∈Fn2 ; u y+1n a∈Fn2 ;a y+1n x∈Fn2 ; x y
⎛ ⎞2

= 23n/2 + 2n/2 ⎝ (−1)f (a+x) ⎠ .


a∈Fn2 ;a y+1n ;a=0n x∈Fn2 ; x y

Using again that the maximum is bounded below by the mean, we deduce the inequality
  2
maxu∈Fn2 ; u y+1n Wf2 (u) ≥ 2n + a y+1n ;a=0n n
x∈F2 ; x y (−1) f (a+x) .

For a y + 1n , denoting wH (a) by j , if x y has Hamming weight strictly less than


n/2 − j , then a + x has Hamming weight strictly less than n/2 and f (a + x) equals 0,
and if x y has Hamming weight strictly larger than n/2 − j , then a + x has Hamming
weight strictly larger than n/2 and f (a + x) equals 1. If x y has Hamming weight
n/2 − j , then a + x has weight n/2 and the value of f (a + x) is unknown. The value of
 n/2−1−j n/2 n/2 n/2  n/2
i − i=n/2+1−j i − n/2−j and
f (a+x) lies then between
x∈Fn2 ; x y (−1) i=0
n/2−1−j n/2 n/2 n/2  n/2
i=0 i − i=n/2+1−j i + n/2−j .
  n/2 n/2 n/2 j −1 n/2
Replacing n/2 i by n/2−i in the sum i=n/2+1−j i , we obtain i=0 i .
 2
Then for j < n/4, we have n/2 − 1 − j ≥ j and x∈Fn2 ; x y (−1)
f (a+x) ≥
   2   2
n/2−1−j n/2 n/2 n/2−1−j n/2
i=j i − n/2−j = i=j +1 i , and for j > n/4, we have
 2  n/2  n/2 2
j −1
j − 1 ≥ n/2 − j and n
x∈F2 ; x y (−1) f (a+x) ≥ i=n/2−j i − n/2−j =
 n/2  2
j −1
i=n/2−j +1 i . We then deduce that maxu∈Fn2 ; u y+1n Wf2 (u) ≥ 2n +
⎛ ⎞ ⎛ ⎞
 n/2−1−j  2  j −1  2
n/2 ⎝ n/2 ⎠ n/2 ⎝ n/2 ⎠
+
j i j i
1≤j <n/4 i=j +1 n/4<j ≤n/2 i=n/2−j +1
⎛ ⎞ ⎛ ⎞
 n/2−1−j  2 n/2−1  2
n/2 ⎝ n/2 ⎠ +⎝ n/2 ⎠
= 2n + 2
j i i
1≤j <n/4 i=j +1 i=1
⎛ ⎞2
 j   2
n/2 ⎝ n/2 n/2 ⎠
=2 +2n
2 −2 + 2n/2 − 2 .
j i
1≤j <n/4 i=0
366 Particular classes of Boolean functions

   j n/2 2  n/2 2
And we have 2 1≤j <n/4 n/2 j 2n/2 − 2
i=0 i + 2 −2 ≥ 3 · 2n for every
n ≥ 10, since the expression of n equal to
⎡ ⎛ ⎞2 ⎤
 j   
⎢ n/2 ⎝ n/2 n/2 ⎠ 2⎥
2−n ⎣2 2 −2 + 2n/2 − 2 ⎦
j i
1≤j <n/4 i=0

is nondecreasing and is larger than 3 for n = 10. We deduce then

Proposition 154 [249] For every even n ≥ 10 and every monotone n-variable function f ,
we have nl(f ) ≤ 2n−1 − 2n/2 .

Since 2n−1 − 2(n−1)/2 (n odd) and 2n−1 − 2n/2 (n even) are good nonlinearities for
Boolean functions in n variables, the bounds above do not tell us if monotone Boolean
functions can have good nonlinearity. But a stronger bound, valid for every n, can be proved
as also shown in [249]. Indeed, the inequalities maxn |Wf (u)| ≥ 2wH (y) for f (y) = 0 and
u∈F2
maxn |Wf (u)| ≥ 2n−wH (y) for f (y) = 1 can be refined by using the second-order Poisson
u∈F2
summation formula (10.7) again.
– If there exist vectors of Hamming weight strictly larger than n/2 whose image by f
is 0, let then y have maximal Hamming weight (say, w) among all vectors satisfying
f (y) = 0. We have with the same arguments as above:
⎛ ⎞2

max Wf2 (u) ≥ 22w + ⎝ (−1)f (a+x) ⎠ . (10.8)


u∈Fn2 ; u y+1n
a∈Fn2 ; a y+1n ;a=0n x∈Fn2 ; x y

For every a y + 1n (of Hamming weight j ≤ n − w), we have f (a + x) = 1 for


every x y such that a + x has Hamming weight at least  w + 1 (that is, for every
x y of Hamming weight at least w − j + 1), and we deduce x∈Fn ; x y (−1)f (a+x) ≤
 w w w 2
2w − 2 w i=w−j +1 i . Note that we have 2 − 2
w
i=w−j +1 i ≤ 0 if and only if
  2
w − j + 1 ≤ w2 , that is, j ≥ w2 + 1. We have a∈Fn2 ;a=0n n
x∈F2 ; x y (−1) f (a+x) ≥
n−w  w 2 a y+1n
n−w w
j = +1
w j 2 i=w−j +1 i − 2w . We deduce then from (10.8) that
2

⎛ ⎞
n−w  w−j  2
n−w ⎝ w w ⎠
max Wf2 (u) ≥ 22w + 2 −2 .
u∈Fn2 ; u y+1n j i
j = w2 +1 i=0

Denoting 2w = n + k (where k > 0 has the same partity as n), we have then
10.4 Monotone functions 367
⎛ ⎞2
n−k  n+k 
n−k n+k
2 2 −j
⎜ n+k ⎟
max Wf2 (u) ≥ 2n+k + 2
⎝2 2 − 2 2
⎠ .
u∈Fn2 ; u y+1n - . j i
j= n+k
+1 i=0
4

Hence, we have
2
3 ⎛ ⎞2
3
n−k  n+k 
n−k n+k
2 −j
13
3
2
⎜ n+k ⎟
nl(f ) ≤ 2n−1
− 32n+k + 2
⎝ 2 2 −2 2
⎠ .
24 - . j i
j= n+k
+1 i=0
4

– If there exist vectors of Hamming weight smaller than n/2 and whose image by f equals
1, let y have minimal Hamming weight w such that f (y) = 1 (w < n/2). Applying the
upper bound above to the monotone function f (x + 1n ) ⊕ 1, whose nonlinearity equals

that of f , and denoting w = n − w = n+k 
2 , where k > 0 has the same partity as n, we
have
2
3 ⎛ ⎞2
3 n−k  n+k 
3 2 n−k   2 −j n+k 
13  ⎜ n+k 

nl(f ) ≤ 2n−1 − 32n+k + 2
⎝2 2 − 2 2
⎠ .
24 - 
. j i
j= n+k
+1 i=0
4

– If none of the two cases above happens, then f coincides with the majority function
at every input x of Hamming weight different
 from n/2 and either (i)n−1
f is a majority
function and nl(f ) equals then 2n−1 − n−1 if n is even and 2 n−1
− (n−1)/2
n/2 √
if n is odd, or (ii) n is even and nl(f ) ≤ 2n−1 − 12 A where A equals 2n +
⎛ ⎞
 j  2  2
n/2 ⎝ n/2 n/2 ⎠
2 2 −2 + 2n/2 − 2 . We deduce:
j i
1≤j <n/4 i=0

Theorem 24 [249] √ For every n and every monotone n-variable function f , we have
nl(f ) ≤ 2n−1 − 12 M, where M = min(A, B, C) if n is even and M = min(B, C) if n
is odd, with
⎛ ⎞
 j  2  2
n/2 ⎝ n/2 n/2 ⎠
A = 2n + 2 2 −2 + 2n/2 − 2 ,
j i
1≤j <n/4 i=0
⎛ ⎛ ⎞2 ⎞
n−k
 n+k
−j 
⎜ n+k 2 n−k
⎜ n+k
2 n+k
⎟ ⎟
B = min ⎜ ⎝ 2 + 2
⎝ 2 2 −2 2
⎠ ⎟⎠,
1≤k≤n/2 - . j i
n+k even
j= n+k
+1 i=0
4

  2
and C = 2 n−1
n
  .
2
368 Particular classes of Boolean functions

The behavior of A, B, and C when n tends to infinity is studied in [249] and shows that
3nλn
min(A, B, C) is asymptotically equivalent to an expression of n at least equal to 2 2 for
some λn tending to 1. Tables are given, indicating for each value of n between 4 and 31 the
value given by the upper bound of Theorem 24. These tables confirm that the nonlinearity of
monotone Boolean functions is bad (much worse than what was suggested by the upper
bounds obtained, resp. conjectured, in [298]). This shows that the rather large class of
monotone Boolean functions contains no element that could be used as a nonlinear function
in a cryptosystem.
11

Highly nonlinear vectorial functions with low


differential uniformity

A large nonlinearity is one of the most important criteria for vectorial functions, valid for
all uses in stream and block ciphers. Nonlinearity is not the only parameter quantifying the
difference in behavior between a vectorial function and affine functions, but it is the most
important. According to Dib’s results [436], the average nonlinearity of vectorial functions
is not bad.
Differential uniformity has the same importance as nonlinearity but is specific to S-boxes
in block ciphers. According to Voloch’s results [1098], the average differential uniformity
of (n, n)-functions is bad, and this is probably also the case for (n, m)-functions. The
relationship between nonlinearity and differential uniformity is not completely clarified.
For instance, as seen at page 136, there exist vectorial functions with good nonlinearity
and bad differential uniformity and vice versa, but most known functions with optimal
differential uniformity have good nonlinearity. Further work is needed to understand better
this relationship. But the work done in general on the study of S-boxes (see a survey in
[94]) is significant and has had important practical applications. The design of the AES has
taken advantage of the studies (in particular by K. Nyberg) on the notions of nonlinearity
and differential uniformity. This has made it possible in the AES to use S-boxes working
on bytes (at the time, it would not have been possible to find a good 8-bit-to-8-bit S-box
by a computer search as this had been done for the 6-bit-to-4-bit S-boxes of the DES). We
recommend the book [141].
We briefly recall the main information given in Subsection 3.2.3, page 115. The
nonlinearity nl(F ) of an (n, m)-function F is the minimum Hamming distance between
all component functions of F and all affine functions in n variables:
1
nl(F ) = 2n−1 − max |WF (u, v)| .
2 2 \{0m }; u∈F2
v∈Fm n

Nonlinearity quantifies the contribution of functions to the resistance against linear attacks,
when they are used as S-boxes in block ciphers, and partly against fast correlation attacks,
when they are used as filters or combiners in stream ciphers.
We have seen that the nonlinearity is a CCZ invariant. In particular, if n = m and if F is
a permutation, then F and its inverse F −1 have the same nonlinearity.
We have also seen at page 160 the relationship between the maximal possible nonlinearity
of (n, m)-functions and the possible parameters of the linear supercodes of the Reed–Muller

369
370 Highly nonlinear vectorial functions with low differential uniformity

code of order 1. Existence and nonexistence results1 on highly nonlinear vectorial functions
are deduced in [1099].

11.1 The covering radius bound; bent/perfect nonlinear functions


As seen at page 117, the covering radius bound is valid for every (n, m)-function:
n
nl(F ) ≤ 2n−1 − 2 2 −1 , (11.1)
and an (n, m) function is called bent if it achieves the covering radius bound (11.1) with
equality.
The notion of bent vectorial function is invariant under CCZ equivalence2 (since the
nonlinearity is), but we have seen at Subsection 6.4, page 269, that CCZ equivalence
coincides with EA equivalence for bent vectorial functions. We have also seen that an
(n, m)-function is bent if and only if all the component functions v · F , v = 0m of F
are bent and that bent (n, m)-functions exist if and only if n is even and m ≤ n2 . Recall also
that an (n, m)-function is bent if and only if all its derivatives Da F (x) = F (x) + F (x + a),
a ∈ Fn2 \ {0n }, are balanced, that is, “bent” and “perfect nonlinear (PN)” are equivalent. Bent
vectorial functions contribute then also to an optimal resistance to the differential attack of
those cryptosystems in which they are involved (but they are not balanced). They can be
used to design authentication schemes (or codes); see [346].
Thanks to the observations made in Subsection 2.3.7 (where we saw that the evaluation
of the multidimensional Walsh transform corresponds in fact to the evaluation of the Walsh
transform), it is a simple matter to characterize bent functions as those functions whose
squared expression of the multidimensional Walsh transform at L is the same for every L.
Note that if a bent (n, m)-function F is normal in the sense that it is null on (say) an
n
2 -dimensional vector space E, then F is balanced on any translate of E. Indeed, for every
v = 0m in Fm 2 and every u ∈ Fn2 \ E, the function v · F is balanced on u + E.
We have recalled at Subsections 6.1.15 and 6.1.16 what are the known primary and
secondary constructions of bent functions.

11.2 The Sidelnikov–Chabaud–Vaudenay bound


We have seen with Theorem 6, page 118, that a better upper bound than the covering radius
bound exists for (n, n)-functions:
n−1
nl(F ) ≤ 2n−1 − 2 2 ,
and that the functions that achieve it with equality (for n necessarily odd) are called almost
bent (AB). There exists a bound on the algebraic degree of AB functions, similar to the
bound for bent functions:

Proposition 155 [257] Let F be any (n, n)-function (n ≥ 3, odd). If F is AB, then the
algebraic degree of F is less than or equal to (n + 1)/2.
1 Using the linear programming bound due to Delsarte.
2 But the number of bent components of general (n, m)-functions is not.
11.3 Almost perfect nonlinear and almost bent functions 371

This is a direct consequence of the fact that the Walsh transform of any function v · F is
n+1
divisible by 2 2 and of Theorem 2, page 63. The bound is tight; it is achieved with equality
for instance by the inverse of x 3 .
Note that the divisibility plays also a role with respect to the algebraic degree of the
composition of two vectorial functions: in [204] has been proved (as we recalled in a remark
at page 64) that, if the Walsh transform values of a vectorial function F : Fn2 → Fn2 are
divisible by 2k then, for every vectorial function G : Fn2 → Fn2 , the algebraic degree of
composite function G◦F is at most equal to the algebraic degree of G plus n−k. This means
that using AB functions as S-boxes in block ciphers may not be a good idea (suboptimal
functions as the multiplicative inverse function, see Chapter 11, may be better, as often in
cryptography).
n−1
Remark. There is a big gap between the best possible nonlinearity 2n−1 − 2 2 of (n, n)-
functions for n odd, achieved by AB functions (see examples below), and the best-known
nonlinearity 2n−1 − 2n/2 of (n, n)-functions for n even, which is achieved (see below)
by the Gold APN functions, the Kasami APN functions, and the multiplicative inverse
function x 2 −2 (n odd). The gap could seem not so important, but it is, since what matters
n

for the complexity of attacks by linear approximation is not the value of nl(F ) but the
value of 2 2−nl(F
n−1 )
n−1 . Finding functions with better nonlinearity (and still more relevantly to
cryptography, with better nonlinearity and good differential uniformity) or proving that such
function does not exist is an open question.

We recall now the definition of the differential uniformity of an (n, m)-function F (see
Definition 40, page 135)
δF = max |{x ∈ Fn2 ; Da F (x) = b}|
a∈Fn m
2 ,b∈F2
a=0n

is the maximum number of ordered pairs of distinct elements of the graph GF = {(x, y) ∈
Fn2 × Fm2 ; y = F (x)} of F whose sum equals some value (a, b) ∈ (F2 \ {0n }) × F2 . The
n m

smaller δF , the better the contribution of F to the resistance to differential cryptanalysis. For
every (n, m)-function F , we have δF ≥ 2n−m (as observed by Nyberg) with equality if and
only if F is perfect nonlinear (which can exist if and only if n is even and m ≤ n/2), and
when m ≥ n, the smallest possible value of δF is 2, since δF is always even.
We have seen that the differential uniformity is a CCZ invariant (and here also, if n = m
and if F is a permutation, then F and its inverse F −1 have the same differential uniformity).

11.3 Almost perfect nonlinear and almost bent functions


We have seen in Definition 41, page 137, that differentially 2-uniform (n, n)-functions are
called almost perfect nonlinear (in brief, APN) and contribute to a maximal resistance to
differential cryptanalysis.
AB functions contribute to a maximal resistance to both linear and differential cryptanal-
yses; indeed, according to the proof of the SCV bound and as observed by Chabaud and
Vaudenay:
372 Highly nonlinear vectorial functions with low differential uniformity

Proposition 156 For every n odd, AB (n, n)-functions are APN.

The converse of Proposition 156 is false in general; it is true for quadratic functions in odd
dimension [257] and in more general cases that we shall see at page 382. The implication of
Proposition 156 can be more precisely changed into a characterization of AB functions:

Proposition 157 Any vectorial function F : Fn2 → Fn2 is AB if and only if F is APN and
plateaued with single amplitude (see Definition 67, page 274).

This comes directly from Relations (3.22) and (3.25), page 118. We shall see in
Proposition 163, page 382, that if n is odd, the condition “with the same amplitude” is
in fact not necessary.
AB functions exist for every odd n ≥ 3. APN functions exist for every n ≥ 2. Function
F (x) = x 3 , x ∈ F2n , is an example; others will be given below.
According to Relations (3.24) and (3.25), and to the two lines following them, APN (n, n)-
functions F are characterized3 by the fact that the power sum of degree 4 of the values of
their Walsh transform is minimal:
WF4 (u, v) = 3 · 24n − 2 · 23n (11.2)
v∈Fn2 ,u∈Fn2

or equivalently, replacing u∈Fn2 WF4 (u, 0n ) by its value 24n :

Theorem 25 [341] Any (n, n)-function F is APN if and only if


WF4 (u, v) = 23n+1 (2n − 1), (11.3)
v∈Fn2 \{0n },u∈Fn2

which is the minimal possible value of this sum for all (n, n)-functions.

We have seen at page 111 that this implies that the Walsh support of APN (n, n)-functions
has size at least 1 + (2n − 1) 2n−1 . 
Using Relation (3.10), page 98, F is then APN if and only if v∈Fn \{0n } V (v · F ) =
2
22n+1 (2n − 1). In fact, as observed in [910], F is APN if and only if, for every a ∈ Fn \ {0 },
2 n
v∈Fn2 F (Da (v · F )) = 2 |{(x, y) ∈ (F2 ) ; Da F (x) = Da F (y)}| equals 2
2 n n 2 2n+1 (i.e., is

minimal), and Theorem 25 can also be referred to [910].


Using Parseval’s relation (3.23) and Relation (11.3), any (n, n)-function F is APN if and
only if
 
WF2 (u, v) WF2 (u, v) − 2n+1 = 0. (11.4)
v∈Fn
2 \{0n }
u∈Fn2

This characterization will have nice consequences in the sequel.


It is easily shown as in the proof of the SCV bound, that for every (n, n)-function, the
  3
power sum of degree 3: v∈Fn ,u∈Fn v·F (x)⊕u·x equals
2x∈Fn (−1)
2 2

3 This characterization is equivalent to a characterization due to Helleseth [592] in the framework of sequences.
11.3 Almost perfect nonlinear and almost bent functions 373
06 70
0 0
22n 0 (x, y) ∈ F2n
2 ; F (x) + F (y) + F (x + y) = 0n 0.
Applying (with z = 0n ) the property that, for every APN function F , the relation F (x) +
F (y) + F (z) + F (x + y + z) = 0n can be achieved only when x = y or x = z or y = z, we
have then, for every APN function such that F (0n ) = 0n :
WF3 (u, v) = 3 · 23n − 2 · 22n . (11.5)
v∈Fn2 ,u∈Fn2

But this property is not characteristic (except for plateaued functions; see below) of APN
functions among those (n, n)-functions such that F (0n ) = 0n , since it is only characteristic
of the fact that x∈E F (x) = 0n for every two-dimensional vector subspace E of Fn2 (which
is more restrictive than for every two-dimensional flat).
As already seen at page 111, the spectral complexity of an APN function satisfies
24n 22n
|{(u, v) ∈ Fn2 × Fm2 ; WF (u, v) = 0}| ≥ 3·22n −2n+1 ≈ 3 .
Note that for every APN function F , we have
0 0
0 0
0{(a, b) ∈ (Fn2 )2 , a = b ; F (a) = F (b)}0 ≤ 2 · (2n − 1)
since F (a) = F (b)0 is equivalent to Da+b F (a) = 00n . 
Hence, we have 0{(a, b) ∈ (Fn2 )2 ; F (a) = F (b)}0 = z∈Fn2 |F
−1 (z)|2 ≤ 3 · 2n − 2 and
:√ ;
therefore |F −1 (z)| ≤ 3 · 2n − 2 ≤ 2n/2+1 , for every z ∈ Fn2 .
We have seen at page 137 the different ways of expressing that a function is APN. It is
observed in [71, theorem 3] (recalled in [94] and slightly modified in [353]) that, given any
linear hyperplane H in Fn2 and any (n, n)-function F , the necessary property (for F to be
APN) that Da F is 2-to-1 when a is nonzero and belongs to H is also sufficient. Let us give
a simple proof: suppose that F is not APN, then there exists an affine plane P in Fn2 , say
P = u + E, where E is a linear plane, on which F is affine (see page 137). The direction E
of P contains at least one nonzero element a of H , because dim E + dim H > n; then Da F
is not 2-to-1, a contradiction.
We have seen at page 278 that a subclass of APN functions (and superclass of AB
quadratic permutations), called crooked functions, has been considered in [57], further
studied in [172, 410, 726], and generalized in [80, 727, 729]. There are only two known cases
of crooked functions corresponding to the original definition: Gold power AB functions and
the class of quadratic AB binomials constructed in [151, 158]. All known crooked functions
in the larger sense are quadratic APN, and we have several constructions of them. Among
the known 487 quadratic AB functions over F27 , only Gold functions are CCZ equivalent to
permutations (among AB functions, permutations are rare). It can be proved [728] that every
power crooked function is a Gold function (see the definition below).
The maximal algebraic degree of APN functions is unknown: for n odd, it is probably
n − 1 (achieved by x 2 −2 ), but it is unproven that it is not n, and for n even, it is still more
n

undetermined. All known APN functions (see pages 395 and 400) have algebraic degree at
most n − 1. It has been proved in [156], thanks to characterizations by means of derivatives
and power moments of the Walsh transform, that APN functions of algebraic degree n do
not exist for n ≥ 3 within the classes of power functions modified at input 0 (and the
nonexistence for power functions modified in one point was checked by computer for n ≤
13) and of plateaued functions modified in one point. See more in [153, 654], and in [167],
where the notion of APNness is weakened (differently from [24]).
374 Highly nonlinear vectorial functions with low differential uniformity

11.3.1 Other characterizations of AB and APN functions


We have seen above the main characterizations, but others exist:

Characterization by the degrees of univariate polynomials


An (n, n)-function F , given in univariate form, is APN if and only if, for every a ∈ F∗2n and
n
every b ∈ F2n , the polynomial gcd(x 2 + x, F (x) + F (x + a) + b) has degree at most 2 (that
n
is, has degree 0 or 2). Indeed, x 2 + x splits completely over F2n and its roots, all simple, are
all the elements of F2n . The polynomial P (x) = F (x) + F (x + a) + b has then a number
n
of zeros in F2n equal to the degree of Q(x) = gcd(P (x), x 2 + x). The degree of Q(x) is
2, that is, the equation F (x) + F (x + a) = b has solutions, if and only if γF (a, b) = 1,
where γF has been defined at page 229 and will be studied more in detail in Proposition 158
below.

Remark. If F is a quadratic (n, n)-function, the equation F (x) + F (x + a) = b is a linear


equation. It admits then at most two solutions for every nonzero a and every b if and only if
the related homogeneous equation F (x) + F (x + a) + F (0n ) + F (a) = 0n admits at most
two solutions for every nonzero a. We shall see that this generalizes to plateaued functions.
In the case of a quadratic function, F is APN if and only if the associated bilinear symmetric
(2n, n)-function βF (x, y) = F (0n ) + F (x) + F (y) + F (x + y) never vanishes when x and
y are F2 -linearly independent vectors of Fn2 . For functions of higher degrees, the fact that
βF (x, y) (which is no longer bilinear) never vanishes when x and y are linearly independent
is only necessary for APNness (sufficient for plateaued functions).

Characterization by the ANF


By definition, an (n, n)-function is APN if and only if, for every nonzero a ∈ Fn2 ,
     
δ0 F (x) + F (x + a) + F (y) + F (y + a) ⊕ δ0 x + y ⊕ δ0 x + y + a ≡ 0

(where ≡ 0 means “equals the zero function”), where δ0 (z) = ni=1 (zi ⊕ 1) is the Dirac (or
Kronecker) symbol. Indeed, this equation expresses that F (x)+F (x +a) = F (y)+F (y +a)
if and only if x = y or x = y + a. Equivalently, denoting by Ha any linear hyperplane
excluding a, function Da F is injective on Ha , that is:

1Ha (x) 1Ha (y) δ0 (F (x) + F (x + a) + F (y) + F (y + a)) ⊕ δ0 (x + y) ≡ 0.
These identities, when considered as multivariate polynomial equalities, need to be viewed
in F2 [x, y]/(xi2 + xi , yi2 + yi ; i = 1, . . . , n).
They can also be considered as univariate identities over F2n , where δ0 (z) = 1 + z2 −1 ,
n

n n
and they need then to be reduced modulo x 2 + x and modulo y 2 + y before being checked
as identically zero.

Characterization by the ANFs of affine equivalent functions


A necessary condition dealing with quadratic terms in the ANF of any APN function has
been observed in [71]. Given any APN function F (quadratic or not), every quadratic term
11.3 Almost perfect nonlinear and almost bent functions 375

xi xj (1 ≤ i < j ≤ n) must appear with a nonnull coefficient  in the algebraic normal


form of F . Indeed,  we know that the coefficient of any monomial i∈I x i in the ANF of F
equals aI = x∈Fn ; supp(x)⊆I F (x) (this sum being calculated in Fn2 ). Applied for instance
2
to I = {n − 1, n}, this gives aI = F (0, . . . , 0, 0, 0) + F (0, . . . , 0, 0, 1) + F (0, . . . , 0, 1, 0) +
F (0, . . . , 0, 1, 1), and F being APN, this vector cannot be null. Note that, since the notion
of almost perfect nonlinearity is affine invariant (see below), this condition must be satisfied
by all of the functions L ◦ F ◦ L, where L and L are affine automorphisms of Fn2 .
Extended this way (i.e., writing that all degree 2 terms have nonnull coefficients in the
ANF of every affinely equivalent function), the condition becomes necessary and sufficient
(indeed, for every distinct x, y, z in Fn2 , there exists an affine automorphism L of Fn2 such
that L(0,  . . . , 0, 0, 0) = x, L(0, . . . , 0, 1, 0) = y and L(0, . . . , 0, 0, 1) = z; so the condition
tells that x∈P F (x) is nonzero for every two-dimensional affine space P ).

Characterizations by the Hamming weight and the bentness of associated Boolean


functions
The properties of APNness and ABness can be translated in terms of Boolean functions, as
observed in [257] and already encountered at page 229:

Proposition 158 Let F be any (n, n)-function. For every a, b ∈ Fn2 , let γF (a, b) equal 1 if
the equation F (x) + F (x + a) = b admits solutions, with a = 0n . Otherwise, let γF (a, b)
be null. Then:
1. F is APN if and only if γF has  Hamming weight 22n−1 − 2n−1 , and we have then, for
2 if (u, v) = (0n , 0n )
n
every u, v ∈ Fn2 : WγF (u, v) =
2n − WF2 (u, v) otherwise.
2. F is AB if and only if γF is bent. The dual of γF is then the indicator of the Walsh support
of F , deprived of (0n , 0n ).

Proof
1. If F is APN, then for every a = 0n , the mapping x → F (x) + F (x + a) is 2-to-1 (that
is, the size of the preimage of any vector equals 0 or 2). Hence, γF has Hamming weight
22n−1 − 2n−1 . The converse is also straightforward.
We assume now that F is APN. We have WγF (0n , 0n ) = 22n − 2wH (γF ) = 2n . For
(u, v) = (0n , 0n ), we have

WγF (u, v) = −2 γ+
F (u, v) = − (−1)u·a ⊕ v·(F (x)+F (x+a))
a=0n ,x∈Fn2

= 2n − (−1)u·a ⊕ v·(F (x)+F (x+a))


a,x∈Fn2

= 2n − (−1)u·(x+y) ⊕ v·(F (x)+F (y)) = 2n − WF2 (u, v).


x,y∈Fn2

2. We deduce that F is AB if and only if WγF (u, v) = ±2n for every (u, v) ∈ Fn2 × Fn2 ,
i.e., γF is bent. Then for every (u, v) = (0n , 0n ), we have γK
F (u, v) = 0, that is,
376 Highly nonlinear vectorial functions with low differential uniformity

WγF (u, v) = 2n if and only if WF (u, v) = 0. Hence, the dual of γF is the indicator
of the Walsh support of F , deprived of (0n , 0n ).

Denoting by L = (L1 , L2 ) an affine automorphism mapping the graph of F to the


graph of G, we have γF = γG ◦ L, where L is the linear automorphism such that
L = L + cst. Indeed, we have G = F2 ◦ F1−1 , where F1 (x) = L1 (x, F (x)) and
F2 (x) = L2 (x, F (x)); the value γG (a, b) equals 1 if and only if a = 0n and there
exists (x, y) in Fn2 × Fn2 such that F1 (x) + F1 (y) = a and F2 (x) + F2 (y) = b, that is,
L(x, F (x)) + L(y, F (y)) = L(x + y, F (x) + F (y)) = (a, b). Hence, γG ◦ L(a, b) = 1
if and only if γF (a, b) = 1. Note that different functions may have the same γF ; see in
[561] a study when the function γF is the one associated to Gold functions. The linear
equivalence between functions γF could potentially lead to an equivalence notion strictly
more general than CCZ equivalence; this needs to be studied. It is observed in this same
reference that if two functions F , F  are such that γF = γF  , then for any function G taken
EA equivalent to F , there exists G , which is EA equivalent to F  and such that γG = γG .
In [109], it is observed that if two functions F , F  have the same DDT, then for any function
G taken CCZ equivalent to F , there exists G , which is CCZ equivalent to F  and such that
G and G have the same DDT (and the same is true with EA instead of CCZ); it is also
shown that, for any APN permutation F and any pair {a, a  } of distinct nonzero elements,
the functions γF (a, x) and γF (a  , x) are different. It is conjectured in this same reference
that two permutations F and G having such property and such that γF = γG (i.e., with
the same DDT) are such that G(x) = F (x + a) + b. A guess-and-determine algorithm
for reconstructing an S-box from its DDT is given, which is outperformed by an algorithm
from [489].
The γF functions associated to some AB functions are addressed at page 229 and those
associated to some of the known APN functions are determined in [152, 257], (for some
other cases, it is an open problem).

Remark. Let F be APN. According to Relation (3.3), page 82, we have nl(F ) =
minv∈Fn2 ,v=0n nl(v · F ) ≥ 2n−2 − 14 maxv∈Fn2 ,v=0n mine∈Fn2 ,e=0n |F (v · De F )| =

2n−2 − 12 maxv=0n mine=0n | b∈Fn γF (e, b)(−1)v·b |. We obtain then nl(F ) ≥ 2n−2 −
2
1
2 maxv=0n mine=0n |γ,
F ,e (v)| ≥ 2
n−2 − 1 min
2 e=0n maxv=0n |γ,
F ,e (v)| = maxe=0n nl(γF ,e ) −
2 , where γF ,e (b) = γF (e, b). These lower bounds are not efficient for highly nonlinear
n−2

functions like AB functions, since they are below 2n−2 which is much smaller than
n−1
2n−1 − 2 2 , but since little is known on the nonlinearity of APN non-AB functions,
they are worth mentioning.

Characterizations by the numbers of solutions of systems of equations


There exists a characterization of AB functions by van Dam and Fon-Der-Flaass in [410]
 characterization of APN functions by the fact that, for every (a, b) = (0n , 0n ),
similar to the
x+y = a
the system admits zero or two solutions:
F (x) + F (y) = b
11.3 Almost perfect nonlinear and almost bent functions 377

Proposition 159 Any (n, n)-function F is AB if and only if the system



x+y+z = a
(11.6)
F (x) + F (y) + F (z) = b

admits 3 · 2n − 2 solutions if b = F (a) and 2n − 2 solutions otherwise.

Indeed, F is AB if and only if, for every v ∈ Fn2 \ {0n } and every u ∈ Fn2 , we have
 3 
x∈F2n (−1) v·F (x)⊕u·x = 2n+1 x∈Fn (−1)v·F (x)⊕u·x , and we know that two pseudo-
2
Boolean functions are equal to each other if and only if their Fourier–Hadamard transforms
are equal. The value at (a, b) of the Fourier–Hadamard transform of the function of (u, v)
 3
equal to x∈Fn (−1)
v·F (x)⊕u·x if v = 0n , and to 0 otherwise equals
2

⎛ ⎞3
⎝ (−1)v·F (x)⊕u·x ⎠ (−1)a·u⊕b·v − 23n =
u∈Fn
2 x∈Fn2
v∈Fn
2
0  >0
2n 0
0 x+y+z=a 0
0
0 (x, y, z) ∈ F2 ; F (x) + F (y) + F (z) = b 0 − 2 ,
3n 3n
2

and the
 value of the Fourier–Hadamard transform of the function that is equal to
2n+1 x∈Fn (−1)v·F (x)⊕u·x if v = 0n , and to 0 otherwise equals
2
0  >0
0 x=a 0
23n+1 00 x ∈ Fn2 ; 0 − 22n+1 .
F (x) = b 0

This proves the result. Note that 3 · 2n − 2 is the number of triples (x, x, a), (x, a, x) and
(a, x, x), where x ranges over Fn2 . Hence the condition when F (a) = b means that these
particular triples are the only solutions of the system (11.6). This is equivalent to saying that
F is APN, and we can replace the first condition of van Dam and Fon-Der-Flaass by “F is
APN.” Denoting c = F (a) + b, we have then

Corollary 27 Let n be any positive integer and F any APN (n, n)-function. Then F is AB
if and only if, for every c = 0n and every a in Fn2 , the equation F (x) + F (y) + F (a) + F (x +
y + a) = c has 2n − 2 solutions.

 by A2 the set
Let us denote of two-dimensional flats of Fn2 and by F the mapping
A ∈ A2 → x∈A F (x) ∈ F2 . Corollary 27 is equivalent to saying that an APN function
n

is AB if and only if, for every a ∈ Fn2 , the restriction of F to those flats that contain a is
a 2 3 −1 -to-1 function (indeed, there are six different ways of ordering the three elements
n−1

other than a in such flat). Note that the number of two-dimensional flats of Fn2 containing
n −1)(2 −2)
n n−1 −1
a equals (2
(22 −1)(22 −2)
= (2n − 1) 2 3 and the size of Fn2 \ {0n } equals 2n − 1. We
have then:
378 Highly nonlinear vectorial functions with low differential uniformity

Corollary 28 Any (n, n)-function F is APN if and only if F is valued in Fn2 \ {0n }, and F
is AB if and only if, additionally, for every a ∈ Fn2 , the restriction of F : A2 → Fn2 \ {0n }
to those flats that contain a is balanced (that is, has uniform output).

Note that, for every APN function F and any two distinct vectors a and a  , the restriction
of F to those flats that contain a and a  is injective, since for two such distinct flats A =
{a, a  , x, x + a + a  } and A = {a, a  , x  , x  + a + a  }, we have F (A) + F (A ) = F (x) +
F (x + a + a  ) + F (x  ) + F (x  + a + a  ) = F ({x, x + a + a  , x  , x  + a + a  }) = 0n .
But this restriction of F cannot be surjective since the number of flats containing a and a 
equals 2n−1 − 1, which is less than 2n − 1.

Remark. Other characterizations can be derived with the same method as in Proposition
159’s proof. For instance, F is AB if and only if, for every v ∈ Fn2 \ {0n }, u ∈ Fn2 , we have
⎛ ⎞4 ⎛ ⎞2
⎝ (−1)v·F (x)⊕u·x ⎠ = 2n+1 ⎝ (−1)v·F (x)⊕u·x ⎠ . By applying again the Fourier–
x∈Fn2 x∈Fn2
Hadamard transform and dividing by 22n ,
we deduce that F is AB if and only if, for every
n 2
(a, b) in (F2 ) , we have
0  >0
0 0
0 (x, y, z, t) ∈ F4n ; x + y + z + t = a 0 − 22n =
0 2 F (x) + F (y) + F (z) + F (t) = b 0
0  >0
0
n+1 0 x+y =a 0
0
0 (x, y) ∈ F2 ; F (x) + F (y) = b 0 − 2 .
2n n+1
2

x+y+z+t = a
Hence, F is AB if and only if the system admits
F (x) + F (y) + F (z) + F (t) = b
3 · 22n − 2n+1 solutions if a = b = 0n (this is equivalent to saying that F is APN), 22n −
2n+1 solutions if a = 0n and b = 0n (note that this condition corresponds to adding all
the conditions of Corollary 27 with c fixed to b and with a ranging over Fn2 ), and 22n +
2n+2 γF (a, b) − 2n+1 solutions if a = 0n (indeed, F is APN; note that this gives a new
property of AB functions).

Characterization of APN functions by the minimum distance of related codes, and of


AB functions by the weight distribution of these codes
A relationship has been observed in [641] (not exactly in terms of APNness since
this notion was not known by the authors) and developed further in [257]) (see also
[642, 1099]) between the properties, for an (n, n)-function, of being APN or AB and
properties of related codes. This is the reason that APN functions are generalizations of
the cube (n, n)-function x 3 , whose related code is the 2-error correcting BCH code (see
page 10):

Proposition 160
 [257] Let F be any function from F2n to F2n such that F (0) = 0. Let H
α 2 −2
n
1 α α2 ···
be the matrix , where α is a primitive element
F (1) F (α) F (α 2 ) · · · F (α 2 −2 )
n
11.3 Almost perfect nonlinear and almost bent functions 379

of F2n , where each symbol stands for the column of its coordinates with respect to a basis
of the F2 -vector space F2n , and where only linearly independent rows are kept. Let CF be
the linear code admitting H for parity check matrix. Then F is APN if and only if CF has
minimum distance 5, and F is AB if and only if CF⊥ (admitting H for generator matrix)
n−1 n−1
has Hamming weights 0, 2n−1 − 2 2 , 2n−1 , and 2n−1 + 2 2 (equivalently, has nonzero
n−1 n−1
Hamming weights between 2n−1 − 2 2 and 2n−1 + 2 2 ).

Proof Since H contains no zero column, CF has no codeword of Hamming weight 1,


and since all columns of H are distinct vectors, CF has no codeword of Hamming weight
2. Hence4 , CF has minimum distance at least 3. This minimum distance is also at most 5,
since otherwise, a [2n − 2, k, d ≥ 5] code with k ≥ 2n − 1 − 2n would exist by puncturing,
and we know from [482] that this is impossible. The fact that CF has no codeword of weight
3 or 4 is by definition equivalent to
 the APNness of F , since a vector (c0 , c1 , . . . , c2n −2 ) ∈
2n −2
ci α i = 0
F22 −1 is a codeword if and only if i=0
n
2n −2 . The nonexistence of codewords
i=0 ci F (α ) = 0 
i

of Hamming weight 3 is then equivalent to the fact that x∈E F (x) = 0 for every two-
dimensional vector subspace E of F2nand the nonexistence of codewords of Hamming
weight 4 is equivalent to the fact that x∈A F (x) = 0 for every two-dimensional flat A
not containing 0. The characterization of ABness through the weights of CF⊥ comes directly
from the characterization of AB functions by their Walsh transform values, respectively
by their nonlinearity, and from the fact that the Hamming weight of the Boolean function
v · F (x) ⊕ u · x equals 2n−1 − 12 WF (u, v).

Remark.
1. If F is APN and n > 2, then CF has dimension 2n − 1 − 2nexactly (i.e., all the rows in
α 2 −2
n
1 α α2 ···
the matrix H = are linearly independent),
F (1) F (α) F (α 2 ) · · · F (α 2 −2 )
n

since according to [482] again, [2n − 1, 2n − 2n, 5] codes do not exist. A direct proof of
the fact that CF⊥ has indeed dimension 2n is given by Dillon in [447]. This property of
CF⊥ is equivalent to the fact that F has nonzero nonlinearity. Dillon uses Relation (11.2),
page 372, and observes that if v0 · F is affine for some v0 = 0, then WF4 (u, v) =
u,v∈F2n
v∈{0,v0 }

(2n − 2) · 23n , which means that all component functions of F except v0 · F are bent.
This allows building a bent (n, n − 1)-function, a contradiction with Nyberg’s result
(Proposition 104, page 269). A slightly different proof (also using Nyberg’s result) was
known earlier; see Proposition 161 below.
2. Any subcode of dimension 2n − 1 − 2n of the [2n − 1, n, 3] Hamming code is a code CF
for some function F .
3. Proposition 160 assumes that F (0) = 0. If we want to express the APNness of any
(n, n)-function, another matrix can be considered as in [135]: the (2n + 1) × (2n − 1)

4 We can also say that CF is a subcode of the Hamming code (see page 8).
380 Highly nonlinear vectorial functions with low differential uniformity
⎡ ⎤
1 1 1 1 ··· 1
matrix ⎣ 0 α 2 −2 ⎦. Then F is APN if and only
n
1 α α2 ···
F (0) F (1) F (α) F (α 2 ) · · · F (α 2 −2 )
n

if the code CK F admitting this parity check matrix has parameters [2 , 2 − 1 − 2n, 6]. To
n n

prove this, note first that this code does not change if we add a constant to F (contrary
to CF ). Hence, by adding the constant F (0), we can assume that F (0) = 0. Then the
code C KF is the extended code of CF (obtained by adding to each codeword of CF a first
coordinate equal to the sum modulo 2 of its coordinates). Since F (0) = 0, we can apply
K
Proposition 160, and it is clear that CF is a [2n − 1, 2n − 1 − 2n, 5] code if and only if C F
is a [2 , 2 − 1 − 2n, 6] code, since we know from [482] that CF cannot have minimum
n n

distance larger than 5.


K ⊥
Note that Proposition 156, page 371, means that if C F has the highest possible
n−1
minimum distance 2n−1 − 2 2 , then C KF has minimum distance at least 6.
4. As observed in [135], given two (n, n)-functions F and G such that F (0) = G(0) = 0,
there exists a linear automorphism5 that maps GF to GG if and only if the codes CF
and CG are equivalent (that is, are equal up to some permutation of the coordinates of
their codewords). Indeed, the graph GF of F equals the (unordered) set of columns in the
parity check matrix of the code CF , plus an additional point equal to the all-zero vector.
Hence, the existence of a linear automorphism that maps GF onto GG is equivalent to the
fact that the parity check matrices6 of the codes CF and CG are equal up to multiplication
(on the left) by an invertible matrix and to permutation of the columns. Since two codes
with given parity check matrices are equal if and only if these matrices are equal up to
multiplication on the left by an invertible matrix, this completes the proof.
Similarly, two functions F and G taking any values at 0 are CCZ equivalent if and
only if the codes C KF and CK G are equivalent.
5. For every (n, n)-function F such that F (0) = 0, the two first power moments of WF are
known: we have WF (u, v) = 2n (−1)v·F (0) = 22n , and WF2 (u, v) =
u,v∈F2n v∈F2n u,v∈F2n
3n
2 (the former equality is given by the inverse Walsh transform formula (2.43), page
59, and the latter is given by the Parseval relation (2.48), page 61). If F is APN, then
we have also the two next power moments: Relations (11.2) and (11.5), page 373. In
the case F is AB, this makes possible to determine the value distribution of WF and
therefore the weight distribution of CF⊥ uniquely.7 Indeed, there are only three nonzero
weights, which are known, and we need then only to determine the three numbers of
codewords of each weight; the four equations obtained, which are linear in these numbers,
make this possible. There are one codeword of null Hamming weight, (2n − 1)(2n−2 +
n−3 n−1 n−3
2 2 ) codewords of Hamming weight 2n−1 − 2 2 , (2n − 1)(2n−2 − 2 2 ) codewords of
n−1
Hamming weight 2n−1 + 2 2 , and (2n − 1)(2n−1 + 1) codewords of Hamming weight
2n−1 . See more details in [257] (the calculations are made there equivalently with the

5 Note that this is a subcase of CCZ equivalence – in fact, a strict subcase as shown in [135].
6 This is true also for the generator matrices of the codes.
7 The determination of such weight distribution is not known so often (when the code does not contain the
all-one vector) since determining the Walsh value distribution of the function is much more difficult in general
than determining the absolute value distribution, which for an AB function is easily deduced from the single
Parseval’s relation.
11.3 Almost perfect nonlinear and almost bent functions 381

Pless power moment equalities of [958]). We shall see that function x 3 over F2n is an AB
function (for n odd). The code CF⊥ corresponding to this function is an important code:
the dual of the 2-error-correcting BCH code of length 2n − 1.

We have seen that if F is APN on F2n , n > 2, and F (0) ⊥


 = 0, the code C2 F has dimension
2n. Equivalently, the code whose generator matrix equals F (1) F (α) F (α ) · · · F (α 2 −2 ) ,
n

and which can therefore be seen as the code {trn (vF (x); v ∈ F2n }, has dimen-
sion n and2 intersects the simplex code {trn (ux); u ∈ F2n } of generator matrix
1 α α · · · α 2 −2 only in the null vector. This can be proved directly:
n

Proposition 161 [237] Let F be APN over Fn2 with n > 2. Then nl(F ) cannot be null and,
assuming that F (0n ) = 0n , the code CF⊥ has dimension 2n.

Proof Suppose there exists v = 0n such that v · F is affine. Without loss of generality
(by composing F with an appropriate linear automorphism and adding an affine function to
F ), we can assume that v = (0, . . . , 0, 1) and that v · F is null. Then, every derivative of
F is 2-to-1 and has null last coordinate. Hence, for every a = 0n and every b, the equation
Da F (x) = b has no solution if bn = 1 and it has two solutions if bn = 0. The (n, n − 1)
function obtained by erasing the last coordinate of F (x) has therefore balanced derivatives;
hence it is a bent (n, n − 1)-function, a contradiction with Nyberg’s result (Proposition 104,
page 269), since n − 1 > n2 . The last sentence in the statement is straight forward.

For n = 2, the nonlinearity can be null; example: function (x1 , x2 ) → (x1 x2 , 0).

Remark. As observed at page 136, the nonlinearity and the differential uniformity of
general functions do not seem correlated. However, Proposition 161 shows that, for APN
functions, a null nonlinearity is impossible. Moreover, all known APN functions have a
3n 2n
rather good nonlinearity (probably at least 2n−1 −2 5 −1 −2 5 −1 , but this has to be confirmed
since the nonlinearity of the Dobbertin function is unknown except for small values of n).
The question of knowing whether it is because we know too few APN functions or because
there is some correlation in the case of such optimal functions seems wide open.

J. Dillon (private communication) observed that the property of Proposition 161 implies
that, for every nonzero c ∈ F2n , the equation F (x) + F (y) + F (z) + F (x + y + z) = c must
have a solution (that is, the function F introduced after Corollary 27 is onto Fn2 \ {0n };
we have seen this for AB functions since we saw that this function is balanced, but it is
new for APN functions). Indeed, otherwise, for every Boolean function g(x), the function
F (x) + c g(x) would be APN. But this is contradictory with Proposition 161 if we take
g(x) = v0 · F (x) (that is, g(x) = trn (v0 F (x)) if we have identified Fn2 with the field F2n )
with v0 ∈ c⊥ , since we have then v0 · [F (x) + c g(x)] = v0 · F (x) ⊕ g(x) (v0 · c) = 0.

Characterization of AB functions by uniformly packed codes


Proposition 162 [257] Let F be any (n, n)-function, n odd. Then F is AB if and only if CF
is a uniformly packed code (see Definition 2, page 10) of length N = 2n − 1 with minimum
distance 5 and covering radius 3.
382 Highly nonlinear vectorial functions with low differential uniformity

It is deduced in [257, Corollary 3] that an APN function is AB if and only if


{WF (u, v); u, v ∈ Fn2 , v = 0n } has three values.

Characterization of AB functions, among APN functions, by the divisibility of their


Walsh transform values (n odd); consequence for plateaued functions
We have seen that all AB functions are APN. The converse is false, in general. But if n is odd
and if F is APN, then, as shown in [186, 195], there exists a nice necessary and sufficient
n−1
condition, for F being AB: the weights of CF⊥ are all divisible by 2 2 (see also [196],
where the divisibilities for several types of such codes are calculated, where tables of exact
divisibilities are computed, and where proofs are given that a great deal of power functions
are not AB). In other words and slightly more generally:

Proposition 163 Let F be an APN (n, n)-function. Then F is AB if and only if all the
values WF (u, v) of the Walsh spectrum of F are divisible by 2 2  .
n+1

Proof The condition is clearly necessary (with n 


necessarily odd). Conversely, assume
that F is APN and that all the values WF (u, v) = x∈Fn (−1)v·F (x)⊕u·x are divisible by
2
n+1
2 2  . Writing WF2 (u, v) = 2n+1 λu,v , where all λu,v ’s are integers, Relation (11.4), page
372, implies then
(λ2u,v − λu,v ) = 0, (11.7)
v∈Fn2 ∗ ,u∈Fn2

and since all the integers λ2u,v − λu,v are nonnegative (λu,v being an integer), we deduce that
λ2u,v = λu,v for every v ∈ Fn2 ∗ , u ∈ Fn2 , i.e. λu,v ∈ {0, 1}.

Proposition 163 shows that if n is odd and an APN (n, n)-function F is plateaued,
or more generally if F = F1 ◦ F2−1 , where F2 is a permutation and the linear com-
binations of the coordinate functions of F1 and F2 are plateaued, then F is AB, since
 v·F (x)⊕u·x =
 v·F1 (x)⊕u·F2 (x) is divisible by 2 n+1
2 .
x∈Fn2 (−1) x∈Fn2 (−1)
This makes it possible to deduce easily the AB property of Gold and Kasami functions
(see their definitions below) from their APN property, since the Gold AB functions are
quadratic and the Kasami AB functions are equal, when n is odd, to F1 ◦F2−1 where F1 (x) =
x 2 +1 and F2 (x) = x 2 +1 are quadratic.8
3i i

Proposition 163 also allows us to characterize AB functions among APN power functions,
thanks to Proposition 21, page 73. Sufficient conditions for power functions not to be AB
are given in [194].

Complementary observation on APN functions for n even


If F is APN, then there must exist v ∈ Fn2 \ {0n }, u ∈ Fn2 such that WF (u, v) is not
n
divisible by 2 2 +1 . Indeed, suppose that all the Walsh values of F have such divisibility,
8 The component functions of Kasami APN functions are plateaued for every n even too. This has been proved
in [448, theorem 11] when n is not divisible by 6 and for every n even in [1142].
11.3 Almost perfect nonlinear and almost bent functions 383

then writing again WF2 (u, v) = 2n+1 λu,v , we have Relation (11.7), in which each nonzero
λu,v being now even satisfies λ2u,v − λu,v > 0. All the values λ2u,v − λu,v are then
nonnegative integers and (for each v = 0n ) at least one value is strictly positive, a
contradiction.
n
If all the Walsh values of F are divisible by 2 2 (e.g., if F is plateaued), then we deduce
n n
that there must exist v ∈ Fn2 \ {0n }, u ∈ Fn2 such that WF (u, v) ≡ 2 2 [mod 2 2 +1 ]. It is also
shown in [24] that for every APN, or more generally weakly APN, permutation F (whose
derivatives at nonzero directions take strictly more than 2n−2 distinct values), at most 2 3−1
n

component functions of F can be partially-bent (and, in particular, F cannot then be strongly


plateaued); indeed, each partially-bent component function of a permutation has a linear
kernel of dimension at least 2 (bent functions being not balanced), and has then at least
three constant derivatives at nonzero directions, and if there was t > 2 3−1 partially-bent
n

component functions of F , since 3t > |Fn2 \ {0n }|, there would exist a = 0n and two distinct
nonzero elements v1 , v2 of Fn2 such that v1 · Da F and v2 · Da F are constant, a contradiction
since {Da F (x); x ∈ Fn2 } would then have at most 2n−2 elements.
More can be said in the case of APN plateaued functions; see page 391.

APN functions and finite geometry


We refer the reader to [430] and the references therein for the relations between APN
functions and dimensional dual hyperovals or bilinear dimensional dual hyperovals. Other
relations with finite geometry are shown in [895].

11.3.2 The particular case of power functions


Identifying Fn2 with the field F2n (in which we can take x · y = trn (xy) for inner product)
allows considering those (n, n)-functions of the form F (x) = x d , d ∈ Z/(2n − 1)Z, called
power (n, n)-functions (and sometimes, monomial vectorial functions). If such F is APN,
then d is called an APN exponent over F2n .
Note that if d is an APN exponent over F2n and r divides n, then d [mod (2r − 1)] is an
APN exponent over F2r (in particular, it cannot be a power of 2 if r ≥ 2); more generally, if
r divides n and F (x) is an APN polynomial function over F2n with coefficients in F2r , then
F is APN over F2r ).

Relation between AB power functions and sequences


There is a close relationship between the nonlinearity of power functions and sequences
used for radars and for spread-spectrum communications. Recall that a binary sequence
that can be generated by an LFSR, or equivalently that satisfies a linear recurrence relation
si = a1 si−1 ⊕ · · · ⊕ an si−n , is called an m-sequence or a maximum length sequence if
its period equals 2n − 1, which is a maximum. Such a sequence has the form trn (λα i ),
where λ ∈ F2n and α is some primitive element of F2n . Consequently, its autocorrelation
2n −2
values i=0 (−1)si ⊕si+t (1 ≤ t ≤ 2n − 2) are all equal to −1, that is, are optimal. This is
useful for radars and for code division multiple access (CDMA) in telecommunications,
since it allows sending a signal easily distinguished from any time-shifted version of
itself. Finding a highly nonlinear power function (in particular, an AB power function)
384 Highly nonlinear vectorial functions with low differential uniformity

x d on the field F2n makes it possible to have a d-decimation9 si = trn (λα di ) of the
2n −2   −t
(−1)si ⊕si+t = x∈F∗n (−1)trn (λ(x +α x))
d
sequence, whose cross-correlation values i=0
2
(0 ≤ t ≤ 2n − 2) with the sequence si have small (minimum) overall magnitude10 [551,
552, 598]. In the case of an AB function, we speak of a preferred cross-correlation; see,
e.g., [174, 598]. The exponents of AB power functions have then been investigated as the
decimations with preferred cross-correlation by the researchers on sequences (those whose
names have been given to special classes of sequences and will be used for naming the
corresponding classes of AB functions, and also S. Golomb [550] who has been one of
the main initiators of the theory of sequences; see [555]). They proved the preferred cross-
correlation in some cases and made conjectures for others. Hence, when the notion of AB
function was invented by Chabaud and Vaudenay, some work had been already done for
searching such functions. A survey on cross-correlation distributions is given in [593]. See
also [1127].

Simplification of the checking of APNness


When F is a power function, it is enough to check the APN property for a = 1 ∈ F2n , since
for a = 0, changing x into ax in the equation F (x)+F (x +a) = b gives F (x)+F (x +1) =
b
F (a) . Hence, according to what we saw on the characterization by the ANF at page 374,
F (x) = x d is APN if and only if
      n n
δ0 x d + (x + 1)d + y d + (y + 1)d + δ0 x + y + δ0 x + y + 1 mod x 2 + x, y 2 + y

equals the zero function, where δ0 (z) = 1 + z2 −1 , or equivalently


n

  
2n −1 2n −1
1H (x)1H (y) x d + (x + 1)d + y d + (y + 1)d + x+y ,

similarly reduced, equals the zero function, where H is a linear hyperplane excluding 1.
 n+1
Moreover, checking the AB property x∈F2n (−1)trn (vF (x)+ux) ∈ {0, ±2 2 }, for every
u, v ∈ F2n , v = 0, is enough for u =  0 and u = 1 (and every v = 0), since changing

x into xu (if u = 0) in this sum gives x∈F2n (−1)trn (v F (x)+x) , for some v  = 0. If F
is a permutation, then checking the AB property is also enough  for v = 1 and every u,
x  trn F (x)+ −1u x
since changing x into F −1 (v) in this sum gives x∈F2n (−1) F (v) . And in the
characterization of Proposition 159, page 376, if a = 0, then it can be reduced similarly
to a = 1, and if a = 0, we can assume that z = 0 and replace x by xz and y by yz,
we get (x d + y d + 1)zd = b, which has the same number of solutions for every nonzero
b since F is a permutation; the characterization of ABness reduces then to: the equation
x d + y d + (x + y + 1)d = b has 2n − 2 solutions for every b = 1. Then a power permutation
F in odd dimension is AB if and only if the function (x, y) → x d + y d + (x + y + 1)d is
(2n − 2) − to − 1 from {(x, y) ∈ (F2n \ {1})2 ; x = y} to F2n \ {1} (the fact that this never
takes value 1 is equivalent to F APN).

9 Another m-sequence if d is coprime with 2n − 1.


10 This makes possible, in code division multiple access, to give different signals to different users.
11.3 Almost perfect nonlinear and almost bent functions 385

Additional information on bijectivity


It was proved in [257] that, when n is even, no APN function exists in a class of permutations
including power permutations, which we describe now. Let k = 2 3−1 (which is an integer,
n

since n is even) and let α be a primitive element of the field F2n . Then β = α k is a primitive
element of F4 . Hence, β 2 + β + 1 = 0. For every j , the element (β + 1)j + β j = β 2j + β j
equals 1 if j is coprime with3 (since β j is then also a primitive element of F4 ), and is
n −1
null otherwise. Let F (x) = j2 =0 δj x j , (δj ∈ F2n ) be an (n, n)-function. According to
 observations above, β and β + 1 are the solutionsof2nthe
the
−1
equation F (x) + F (x + 1) =
gcd(j ,3)=1 δ j . Also, the equation F (x)+F (x +1) = j =1 j admits 0 and 1 for solutions.
δ
Thus:
 n −1
Proposition 164 Let n be even and let F (x) = j2 =0 δj x j be any APN (n, n)-function,
 3 2 n −1
then j =1 δ3j = 0. If F is a power function, F (x) = x d , then 3 divides d and F cannot
be a permutation.

H. Dobbertin gives in [469] a result valid only for power functions but slightly more
precise, and he completes it in the case that n is odd:

Proposition 165 If a power function F (x) = x d over F2n is APN, then for every x ∈ F2n ,
we have x d = 1 if and only if x 3 = 1, that is, F −1 (1) = F4 ∩ F∗2n . If n is odd, then
gcd(d, 2n − 1) equals 1 and, if n is even, then gcd(d, 2n − 1) equals 3. Consequently, APN
power functions are permutations if n is odd, and are three-to-one over F∗2n if n is even.

Proof Let x = 1 be such that x d = 1. There is a (unique) y in F2n , y = 0, 1, such that x =


(y + 1)/y. The equality x d = 1 implies (y + 1)d + y d = 0 = (y 2 + 1)d + (y 2 )d . By the APN
property and since y 2 = y because x = 1, we conclude y 2 +y+1 = 0. Thus, y, and therefore
x, are in F4 and x 3 = 1. Conversely, if x ∈ F2n \ F2 is such that x 3 = 1, then 3 divides
2n − 1 and n must be even. Moreover, d must also be divisible by 3 (indeed, otherwise, the
restriction of x d to F4 would coincide with the function x gcd(d,3) = x and would be therefore
linear, a contradiction). Hence, we have x d = 1. The rest is straightforward.

Note that for n even, 3|d can be proved directly: if 3  | d, then x d being APN on F2n ,
x d (mod 3) is APN on F4 , a contradiction since d (mod 3) ∈ {1, 2}.
In [62], it is similarly observed that, if all the coefficients in the univariate representation
of an APN function F (x) belong to a subfield F2r of F2n , then the equality Da F (x) = b for
r
some a, b ∈ F2r and x ∈ F2n \ F2r implies x 2 = x + a.
In [406], it is shown that, for any n, if an (n, n)-function F fixes 0n and is such that, for
every nonzero u ∈ Fn2 , the preimage F −1 (u) either is empty or equals a set of three distinct
nonzero elements of the form {a, b, a + b} (i.e., is a two-dimensional F2 -linear space less
0n ),11 then F is APN if and only if

F (x) = F (y)
'⇒ F (x) + F (y) + F (z) + F (x + y + z) = 0n .
F (z) ∈ {F (x), F (y), F (x + y)}

11 This needs that n be even, and it happens then with any APN power function, and also with other functions
like x 3 + trn (x 9 ).
386 Highly nonlinear vectorial functions with low differential uniformity

F (x) = F (y)
Indeed, this condition is necessary since implies that
F (z) ∈ {F (x), F (y), F (x + y)}
{x, y, z, x + y + z} is a two-dimensional flat and the restriction of an APN function to any
two-dimensional flat must not sum up to 0n ; this condition is also sufficient since, if four
distinct elements of Fn2 have null sum as well as their images, then the condition being
assumed satisfied, either these four elements come by pairs with the same image in each
pair, and this is impossible since if for instance, F (x) = F (y) and F (z) = F (x + y + z),
then because of the assumption on the preimages, we have F (z) = F (x + y) = F (x) =
F (y) = F (x + y + z), which is impossible since F −1 (F (z)) has only three elements, or
they are such that F (z) = F (x + y) and the same happens since we have then F (z) =
F (x + y) = F (x + y + z) = F (x) = F (y). Note that a sufficient condition for F to be APN
is that F (Fn2 ) is a Sidon set (see Definition 80, page 388), but it is shown in [406] that such
sets of size 2 3−1 + 1 do not exist for n ≥ 6 even.
n

Nonlinearity
An upper bound valid not only for APN functions but restricted to power functions is proved
in [189]:

Proposition 166 For every n even, if a power function F (x) = x d on F2n is not a
permutation (i.e., if gcd(d, 2n − 1) > 1), then the nonlinearity of F is bounded above by
n
2n−1 − 2 2 .
 d
Proof Let d0 = gcd(d, 2n − 1); for every v ∈ F2n , the sum x∈F2n (−1)trn (vx ) equals
   2
(−1) trn (vx d0 ) ; hence, (−1) trn (vx d ) is equal to 2n |{(x, y), x, y ∈
x∈F2n v∈F2n x∈F2n
F2n , x d0 = y d0 }|. The number of elements in the image of F∗2n by the mapping
x → x d0 is (2n − 1)/d0 and every element of this image has d0 preimages. Hence,
  2
∗ (−1) trn (vx d ) equals 2n [(2n − 1)d0 + 1] − 22n = 2n (2n − 1)(d0 − 1)
v∈F2n x∈F2n
⎛ ⎞2

and max ⎝ (−1)trn (vx ) ⎠ ≥ 2n (d0 − 1). The proof is completed by using that the
d

v∈F2n
x∈F2n
 trn (vx 3 )
values of x∈F2n (−1) are known.

Reference [189] also studies the case of equality. The possible values of the sum
 trn (vx d )
x∈F2n (−1) are determined in [62] for APN power functions for
n even.
It happens that all known APN power functions have rather good nonlinearity. To clarify
the situation for general power APN functions, we need to show lower bounds on their
nonlinearity and/or to find such functions with lower nonlinearity. The next bound is shown
in [250] (the proof below is from this reference):

Proposition 167 Let F be any APN power function. Then, if n is odd, we have nl(F ) ≥
3n−3 3n−2
2n−1 − 2 4 and if n is even, we have nl(F ) ≥ 2n−1 − 2 4 .
11.3 Almost perfect nonlinear and almost bent functions 387

Proof If n is odd then, for every v = 0, the sum u∈F2n WF4 (u, v) is independent12
of the choice of v and, according to the characterization of APN functions by the
fourth moment of Walsh transform, equals then 23n+1 . Hence, we have WF4 (u, v) ≤
23n+1 for every u and the result follows from (3.21),
 page 4117. If n is even, then,
since according to Proposition 165, the value of u∈F2n WF (u, v) does not change
when v is multiplied by a nonzero cube, it takes, when v ranges over F∗2n , 2 3−1 times
n

 4 2n −1  4 2n −1
the value u∈F2n WF (u, 1), times the value u∈F2n WF (u, α), and times
 3  3
the value u∈F2n WF (u, α ) (α primitive in F2 ). Hence we have u∈F2n WF (u, 1) +
4 2 n 4
 
u∈F2n WF (u, α) + u∈F2n WF (u, α ) = 3 · 2
4 4 2 3n+1 . We have, by the Cauchy–Schwarz
 2
 u∈Fn WF2 (u,v)
inequality, that u∈Fn2 WF4 (u, v) ≥ = 23n for v = 0. Hence, we
2
2n
 
have by complementation that each of the sums u∈F2n WF4 (u, 1), u∈F2n WF4 (u, α) and

u∈F2n WF (u, α ) is bounded above by 3 · 2
4 2 3n+1 − 2 · 23n = 23n+2 . We have then

WF (u, v) ≤ 2
4 3n+2 for every u, v such that v = 0 and Relation (3.21) completes the
proof.

The bound of Proposition 167 has been extended in [356] to differential uniform power
functions but only for permutations. For explaining the good nonlinearity of known APN
functions, there remains to tackle that of quadratic functions. Less is known for them; see
observations in [250].

Relation with cyclic codes


If F is a power function, then the linear codes CF and CF⊥ viewed in Proposition 160, page
378, are cyclic codes (see [257], where several results are given in this framework). Indeed,
(c0 , . . . , c2n −2 ) belongs to CF if and only if c0 + c1 α + · · · + c2n −2 α 2 −2 = 0 and c0 +
n

c1 α d + · · · + c2n −2 α (2 −2)d = 0; this implies (by multiplying these equations by α and α d ,


n

respectively) c2n −2 + c0 α + · · · + c2n −3 α 2 −2 = 0 and c2n −2 + c0 α d + · · · + c2n −3 α (2 −2)d =


n n

0. The BCH bound (see page 13) shows in the case F (x) = x 3 that CF has minimum
distance (at least) 5 (i.e., that F is APN) and (in an original but rather complex way) that
n−1
the function x 2 2 +1 , n odd, is AB: by definition, the defining set I of CF (see page 12)
n−1
equals the union of the cyclotomic classes of 1 and 2 2 + 1, that is, I = {1, 2, . . . , 2n−1 }∪
n−1 n+1 n−1 n+1 n+3 n−3
{2 2 + 1, 2 2 + 2, . . . , 2n−1 + 2 2 , 2 2 + 1, 2 2 + 2, . . . , 2n−1 + 2 2 }. Since there is no
n−1
element equal to 2n−1 + 2 2 + 1, . . . , 2n − 1 in I , the defining set Z/(2n − 1)Z \ {−i; i ∈ I }
n−1
of CF⊥ contains a string of length 2n−1 − 2 2 − 1. Hence the nonzero codewords of this
n−1
code have Hamming weights larger than or equal to 2n−1 − 2 2 . This is not sufficient for
concluding that the function is AB, but we can apply the previous reasoning to the cyclic
code CF⊥ ∪ (12n −1 + CF⊥ ): the defining set of the dual of this code being equal to that of CF ,
plus 0, the defining set of the code equals that of CF⊥ less 0, which gives a string of length
n−1 n−1
2n−1 − 2 2 − 2 instead of 2n−1 − 2 2 − 1. Hence the complements of the codewords of

12 Such a property will be called CAPNness at page 390 and implies the former inequality.
388 Highly nonlinear vectorial functions with low differential uniformity
n−1
CF⊥ have Hamming weights at least 2n−1 − 2 2 − 1 and the codewords of CF⊥ have then
n−1
Hamming weights at most 2n−1 + 2 . 2

The powerful McEliece theorem (see, e.g., [809]) that we recalled at Section 4.1 (page
151) gives the exact divisibility of the codewords of cyclic codes. Translated in terms of
vectorial functions, it says that if d is relatively prime to 2n − 1, the exponent ed of the
greatest power of 2 dividing all the Walsh coefficients of the power function x d is given by
ed = min{w2 (t0 ) + w2 (t1 ), 1 ≤ t0 , t1 < 2n − 1; t0 + t1 d ≡ 0 [mod 2n − 1]}. It can be used
in relationship with Proposition 163. This has led in [195] to the proof, by Canteaut et al.,
of a several decade old conjecture due to Welch.
Note finally that, if F is a power function, then Boolean function γF seen in Proposition
158 is within the framework of Dobbertin’s triple construction [466].

Relation with the notions of Sidon sets and sum-free sets


In [316], is observed that APN exponents have a property involving two well-known notions
in additive combinatorics. We refer the reader to this paper and to the references therein for
complements.

Definition 80 A subset of Fn2 is a Sidon set if it does not contain four distinct elements
whose sum is null.

This notion due to Sidon is preserved by affine equivalence and by decreasing inclusion.
Denoting by PS the set of pairs in S, it is equivalent to saying that {x, y} ∈ PS → x + y is
one-to-one. The size |S| is then such that |S| 2 ≤ 2 − 1.
n

Note that an (n, n)-function F is APN if and only if its graph GF = {(x, F (x)); x ∈ Fn2 }
is a Sidon set in ((Fn2 )2 , +), since saying that, given four distinct elements x, y, z, t of Fn2 , if
x + y + z + t = 0n then F (x) + F (y) + F (z) + F (t) = 0n , is equivalent to saying that, given
four distinct elements x, y, z, t, we have x+y+z+t = 0n or F (x)+F (y)+F (z)+F (t) = 0n .

Definition 81 A subset S of Fn2 is called a sum-free set if it does not contain elements
x, y, z such that x + y = z (i.e., if S ∩ (S + S) = ∅).

This notion due to Erdös is preserved by linear equivalence and by decreasing inclusion.
The size |S| is then smaller than or equal to 2n−1 , because |S + S| ≥ |S|, and if |S| > 2n−1 ,
then the two sets S + S and S have intersection. Note that S cannot contain 0n . A basic
example of a sum-free set (with minimum size) is the complement of a linear hyperplane.
The size |S| of a sum-free Sidon set satisfies |S| (|S|+1)
2 ≤ 2n − 1, since S ∪ {0n } is then a
Sidon set.

Proposition 168 [316] For every positive integers n and d and for every j ∈ Z/nZ, let
ej = gcd(d − 2j , 2n − 1) ∈ Z/(2n − 1)Z, and let Gej be the multiplicative subgroup
{x ∈ F∗2n ; x d−2 = 1} = {x ∈ F∗2n ; x ej = 1} of order ej . If d is an APN exponent over F2n ,
j

then, for every j ∈ Z/nZ, Gej is a Sidon sum-free set in F2n .


11.3 Almost perfect nonlinear and almost bent functions 389
j
Proof For every x ∈ Gej \ {1}, let s = x
x+1 . Then x = s
s+1 , and x d−2 = 1 implies
j j j j
s d−2 + (s + 1)d−2 = 0, which implies after multiplication by s 2 + 1 = (s + 1)2 that

and s  = x x+1 , with x = 1 and
j
s d + (s + 1)d = s d−2 = 1
d−2j
. Note that if s = x+1
x
(x+1)
x =
 1, then we have s = s  if and only if x = x  and s = s  + 1 if and only if x  = x −1 .
Suppose that Gej is not a Sidon set, then let x, y, z, t be distinct elements of Gej such that
x + y = z + t. Making the changes of variables x → xt, y → yt, z → zt and dividing
the equality by t, we obtain distinct elements x, y, z of Gej \ {1} such that x + y + z = 1.
Making now the change of variable y → zy, we obtain elements x, y, z in Gej \{1} such that
x + 1 = z(y + 1), x = y and x = y −1 . We have then 1
d−2j
= 1
d−2j
x
and x+1 y
= y+1 ,
(x+1) (y+1)
y
x
x+1  y+1
= + 1, a contradiction with the APNness of F .
Suppose that Gej is not sum-free, then Gej ∩ (Gej + 1) = ∅. Let x ∈ Gej ∩ (Gej + 1) and
s = x+1
x
; we have then 1
d−2j
= 1 and s d +(s+1)d = 1 and the equation zd +(z+1)d = 1
(x+1)
has four solutions 0, 1, s, and s + 1 in F2n , a contradiction.

Remark. Denoting e = gcd(d, 2n − 1), we have that Ge itself is a Sidon set since, as
recalled above, we have e = 1 if n is odd and e = 3 if n is even, and G1 = {1}, G3 = F∗4 are
Sidon sets (since they do not contain four distinct elements). But Ge is a sum-free set only
for n odd, since F∗4 is not sum-free.

A geometric characterization of the fact that some integer d coprime with 2n − 1 is an


APN exponent over F2n for n odd by means of the Singer set Sd = {x ∈ F2n ; trn (x d ) = 1}
is also given in [252].

Alternative characterization of APN exponents, relation


with the Dickson polynomials
x d +1
When x ranges over F2n \{1}, s = x+1 ranges over F2 \{1} and s +(s+1) = (x+1)d . Then,
x n d d

considering separately the equation s d +(s+1)d = 1 and the equations s d +(s+1)d = b = 1,


we have directly:

Proposition 169 [316] Let n be any positive integer, then a power function F (x) = x d
x d +1
over F2n is APN if and only if the function x → (x+1) d is 2-to-1 from F2n \ F2 to F2n \ {1}.

By definition, we have x d + (x + 1)d = φd (x 2 + x) where φd is the reversed Dickson


polynomial, that is, φd (X) = Dd (1, X), where Dd is classically defined by Dd (X +
Y , XY ) = X d + Y d ; see [628] and [890, page 227]. Then F (x) = x d is APN if and only if
function φd is injective over the hyperplane H = {x 2 +x; x ∈ F2n } = {y ∈ F2n ; trn (y) = 0}.
Moreover, if x d is APN over F2n for n even, then φd is a permutation polynomial of F2n/2 ,
which in turn implies that x d is APN over F2n/2 , see [890, theorem 8.1.97, page 227].
x d +1 −1 ), where (ψ (X))2 = Dd (X,1) , where D (X, 1) is
We also have (x+1) d = ψd (x + x d Xd d
 d 2 −d
x +1 x +x
d
the Dickson polynomial [890] since (x+1) d = (x+x −1 )d . According to Proposition 169,
390 Highly nonlinear vectorial functions with low differential uniformity

function F is then APN if and only if ψd is injective from {x + x −1 ; x ∈ F2n \ F2 } = {y ∈


−1
F∗2n ; trn (y −1 ) = 0} to F2n \ {1}. Note that D(y
d (y ,1)
−1 )d = y d Dd (y −1 , 1) is the value at y of the
reciprocal polynomial of Dd (X, 1). Hence:

Proposition 170 [316] For every positive integer n and d, function F (x) = x d is APN if
and only if the reciprocal polynomial D −1
d (X, 1) = X Dd (X , 1) of the Dickson polynomial
d

Dd (X, 1) is injective and does not take value 1 over H = {y ∈ F∗2n ; trn (y) = 0}.

And it has been proved in [316] that for every positive integer d, the reversed Dickson
polynomial of index 2d and the reciprocal of Dickson polynomial of index d are equal.
In fact, as observed with X.-D. Hou, for any characteristic, we have Xd Dd ( X1 − 2, 1) =
D2d (1, X).

Search for APN exponents


Dobbertin and Canteaut have independently determined all APN exponents for n ≤ 26,
and Leander–Langevin did the same up to n = 33 for AB exponents in [753]; all belong
to the classical classes of APN exponents that we shall list in Subsection 11.4, page 394.
Edel checked the same for n ≤ 34 and n = 36, 38, 40, 42. The main idea for his computer
investigation was to consider all the elements in Z/(2n − 1)Z, discard all those which are
not coprime with 2n − 1 for n odd and do not have gcd equal to 3 with 2n − 1 for n even, and
all the remaining exponents whose reduction mod 2r − 1 is not an APN exponent in F2r for
some divisor r of n. Then, checking APNness was made for one member of each remaining
cyclotomic class of 2 modulo 2n − 1 only since x d and x 2d are linearly equivalent. No
unclassified APN exponent could be found. A new search has been made in [316] in which
were also discarded all those exponents d that were known not to satisfy Proposition 168,
thanks to a work on the Sidon and sum-free multiplicative subgroups of F∗2n made in [315],
which shows in particular that Ge = {x ∈ F∗2n | x e = 1} is a Sidon set (resp. a sum-free
set) if and only if, for every u ∈ F∗2n (resp. for u = 1), the polynomial gcd(Xe + 1, (X +
1)e +u) has at most two zeros in F2n (resp. has no zero13 ). The condition for sum-free case is
n
equivalent to saying that gcd(Xe +1, (X+1)e +1, X2 +X), that is, gcd(X e +1, (X+1)e +1)
since X e + 1 divides X2 −1 + 1, equals 1, and this can be handled without computing in
n

the field F2n (which needs huge computational power for large values of n) since all the
coefficients playing a role in the Euclidean algorithm belong to F2 . Unfortunately, this did
not discard enough additional APN candidates to allow us to find new APN exponents.

11.3.3 Componentwise APNness (CAPNness)


Chabaud–Vaudenay’s characterization of APN functions by the fourth moment of the Walsh
transform (see Relation (11.2), page 372) leads to a notion called componentwise APNness
(CAPNness) in [251], stronger than APNness, in which the value on the left-hand side
of (11.2) is the same for every component function:
13 This condition can be compared to the condition that Xd + (X + 1)d + 1 has no zero in F2n \ F2 , which
expresses (as it can be easily checked) that the cyclic code CF (see Proposition 160, page 378, and see page
387) has no codeword of Hamming weight 3.
11.3 Almost perfect nonlinear and almost bent functions 391

Definition 82 Let n be any positive integer and F any (n, n)-function. We call F
componentwise APN (CAPN) if, given any nonzero v in Fn2 , its Walsh transform satisfies
the equality:

WF4 (u, v) = 23n+1 . (11.8)


u∈Fn2

Using Relation 3.10, page 98, F is CAPN if and only if V (v · F ) = 22n+1 for every
v = 0n . This EA-invariant notion had been first studied by Berger et al. in [62] without
this specific name being introduced by these authors. They had observed that AB functions
and power APN permutations have this property (for straightforward reasons); in particular,
all known APN functions in odd dimension are CAPN. They had stated an open question
on the existence or nonexistence of such functions for n even. The nonexistence has been
proved in [251], by showing that F is CAPN if and only if, for every w = 0n , the set
{(x, y, z) ∈ Fn2 ; F (x) + F (y) + F (z) + F (x + y + z) = w} has size 22n − 2n+1 and
observing that this size is divisible by 3.

11.3.4 Plateaued APN functions


In the case n odd, we have seen in Proposition 163, page 382, that plateaued APN n-
variable functions are almost bent.
In the case n even, we have seen at page 382 that for any plateaued APN n-variable
function F , there must exist v ∈ Fn2 ∗ such that the Boolean function v · F is bent.
Note that this implies that F cannot be a permutation, according to Proposition 35,
page 112, and since a bent Boolean function is never balanced. This was first observed
in [62].
When F is plateaued and APN, the numbers λu,v involved in Relation (11.7), page 382,
can be divided into two categories (since we know that the amplitude of a plateaued Boolean
function equals 2j with j ≥ n2 ): those such that the function
v · F is bent (for each such v,
we have λu,v = 22j −n−1 = 1/2 for every u and therefore u∈Fn (λ2u,v − λu,v ) = −2n−2 );
2
and those such that v · F is not bent (then λu,v ∈ {0, 2i } for some i ≥ 1 depending on
v, and therefore λ2u,v = 2i λu,v , and we have, thanks to Parseval’s relation applied to the
  2n
Boolean function v · F , u∈Fn (λ2u,v − λu,v ) = (2i − 1) u∈Fn λu,v = (2i − 1) 22n+1 =
2 2
(2i − 1)2n−1 ≥ 2n−1 ). Equation (11.7) implies then that the number B of those v such that
v · F is bent satisfies −B 2n−2 + (2n − 1 − B) 2n−1 ≤ 0, which implies that the number of
bent functions among the functions v · F is at least 23 (2n − 1) (this has been first observed in
[910] for APN functions with partially-bent components, Nyberg generalizing a result given
without a complete proof in [1028] for quadratic functions, and in [62] for plateaued APN
functions).
This bound is achieved with equality by the Gold APN functions F (x) = x 2 +1 ,
i

gcd(i, n) = 1 (see page 399). Indeed, we saw at page 206 that the function trn (vF (x))
is bent if and only if v is not the third power of an element of F2n .
Note that, given an APN plateaued function F , saying that the number of bent functions
among the functions trn (vF (x)) equals 23 (2n − 1) is equivalent to saying, according to the
392 Highly nonlinear vectorial functions with low differential uniformity

observations above, that there is no v such that λu,v = ±2i with i > 1, that is, F has
n
nonlinearity 2n−1 − 2 2 and it is also equivalent to saying that F has the same extended
Walsh spectrum as the Gold functions.
The fact that an APN function F has same extended Walsh spectrum as the Gold functions
can be characterized by using a similar method as for proving Corollary 27, page 377:
this situation happens if and only if, for every v ∈ Fn2 \ {0n } and every u ∈ Fn2 , we have
n n+2 
WF (u, v) ∈ {0, ±2 2 , ±2 2 } (where WF (u, v) = x∈Fn (−1)v·F (x)⊕u·x ), that is,
2
  
WF (u, v) WF (u, v) − 2
2 n+2
WF (u, v) − 2n = 0,
2

or equivalently WF5 (u, v) − 5 · 2n WF3 (u, v) + 22n+2 WF (u, v) = 0. Applying the Fourier–
Hadamard transform and dividing by 22n , this is equivalent to the fact that
0   0
0 5 0
0 xi = a 0
0 (x1 , . . . , x5 ) ∈ F2 ; 5
5n i=0
0 − 23n −
0 i=0 F (xi ) = b 0
0   0 
0 3 0
0 xi = a 0
5 · 2 0 (x1 , . . . , x3 ) ∈ F2 ; 3
n 3n i=0
0−2 +
n
0 i=0 F (xi ) = b 0
0  >0 
0 x = a 0
2 2n+2 0 x∈F ;
n 0 − 2−n = 0
0 2 F (x) = b 0
for every a, b ∈ Fn2 . A necessary condition is (taking b = F (a) and using that F is APN)
that, for every a, b ∈ Fn2 , we have
0   0
0 5 0
0 xi = a 0
0 (x1 , . . . , x5 ) ∈ F5n ;  i=0
0=
0 2 5
i=0 F (xi ) = b 0

23n + 5 · 2n (3 · 2n − 2 − 2n ) − 22n+2 (1 − 2−n ) = 23n + 3 · 22n+1 − 3 · 2n+1 .


There exist APN quadratic functions whose Walsh spectra are different from the Gold
functions. K. Browning et al. [135] exhibit such function in six variables: F (x) = x 3 +
α 11 x 5 + α 13 x 9 + x 17 + α 11 x 33 + x 48 (α primitive), for which 46 functions tr6 (vF (x)) are
bent, 16 are plateaued with amplitude 16, and one is plateaued with amplitude 32. For n = 8,
among the 8,179 quadratic APN functions identified in [1145], there are 487 functions with
the spectrum {∗ − 646 , −322240 , −1620880 , 015600 , 1623664 , 322880 , 6410 ∗} and 12 functions
with the spectrum {∗ − 6412 , −322100 , −1621360 , 014880 , 1624208 , 322700 , 6420 ∗}, and the rest
have a Gold-like Walsh spectrum [655].

For all n, characterizations of APN functions among plateaued vectorial functions


Thanks to the characterizations of plateaued functions recalled in Subsection 6.5, page 274,
we shall see that all the main results known for quadratic APN functions generalize to
plateaued APN functions, simplifying the study of the APNness of (n, n)-functions when
they are known to be plateaued.14
14 In [247], characterizations of plateaued functions among APN functions are also given.
11.3 Almost perfect nonlinear and almost bent functions 393

In particular, it is much used in papers on APN functions that, if a function F is quadratic,


then given a = 0n , the property that all equations F (x)+F (x+a) = v (which are then linear
equations) have at most two solutions is equivalent (as we saw already) to the fact that the
single homogeneous equation F (x) + F (x + a) = F (0n ) + F (a) has exactly two solutions.
Proving APNness results then in proving that, for every a = 0n , this equation has 0n and
a for only solutions. This is probably the main reason why many results on APN functions
[80, 147, 151, 157, 158, 160, 239, 283, 1118, 1145] were found for quadratic functions. The
property above generalizes to all plateaued functions:

Proposition 171 [247] Any plateaued (n, n)-function F is APN if and only if, for every
a = 0n in Fn2 , the equation F (x) + F (x + a) = F (0n ) + F (a) has the two solutions 0n and
a only.

Indeed, for every v ∈ Fn2 , the size |{(a, b) ∈ (Fn2 )2 ; F (x) + F (x + a) + F (x + b) + F (x +


a + b) = v}| does not depend on x ∈ Fn2 , according to Theorem 18, page 276, and we can
reduce ourselves in this characterization to a and b linearly independent. Function F is then
APN if and only if, for such a, b, this size is null for v = 0n . This completes the proof by
taking x = 0n , and fixing a = 0n .
Another particularity of plateaued functions, extending that of quadratic functions, is the
sufficiency for APNness of the necessary condition (11.5), page 373:

Proposition 172 [247] Let F be any plateaued (n, n)-function. Assume that F (0n ) = 0n .
Then F is APN if and only if the set {(x, a) ∈ (Fn2 )2 | F (x) + F (x + a) + F (a) = 0n } has
size 3 · 2n − 2. Equivalently:

WF3 (u, v) = 22n+1 (2n − 1).


u,v∈Fn2 ,v=0n

Indeed, each equation F (x) + F (x + a) = F (a), a = 0n has at least


 a and 0n for3solutions;
Proposition 171 shows then the first assertion; and we have (u,v)∈(Fn2 )2 WF (u, v) =
0 0
2n 0
2 {(x, a) ∈ (F2 ) | F (x) + F (x + a) + F (a) = 0n } .
n 2 0
See in [247] several inequalities by means of power moments of the Walsh transform,
valid for all vectorial functions, and achieved with equality by APN functions only.
The case of unbalanced component functions: Theorem 19, page 279, implies:

Proposition 173 [247] Let F be any plateaued (n, n)-function having all its component
functions unbalanced, then
0 0
0 0
0{(a, b) ∈ (Fn2 )2 , a = b ; F (a) = F (b)}0 ≥ 2 · (2n − 1), (11.9)

with equality if and only if F is APN.


394 Highly nonlinear vectorial functions with low differential uniformity

Hence, the APNness of plateaued (n, n)-functions with unbalanced component functions
depends only on their value distribution (for instance, any plateaued (n, n)-function, n
even,
 having similar value distribution as APN power functions, is APN and, since
v·Da Db F (x) = W 2 (0 , v), has the same extended Walsh spectrum as the APN
a,b∈Fn2 (−1) F n
Gold functions).
0 The case of power functions simplifies
0  further [247].
0 0
We have 0{(a, b) ∈ (Fn2 )2 , a = b ; F (a) = F (b)}0 = a∈Fn ;a=0n 0(Da F )−1 (0n )0 and this
2
is the parameter NbF of page 114. Each set (Da F )−1 (0n ) has then size exactly 2. Any
function F having this latter property is called zero-difference 2-balanced;15 see [451, 462].
The zero-difference 2-balancedness of some classes of quadratic APN functions seen in
[283] is a corollary of Proposition 173 since the functions in these classes have unbalanced
components. Note, as observed in [283], that for every δ, all quadratic zero-difference δ-
balanced functions are differentially δ-uniform.

11.4 The known infinite classes of AB functions


We begin with AB functions, because when dealing subsequently with APN functions,
we shall just complete the list, and also for historical reasons, since AB functions were
considered first (under different names in the domain of sequences, as seen at page 383). All
the functions in this subsection and the next one are viewed within the structure of the finite
field F2n , n odd; that of semifield has been used in [77, 896]; we refer the reader to these
papers for more details.

11.4.1 Power AB functions


The first known examples of AB functions have been power functions x → x d on the field
F2n (n odd) for reasons also explained at page 383. The exponents d of these power functions
are (1) those given below (and summarized in Table 11.1), whose largest classes are the two
first and (2) the inverses modulo 2n − 1 of these values. These inverses have been studied
in [731, 908].
• d = 2i + 1 with gcd(i, n) = 1 and 1 ≤ i ≤ n−1 2 (proved by Gold, see [540, 908]). The
condition 1 ≤ i ≤ 2 (here and below) is not necessary but we mention it because the
n−1

other values of i give EA equivalent functions. These power functions are called Gold
AB functions.
• d = 22i − 2i + 1 = 2 i +1 with gcd(i, n) = 1 and 2 ≤ i ≤ n−1
3i
2 +1 2 (we exclude i = 1
since then the function is the cube function, that is a Gold function). The AB property
of this function is equivalent to a result historically due to Welch, but never published
by him, and is a particular case of a result of Kasami [669]; see other proofs in [470]
and [443]. These power functions are called Kasami AB functions (some authors call
them Kasami–Welch functions). Note that, denoting by Gi (x) the Gold AB function
x 2 +1 over F2n , and by L(x) the linear function x 2 + x, Kasami function Ki (x) not
i 2i

only equals G3i ◦ G−1 −1


i but also equals Gi ◦ L ◦ Gi (x) + x
22i + x 2i + x (and is therefore

EA equivalent to Gi ◦ L ◦ G−1 i ); more generally, for every nonzero μ ∈ F2 , denoting


n
−1
Lμ (x) = x 2 +μ x, function μ Ki (x) equals Gi ◦Lμ ◦Gi (x)+x 2 +μ2 x 2 +μ2 +1 x;
2i 2i i i i

15 Such ZDB functions have, however, more applications when they are over cyclic groups.
11.4 The known infinite classes of AB functions 395

Table 11.1 Known AB exponents on F2n (n odd)


up to equivalence and to inversion.

Functions Exponents d Conditions

Gold 2i + 1 gcd(i, n) = 1
Kasami 22i − 2i + 1 gcd(i, n) = 1
Welch 2t + 3 n = 2t + 1
t
Niho 2t + 2 2 − 1, t even n = 2t + 1
3t+1
2t + 2 2 − 1, t odd

22i 1
2i +1 23i +1 22i +2i
Gi ◦ Lμ ◦ G−1
2i i
indeed, i (x) = x 2i +1 + μx 2i +1 = x2 + μ x 2i +1 + μ2 x 2i +1 +

μ2 +1 x = x 2 + μ Ki (x) + μ2 x 2 + μ2 +1 x. More is observed in [144].


i 2i i i i

• d = 2(n−1)/2 +3 (conjectured by Welch and proved by Canteaut, Charpin, and Dobbertin


see [195, 196, 471]). These functions are called Welch functions.
• d = 2(n−1)/2 + 2(n−1)/4 − 1 if n ≡ 1 (mod 4), d = 2(n−1)/2 + 2(3n−1)/4 − 1 if n ≡
3 (mod 4) (conjectured by Niho, proved by Hollmann and Xiang, after the work by
Dobbertin; see [472, 608]). These functions are called Niho functions.

The almost bentness can be proved in two steps: (1) prove the almost perfect nonlinearity;
the noneasy cases (Kasami, Welch, and Niho) can be treated by Dobbertin’s general method
introduced in [472] and further developed in [474], called the multivariate method (see the
end of the Appendix for an example of this method); (2) prove then ABness by using
Proposition 163, page 382, and McEliece’s theorem, page 156, in the cases of the Welch
and Niho functions. The global proofs of ABness are not easy except in the case of Gold
functions, and too long to be included here.
The direct proof that the Gold function above is AB is easy by using the properties
of quadratic functions. Since it is a power permutation, we can restrict the study of the
Walsh transform to the component function trn (x 2 +1 ). The linear kernel of this component
i

i i i n−i
function {x ∈ F2n ; trn (x 2 y + xy 2 ) = trn ((x 2 + x 2 ) y) = 0, ∀y ∈ F2n } has the equation
x 2 + x = 0, and equals then F2 , since gcd(22i − 1, 2n − 1) = 1. Function trn (x 2 +1 + ax) is
2i i

constant on F2 if and only if trn (a) = 1. The value at a of the Walsh transform equals then
n+1
±2 2 if trn (a) = 1 and  is null otherwise. This proves ABness. The support of WF (u, v)
has equation trn u
1 = 1. The Walsh transform sign is studied in [734].
v 2i +1
 n−1
The inverse of x 2 +1 is x d , where d =
i 2 2ik d
k=0 2 , and x has therefore the algebraic
degree n+1
2 [908] (hence, the bound of Proposition 155, page 370, is tight).
It has been proved in [443, theorem 7] and [448, theorem 15] that, if 3i is congruent with
1 mod n, then the Walsh support of the Kasami Boolean function trn (x 2 −2 +1 ) equals16 the
2i i

For n even, it equals the set {x ∈ F2n ; tr2n (x 2 +1 ) = 0}, where tr2n is the trace function from F2n to the field
16 i

n −1
2
F22 : tr2n (x) = x + x 4 + x 4 + · · · + x 42 .
396 Highly nonlinear vectorial functions with low differential uniformity

support of the Gold Boolean function trn (x 2 +1 ) (i.e., the set {x ∈ F2n ; trn (x 2 +1 ) = 1}).
i i

The Walsh support of the Kasami functions is also determined in [742] when 5i ≡ 1 [mod n]
(it is more complex). The knowledge of the Walsh support gives the absolute value (but not
the sign) of the Walsh transform of the Kasami function, this function being a permutation.
It has been shown in [548, 734] that, for every AB power function x d over F2n whose
 n+1
restriction to any subfield of F2n is also AB, the value x∈F2n (−1)trn (x +x) equals 2 2
d

n+1
if n ≡ ±1 [mod 8] and −2 2 if n ≡ ±3 [mod 8].
Note that the knowledge of the support of the Walsh transform gives also an information
on autocorrelation: according to the Wiener–Khintchine
 formula, the Fourier–Hadamard
transform of function a → F (Da f ) = x∈Fn (−1)Da f (x) equals the square of the Walsh
2
transform of f . In the case that 3i is congruent with 1 mod n for instance, since the value
at b of the square of the Walsh transform of f equals 2n+1 trn (x 2 +1 ), then by applying the
i

inverse Fourier–Hadamard transform (that is, by applying the Fourier–Hadamard transform


again and dividing by 2n ), F (Da f ) equals twice the Fourier–Hadamard transform of the
function trn (x 2 +1 ). We deduce that, except at the zero vector, F (Da f ) equals the opposite
i

of the Walsh transform of the function trn (x 2 +1 ).


i

It is proved in [429] (see also [159, 1140]) that power functions are CCZ equivalent if
and only if their exponents or their inverses are in the same cyclotomic coset. The algebraic
degrees of functions in Table 11.1 show their pairwise CCZ inequivalence in general.
It was conjectured by Hans Dobbertin that the list of power AB functions is complete. No
counterexample to this conjecture has been found (see page 390).

11.4.2 Nonpower AB functions


It had been conjectured in [257] that all AB functions are equivalent to power functions (and
then to permutations). This conjecture has been disproved, in a first step by exhibiting in
[163] AB functions that are EA inequivalent to any power function and to any permutation,
but that are by construction CCZ equivalent to the Gold function x → x 3 , and in a second
step by finding AB functions that are CCZ inequivalent to power functions (at least for some
values of n) [158]. Note that an easy case where a function is provably EA inequivalent to
power functions is when a component function trn (vF ) has algebraic degree larger than 1
and different from the algebraic degree of F [163].

AB functions CCZ equivalent to power functions


To construct APN (n, n)-functions, and AB functions from known ones by using CCZ
equivalence, is needed, given such a function F , to find an affine permutation L of Fn2 × Fn2
such that, denoting L(x, y) = (L1 (x, y), L2 (x, y)), where L1 (x, y), L2 (x, y) ∈ Fn2 , the
function F1 (x) = L1 (x, F (x)) is a permutation. This is a necessary and sufficient condition
for the image of the graph of F by L to be the graph of a function. Two cases of such L were
found in [162, 163] for the function F (x) = x 2 +1 , where (i, n) = 1, giving new classes of
i

AB functions:
• The function F (x) = x 2 +1 + (x 2 + x) trn (x 2 +1 + x), where n > 3 is odd and
i i i

gcd(n, i) = 1, is AB. It is provably EA inequivalent to any power function [162, 163]


11.4 The known infinite classes of AB functions 397

and it is EA inequivalent to any permutation [163, 771], which disproved the conjecture
above.
• For n odd, m | n, m = n and gcd(n, i) = 1, the (n, n)-function

x 2 +1 + trm
n 2 +1
i i i i
(x ) + x 2 trm
n
(x) + x trm
n
(x)2 +
1
(x)2 +1 + trm
n 2 +1
i i i i
n
[trm (x ) + trm
n
(x)] 2i +1 (x 2 + trm
n
(x)2 + 1) +
2i
(x)2 +1 + trm
n 2 +1
i i
n
[trm (x ) + trm
n
(x)] 2i +1 (x + trm
n
(x)),

n denotes the trace function tr n (x) =


n/m−1 2mi
where trm m i=0 x from F2n to F2m , is an AB
function of algebraic degree m + 2, which is provably EA inequivalent to any power
function; the question of knowing whether it is EA inequivalent to any permutation is
open.
It would be good to find similarly classes of AB functions by using CCZ equivalence
with Kasami (resp. Welch, Niho) functions. For n odd, the Kasami x 4 −2 +1 function
k k

equals F2 ◦ F1−1 (x), where F1 (x) and F2 (x) are respectively the Gold functions x 2 +1
k

and x 2 +1 . Hence, the first step would be to investigate permutations of the form
3k

L1 (x 2 +1 ) + L2 (x 2 +1 ), that is, to find L11 and L21 linear such that for every u = 0
k 3k

and every x, we have L11 (x 2 u + xu2 + u2 +1 ) + L21 (x 2 u + xu2 + u2 +1 ) = 0.


k k k 3k 3k 3k

However, it is conjectured in [145] that for a non-Gold power APN (or AB) function,
CCZ equivalence coincides with EA equivalence together with inverse transformation,
and it is proven (with the help of a check by computer) that this conjecture is true for
n ≤ 8.
• The AB functions constructed in [162, 163] cannot be obtained from power functions
by applying only EA equivalence and inverse transformation, but Budaghyan shows in
[140] that some AB functions EA inequivalent to power functions can be constructed
by only applying EA equivalence and inverse transformation to power AB functions, for
 1 2i −1
instance the function x 2i +1 + tr3n (x + x 2 ) .

AB functions CCZ inequivalent to power functions


The problems of (in)existence of AB functions CCZ inequivalent to power functions and
of quadratic APN functions EA inequivalent to Gold functions remained open after finding
the two classes above. A paper by Edel et al. [493] introduced two quadratic APN functions
from F210 (resp. F212 ) to itself. The first one, x 3 + αx 36 , was proved CCZ inequivalent to
power functions.
These two (quadratic) APN functions were sporadic, and this left open the question of
knowing whether a whole infinite class of APN functions being not CCZ equivalent to power
functions could be exhibited. Moreover, the question of existence of such AB functions was
still open.
• The new following class of binomial AB functions for n divisible by 3 was found in
[151, 158] by Budaghyan, the author, Felke, and Leander:
398 Highly nonlinear vectorial functions with low differential uniformity

Proposition 174 Let s and k be positive integers with gcd(s, 3k) = 1 and t ∈ {1, 2},
i = 3 − t. Let d = 2ik + 2tk+s − (2s + 1), g1 = gcd(23k − 1, d/(2k − 1)) and g2 =
gcd(2k − 1, d/(2k − 1)). If g1 = g2 , then the function
F : F23k → F23k
s +1 k −1 ik +2tk+s
x → x 2 + α2 x2 ,
where α is primitive in F23k is AB when k is odd and APN when k is even.

It could be proved (mathematically) in [151, 158] that some of these functions


are EA inequivalent to power functions and CCZ inequivalent to some AB power
functions, deducing that they are CCZ inequivalent to all power functions for some
values of n:

Proposition 175 Let s and k ≥ 4 be positive integers such that s ≤ 3k − 1,


gcd(k, 3) = gcd(s, 3k) = 1, and i = sk [mod 3], t = 2i [mod 3], n = 3k. If a ∈ F2n has
the order 22k +2k +1, then the function F (x) = x 2 +1 +ax 2 +2
s ik tk+s
is an AB permutation
on F2n when n is odd and is APN when n is even. It is EA inequivalent to power functions
and CCZ inequivalent to Gold and Kasami mappings as shown by a computer-free
proof.

This class was the first infinite family of APN and AB functions CCZ inequivalent to
power functions, disproving a conjecture from [257] on the nonexistence of quadratic AB
functions inequivalent to Gold functions. This class has been generalized in [116, 118]
(see page 405), with Walsh spectra determined in [117].
• It has been shown by Budaghyan et al. in [160] that:

Proposition 176 For every odd positive integer, the function x 3 + trn (x 9 ) is AB on F2n
(and that it is APN for n even).

This function is one of the only examples17 with x 3 of a function AB for any n
odd. It is CCZ inequivalent to any Gold, inverse and Dobbertin functions on F2n if
n ≥ 7 and EA inequivalent to power functions [160]. It has been extended in the
same reference into the AB function x 3 + a −1 trn (a 3 x 9 ) (which is CCZ inequivalent
to all power functions according to [1140], since it has been proved that it is not
equivalent to Gold), and in [161] for n divisible by 3 and odd into x 3 + a −1 trn3 (a 3 x 9 +
a 6 x 18 ) and x 3 + a −1 trn3 (a 6 x 18 + a 12 x 36 ). Coefficient a for all three functions can be
reduced to a = 1 up to equivalence.18 The principle of adding a Boolean function
to an APN function has been generalized into the so-called switching method (see
page 407).
• The eighth entry of Table 11.4, page 407 (displaying the known classes of quadratic
APN polynomials CCZ inequivalent to power functions) is potentially an infinite class,
and has been found in [142] by applying to the cube function x 3 the so-called isotopic
17 If we do not take into account that n is present in the definition of trn .
18 The situation is different for n even, with two different functions for a = 1 and a primitive.
11.5 The known infinite classes of APN functions 399

shift F → FL (x) = F (x + L(x)) − F (x) − F (L(x)) (where L is linear), adapted from


an equivalence notion originally defined by Albert in the study of presemifields in odd
characteristic.19

An open question is to find infinite classes of AB functions CCZ inequivalent to power


functions and to quadratic functions. Actually, the very existence of such functions is an
open problem too. A former question on the existence of AB functions CCZ inequivalent to
permutations has been solved: for n = 7, all 484 quadratic AB functions found in [1145] are
CCZ inequivalent to permutations (Yuyin Yu, private communication, 2018). At the moment,
the only known AB functions CCZ equivalent to permutations are the power AB functions
and the binomials of Proposition 174.

11.5 The known infinite classes of APN functions


We list below the known infinite classes of APN functions (those which are not already seen
as AB functions).

11.5.1 Sporadic APN (and AB) functions


In four and five variables, all APN functions are known (they are classified under EA and
CCZ equivalences by Brinkmann and Leander in [134]; for n = 4, there are two EA
equivalence classes, one of which is not EA equivalent to a power function, and there is
one CCZ equivalence class; for n = 5, there are seven EA equivalence classes, two of which
are not EA equivalent to any power function, and all APN functions are CCZ equivalent to
one of three power functions). In six to eight variables, known APN functions lying outside
the known infinite classes are listed by Browning et al. and by Yu et al. in [135, 1145] (see
a few more, some of which in more variables, in [493, 494, 1118]). We refer the reader to
these papers for the tables they contain, which are useful when trying to state conjectures on
APN functions and for having precise knowledge of all known APN functions. For n = 6,
the classification of quadratic APN functions is complete: 13 quadratic APN functions are
given in [135] and, as proven in [492], up to CCZ equivalence these are the only quadratic
APN functions. Only one nonquadratic APN function is known outside the infinite classes
up to CCZ equivalence (it is, for n = 6, the Brinkmann–Leander–Edel–Pott function [494],
see page 407). In [1145], by establishing a correspondence between quadratic APN functions
and those n×n matrices over F2n which are symmetric with only zeros on the main diagonal
and such that every nonzero linear combination of the rows has rank n − 1, it is shown that
there are at least 490 CCZ inequivalent APN (7, 7)-functions (487 of which are quadratic)
and at least 8,180 for n = 8 (8,179 quadratic). For n odd, all power APN functions and the
known APN binomials (see Proposition 174) are permutations. For n even, the only known
APN permutation is constructed in [136] for n = 6. The existence of APN permutations for
even n ≥ 8 is an open problem.

19 All quadratic APN (6,6)-functions can be obtained from x 3 by isotopic shift; an extension in [143], where
instead of xL(x)2 + x 2 L(x), given by the isotopic shift of x 2 +1 , is taken xL1 (x)2 + x 2 L2 (x), with L1 and
i i i i i

L2 linear, leads to 15 new APN (9,9)-functions.


400 Highly nonlinear vectorial functions with low differential uniformity

11.5.2 Power APN functions


As in the case of AB functions, the first known APN functions have been power functions
x → x d over F2n ; the exponents d of these power functions are those given below (and
summarized in Table 11.2) and their inverses (when n is odd); we do not repeat below the
exponents of AB functions, but we do in Table 11.2:
• d = 2i + 1 with gcd(i, n) = 1, n even and 1 ≤ i ≤ n−2 2 (Gold APN functions; see
[540, 908]). The proof that these functions are APN (whatever is the parity of n) is easy:
i
the equality F (x)+F (x +1) = F (y)+F (y +1) is equivalent to (x +y)2 = (x +y), and
thus implies that x + y = 0 or x + y = 1, since i and n are coprime. Hence, any equation
F (x) + F (x + 1) = b admits at most two solutions. Gold functions being quadratic are
plateaued.
• d = 22i −2i +1 with gcd(i, n) = 1, n even and 2 ≤ i ≤ n−2 2 (Kasami APN functions; see
[641], see also [468]). The proof that such a function is APN is difficult. It comes down
to showing that the restriction to the hyperplane of equation trn (x) = 0 of a function
φ such that F (x) + F (x + 1) = φ(x 2 + x) (which exists since F (x) + F (x + 1) is
invariant by translation by 1) is injective; Dobbertin shows a close connection between
2i +1
φ and the polynomial P (x) = (tri (x)) i , called Müller–Cohen–Matthews polynomial
x2
(MCM polynomial) [380] and proves that φ is a bijection in [468, 474]. Kasami APN
functions are plateaued as proved when 3 does not divide n in [448] and for every even
n in [1142].
• d = 2n − 2, n odd. The corresponding so-called multiplicative inverse permutation (or
simply inverse function) x → F (x) = x 2 −2 (which equals x1 if x = 0, and 0 otherwise)
n

is APN [71, 908]. Indeed, the equation x 2 −2 + (x + 1)2 −2 = b (that we can take with
n n

b = 0, since the inverse function is a permutation) admits 0 and 1 for solutions if and
only if b = 1; and it (also) admits (two) solutions different from 0 and 1 if and only if
there exists x = 0, 1 such that x1 + x+1
1
= b, that is, x 2 + x = b1 . It is well known that
 
such existence is equivalent to the fact that trn b1 = 0 (since 0 and 1 do not satisfy this
latter equation). Hence, F is APN if and only if trn (1) = 1, that is, if n is odd.
Consequently, the functions x → x 2 −2 −1 , which are linearly equivalent to F
n i

i
(through the linear isomorphism x → x 2 ), are also APN, if n is odd.

Table 11.2 Known APN exponents up to equivalence (any n) and up to inversion (n odd).

Functions Exponents d Conditions

Gold 2i + 1 gcd(i, n) = 1
Kasami 22i − 2i + 1 gcd(i, n) = 1
Welch 2t + 3 n = 2t + 1
t
Niho 2t + 2 2 − 1, t even n = 2t + 1
3t+1
2t + 2 2 − 1, t odd
Inverse 22t − 1 n = 2t + 1
Dobbertin 24t + 23t + 22t + 2t − 1 n = 5t
11.5 The known infinite classes of APN functions 401

If n is even, then the equation x 2 −2 + (x + 1)2 −2 = b admits at most two solutions


n n

if b = 1 and admits four solutions (the elements of F4 ) if b = 1, which means that F


opposes a good (but not optimal) resistance against differential cryptanalysis.
The inverse function is not plateaued, since we have seen that the set of values of its
n
Walsh spectrum equals the set of all integers s ≡ 0 [mod 4] in the range [−2 2 +1 +
n
1; 2 2 +1 + 1]. The values of V (v · F ) are calculated in [351]. A connection between the
differential properties of function x 3 and of the multiplicative inverse function (and more
generally between functions x 2 −1 and x 2
n−t+1 −1
, using that x 2 −1 + (x + 1)2 −1 + 1 =
t t t

t−1
(x 2 +x)2
is shown in [93], with a focus on t = 3.
x 2 +x
)
In [757], it is proved that any function F (x) = x −1 + G(x), where G is any non-affine
polynomial, is APN on at most a finite number of fields F2n .
4n 3n 2n n
• d = 2 5 +2 5 +2 5 +2 5 −1, with n divisible by 5 (Dobbertin function; see [473]). It has
been shown by Canteaut et al. [196] that this function cannot be AB: they showed that CF⊥
n−1
contains words whose Hamming weights are not divisible by 2 2 (the Walsh spectrum
n 2n
values of F are divisible by 2 5 but not all by 2 5 +1 ). The proof that the Dobbertin
function is APN is also difficult and comes down as well to showing that some mapping is
a permutation. Neither the nonlinearity nor the Walsh spectrum of the Dobbertin function
is known. The Dobbertin function is not plateaued as seen in [473].

Nonlinearity
For n even, the Gold, Kasami, and inverse functions have the best-known nonlinearity 2n−1 −
n
2 2 [540, 669] (knowing whether there exist (n, n)-functions with nonlinearity strictly larger
than this value when n is even is an open question). This is easily shown in the former case
by using the properties of quadratic functions; it has been proved by Kasami in the second
case, and it was first shown in [333] in the latter case. Dobbertin functions have worse
nonlinearity.
The inverse function x → x 2 −2 = x −1 has been chosen for the S-boxes of the AES
n

with n = 8 because of its bijectivity, good nonlinearity, good differential uniformity (which
is suboptimal:20 equal to 4), highest possible algebraic degree n − 1, nonplateauedness,
simplicity, etc. The computation of its output can be adapted to the device on which it
is done thanks to the fact that 8 is a power of 2 and x −1 can then be computed by
decomposition over subfields. An example of such decomposition is as follows: we can write
x −1 = x 2 (x 2 +1 )−1 ; we have then a product between x 2 , which is a linear function
n/2 n/2 n/2

over F2n , and the inverse of x 2 +1 , which lives in the subfield F2n/2 . This method can
n/2

be iterated; note that over F22 , the inverse function equals x 2 and is then linear. This allows
minimizing the number of nonlinear multiplications needed for computing x −1 , which plays
a role with respect to countermeasures against side-channel attacks (see page 433).
For n odd, Gold and Kasami functions are AB. The nonlinearity of inverse function equals
n
the highest even number bounded above by 2n−1 − 2 2 , as also shown in [333] (this result
has drawn K. Nyberg to focus on the inverse function in [908], which contributed to the

20 We speak here of the sub-S-boxes; the global AES S-box is the concatenation of 16 differentially 4-uniform
functions and has then differential uniformity 4 · 215·8 = 2122 .
402 Highly nonlinear vectorial functions with low differential uniformity

invention of AES). Lachaud and Wolfmann proved (as we already mentioned at page 215)
in [733] that the set of values of its Walsh spectrum equals the set of all integers s ≡ 0 [mod
4] in the range [−2 2 +1 + 1; 2 2 +1 + 1], whatever is the parity of n; see more in [601]. See
n n

[196] for a list of all known permutations with best-known nonlinearity. See also [467].

Inequivalence between functions


It is shown in [159] that distinct Gold functions are CCZ inequivalent. We have seen at page
281 that two power functions are CCZ equivalent if and only if they are EA equivalent or
one of them is EA equivalent to the inverse of the other. Hence, given the algebraic degrees
of Kasami functions, any two distinct functions taken among Gold and Kasami functions21
are CCZ inequivalent (which was already shown in [159] when one function is Gold and the
other is Kasami). And in [1140] is shown the CCZ inequivalence between any n-variable
Gold function and any Niho function for n ≥ 9. It is also shown in [247] that any plateaued
function in even dimension that is CCZ equivalent to a Gold or Kasami APN function is
necessarily EA equivalent to it. This result is revisited in [1141, corollary 1] after a study of
the more general framework of plateaued APN functions.
Inverse and Dobbertin functions are CCZ inequivalent to all other known APN functions
and between them because of their peculiar Walsh spectra, as also first observed in [159].
The situation is summarized:

Proposition 177 [1140, proposition 2]


(i) The Gold functions x 2 +1 and x 2 +1 on F2n ; s, t < n/2, are CCZ equivalent if and
s t

only if s = t.
(ii) The Gold function x 2 +1 and the Kasami function x 4 −2 +1 on F2n ; s, r < n/2, are
s r r

CCZ equivalent if and only if either s = r = 1 or (n, s, r) = (5, 1, 2).


(iii) On F2n with n odd and n ≥ 9, the Gold function x 2 +1 and the Welch function are
s

always CCZ inequivalent.


(iv) On F2n with n odd and n ≥ 9, the Gold function x 2 +1 and the Niho function are
s

always CCZ inequivalent.


(v) The Kasami functions x 4 −2 +1 and x 4 −2 +1 on F2n ; r, s < n/2, are CCZ equivalent
r r s s

if and only if r = s.
(vi) On F2n with n odd and n ≥ 9, the Kasami function x 4 −2 +1 and the Welch function
r r

are always CCZ inequivalent.


(vii) On F2n with n odd and n ≥ 9, the Kasami function x 4 −2 +1 and the Niho function
r r

are always CCZ inequivalent.


(viii) On F2n with n odd and n ≥ 11, the Welch function and the Niho function are always
CCZ inequivalent.

It is proven in [134] that there exists no APN function CCZ inequivalent to power
mappings on F2n for n ≤ 5. See also [494, table 3], where the so-called switching classes
related to the switching construction described at page 407 are investigated for n = 5; there
are three classes with representatives x 3 , x 5 and x −1 ; in general, a switching class containing
an APN function F is not included in the CCZ equivalence class containing F , but in the
21 A Gold function with i = 1 and a Kasami function with i = 1 as well are not distinct.
11.5 The known infinite classes of APN functions 403

case of F25 , they are the same. This fact is given as a comment in the second paragraph after
remark 2 in this same paper [494], where it is also indicated that, in the case n = 8, the EA
switching class of the Gold function x 3 contains 17 CCZ inequivalent functions.

Remark. Proving the CCZ inequivalence between two functions is mathematically (and
also computationally) difficult, unless some CCZ invariant parameters can be proved
different for the two functions. Examples of direct proofs of CCZ inequivalence using only
the definition can be found in [158, 159, 160].

Examples22 of CCZ invariant parameters are the following (see [135] and [494] where
they are introduced and used, as well as group algebra interpretations):
• The extended Walsh spectrum.
• The equivalence class of the code C KF defined at page 379 (under the relation of
equivalence of codes), and all the invariants related to this code (the weight enumerator of
K
C F , the weight enumerator of its dual–but it corresponds to the extended Walsh spectrum
of the function–the automorphism group etc., which coincide with some of the invariants
below).
• The -rank: let G = F2 [Fn2 × Fn2 ] be the group algebra of Fn2 × Fn2 over F2 , consisting of

the formal sums g∈Fn ×Fn ag g, where ag ∈ F2 . If S is a subset of Fn2 × Fn2 , then it can
2 2 
be identified with the element s∈S s of G . The dimension of the ideal of G generated
by the graph GF = {(x, F (x)); x ∈ Fn2 } of F is called the -rank of F . The -rank
equals (see [494]) the rank of the matrix MGF whose term indexed by (x, y) ∈ Fn2 × Fn2
and by (a, b) ∈ Fn2 × Fn2 equals 1 if (x, y) ∈ (a, b) + GF and equals 0 otherwise.
• The -rank, that is, the dimension of the ideal of G generated by the set DF =
{(a, F (x)+F (x +a)); a, x ∈ Fn2 ; a = 0} (according to Proposition 158, this set has size
22n−1 − 2n−1 and is a difference set when F is AB). The -rank equals the rank of the
matrix MDF whose term indexed by (x, y) and by (a, b) equals 1 if (x, y) ∈ (a, b) + DF
and equals 0 otherwise.
• The order of the automorphism group of the design dev(GF ), whose points are the
elements of Fn2 × Fn2 and whose blocks are the sets (a, b) + GF (and whose incidence
matrix is MGF ); this is the group of all those permutations on Fn2 × Fn2 that map every
such block to a block.
• The order of the automorphism group of the design dev(DF ), whose points are the
elements of Fn2 × Fn2 and whose blocks are the sets (a, b) + DF (and whose incidence
matrix is MDF ).
• The order of the automorphism group M(GF ) of the so-called multipliers of GF , that is,
the permutations π of Fn2 ×Fn2 such that π(GF ) is a translate (a, b)+ GF of GF . This order
is easier to compute, and it makes it possible in some cases to prove CCZ inequivalence
easily. As observed in [135], M(GF ) is the automorphism group of the code C KF.
• The order of the automorphism group M(DF ).
• A CCZ-invariant lower bound on the minimum distance to other APN functions [153].

CCZ equivalence does not preserve crookedness nor the algebraic degree.

22 There are other CCZ invariants known; we describe all those efficient for APN functions.
404 Highly nonlinear vectorial functions with low differential uniformity

Exceptional exponents
The exponents d such that the function x d is APN on infinitely many extensions F2n of F2
are called exceptional (see [135, 444, 642]). We have seen above that a power function
x d is APN if and only if the function x d + (x + 1)d + 1 (we write “+1” so that 0 is
a root, which simplifies presentation) is 2-to-1 and that, for every (n, n)-function F over
F2n , there exists a polynomial P such that F (x) + F (x + 1) + F (1) = P (x + x 2 ).
In each case of the Gold and Kasami functions, one of these polynomials P is an
exceptional polynomial (i.e., is a permutation over infinitely many fields F2n ); from there
comes the term. In the case of the Gold function x 2 +1 , we have P (x) = x + x 2 + x 2
i 2

i−1
+ · · · + x 2 , which is a linear function over the algebraic closure of F2 having kernel
{x ∈ F2i ; tri (x) = 0} and is therefore a permutation over F2n for every n coprime with
i. In the case of Kasami exponents, the polynomial is related to the MCM polynomials;
see page 400. It had been conjectured in [642] that Gold and Kasami exponents are the
only exceptional exponents. This conjecture has been shown in [603]. It has been shown
in [40] that if the degree of a function given in univariate representation is not divisible
by 4, and if this degree is not a Gold or a Kasami exponent, and if the polynomial
contains a term of odd degree, then the function cannot be APN over infinitely many
extensions of F2 . See more on exceptional APN functions in [418, 419] and the references
therein.

11.5.3 Nonpower APN functions


As for AB functions, it had been wrongly conjectured that all APN functions were EA
equivalent to power functions.

APN functions CCZ equivalent to power functions


Using also the notion of CCZ equivalence, two more infinite classes of APN functions
have been introduced by Budaghyan, the author, and Pott in [162, 163], which disprove
the conjecture above:
• The function F (x) = x 2 +1 + (x 2 + x + 1) trn (x 2 +1 ), where n ≥ 4 is even and
i i i

gcd(n, i) = 1, which is EA inequivalent to any power function.


• For n even and divisible by 3, the function F (x) equal to

[x + trn/3 (x 2(2 +1) + x 4(2 +1) ) + trn (x) trn/3 (x 2 +1 + x 2


2i (2i +1)
)]2 +1 ,
i i i i

where gcd(n, i) = 1, is APN and is EA inequivalent to any known APN function.


We display in Table 11.3 the APN functions found this way. Finding classes of
APN functions by using CCZ equivalence with Kasami (resp. Welch, Niho, Dobbertin,
inverse) functions is an open problem.
In [1166], some observations about the fact that, starting from a quadratic APN
function, it is possible to obtain a CCZ-equivalent function that is EA inequivalent to
any quadratic function are made.
11.5 The known infinite classes of APN functions 405

Table 11.3 Some APN functions CCZ equivalent to Gold functions and EA inequivalent to
power functions on F2n (constructed in [162, 163]).

Functions Conditions dalg

n≥4
x 2 +1 + (x 2 + x + trn (1) + 1)trn (x 2 +1 + x trn (1))
i i i
gcd(i, n) = 1 3
6|n
[x + trn/3 (x 2(2 +1) + x 4(2 +1) ) + trn (x)trn/3 (x 2 +1 + x 2 (2 +1) )]2 +1
i i i 2i i i
gcd(i, n) = 1 4
m = n
x 2 +1 + trm
n (x 2i +1 ) + x 2i tr n (x) + xtr n (x)2i
i
m m n odd
1
n (x)2i +1 + tr n (x 2i +1 ) + tr n (x)] 2i +1 (x 2i + tr n (x)2i + 1)
+ [trm m|n m+2
m m m
2i
+ [trm
n (x) 2i +1 + trm
n (x 2i +1 )+ m
tr n (x)] 2i +1 (x + trm
n (x)) gcd(i, n) = 1

APN functions CCZ inequivalent to power functions


• As recalled at page 397, two quadratic APN functions from F210 (resp. F212 ) to itself have
been introduced in [493]. The first one, F (x) = x 3 +ux 36 , where u ∈ F4 \F2 , was proved
to be CCZ inequivalent to any power function by computing its -rank. Proposition
174, page 397, which gives binomial AB functions when n is odd, gives binomial APN
functions when n is even, which generalize the second function: F (x) = x 3 + α 15 x 528 ,
where α is a primitive element of F212 . Some of them can be proven CCZ inequivalent
to Gold and Kasami mappings, as seen in Proposition 175, and therefore, they are CCZ
inequivalent to all power mappings due to a result of [1140] (that if a quadratic function
is CCZ equivalent to a power function, then it is EA equivalent to a Gold map). A similar
class but with n divisible by 4 was later given in [157]. A common framework exists for
the class of Proposition 175 and this new class:

Proposition 178 Let n = tk be a positive integer, with t ∈ {3, 4}, and s be such that t, s, k
are pairwise coprime and such that t is a divisor of k + s. Let α be a primitive element of
F2n and w = α e , where e is a multiple of 2k − 1, coprime with 2t − 1. Then the following
function is APN:
s +1 k+s +2k(t−1)
F (x) = x 2 + wx 2 .

For n ≥ 12, these functions are EA inequivalent to power functions and CCZ inequivalent
to Gold and Kasami mappings as shown by a computer-free proof [158]. This implies that
they are CCZ inequivalent to all power mappings [1140].
• Proposition 175, page 398, has been generalized23 in [118] for n divisible by 3 by
Bracken et al. into quadrinomial APN functions:
k 2k +2k+s s +1 2k +1 k +1 k+s +2s
F (x) = u2 x 2 + ux 2 + vx 2 + wu2 x2 (11.10)

23 Note that Proposition 174 covers a larger class of APN functions than Proposition 175.
406 Highly nonlinear vectorial functions with low differential uniformity

is APN on F23k , when 3 | k + s, (s, 3k) = (3, k) = 1 and u is primitive in F23k , v =


w−1 ∈ F2k . It contains the trinomials introduced in [116]. The Walsh spectrum has been
determined in [123]. The general class of Proposition 178 is not generalized for the n
divisible by 4 case.
• The same paper [118] obtained multinomial APN functions for n ≡ 2 [mod 4]:
k−1
2s +1 2k 2k+s +2k 2k +1 i+k +2i
F (x) = bx +b x + cx + ri x 2 , (11.11)
i=1

where k, s are odd and coprime, b ∈ F22k is not a cube, and c ∈ F22k \ F2k , ri ∈ F2k , is
APN on F22k . Recently, in [146], it has been proved that these functions (11.11) are EA
equivalent to the functions of Proposition 181 below, which is itself generalized by the
class in [239]; see below.
• The construction of AB functions of Proposition 176, page 398, works for APN
functions: for any positive integer n, function x 3 + trn (x 9 ) is APN on F2n . It is shown
in [160] that, if F is an APN quadratic (n, n)-function and f is quadratic Boolean
such that, for every a ∈ F∗2n , there exists a linear Boolean function a satisfying
βf (x, a) = f (x + a) + f (x) + f (a) + f (0) = a (βF (x, a)), then F (x) + f (x) is
APN provided that, if βF (x, a) = 1 for some x ∈ F2n , then a (1) = 0.
Function x 3 + trn (x 9 ) is CCZ inequivalent to any Gold function on F2n if n ≥ 7, as
proved in [160], and therefore it is CCZ inequivalent to any power function [1140]. The
extended Walsh spectrum of this function is the same as for the Gold functions as shown
in [114].
The approach that has led to function x 3 + trn (x 9 ) has been generalized (as for AB
functions) and new functions have been deduced: x 3 + a −1 trn (a 3 x 9 ) in [160], and for n
divisible by 3, x 3 + a −1 trn3 (a 3 x 9 + a 6 x 18 ) and x 3 + a −1 trn3 (a 6 x 18 + a 12 x 36 ) in [161]
(for n even, each such function defines two CCZ inequivalent functions, one for a = 1
and one for any a = 1). Their APNness is proved in this latter reference by showing for
n even that if L is linear and F (x) = x + L(x 3 ) is a permutation24 over F2n , then F (x 3 )
is APN, and by showing a more complex result for n odd. The Walsh spectra of the
functions above are determined in [114, 166] (they are Gold-like Walsh spectra). When
F is a Gold function, all possible APN mappings F (x) + f (x), where f is a Boolean
function have been computed until dimension 15. The only possibilities different from
x 3 + trn (x 9 ), are for n = 5 the function x 5 + trn (x 3 ) (CCZ equivalent to Gold functions)
and for n = 8 the function x 9 + trn (x 3 ) (CCZ inequivalent to power functions); see
more in [1096]. In Table 11.4 are displayed all the known classes of APN functions CCZ
inequivalent to power functions. It is shown in [161]:

Proposition 179 Let n be any positive integer and K some field extension of F2n . Let L be
an F2 -linear mapping from F2n to F2n extended to an F2 -linear mapping from K to K. Let
E be a coset in K of a vector space containing L(F2n ). Assume that F (x) = x + L(x 3 ) is
injective on E and that the set {x 2 + x + 1; x ∈ E} contains the set of elements y of F2n such
that trn (y) = 0. Then F (x 3 ) is APN over F2n .

24 Sufficient conditions are given in [161] for that.


11.5 The known infinite classes of APN functions 407

Table 11.4 Known classes of quadratic APN polynomials CCZ inequivalent to power functions
on F2n .

Functions Conditions Proven

n = pk, gcd(k, p)=gcd((s, pk))=1, Prop.


x 2 +1 + α 2 −1 x 2 +2
s k ik tk+s
p ∈ {3, 4}, i = sk mod p, t = p − i, 174, 175
n ≥ 12, α primitive in F∗2n 178; [158]
q = 2m , n = 2m, gcd(i, m)=1,
ax 2i (q+1)
+x +x 2i +1 q(2i +1) c ∈ F2n , a ∈ F2n \ Fq , Prop. 181
+x q+1 + cx 2 q+1 + cq x 2 +q X2 +1 + cX2 + cq X + 1
i i i i
[147]
has no zero in F2n s.t. x q+1 = 1
x 3 + a −1 trn (a 3 x 9 ) a = 0 [160]
x 3 + a −1 tr3n (a 3 x 9 + a 6 x 18 ) 3|n, a = 0 [161]

x 3 + a −1 tr3n (a 6 x 18 + a 12 x 36 ) 3|n, a = 0 [161]

n = 3k, gcd(k, 3)=gcd(s, 3k)=1, see page


αx 2 +1 + α 2 x 2 +2 +
s k 2k k+s
v, w ∈ F2k , vw = 1, 405
vx 2 +1 + wα 2 +1 x 2 +2 3|(k + s) α primitive in F∗2n
2k k s k+s
[116]

(x + x 2 )2 +1 +
m k
n = 2m, m ≥ 2 even,
β(αx + α 2 x 2 )(2 +1)2 +
m m k i
gcd(k, m) = 1 and i ≥ 2 even [1182]
α primitive in F∗2n , β ∈ F2m not cube
m m m
α(x + x 2 )(αx + α 2 x 2 )
n = 3m, m odd; U subgroup of
F∗2n of order 22m + 2m + 1, L(x) =
2m+1 +1 m+1 +1
a2x 2 + b2 x 2 + [142]
ax 2 +2 + bx 2 +2 +
2m m 2m m
ax 2 + bx 2 + cx ∈ F2n [x], s.t.
(c2 + c)x 3 / {0, v} and v 2 L(t)
∀v, t ∈ U , L(v) ∈
2 2
+tL(v)2 = 0 ⇒ t 2L(v)+vL(t)2 ∈ F2m
v L(t)+tL(v)
α(α q x + αx q )(x q + x)+ q = 2m , n = 2m, gcd(i, m) = 1 [1077]
(α q x + αx q )2 +2 + X2 +1 + aX + b
2i 3i i

2i i
a(α q x + αx q )2 (x q + x)2 + has no zero in F2m
b(x q + x)2 +1
i

x 3 + β(x 2 +1 )2 + β 2 x 3·2 +
i k m
n = 2m; m odd; 3  |m [165]
(x 2 +2 )2 i = m − 2 or i = (m − 2)−1 [mod n]
i+m m k

β primitive in F4

Note that if the output of x 3 +trn (x 9 ) is decomposed over an F2 -basis of F2n that contains
element 1, function x 3 + trn (x 9 ) differs from x 3 for only one coordinate function. This is
an example of the idea originally due to Dillon (after [160]) and developed in [494] of the
switching construction: starting with a known APN function, one of the coordinate functions
is changed; this gives a function which is differentially 4-uniform and, in some rare cases,
APN. In general, each change of a coordinate function in an S-box can at most multiply
its differential uniformity δ by 2. Indeed, changing, for instance, the last coordinate function
and denoting by F  the function obtained from F by erasing the last coordinate, the equation
F  (x) + F  (x + a) = b corresponds to F (x) + F (x + a) = (b , 0) or F (x) + F (x +
a) = (b , 1), and the differential uniformity of any function obtained by changing the last
408 Highly nonlinear vectorial functions with low differential uniformity

coordinate function is then at most 2δ (but this change can lower the nonlinearity to 0).
This has led in [494] to an APN (6,6)-function CCZ inequivalent to power functions and
to quadratic functions (the only known, currently), which had been already found in [134]
but missed as a non quadratic function; we shall call it the Brinkmann–Leander–Edel–Pott
function; it equals, given α primitive:
x 3 + α 17 (x 17 + x 18 + x 20 + x 24 ) + tr2 (x 21 ) + tr3 (α 18 x 9 )
+ α 14 tr6 (α 52 x 3 + α 6 x 5 + α 19 x 7 + α 28 x 11 + α 2 x 13 ).
On the basis of a generalized switching construction, Göloglŭ has proposed in [547] the
function x 2 +1 + (trm
n (x))2k +1 , where gcd(k, n) = 1 and n = 2m, m even, but this function
k

was proved affine equivalent to the Gold function in [166].


• An idea of J. Dillon [445] was that (n, n)-functions (over F2n ) of the form
F (x) = x(Ax 2 + Bx q + Cx 2q ) + x 2 (Dx q + Ex 2q ) + Gx 3q ,
n
where q = 2 2 , n even, have good chances to be differentially 4-uniform. Such F is
quadratic. For a ∈ F∗2n , we consider then the equation G1 := F (x + a) + F (x) + F (a) =
q
a1 x + a2 x 2 + a3 x q + a4 x 2q = 0, where a1 , . . . , a4 ∈ F2n . We deduce G2 := a2 G1 +
q
a4 G1 = b1 x+b2 x +b3 x = 0, G3 := b3 G1 +a3 b3 G2 +a4 G2 = c1 x+c2 x +c3 x 4 = 0.
2 q 2 2 2

If either c1 , c2 , or c3 is nonzero, then F is differentially 4-uniform and possibly APN.


This idea was applied to more general functions by Budaghyan and the author in [147];
this resulted in trinomial APN functions:
n
Proposition 180 Let n be even and let gcd(i, n2 ) = 1. Set q = 2 2 and let c, b ∈ F2n be
such that cq+1 = 1, c ∈ {λ(2 +1)(q−1) , λ ∈ F2n }, cbq + b = 0. Then
i

2i +2i 2i +2i )
F (x) = x 2 + bx q+1 + cx q(2
is APN on F2n . Such vectors b, c do exist if and only if gcd(2i + 1, q + 1) = 1. For n
2 odd,
this is equivalent to saying that i is odd.

The extended Walsh spectrum of these functions is the same as that of the Gold functions
[1181]. But it has been recently proved in [146] that these functions are EA equivalent to the
functions of the next proposition.
• The method also resulted in a class of hexanomial APN functions:
n
Proposition 181 [147] Let n be even and i be coprime with n2 . Set q = 2 2 and let c ∈ F2n
and a ∈ F2n \ Fq . If the polynomial X2 +1 + cX 2 + cq X + 1 has no zero x ∈ F2n such that
i i

x q+1 = 1 (in particular if it is irreducible over F2n ), then the following function is APN on
F2n :
F (x) = x(x 2 + x q + cx 2 q ) + x 2 (cq x q + ax 2 ) + x (2 +1)q .
i i i iq i

The condition was shown achievable by computer investigation, then mathematically


in [122] and [97]; finally in [547], all the polynomials satisfying it have been characterized,
constructed, and counted. This class was generalized (up to CCZ equivalence) in [239]; see
11.5 The known infinite classes of APN functions 409

below (the question whether this generalizing bivariate construction gives new functions up
to CCZ equivalence is open). It was checked with a computer that some of the APN functions
provided by Proposition 181 are CCZ inequivalent to power functions for n = 6, 8, 10. It
remains open to prove the same property for every even n ≥ 12.
Cases where the hypothesis of Proposition 181 is satisfied were exhibited in [97, 122,
980]. The polynomials X2 +1 + cX2 + cq X + 1 are directly related to the polynomials
i i

X 2 +1 + X + a. In [122], the coefficients a ∈ F∗2n , such that this latter polynomial has no
i

zero in F2n when gcd(i, n) = 1 and n is even, are characterized. In particular, for i = 1, the
polynomial X3 + X + a has no zero (i.e., is irreducible) if and only if a = u + u−1 where u
is not a cube in F2n . Note that X3 + X is the Dickson polynomial D3 (seen at page 389).
As shown in [146], this hexanomial construction is more general than those constructions
of Proposition 180 and those of (11.11) (strictly more general, because it is also defined for
n even divisible by 4 while these are not).
• A method has been introduced by the author in [239] for constructing APN functions
from bent functions. Let B be a bent (n, n2 )-function and let G be a function from
n n n
Fn2 to F22 . Let F : x ∈ Fn2 → (B(x), G(x)) ∈ F22 × F22 . Then F is APN if and
n
only if, for every nonzero a ∈ Fn2 , and for every c, d ∈ F22 , the system of equations

B(x) + B(x + a) = c
has zero or two solutions. Since B is bent, the number of
G(x) + G(x + a) = d
n
solutions of the first equation always equals 2 2 and such regularity can help. Functions
EA equivalent to the Brinkmann–Leander–Edel–Pott function can be obtained in the
form (B(x, y), G(x, y)) with B(x, y) = sx 3 + ty 3 + ux 2 y + vx y 2 .
Taking B equal to the Maiorana–McFarland function B(x, y) = xy on F n2 × F n2 ,
2 2
where xy is the product of x and y in the field F n2 , and writing then (a, b) with a, b ∈
2
F n2 instead of a ∈ Fn2 , the system of equations above becomes, after changing c into c +
2 
bx + ay = c
ab: . Then, by considering separately the cases
G(x, y) + G(x + a, y + b) = d
a = 0, b = 0 and a = 0, F is APN if and only if:
1. For every c ∈ F n2 , the function y ∈ F n2 → G(c, y) is APN.
2 2
2. For every b, c ∈ F n2 , the function x ∈ F n2 → G(x, bx + c) is APN (this is easily
2 2
seen by replacing b and c respectively by ab and ac in the latter system).
This leads to the following bivariate APN functions:

Proposition 182 [239] Let n be any even integer; let i, j be integers such that
gcd(n/2, i − j ) = 1, and let s = 0, t = 0, u and v be elements of F2n/2 . Set
G(x, y) = sx 2 +2 + ux 2 y 2 + vx 2 y 2 + ty 2 +2 . Then the function
i j i j j i i j

F : (x, y) ∈ F2n/2 × F2n/2 → (x y, G(x, y)) ∈ F2n/2 × F2n/2

is APN if and only if G(x, 1) = sx 2 +2 + ux 2 + vx 2 + t has no zero in F2n/2 .


i j i j

For j = 0 and s = 1 (as in Proposition 181), such polynomials are called “projective”
by some authors. In [239], examples where the condition of Proposition 182 is satisfied
410 Highly nonlinear vectorial functions with low differential uniformity

are investigated and it is shown that Propositions 180 and 181 and the APN functions
of Bracken et al. recalled above are cases of application of Proposition 182 (note that
the bivariate function (x, y) ∈ F22n/2 → xy is EA equivalent to the univariate function
x ∈ F2n → x 2 +1 ); see more in [243], where the univariate representation of these
n/2

functions is investigated and generalized in several ways. Note that it was mentioned (but
unpublished) by Göloglŭ, Krasnayova, and Lisoněk at the conference Fq13, in 2017, that
n n n
any APN function of the particular form x 3 +ax 3·2 2 +bx 2·2 2 +1 +cx 2+2 2 , a, b, c ∈ F n ,
n −1
22
2 2 +1
is equivalent to x 3 or to x or when n = 6 to the so-called Kim function (see the
definition of this function at page 411).
The construction of function (x, y) → (xy, x 2 +1 + ay 2 (2 +1 ), where a = 0 is
i j i

assumed impossible to be written in the form b2 +1 (c + c2 )1−2 , has been proposed


i i j

by Zhou and Pott in [1182]. It is similar to the one of Proposition 182 but different. It
has been generalized in [243]:

Proposition 183 Let n be any even integer; let i be any integer coprime with m = n/2,
and let P , Q, R, and S be linear homomorphisms of F2m . Set G(x, y) = P (x 2 +1 ) +
i

Q(x 2 y) + R(xy 2 ) + S(y 2 +1 ). Then the function


i i i

F : (x, y) ∈ F2m × F2m → (x y, G(x, y) ∈ F2m × F2m


is APN if and only if, for every a and b in F2m such that (a, b) = (0, 0), the linear
function Ta,b (y) := P (a 2 +1 y) + Q(a 2 by) + R(ab2 y) + S(b2 +1 y) satisfies
i i i i

– If m is odd, then Ta,b : F2m → F2m is bijective.


– If m is even, then (ker Ta,b ) ∩ {u2 +1 (t 2 + t); u ∈ F∗2m , t ∈ F2m } = {0}.
i i

• A new class has been found by Zhou and Pott in [1182]; see Table 11.4.
• Very recently has been found in [1077] the penultimate entry in Table 11.4, which enters
in the framework of the proposition above.
• Still more recently has been found in [165] the last entry in the table.
Note that each class of functions can be described in many different ways due to
equivalences. The reader must then not be surprised if there are small differences
between the representations given in Table 11.4 and in the body of the section. The
descriptions in the table are in some cases for subclasses, for which inequivalences
between the different entries could be shown. This table as well as other tables with
data on APN functions is periodically renewed at https://boolean.h.uib.no/mediawiki/
index.php/Tables.

An APN permutation and the big open APN problem


The APN functions listed above for n even are not permutations. This is problematic
since for implementation reasons (see, e.g., page 401), n even is preferred. Block ciphers
using bijective APN (7, 7)-functions and (9, 9)-functions as S-boxes exist, such as the
MISTY block cipher [830] and its variant KASUMI [687], but have drawbacks. Most block
ciphers use then differentially 4-uniform permutations in even dimension. The question
11.5 The known infinite classes of APN functions 411

(called the big open APN problem by Dillon) of knowing whether there exist APN
permutations when n is even (which would allow simplifying the structure) was wide
open (as first mentioned in [910] and answered negatively for n = 4 in [624] thanks to
a computer investigation and in [176] mathematically) until Browning et al. exhibited in
n
[136] an APN permutation (of algebraic degree n − 2 and nonlinearity 2n−1 − 2 2 ) in n = 6
variables (used later in the cryptosystem Fides [84], which has been subsequently broken
due to its weaknesses in the linear component). This permutation is CCZ equivalent to the
so-called Kim function x 3 + x 10 + αx 24 (given in [135]), whose associated code CF (see
Proposition 160) is therefore a double simplex code (see page 10). It is EA equivalent
to an involution and is studied further in [199, 944] where the butterfly construction is
introduced. This construction works with concatenations of bivariate functions R(x, y) over
F2n/2 (n/2 odd), which are viewed as Ry (x) and are such that Ry is bijective for every y. The
resulting butterflies have two CCZ equivalent representations, one of which, called closed
butterfly, can be taken quadratic (and may not be bijective) and has the form (Ry (x), Rx (y)),
while the other, called open butterfly, is the involution of the form (RR−1 (x) (y), Ry−1 (x)).
y
This construction includes the APN permutation of [136], but unfortunately it is shown in
[199, 943] that it does not allow obtaining APN permutations in more than six variables.
The butterfly construction gives differentially 4-uniform involutions; see page 421.
The question of existence of APN permutations in even dimension n remains open for
n ≥ 8. There exist nonexistence results within the following classes:
• Plateaued functions (when APN, they have bent components; see page 391).
• A class of functions including power functions (see page 383).
• Functions whose univariate representation coefficients lie in F n , or in F24 for n divisible
22
by 4 [624].
 2n3−1
• Functions whose univariate representation coefficients satisfy i=0 a3i = 0 [184].
• Functions having at least one partially-bent component; it is indeed proved in [176]
(starting from the same idea as in [24], see page 383) that no component function of an
APN permutation can be partially-bent (this improves upon several previous results on
the components of APN permutations): n being even, the linear kernel of such balanced
partially-bent component function v · F would have dimension at least 2, and since for
every a, b, we have Da (v · F )(x) ⊕ Db (v · F )(x) = Da+b (v · F )(x + a), there would
exist a = 0 such that Da (v · F ) = 0, and since I m(Da F ) has 2n−1 elements because F
is APN, I m(Da F ) would include 0, a contradiction with the bijectivity of F .

Finding infinite classes of APN functions CCZ inequivalent to power functions and to
quadratic functions is an open problem too.

11.5.4 The extended Walsh spectra of known APN functions


For n odd, the known APN functions have three possible spectra (all satisfying V (v · F ) =
22n+1 for every v = 0); see, e.g., [62]:
n−1
• The spectrum of the AB functions, which gives a nonlinearity of 2n−1 − 2 2 .
• The spectrum of the inverse function, which takes any value divisible by 4 in [−2 2 +1 +
n

1; 2 2 +1 + 1] and gives a nonlinearity close to 2n−1 − 2 2 .


n n
412 Highly nonlinear vectorial functions with low differential uniformity

• The spectrum of the Dobbertin function which is more complex (it is divisible by 2n/5
and not divisible by 22n/5+1 [196]); its nonlinearity seems to be bounded below by
approximately 2n−1 − 23n/5−1 − 22n/5−1 – maybe equal – but this has to be proven
(or disproven).

For n even, the spectra may be more diverse:


• The Gold functions (and all known infinite classes of quadratic APN functions [117, 123,
1061]), whose component functions are bent for a third of them and have nonlinearity
n
2n−1 − 2 2 for the rest of them; the Kasami functions, which have the same extended
spectra.
• The Dobbertin function (same observation as above).
• As soon as n ≥ 6, we find (quadratic) APN functions with different spectra (e.g., x 3 +
α 11 x 5 + α 13 x 9 + x 17 + α 11 x 33 + x 48 , for n = 6, with a seven-valued Walsh spectrum
found by Dillon).

The nonlinearities seem also bounded below by approximately 2n−1 − 23n/5−1 − 22n/5−1
(but this has to be proven . . . or disproven too). Note that the question of classifying APN
functions is open even when restricting ourselves to quadratic APN functions in more than
six variables (even classifying their Walsh spectra is open for even numbers of variables).
There is only one known example of quadratic APN function (with n = 6) having non-Gold-
like nonlinearity; see [445].

11.5.5 Conclusion on known APN functions


As we can see, very few functions usable as S-boxes have emerged so far. The only known
APN permutations are in odd dimension or in dimension 6, which is not convenient for
implementation. Besides, Gold functions, all the other found quadratic functions, and the
Welch functions have too low algebraic degrees for being widely chosen for the design
of new S-boxes. The Kasami functions themselves seem too closely related to quadratic
functions. The inverse function has many very nice properties: large Walsh spectrum and
good nonlinearity, differential uniformity of order at most 4, and fast implementation. But
differential uniformity 2 in a dimension equal to a power of 2 would be better, and the
inverse function has a potential weakness against algebraic attacks, which did not lead yet
to efficient attacks, but may in the future. So further studies on APN permutations seem
essential for the future designs of SP networks.

11.6 Differentially uniform functions


11.6.1 Characterizations by the Walsh transform
We have seen that APN functions are nicely characterized by their Walsh transform through
Relation (11.2), page 372. It is shown in [250] that other characterizations by the Walsh
transform exist for APN functions and that more generally, for each value of δ, several
(in fact, an infinity of) characterizations by the Walsh transform of differentially δ-uniform
functions exist. We follow here the presentation of [252]. Denoting for every a, b ∈ Fn2 and
every (n, m)-function F by NF (b, a) the size of the set {x ∈ Fn2 ; Da F (x) = Da F (b)}, we
11.6 Differentially uniform functions 413

have that F is differentially δ-uniform if and only if, for every a = 0n in Fn2 and every b ∈
Fn2 , we have NF (b, a) ∈ {2, 4, . . . , δ}. For every polynomial φδ (X) = j ≥0 Aj Xj ∈ R[X]
such that φδ (u) = 0 for u = 2, 4, . . . , δ and φδ (u) > 0 for every even u ∈ {δ + 2, . . . , 2n },
we have then for every (n, m)-function F that

Aj (NF (b, a))j = Aj (NF (b, a))j ≥ 0,


j ≥0 a,b∈Fn2 a,b∈Fn2 j ≥0
a=0n a=0n

and that F isdifferentially δ-uniform if and only if this inequality is an equality.


The sum a,b∈Fn (NF (b, a))j is easily expressed by means of the Walsh transform of F :
2

Lemma 12 Let F be any (n, m)-function. We have for j ≥ 1:

(NF (b, a))j = (NF (b, a))j + 2n(j +1) =


a,b∈Fn2 a,b∈Fn2 ,a=0n

⎛ ⎞
j j 
j
2 −j (m+n)
WF2 ⎝ ui , vi ⎠ WF2 (ui , vi ). (11.12)
u1 ,...,uj ∈Fn
2 i=1 i=1 i=1
v1 ,...,vj ∈Fm
2

This technical lemma isproved in [250] by raising at the j th power the equal-
ity NF (b, a) = 2−m x∈Fn ,v∈Fm (−1)v·(Da F (x)+Da F (b)) , obtaining (NF (b, a))j =
2 2
−j
 j
·(F (xi +a)+F (b)+F (b+a))
2 m
xi ∈Fn m (−1)
2 ,vi ∈F2
i=1 vi (x i )+F
, and using that yi = xi + a

i=1,...,j
if and only if ui ∈Fn (−1)ui ·(xi +yi +a) = 2n (idem for c = b + a). We deduce from Lemma
2
12:

Theorem 26 [250] Let n, m, and δ be positive integers, with δ even, and let F be any
(n, m)-function. Let

φδ (X) = Aj Xj ∈ R[X]
j ≥0

be any polynomial such that φδ (u) = 0 for u = 2, 4, . . . , δ and φδ (u) > 0 for every even
u ∈ {δ + 2, . . . , 2n }. Then we have
 
2n (2n − 1)A0 + 2−j (n+m) Aj (WF2 )⊗(j +1) (0n , 0m ) − 2(2j +1)n+j m ≥ 0, (11.13)
j ≥1

where (WF2 )⊗(j +1) is the (j + 1)-th order convolutional product of WF :


⎛ ⎞
j j j
(WF2 )⊗(j +1) (0n , 0m ) = WF2 ⎝ ui , vi ⎠ WF2 (ui , vi ).
u1 ,...,uj ∈Fn
2 i=1 i=1 i=1
v1 ,...,vj ∈Fm
2

Moreover, this inequality is an equality if and only if F is differentially δ-uniform.


414 Highly nonlinear vectorial functions with low differential uniformity

Relation (11.2), page 372, can then be deduced from Theorem 26 by choosing φ2 (X) =
X − 2. Theorem 26 gives other interesting characterizations. We shall give one for each case
δ = 2 and δ = 4, but more can be found in [250]. Taking φ2 (X) = (X − 2)(X − 4), we
obtain:

Corollary 29 [250] Every (n, n)-function F is APN if and only if

WF2 (u1 , v1 )WF2 (u2 , v2 )WF2 (u1 + u2 , v1 + v2 ) =


u1 ,u2 ∈Fn n
2 ;v1 ,v2 ∈F2
v1 =0n ,v2 =0n ,v1 =v2

25n (2n − 1)(2n − 2). (11.14)

Moreover, every (n, n)-function satisfies a version of (11.14) with “≥” in the place of
“=,” but to show this, Theorem 26 must be applied with φ4 (X) = (X − 2)(X − 4) and also
with φ2 (X) = X − 2 (see [250]).
Applying Theorem 26 with φ4 (X) = (X − 2)(X − 4) when m = n − 1 gives:

Corollary 30 [250] Every (n, n − 1)-function F is differentially 4-uniform if and only if

WF2 (u1 , v1 )WF2 (u2 , v2 )WF2 (u1 + u2 , v1 + v2 ) =


u1 ,u2 ∈Fn n−1
2 ;v1 ,v2 ∈F2
v1 =0n−1 ,v2 =0n−1 ,v1 =v2

25n (2n−1 − 1)(2n−1 − 2). (11.15)

And every (n, n − 1)-functions satisfies a version of (11.15) with “≥” in the place of “=.”
Note the similarity between these two corollaries, which shows that the two optimal
notions of APN (n, n)-function and differentially 4-uniform (n, n − 1)-function are close.
In [252], Theorem 26 is generalized into a characterization of all the criteria
 on vectorial
functions dealing with the numbers of solutions of equations of the form i∈I F (x +ui,a )+
La (x) + ua = 0m , with La linear. In particular, injective functions are characterized this
way. A characterization of o-polynomials originally given in [314] can also be obtained
by this generalization. And a generalization to differentially δ-uniform functions of a
characterization by Nyberg of APN functions by means of the Walsh transforms of their
derivatives is also derived.

11.6.2 Componentwise Walsh uniformity (CWU)


We have seen at page 390 that the characterization of APNness by Relation (11.2) leads to
a stronger notion called CAPNness, in which the relation is satisfied by each component
function. The characterizations of APN (n, n)-functions and differentially 4-uniform (n, n −
1)-functions by Relations (11.14) and (11.15) in Corollaries 29 and 30 lead similarly to the
following EA invariant notion introduced in [251]: we call componentwise Walsh uniform
(CWU) those functions F : Fn2 → Fm 2 , with m ∈ {n − 1, n}, which satisfy (11.14),
respectively (11.15), for each pair of component functions.
11.6 Differentially uniform functions 415

Definition 83 [251] An (n, m)-function F with m ∈ {n − 1, n} is called CWU if, for all
distinct nonzero v1 , v2 ∈ Fm
2 , we have

WF2 (u1 , v1 )WF2 (u2 , v2 )WF2 (u1 + u2 , v1 + v2 ) = 25n .


u1 ,u2 ∈Fn2

If m = n, then CWUness implies APNness. The converse is not true in general, but we
have the following result (we refer the reader to [251] for the proof):

Proposition 184 Any crooked function (in particular, any quadratic APN (n, n)-function)
is CWU.

An investigation of CWU functions among all known nonquadratic APN power functions
was made in [251] for n ≤ 11. Two potential infinite classes of nonquadratic CWU power
functions arised:
• All Kasami APN functions (n odd or even)
n−1
2 +1
• The inverse of the Gold APN permutation x 2 (n odd)

None of the other known classes of nonquadratic APN power functions is made of CWU
functions only. We have:

n−1
2 +1
Proposition 185 [251] The compositional inverse of the Gold APN permutation x 2
is CWU for every odd n.

The proof is rather long, so we refer to [251] for it. Finding a proof (after confirmation of
the investigation results) of the same property for Kasami functions is an open problem (see
some observations in [251] and more in [252]).

11.6.3 Cyclic difference sets, cyclic-additive difference sets, and the CWU property
We have seen the notion of additive difference set in Fn2 when dealing with bent functions at
page 196: every nonzero element can be written in the same number of ways as the difference
x − y (that is, x + y) between two elements of the set. This notion exists for every group
structure. When this group structure is that of F∗2n , we speak of cyclic difference set, since
F∗2n is cyclic (and x − y has to be replaced by xy ). A particular case that plays a role with
APN functions (see [448]) is the following:

Definition 84 A subset  of size 2n−1 of the multiplicative group F∗2n is called a cyclic
difference set with Singer parameters if, for all distinct v1 , v2 in F∗2n , we have |{(x, y) ∈
2 ; v1 x + v2 y = 0}| = 2n−2 .

Equivalently, the
symmetric 1difference between  and a equals 2n−1 for every a ∈
F∗2n\ {1}, that is, x∈F∗n (−1)  (x)⊕1  (ax) = −1 (i.e., the sequence si = 1 (α i ), where
2
416 Highly nonlinear vectorial functions with low differential uniformity

α is primitive, has ideal autocorrelation). Any Singer set (already seen at page 389)
Sd = {x ∈ F2n ; trn (x d ) = 1}, gcd(d, 2n − 1) = 1, has such a property since x → x d
is a (multiplicative) group automorphism and the intersection between two distinct affine
hyperplanes has dimension n − 2. Maschietti [825] (see also [443]) proves that, for every
d coprime with 2n − 1 and such that the mapping x → x + x d is 2-to-1 over F2n , the
complement of the image of this mapping is a cyclic difference set with Singer parameters.
It is proved in [443, 448] that, for every APN Kasami function F (x) = x 4 −2 +1 over F2n ,
k k

n = 3k ±1, the set {F (x)+F (x +1); x ∈ F2n } (or its complement if it contains 0) is a cyclic
difference set with Singer parameters (note that x → F (x) + F (x + 1) is also 2-to-1) and in
[448] that the complement of its translation by 1, that is, of F = {F (x)+F (x +1)+1; x ∈
F2n }, is a cyclic difference set with Singer parameters (the proof is deduced from an elegant
1
calculation of the Fourier transform of the indicator of the set DF = {x 2i +1 ; x ∈ F }),
under the weaker condition that gcd(k, n) = 125 . Known facts are summarized with their
proofs, and a few new observations are made in [252].
A relationship is shown in [251] between CWU power permutations and a new notion
similar to the cyclic difference set property.

Definition 85 A set  ⊆ F2n is called a cyclic-additive difference set if, for every distinct
nonzero v1 , v2 in F2n , we have:
|{(x, y, z) ∈ 3 ; v1 x + v2 y + (v1 + v2 )z = 0}| = 22n−3 .

Every power function F is APN if and only if the set {F (x) + F (x + 1) + 1; x ∈ F2n } has
size 2n−1 .

Proposition 186 Let F be any power APN permutation. Then, F is CWU if and only if the
set {F (x) + F (x + 1) + 1; x ∈ F2n } (or equivalently its complement) is a cyclic-additive
difference set.

There are differences between cyclic and cyclic-additive difference sets:


• The notion of cyclic difference set is invariant under raising the elements to a power
coprime with 2n − 1, and if two sets  and  are such that W1 (a) = W1 (a k ), where
k is coprime with 2n − 1, then  is a cyclic difference set if and only if  is one, while
these properties are not true for cyclic-additive difference sets.
• The notion of cyclic-additive difference set is invariant under translation x → x + a
while that of cyclic difference set is not (see [448]).

It seems impossible to deduce the cyclic-additive property of F in the case of Kasami


APN permutations from the fact proved by Dillon and Dobbertin in [448] that the Fourier
1
transform of the indicator of the set DF = {x 2i +1 ; x ∈ F } takes at any input a ∈ F2n the
2i +1
same value as the Walsh transform of the Boolean function trn (x 3 ) at a 3 .

25 For Gold APN functions, we have the same, but the set is the classical Singer set S1 .
11.6 Differentially uniform functions 417

11.6.4 The known differentially 4-uniform (n, n)-permutations, n even


For computational reasons (explained at page 401 for the inverse function but valid in a
more general context), (n, n)-functions are better used as S-boxes when n is even, the best
being when n is a power of 2. In practice, we have most often n = 4 (for lightweight
cryptosystems, to be implemented, for instance, on cheaper smart cards) and n = 8 (for
cryptosystems implemented on more powerful devices), since n = 16 seems still too
large for current computational means. We have seen that only one APN permutation,
in six variables, is known. It is then important to find as many differentially 4-uniform
permutations as possible in even dimension.
Note that if these permutations are involutions, this allows reducing further the complexity
of the algorithm, since the same implemented function can then be used for encryption and
decryption. Several block ciphers such as AES, Khazad, Anubis, or PRINCE use involutive
functions (up to affine equivalence) in their S-boxes. Note that, as already mentioned at
page 411 and shown in [136], the 6-variable permutation exhibited in this reference is EA
equivalent to an involution.
The smallest differential uniformity and largest nonlinearity achievable by a (4, 4)-
permutation are respectively 4 and 4 [183, 756]. Up to affine equivalence, there are 16
classes of such permutations. All have algebraic degree 3 and are also optimal against
algebraic attacks. Half have a component function of algebraic degree 2, which should be
avoided, and half have all their component functions cubic. There are six CCZ equivalence
classes.
The smallest differential uniformity and largest nonlinearity achievable by an (8, 8)-
permutation are respectively 4 and 112 (achieved by Gold, Kasami, and inverse functions).
We describe now the known infinite classes of differentially 4-uniform (n, n)-functions.
We begin with the functions obtained by primary constructions, starting with power
functions:
• The inverse function x 2 −2 (the only known involutive differentially 4-uniform power
n

(n, n)-permutation; see [521]) for n even first proposed in [908] is used (composed by
an affine permutation) for the S-box of the AES26 with n = 8. This class of functions has
best-known nonlinearity 2n−1 − 2n/2 and has maximum algebraic degree n − 1. It is the
worst possible against algebraic attacks (which are not efficient on the AES, but some
risk exists that they will be improved as they were for stream ciphers) since if we denote
y = x −1 , then we have the bilinear relation x 2 y = x.
• The Gold functions x 2 +1 , where gcd(i, n) = 2 are differentially 4-uniform and they are
i

bijective (i.e. gcd(2i +1, 2n −1) = 1) if and only if n ≡ 2 [mod 4] since, 2i −1 and 2i +1
2i −1,2n −1) gcd(2i,n) −1
being coprime, we have gcd(2i + 1, 2n − 1) = gcd(2 gcd(2i −1,2n −1)
= 22gcd(i,n) −1 (but n is then
not a power of 2 and these functions are quadratic). They have best-known nonlinearity.
Gold functions are never involutive.

26 Often represented by a double-entry look-up table with 16 rows and 16 columns, whose indices belong to F42
(and can be written in hexadecimal from 0 to f), which provides the 162 = 256 entries (which, when
represented in hexadecimal, belong to {00, . . . , ff}).
418 Highly nonlinear vectorial functions with low differential uniformity

• The Kasami functions x 2 −2 +1 such that n ≡ 2 [mod 4] and gcd(i, n) = 2 are


2i i

differentially 4-uniform as proved in [669] (see also [604]) and bijective as well, since
3i +1
gcd(2i +1, 2n −1) = 1 and since 22i −2i +1 = 22i +1 implies gcd(22i −2i +1, 2n −1) =
−1,2 −1)
6i n
= 22gcd(3i,n) −1
gcd(6i,n)
gcd(23i + 1, 2n − 1) = gcd(2
gcd(23i −1,2n −1) −1
= 1 since 23i − 1 and 23i + 1 are
coprime. They are not quadratic but they have the same Walsh spectrum as the Gold
functions (thus, with best known nonlinearity). They are in fact rather closely related
to quadratic functions, since they have the form F = R1 ◦ R2−1 , where R1 and R2 are
quadratic permutations, which has some similarity with a function CCZ equivalent to
a quadratic function. There is a threat that this could be used in a modified version of
the higher-order differential attack (but adapting the attack to such kind of function is
an open problem). This class of functions never reaches the maximum algebraic degree
n − 1 (but this is not really a problem). Kasami functions are never involutive.

Remark. While all APN permutations have, by definition of APNness, the same differ-
ential spectrum, this is different for differentially 4-uniform permutations. For instance, the
inverse function and the Gold and Kasami functions have different differential spectra (the
inverse function has a better differential spectrum, in which value 4 is obtained less often).
More on power functions can be found in [92].
n/2+n/4+1
• The function x 2 introduced by Dobbertin [467] and shown by Bracken and
Leander to be differentially 4-uniform [120] has best-known nonlinearity 2n−1 − 2n/2
as well. It is bijective (but not involutive) if n is divisible by 4 but not by 8; in this case,
n is not a power of 2; the function has algebraic degree 3, which is rather low.
• The APN binomials of Proposition 175 share many properties of Gold power functions.
In particular, relaxing conditions on involved parameters leads to differentially 2t -
uniform permutations [121]. Let n = 3k and t be a divisor of k, where 3  k and kt
is odd. Let s be an integer such that gcd(3k, s) = t and 3|(k + s). Then the functions
−k
αx 2 +1 + α 2 x 2 +2 , where α is a primitive element of F2n , are differentially 2t -
s k k+s

uniform and bijective. This class of functions has nonlinearity 2n−1 − 2(n+t−2)/2 if n + t
is even and 2n−1 − 2(n+t−3)/2 if n + t is odd. A conjecture of [121] that quadratic
quadrinomial APN functions (11.10) allow similar generalization to differentially 2t -
uniform permutations is disproved in [983].
It is shown in [356], among other results, that for n even, any quadratic differentially
4-uniform permutation has all its δF (a, b) values in {0, 4} for a = 0, has nonlinearity
2n−1 − 2n/2 , and is plateaued with single amplitude.
• The author proposed in [241], for constructing differentially 4-uniform permutations, to
use the structure of the field F2n+1 instead of that of F2n (of course, F2n+r could be also
tried with r ≥ 2). The idea consists in finding an (N, N)-function, where N = n + 1,
whose restriction to an affine hyperplane of F2N has for image set an affine hyperplane.
This restriction provides then an (n, n)-permutation since any affine hyperplane of F2N
is affinely equivalent to Fn2 . Let the affine hyperplane be A = u + E, where E is a
linear hyperplane, the restriction to A is differentially 4-uniform if the restriction to A
of any derivative in a nonzero direction belonging to E is 2-to-1. An example given in
[241] is with the Dickson polynomials Dk , seen at page 389. Recall that every element
11.6 Differentially uniform functions 419

of F∗2N can be expressed uniquely in the form h + 1


h where h ∈ F∗22N – more precisely,
h ∈ F∗2N ∪ U , where U = {x 2 −1 ; x ∈ F∗22N } is the multiplicative subgroup of F∗22N
N

of order 2N + 1. Then Dk (h + h1 , 1) = hk + h1k by definition. Moreover, the image


of F2N by this function equals { x1 ; x ∈ F2N , trN (x) = 0} (with the usual convention
0 = 0) and the image of U \ {1} equals { x ; x ∈ F2N , trN (x) = 1}. If k is coprime
1 1

with 2 + 1, then the mapping h → h is a permutation of U \ {1} and induces then


N k

a permutation of { x1 ; x ∈ F2N , trN (x) = 1}, whose expression coincides with Dk (x, 1)
on this set. The function 1
1 is then a permutation of the hyperplane H of F2N of
Dk ( x ,1)
equation trN (x) = 1. For k = 3, N must be even, and we have D3 (x, 1) = x 3 + x and
x3
1 = x 2 +1 = x + x+1 + x 2 +1 , which is EA equivalent to x + x 2 and is differentially 4-
1 1 1 1 1
1
+x
x3
uniform over H . But the argument given in [241] for this last property is not correct: it is
written that the function x → x + x 2 is 2-to-1 which is true, and that the inverse function
is APN, which is false since N is even. The correct argument is that H excluding 0, the
equation x 2 −2 + (x + a)2 −2 = b (where a = 0 and therefore b = 0 since the inverse
n n

function is a permutation) is equivalent to x1 + x+a


1
= b, that is, x 2 + ax = ab and has
then at most two solutions.
This differentially 4-uniform permutation is in odd dimension. We complete here the
study by addressing the case n even: if k is coprime with 2N − 1, then the mapping h →
hk is a permutation of F2N and induces then a permutation of { x1 ; x ∈ F2N , trN (x) =
0}, whose expression coincides with Dk (x, 1) on this set. The function 1 1 1 = x +
+x
x3
1
x+1 + 1
restricted now to the hyperplane of equation trN (x) = 0 is bijective. It is
x 2 +1
differentially 4-uniform since N being odd, the inverse function is APN. And it is shown
in [241] that the nonlinearity is at least 2n−1 − 2 2 +1 (not optimal) and the algebraic
n

degree equals n − 1. A similar proposal in even dimension is given in [1107].


In [772], Li and Wang have used again the idea of working with functions F over F2N
with N = n + 1, and of taking their restrictions to hyperplanes. They need F to be a
quadratic APN permutation over F2N, and they take N odd so that n is even (then this
permutation is AB, but this is not used). They consider for every nonzero u ∈ F2N the
linear function Lu (x) = F (x + u) + F (x) + F (u) + F (0), whose range Hu is a linear
hyperplane (since F is APN) such that F (u) + F (0) ∈ Hu (since F is bijective). Then
the restriction of Lu ◦ F −1 to Hu is injective because Lu ◦ F −1 (x1 ) = Lu ◦ F −1 (x2 ) and
x1 = x2 imply F −1 (x1 ) + F −1 (x2 ) = u, since F is APN, and therefore, Lu ◦ F −1 (x2 ) =
F (F −1 (x2 )+u)+F (F −1 (x2 ))+F (u)+F (0) = x1 +x2 +F (u)+F (0), and the relation
x1 + x2 = F (u) + F (0) + Lu (F −1 (x2 )) ∈ (F (u) + F (0) + Hu ) ∩ Hu = ∅ is impossible.
This permutation is differentially 4-uniform by construction since Lu is 2-to-1 and F −1
is APN; it is then a differentially 4-uniform permutation of Hu . They obtained with
F equal to Gold functions three classes of differentially 4-uniform bijections in even
dimension with best-known nonlinearity 2n−1 − 2n/2 and algebraic degree n2 + 1.
It seems difficult to prove that any of the functions presented in this paragraph is CCZ
inequivalent to power functions and to quadratic functions (but we see no reason why
such equivalence could happen since the field structures of F2n and F2N are independent
of each other).
420 Highly nonlinear vectorial functions with low differential uniformity

We continue now with permutations obtained by modifications of known differentially


4-uniform bijections.
• Qu et al. [982] proposed two classes of differentially 4-uniform bijections in even
dimension. These functions were obtained through the switching construction, by adding
Boolean functions to the inverse function. The first class has the form x 2 −2 +trn (x 2 (x +
n

1)2 −2 ). It has optimal algebraic degree n − 1 and nonlinearity larger than 2n−1 −
n

2n/2+1 − 2. The second class has the form x 2 −2 + trn x (2 −2)d + (x 2 −2 + 1)d , where
n n n

d = 3(2t + 1), 2 ≤ t ≤ n/2 − 1. It has algebraic degree n − 1 as well and nonlinearity


at least 2n−2 − 2n/2−1 − 1. The authors did not study whether their functions are CCZ
inequivalent to the inverse function, but this can be checked for even n = 6, . . . , 12 with a
computer. A generalized method for constructing differentially 4-uniform permutations
in even dimension is presented in the same reference (by determining conditions for
their differential uniformity), which includes the former two classes of functions, and
produces many CCZ inequivalent differentially 4-uniform bijections in even dimension
(the authors could show that the number of CCZ inequivalent differentially 4-uniform
permutations over F22k grows exponentially when k increases). Zha et al. in [1151],
Qu et al. in [981] (who made more systematic the approach of [982] and obtained more
functions), Peng and Tan in [938], and Chen et al. in [362] proposed more functions
of the form x 2 −2 + g(x), where g is a Boolean function (thanks to more precise
n

conditions in the latter reference). In [1146], the authors also built differentially 4-
uniform permutations by swapping two values of the inverse function (it is observed that
a function I(u,v) over F22m , obtained from the inverse power function by swapping its
values at two different points u = 0 and v = 0, is a differentially 4-uniform permutation
if and only if tr2m (uv −1 )tr(u−1 v) = 1).
• The author, Tang, Tang, and Liao proposed in [325] the following construction, for n ≥ 6
even, of differentially 4-uniform (n, n)-permutations of algebraic degree n − 1:

(1/x  , f (x  )), if xn = 0
(x1 , . . . , xn−1 , xn ) →
(c/x  , f (x  /c) + 1), if xn = 1,
where c ∈ F2n−1 \ F2 is such that trn−1 (c) = trn−1 (1/c) = 1, x  ∈ F2n−1 is identified
with (x1 , . . . , xn−1 ) ∈ F2n−1 and f is an arbitrary Boolean function defined on F2n−1 . 
It is shown in [363] that the particular functions corresponding to f (x  ) = trn−1 x 1+1
have high nonlinearity and are CCZ inequivalent to all known differentially 4-uniform
power permutations and to quadratic functions. It is also shown that the functions in the
general class are CCZ inequivalent to the inverse function, and for n = 2k, k = 4, . . . , 7,
to the sums of the inverse function and of Boolean functions.
• Zha et al. [1150] presented two classes of differentially 4-uniform bijections by applying
affine transformations to the inverse function on some subfields. Their functions have
maximum algebraic degree n−1. The lower bounds on the nonlinearity of these functions
would need to be worked further. In [1130], another infinite family of differentially
4-uniform permutations with the same “piecewise” method but starting from Gold
functions is provided. Inequivalence to known functions needs to be checked.
Peng and Tan in [939], Peng et al. in [940], and Xu and Qu in [1132] presented similar
transformations.
11.6 Differentially uniform functions 421

Tang et al. [1069], for any even n ≥ 6, introduced a class of subsets U of F2n such that
the function equal to (x + 1)2 −2 if x ∈ U and to x 2 −2 otherwise gives a differentially
n n

4-uniform permutation. For every even n, at least 22 −2


n−3 n/2−2
different such sets U are
designed. For every even n ≥ 12, it is proved that if the size of U is such that 0 < |U | <
(2n−1 − 2n/2 )/3 − 2, then the functions are CCZ inequivalent to known differentially
4-uniform power functions and to quadratic functions. A table of comparison with these
other functions is given.
• Li et al. in [773] modified the inverse function by cyclically shifting the images of the
function over some subset {α0 , α1 , . . . , αm } of F2n . Fu and Feng [521] proposed new
families with such cycles of length 3.
• Perrin et al. have introduced in [944] the interesting butterfly construction, already seen at
page 411, generalized by Canteaut et al. in [199]. It is shown in [523, 944] and [199] that
the resulting function is differentially 4-uniform with (best-known) nonlinearity 2n−1 −
n
2 2 when, respectively, Ry (x) = (x + ay)3 + y 3 , with a ∈ F∗ n , Ry (x) = (x + ay)2 +1 +
i

22
y 2 +1 , with a ∈ F∗ n , gcd(i, n) = 1, and Ry (x) = (x + ay)3 + by 3 , with a, b ∈ F∗ n ,
i

22 2 2n
b = (1 + a)3 . More differentially 4-uniform permutations with nonlinearity 2n−1 − 2 2
are obtained with the butterfly construction in [523].

Fu and Feng studied in [521] if some functions among those recalled in the present
subsection could be involutions. They obtained the following involutive differentially 4-
uniform permutations:
• Functions of the form x 2 −2 + 1U (x) or (x + 1)2 −2 if x ∈ U and to x 2 −2 otherwise,
n n n

where U = F4 and in the case n ≡ 2 [mod 4],U = F2 or U = F4 \ F2


b (x + 1)2 −2 + a if x ∈ F2t ,
n
• The Peng and Tan function [939]: F (x) = where t
x 2 −2 otherwise,
n

divides n and a = b = 1, or a = 0, b = 1, t even, or a = 0, b = 1, t = 1, 3, n2 odd, or


a = b ∈ F4 \ F2 , t = 2, n2 odd, or a = 1, trn (b−1 ) = 1 and nt odd

(c x)2 −2 if x ∈ U , 9
n
• The Peng et al. function [940]: F (x) = 2 n −2 where U = g∈G g ,
x otherwise,
where  is the cyclic multiplicative
 group
 generated by γ and {g −1 , g ∈ G} = G and

trn (γ ) = trn (γ −1 ) and trn γ


g l g  −l
= 1 for every g, g  ∈ G and every l (with l
g
γ +gγ
not divisible by || if g = g  )
• The particular case given by Fu and Feng [521] of the Li et al. function [773],
modifying the inverse function by cyclically shifting a triple {α0 , α1 , α2 } of F∗2n , where
(α0 , α1 , α2 ) = (0, γ , γ −1 ) with γ ∈ F2n \ F2 , trn (γ ) = trn ( γ +1
1
) = 1 or (α0 , α1 , α2 ) =
(1, γ , γ −1 ) with γ ∈ F2n \ F2 , trn (γ ) = trn ( γ1 ) = trn ( γ +1
1
) = 1, trn ( (γ +1)
1
3) = 0
  
(1/x , f (x )), if xn = 0
• The functions F (x1 , . . . , xn−1 , xn ) = from [325],
(c/x  , f (x  /c) + 1) if xn = 1
where supp(f ) = ∅ or = {0}

It is observed in [967] that the indicators of the graphs of all the known differentially 4-
uniform (n, n)-permutations (n even) have algebraic immunity 2. In [1039], the differential
422 Highly nonlinear vectorial functions with low differential uniformity

uniformity of the composition of two functions is studied, and new differentially 4-uniform
permutations from known ones are constructed.

11.6.5 Other differentially 4-uniform (n, n)-functions


There are differentially 4-uniform functions in odd dimension, which we do not list here
since they are less interesting, practically. There are also differentially 4-uniform functions
in even dimension that are not permutations. These can be obtained from APN functions by
adding a Boolean function, or composing them (on the right or on the left) by 2-to-1 affine
functions. Differentially 4-uniform functions that are faster and less costly to compute can
be obtained by concatenating the outputs of a bent function and of another function [239]:
1. The function (x, y) → (xy, (x 3 + w)(y 3 + w )), where w, w and ww belong to F2n/2 \
{x 3 , x ∈ F2n/2 }, with n/2 even.
2. The function (x, y) → (xy, x 3 (y 2 + y + 1) + y 3 ), with n/2 odd.
3. The function F : X ∈ F2n → (X2 +1 , (X 2 +1 )3 + wX 3 + (wX 3 )2 ).
n/2 n/2 n/2

Such functions are not bijective but, because of their low implementation complexity, have
an advantage when we wish to protect the cryptosystem against side-channel attacks (see
Section 12.1, page 425). For instance, Function 1 above has been used in the cryptosystem
PICARO [957]. See more in [239]. Other examples of differentially 4-uniform functions are
the function ax 2 +1 + bx 2 +1 + cx 2 +2 such that gcd(s, n) = 1 as shown in [115], the
2s s 2s s

function x 2 −1 +ax 5 (n odd, a ∈ F2n ) as shown in [120], and several classes given in [243].
n−1

Some constructions of differentially 4-uniform functions have been given in [896], in


connection with the structure of a commutative semifield already seen in Chapter 6. A
semifield is a finite algebraic structure (E, +, ◦) such that (1) (E, +) is an Abelian group,
(2) the operation ◦ is distributive on the left and on the right with respect to +, (3) there
is no nonzero divisor of 0 in E, and (4) E contains an identity element with respect to ◦.
This structure has been very useful for constructing planar functions in odd characteristic. In
characteristic 2, it may lead to new APN functions and differentially 4-uniform permutations
by considering, for instance, the function (x ◦ x) ◦ x in a classical semifield (there are two
classes of them, whose underlying Abelian group is the additive group of F2n : the Albert
semifields, in which the multiplication is x ◦ y = xy + β(xy)σ , where x → x σ is an
automorphism of the field F2n , which is not a generator and β ∈ {x σ +1 ; x ∈ F2n }, and the
Knuth semifield where the multiplication is x ◦ y = xy + (xtr(y) + ytr(x))2 , where tr is a
trace function from F2n to a suitable subfield).

11.6.6 Other differentially uniform (n, n)-functions


Some results have been found on differential uniformities 6 and 8 for (n, n)-functions;
see [93, 95, 1129] and the references therein. In [1160], two methods are proposed for
constructing balanced (n, m)-functions (with m < n unfortunately) with nonlinearity strictly
larger than 2n−1 − 2n/2 and with “good” other parameters.27
27 An inappropriate comparison is made in this paper with a permutation – the one used as S-box in the AES –
and with the S-box used in PICARO (designed to resist side-channel attacks, and therefore a little weaker with
respect to the other features).
11.6 Differentially uniform functions 423

11.6.7 On the best differential uniformity of (n, m)-functions


When m < n, (n, m)-functions cannot be used in substitution–permutation networks but
they can be used in Feistel ciphers, like in the DES cipher which has eight S-boxes each
mapping 6 bits to 4 bits. When n is even and m ≤ n2 , these functions can be bent (i.e., PN),
which allows them to oppose optimal resistance against differential and linear attacks, but
they are then not balanced and the number of their output bits is small. When n2 < m < n,
little theoretical work has been done on differentially uniform (n, m)-functions. We know
that the differential uniformity of such functions is bounded below by 2n−m + 2. We call
this bound Nyberg’s bound. Characterizing the pairs (n, m) for which this bound is tight is
an open question.
• In the case m = n−1, Nyberg’s bound is tight. There is indeed a simple way of designing
differentially 4-uniform (n, n − 1)-functions: any function of the form L ◦ F , where
F is an APN (n, n)-function and L is a surjective affine (n, n − 1)-function, is indeed
differentially 4-uniform. Using such an S-box in a Feistel cipher can be seen as using the
APN function itself.
In [255] an alternate way to construct differentially 4-uniform (n, n − 1)-functions
by defining their look-up table (LUT) as the concatenation of the LUT of two APN
(n − 1, n − 1)-functions is studied; the corresponding function S(x, xn ) = xn F (x) +
(1 + xn )G(x) is a differentially 4-uniform (n, n − 1)-function if and only if, for every
a ∈ F2n−1 , the function F (x) + G(x + a) is at most 2-to-1 (i.e., each value in the image
set has at most two corresponding preimages).
The particular case where the two APN functions differ by an affine function provides,
when one of these functions is a Gold function, the family of quadratic differentially
4-uniform (n, n − 1)-functions (x, xn ) → x 2 +1 + xn x, where x ∈ F2n−1 and xn ∈ F2
i

with gcd(i, n − 1) = 1, whose Walsh transform and nonlinearity are studied, as well as
the CCZ inequivalence to all functions of the form L ◦ F above.
• In [259], (n, m)-functions achieving Nyberg’s bound with equality are studied in the
(Maiorana–McFarland) form F (x, z) = I (x)φ(z), where I (x) is the (m, m)-inverse
function28 and φ(z) is an (n − m, m)-function. An infinite family of differentially
(2m−1 + 2)-uniform (2m − 1, m)-functions with m ≥ 3 is designed (which also have
high nonlinearity and not too low algebraic degree). Hence, Nyberg’s bound is tight for
m = n+1 2 , n ≥ 5 odd.
Differentially 4-uniform (m + 1, m)-functions in this form are also designed, and
a method is proposed to construct infinite families of (m + k, m)-functions with low
differential uniformity, leading to an infinite family of (2m − 2, m)-functions with δ ≤
2m−1 − 2m−6 + 2 for any m ≥ 8. But this does not provide functions achieving Nyberg’s
bound with equality and the existence of such (n, m)-functions for n2 + 1 ≤ m ≤ n − 2
is open.
In fact, it is even an open problem to determine whether there exist differentially δ-
uniform (n, n − k) functions with k ≥ 2, k significantly smaller than n2 , δ < 2k+1 , and
n > 5 (δ = 2k+1 is easily reached with functions L ◦ F where F is an APN (n, n)-
function and L is an affine surjective (n, n − k)-function). In particular, the existence of
28 The only function that returned positive results when we made a computer investigation.
424 Highly nonlinear vectorial functions with low differential uniformity

differentially 6-uniform (n, n − 2)-functions for n > 5 is an open question (differentially


6-uniform (5, 3)-functions are known [255]). In [15], Alsalami built more differentially
4-uniform (n, n − 1)-functions and differentially 8-uniform (n, n − 2)-functions.
In [949], several evolutionary algorithms and problem sizes were explored in order
to find such functions. The results of this investigation show that the problem, which is
easy in dimensions 4 and 5, is very difficult for larger n.
12

Recent uses of Boolean and vectorial functions


and related problems

Many mathematical problems in computer science result in questions regarding Boolean


functions (or vectorial functions). Cryptography has been no exception since the 1950s,
and new roles of Boolean functions still emerge nowadays. In this chapter, we give several
examples of recent problematics in cryptography that result in new questions about Boolean
functions, vectorial functions, and related codes, or that renew the interest of some known
notions.

12.1 Physical attacks and related problems on functions and codes


Until the 1990s, cryptographers implicitly considered the black box attacker model only,
in which the cryptanalyst has access to ciphertexts (in the ciphertext-only attacker model)
or to plaintext–ciphertext pairs (in the known-plaintext and the chosen-plaintext models),
but has no information beyond input/output. This was realistic when the ciphers were
run only on computers, all the more if these were protected (by a Faraday cage, for
instance). But nowadays, cryptographic algorithms are run often on mobile devices, on
smart cards (which include a part of hardware and work with software implementations),
or on light hardware devices (e.g., field-programmable gate arrays [FPGA], application-
specific integrated circuits [ASIC]). Side-channel information (through the running-time,
power consumption, electromagnetic emanations, etc.) is then accessible.
The side-channel attacks (SCA) (see [712, 713, 984]) on the implementations of block
ciphers1 in such embedded systems, (see [823]) take advantage of this additional information
obtained through the physical environment. They are able to treat this information for
extracting the secret parameters of the algorithm and are in practice extremely powerful.
They assume an attacker model different from classical attacks: the gray box attacker model,
in which the adversary has also access to leakage. This additional information is all the
more usable on block ciphers, which are iterative: each round involves diffusion layers and
substitution layers; both kinds are necessary for security, and the diffusion needs several
rounds before being effective; the SCA can then be very efficient by attacking the first round
(while in the black box model, only the global cipher is attackable) or the last round (the
first round of the reverse cipher); see the survey [320].
The exploited leakage is a measurable quantity (in the case of a so-called monovariate
attack or univariate attack on a single leakage2 ) depending on the data manipulated by the
1 SCA also exist on asymmetric ciphers, but this is out of the scope of this book, and they have not been as
developed for stream ciphers as they were for block ciphers.
2 Multivariate attacks are more difficult to perform in practice.

425
426 Recent uses of Boolean and vectorial functions and related problems

algorithm (the key is mixed with the data, and any leakage that depends on this data can
be used as an oracle). The important data for SCA are the values of the so-called sensitive
variables of the algorithm. These are variables whose values are in general stored in registers
and which depend on the (varying) input to the algorithm (assumed known by the attacker),
and on the (constant) secret key (or better for the attacker, on a part of the secret key, since
this allows for a divide-and-conquer approach, where the key is recovered byte by byte, a
customary case in block ciphers being when the cipher computes the sum of a public binary
vector and of a subsequence of the round key). The length n of such variable is a number
depending on the cipher (4 if the cipher works on nibbles, 8 if it works on bytes, 16 if it
works on words, etc.).
The attacker records, for instance, the emanations emitted by the register on which the
values of the sensitive variable are stored, which can be approximated as a real-valued
function L of the sensitive variable (the register is a micrometric object, whose contents
cannot, in general, be measured directly). For instance, in the so-called Hamming weight
leakage model, L(Z) equals the Hamming weight of Z; in the Hamming distance leakage
model, L(Z) is the Hamming distance between two consecutive values of the register where
Z is stored; in more general linear leakage models, L(Z) equals a linear combination
with real coefficients of the bits of Z (we speak then of a static linear leakage model,
needed to ensure that the leakages corresponding to different shares are independent),
or of the differences between the bits at two consecutive states of the register. In what
the attacker records, L is added with inevitable noise N , generally viewed as a white
Gaussian variable, due to the activity in the device around the register (an attacker can
only measure an aggregated function of each computing element’s leakage, such as the
total current drawn by the circuit) and depending on the choice of the leakage model (a
good choice minimizes the noise). The part independent of the noise in the leak is called
the deterministic leak. The attacker tests exhaustively all the possible values of the key bits
involved in the sensitive variable, computing for each choice the corresponding modeled
leakage value, the correct key values being those that maximize statistically (for a series
of runs with the same key and different plaintexts) the dependency between the modeled
leakage (which depends on the tested key value and the plaintext, and also on the leakage
model chosen for the attack) and the measured leakage. This dependency can be evaluated
by different statistical methods, leading to different SCA. For instance, in differential power
analysis (DPA) [713] or more general differential analyses, the statistical distinguisher is
the difference of means between the two cases (among all plaintexts used) where the leak is
larger, resp. smaller, than some fixed value. If the guessed key is correct, then the modeled
difference should be close to a nonzero constant, while if it is wrong, it should be close
to zero (the two means measuring then a same random variable). In correlation power
analysis (CPA) or more general correlation analyses, the statistical distinguisher is Pearson’s
(linear) correlation coefficient (which is more complex to evaluate but more efficient),
equal to the covariance between the two values, divided by the product of their standard
deviations.
The attacker starts with a first-order attack, in which the leakage is handled as is. It can
be proved that this first-order attack is successful if the conditional expectation E(L|Z = z)
depends on z. If it does not, then the attacker can try successively a second-order attack,
which mixes the observations of two leakages (and if these two leakages are the same, the
12.1 Physical attacks and related problems on functions and codes 427

attacker takes then the square of a single leakage3 in a so-called zero-offset CPA [1100];
we shall consider only this case in the sequel, to simplify the presentation), a third-order
attack, etc., increasing the order of the attack until it is successful. The complexity of such
higher-order side-channel attack (HO-SCA) [381, 803, 879, 920, 1047] depends then on the
smallest value of the order j such that the conditional expectation E(Lj |Z = z) depends
on z; see [264]. It is shown in this latter reference that the complexity of the attack (in time
and in the number of measuring events – called traces, see below – which is needed) is
exponential in the order, essentially because the noise associated to Lj is exponential in j .
Relative to the noise, the leaked information decreases exponentially with the order j ; it is
proportional to V −j , where V is the variance of the noise N . This is where the choice of the
leakage model plays a role: a bad choice will increase the variance of the noise.
SCA are mainly statistical attacks, and the measures are made several times, each
time providing a so-called leakage trace. Usually, traces are assumed independent and
identically distributed.4 The measure quantifies as we saw above the running time, the
power consumption, the electromagnetic radiations of the cryptographic computation, or
even the photonic emission. Depending on the execution platform, the part of the leakage
due to one bit can be modeled according to its activity (the leakage is observed when the
bit changes values; this is the case of complementary metal-oxide-semiconductor [CMOS]
technology) or its value (the leakage differs according to the bit’s state; this can be viewed
as a particular case of the former case). If every bit of a sensitive variable leaks an identical
amount, irrespective of its neighbors, we are in the so-called Hamming distance (resp.
weight) leakage model (see more in [262]). The measure is inevitably imprecise and noisy
as we saw above with HO-SCA, but if the cryptosystem is not protected against SCA, the
resulting attack can be devastating (an unprotected AES can be attacked in a few seconds
with a few traces while its security against classical attacks is still nowadays of 128 bits,
which corresponds to a huge amount of computation time, even for thousands of computers
in parallel). In particular, continuous side-channel attacks in which the adversary gets
information at each invocation of the cryptosystem are especially threatening [713].
SCA are not the only threats on block ciphers, since fault injection attacks (FIA)
can also be performed, which aim at extracting the secret key when the algorithm is
running over some device, by injecting some fault in the computation, so as to obtain
exploitable differences at the output. For instance, differential fault analysis (DFA) attacks,
first proposed by Biham and Shamir [83], use information obtained from an incorrectly
functioning implementation of an algorithm to derive the secret information. The AES can
be attacked this way (see, e.g., [91]) as well as stream ciphers [606]. These attacks can be
noninvasive and perturb internal data (for example, with electromagnetic impulses), without
damaging the system, and leaving then no evidence that they have been perpetrated.
Masking. The implementations of cryptosystems need to include countermeasures to
physical attacks (SCA and FIA). A sound approach against SCA is to use a secret sharing
3 This case is more frequent in hardware; two distinct leakages are more exploitable in software because it is
easier to determine the exact distinct timings of two leakages than to distinguish them when they happen in
parallel; note, however, that the improved capabilities of modern microprocessors more and more allow
parallel software computing.
4 Adaptative adversarial strategies are seldom conferring a significant advantage, compared to nonadaptative
strategies.
428 Recent uses of Boolean and vectorial functions and related problems

scheme (see page 145), often called masking in the context of side-channel attacks.5 This
method, which aims at amplifying the impact of the noise in the adversary’s observations
and at randomizing the secret-dependent internal values of the algorithm from one execution
to another, is efficient both for implementations in smart cards and FPGA or ASIC (in
the former case, the shares are usually manipulated in serial, while in the latter, they are
manipulated in parallel). This approach consists, for a given masking order d, in splitting
each sensitive variable6 Z of the implementation into d + 1 shares M0 , . . . , Md such that
Z can be recovered from these shares, but no information can be recovered from less
than d + 1 shares, i.e., Z is a deterministic function of all the Mi , but is independent of
(Mi )i∈I if |I |  d. The simplest way (called Boolean masking) of achieving this is to draw
M1 , . . . , Md at random from the space in which lives Z (the Mi are then called masks
and are redrawn fresh at every encryption) and to take M0 such that M0 + · · · + Md
equals Z, where + is a relevant group operation (in practice, the bitwise XOR). The
masks change at every computation. This countermeasure allows resisting the SCA of
order d. For instance, for d = 1 and if the leakage is the Hamming weight wH , then
instead of having traces corresponding to wH (Z), the attacker will have traces corresponding
to wH (Z + M, M) = wH (Z + M) + wH (M) (note that the individual leak from any of the
two shares is useless since it does not give information, being individually random); we
assume here that the attacker cannot separate the two leaks (which is more difficult with
hardware than with smart cards, as we explained above); if he or she can, the designer needs
to take d larger. It can be checked that the first-order attack is then no longer successful. It
has been also proposed (see [562, 974]) to use Shamir’s (, d + 1) secret sharing scheme
(see page 145) rather than Boolean masking (for which the information on the shared data is
relatively easy to rebuild from the observed shares, which simplifies the task of the attacker).
The advantages of such a masking method are studied in [340], where it is shown that it may
be more advantageous for the attacker (in terms of attack complexity) to observe strictly
more than d + 1 shares (while it could seem natural that observing strictly more than d + 1
shares is inappropriate for the attacker since it provides more noise), thanks to the existence
of so-called linear exact repairing codes (which allow reconstruction from less information
than Lagrange’s interpolation, thanks to polynomial interpolation formulae that optimize the
amount of information which needs to be extracted), and that the choice of the public points
(the αi at page 145) has an impact on the countermeasure strength.
Security. We see with HO-SCA that, since the complexity of mounting a successful side-
channel attack increases exponentially with the order of the attack, then when applied against
a masked implementation, it grows exponentially with the masking order. Hence, it is always
possible, theoretically, to protect a cryptosystem against SCA by masking, but this needs
practically to change in the algorithm every function x → F (x) (that we shall assume to be
an (n, n)-function to simplify; the general case is similar) into a function7 (m0 , . . . , md ) →
(m0 , . . . , md ) such that, if m0 , . . . , md are shares of x, then m0 , . . . , md are shares of F (x)
(we shall say that such function (m0 , . . . , md ) → (m0 , . . . , md ) is the masked version of
function F ), and such that the d-th order security property is satisfied. The latter property,
which is equivalent to the probing security model introduced in [637], states that every tuple
5 Other methods exist: threshold implementations and multiparty computation; see below.
6 We denote random variables by capital letters.
7 We denote set or space elements by lower case letters.
12.1 Physical attacks and related problems on functions and codes 429

of d or less intermediate variables is independent of the secret parameters of the algorithm.8


When satisfied, it guarantees that no attacker able to learn at most d intermediate results
(called probes) of a computation can succeed in an attack of order lower than or equal to
d. This model, which is a simplified version of the behavior of a device in the real world
(in which physical leakages reveal some information on the whole computation), allows
thanks to its simplicity to build efficient compilers transforming (at a cost that is quadratic
in d) any circuit into a secure one in the probing model (see a survey in the introduction of
[49]). A more realistic and more complex model was proposed in [880] , and improved in
[973] into the noisy leakage model, which was studied further and improved in [487], where
the so-called statistical distance was introduced, allowing one to show that constructions
proved secure in the probing security model are also secure in the noisy leakage model,
provided that the probing order is a large enough function of the noisy leakage order. A last
improvement can be found in [564].
An a priori weaker notion of d-th order resistance has been introduced in [897] to
characterize the security of parallel implementations, for which higher-order probing
security can never be achieved because all shares are treated within one single cycle. It
is called the bounded moment security model and has been studied in several papers (see,
e.g., [49]). A masking scheme is secure at order d in this model if no moment of degree d in
the intermediate variables depends on the secret. It is more appropriate for hardware. Indeed,
the appropriate model and, hence, the kind of masking scheme to be applied depends on the
capabilities of the execution platform: embedded software devices such as smart cards can
execute operations sequentially, but need to rely on smaller memories (which are constrained
resources); therefore, functions such as S-boxes are preferentially recomputed [972, section
2.1], while FPGAs are able to execute several operations in parallel, and can leverage
on large memory blocks (called Block Random Access Memory [BRAM]); in such a
context, masked functions can be simply tabulated, i.e., computed in one clock cycle (such
strategy is also referred to as Global Look-Up Table [972, section 3.2]). Therefore, masked
computations in smart cards require end-to-end security, whereas masked computations in
FPGAs can resort to large tabulated functions where only data representation (i.e., tables
input and output) shall be secured. Note, however, that if the need in memory appears too
important, we can change the algorithm so that, instead of working on F2n , it works in F2n/k
for some k which provides a time–memory trade-off.
Reductions between the leakage security models seen above are studied in [49] (it is
proved in particular that probing security for a serial implementation implies bounded
moment security for its parallel counterpart, and that simple refreshing algorithms with
linear complexity that are not secure in the continuous probing model are secure in the
continuous bounded moment model). When probing and bounded moment security models
are considered at the bit level, then they are equivalent [577]. Note that there also exists a
parameter quantifying the resistance of S-boxes to DPA, called the (modified) transparency
order [343]; we shall not address it here.
Until recently, no method was known for securely composing masked (elementary)
functions ensuring d-probing security with a (tight) number d + 1 of shares. This problem
8 Note that when the algorithm handles vectors, in Fn2 , there are different ways of interpreting the definition,
according to whether it refers to intermediate variables as vectors or as individual bits; we shall then specify
bit-probing security when needed.
430 Recent uses of Boolean and vectorial functions and related problems

has been solved in [48] thanks to the introduction of the security notions of t-(strong)
noninterference (S)NI (any set of at most t intermediate variables can be perfectly simulated
with at most t shares of each input, and in the case of strong NI, at most t −tout shares, where
tout is the number of output variables among the t ones), and optimized in [55] (where is
shown that some masked S-boxes may be composed without refreshing).
Security at order one against SCA is nowadays considered insufficient in both models
for most practical operational environments. Detecting a single fault is also insufficient.
Second-order resistance to both side-channel analysis and fault injection resistance (in a
“mask then encode” procedure, which is more efficient than “encode then mask” in terms of
variable size growth, but care must be taken on the way the redundancy is applied) may be
sufficient9 (but without a security margin taking into account future improvements of SCA).

Masking functions. If a function F is linear (like diffusion layers – MixColumns and


ShiftRows in AES), then we can take mi = F (mi ) for designing its masked version. A
little more generally, if a function is affine (like the round key addition – AddRoundKey in
AES), it can be masked at no extra cost.
If a function F is not affine (which is the case of a substitution layer – SubBytes in AES),
then we can design its masked version as follows: assuming that the input to F lives in F2n
(which is always possible since we assume it lives in a vector space over F2 , and F2n is an
n-dimensional vector space over F2 ), it is a univariate polynomial function (see page 41) and
its computation can be decomposed into a sequence of additions and multiplications in the
field. The operations of addition, scalar multiplication, and squaring being linear functions,
they can be masked at no extra cost (see above). For masking multiplication, there is a
method called the ISW algorithm (ISW stands for Ishai–Sahai–Wagner), which is introduced
in [637] for the case of F2 and generalized to F2n in [996].

Algorithm 3: Higher-order masking scheme ISW for multiplication.


Input : sharings (a0 , a1 , . . . , ad ) and (b0 , b1 , . . . , bd ) of a and b in F2n
Output: a sharing (c0 , c1 , . . . , cd ) of c = a × b

1 Randomly generate d(d + 1)/2 elements rij ∈ F2n indexed such that 0  i < j  d
2 for i = 0 to d do
3 for j = i + 1 to d do
4 rj ,i ← (ri,j + ai × bj ) + aj × bi
5 end
6 end
7 for i = 0 to d do
8 ci ← ai × bi
9 for j = 0 to d, j = i do
10 ci ← ci + ri,j
11 end
12 end
13 return (c0 , c1 , . . . , cd )

9 Some palliative countermeasures may then be needed, consisting for instance in desynchronization, random
interrupts or dummy operations; such palliative countermeasures are not sufficient on their own.
12.1 Physical attacks and related problems on functions and codes 431

The time complexity and the amount of random data that needs to be generated for the
ISW algorithm are both quadratic in d (see more in [320]).
Other methods10 exist like in [974] (they are surveyed in [54]; see also [563]), which
we do not detail since they do not pose, so far, new questions on Boolean and vectorial
functions.
We see that the designer of the block cipher implementation has some advantage over
the attacker, because increasing d raises exponentially the complexity of the attack and only
quadratically the complexity of the countermeasure. However, countermeasures are costly
in terms of running time and program executable file size (in software applications) or of
implementation area (in hardware applications). For example, in software with 8-bit AVR
architecture, an AES without masking runs in few hundreds of cycles or few thousands,
while with masking it needs already about 40,000 cycles for first order; moreover, the
program executable file size is also increased because of the need to mask S-boxes (see
https://github.com/ANSSI-FR/secAES-ATmega8515/). In hardware, the implementation
area is roughly tripled. The cost overhead may be too high for real-world products, all the
more when the order of probing security is larger than 1. But the implementation (including
masking) must be efficient today, while the SCA can be performed in the future.
We need then to minimize the implementation and memory complexities of the counter-
measures. This is where Boolean and vectorial functions can play a role.

12.1.1 A new role of correlation immunity and of the dual distance of codes related
to side-channel attack countermeasures
Correlation immune Boolean functions (see Definition 21, page 86), allow reducing in two
possible ways the cost overhead due to masking, while keeping the same resistance to dth-
order SCA in the bounded moment security model (and possibly the same order of probing
security, but this depends on the implementation) when the leak is a linear combination over
the reals of the bits of the sensitive variable, added with a Gaussian noise (this assumption
on the leak is rather realistic and the assumption on the noise is almost always the case in
literature):
• By applying a method called leakage squeezing, which allows achieving with one single
mask the same protection of registers against higher-order SCA as with d ones, where
d is an integer strictly larger than 1 that we shall define. This method, which allows
making optimal the representation of the shares and maximizes the resistance order
against high-order side-channel attacks, has been introduced in [811] and further studied
in [263, 810] (an extremely close countermeasure has been introduced independently
in [132]). It uses a bijective vectorial function F that is applied to modify the mask
(this is a reason why the method is better adapted to hardware since we need to apply
F −1 at some point, and in hardware this can be made more easily as we explained, but
millions of smart cards built by industry nowadays include leakage squeezing as well).
The pair (M0 , M1 ) such that M0 + M1 = Z is not processed as is in the device, but
in the form of (M0 , F (M1 )). The condition for achieving resistance to d-th order SCA
in the bounded moment security model is proved in [263, 810, 811]: assuming that the
10 One (the threshold implementation) will be seen in Subsection 12.1.4, page 436; its complexity is higher while
it addresses a more difficult situation than the ISW algorithm.
432 Recent uses of Boolean and vectorial functions and related problems

leakage model is a pseudo-Boolean function of numerical degree (see Definition 13,


page 48) at most d (which is the case of the dth power of a degree 1 leakage), it is
that the graph indicator of F , that is, the 2n-variable Boolean function whose support
equals the graph {(x, y); y = F (x)} of F , is d-th order correlation immune. Such
graph is a complementary information set code (CIS) in the sense that it admits (at least)
two information sets (see page 314) that are complementary of each other; see [280].
The condition that the indicator of this CIS code is d-th order correlation immune is
equivalent to saying that the dual distance of this code is at least d + 1 (according
to Corollary 6, page 88), which is coherent with what was observed by Massey [827]
already in 1993. For instance, a rate 12 [16, 8, 5] linear code can be used. But it is shown in
[810] that there exists a nonlinear code that achieves better: it is the Nordstrom–Robinson
code of parameters (16, 256, 6). A comprehensive study of CIS codes has been made in
[280], and it is shown in [264] that the mutual information between the sensitive data and
the leakage vanishes exponentially with the noise variance, at a rate that is proportional
to the dual distance.
The method of leakage squeezing has been later generalized in [262] to several masks.
Compared to first-order leakage squeezing, second-order leakage squeezing is more
efficient, since it increments by one unit at least the resistance against high-order
attacks, with an appropriate (a priori different) code. In fact, it improves it more, since
better improvements have been realized by relevant choices of squeezing bijections.
But the optimal solutions are more difficult to find than in the case with one mask.
When the masking is applied on bytes (as in AES), optimal leakage squeezing with
one mask resists HO-SCA of orders up to 5 (with the Nordstrom–Robinson code),
and with two masks, resistance against HO-SCA of order 7 is provided. The study of
the corresponding higher-order CIS codes has been made in [277]. A rate 13 [24,8,8]
linear code (maximal minimal distance) with three disjoint information sets fulfills the
conditions.
• An alternative way of resisting higher-order SCA with one single mask consists in
avoiding processing the mask at all: for every sensitive variable Z that is the input to
some box S in the block cipher, Z is replaced by Z + M, where M is drawn at random,
and Z + M is the input to a “masked” box SM whose output is a masked version of S(Z)
(and the process of masking continues similarly during the whole implementation, only
the very last step being eventually unmasked to give the result). This method is called
rotating S-box masking (RSM) [898]. It needs, for each box S in the cipher, to implement
a look-up table for each masked box SM . This is particularly well adapted to hardware:
all S-boxes are then addressed in parallel, for a better throughput; the attacker is not
able to know which S-box is addressed for a given value of M; he/she is only able to
identify that the S-boxes have been looked up, but the order in which they are queried is
indistinguishable from his/her standpoint; he/she is limited to collecting an aggregated
function of all S-boxes. This being said, many smart cards implement RSM nowadays as
well, still more than leakage squeezing.
To reduce the cost, M is not drawn at random in the whole set of binary vectors of
the same length as Z, but in a smaller set of such vectors, say E. The condition for
achieving resistance to d-th order SCA in the bounded moment security model is that
the indicator function 1E is a d-th order correlation immune function, i.e., that E viewed
12.1 Physical attacks and related problems on functions and codes 433

as a code has dual distance at least d + 1 [74, 286]. This is because, for any j ≤ d, the
j 1  j
mean of wH (Z + M) when Z has some fixed value z equals |E| m∈E wH (z + m) =
1  j
j j + ,
m∈Fn2 1E (m) wH (z + m) = |E| 1E ⊗ wH (z) = 2n |E| 1E × wH (z), according to
1 1
|E|
,
Relation (2.45), page 60, and this mean is independent of z if and only if 1+
j
E × wH (a) = 0
j
for every a = 0n , while we know that wH , which has numerical degree j , satisfies that
, j
wH (a) = 0 if and only if wH (a) > j .
Given d, we wish to choose this d-th order correlation immune function 1E with lowest
possible (nonzero) weight, since the size of the overhead due to the masked look-up
tables is proportional to the size of the set.11

In [289], it is shown that the security notion (at bit level, i.e., in F2 ) corresponds in these two
cases to d-probing and d-th order bounded moment security models.
Leakage squeezing and RSM needing correlation immune functions of low weights (with
a particular shape in the case of leakage squeezing since the function must then be the graph
indicator of a permutation, see more in [244]), this has posed a new problematic on Boolean
functions, which we began to address at pages 303 and following (further work is needed).
Most of the numerous studies made (mostly in the 1990s) on correlation immune functions
in the framework of stream ciphers (see page 86) dealt with resilient (balanced) functions
and do not apply to low-weight correlation immune functions.

12.1.2 Vectorial functions in univariate form: minimizing the number of nonlinear


multiplications for reducing the cost of countermeasures
In [297], properties that an S-box could possess for being more resilient against side-channel
attacks, such as the (near) preservation of Hamming weight and a small Hamming distance
between input and output are studied; the incidences on the nonlinearity and differential
uniformity are determined.
Additional protections, like masking, are in any case unavoidable. We have seen that
the complexity of masking additions and linear multiplications (like, for instance, x × x)
is negligible compared to that of masking nonlinear multiplications. To efficiently mask
an algorithm, we need to minimize the masking complexity of each S-box, that is, the
minimum number of nonlinear multiplications needed to implement it. This parameter is
affine invariant.
When the S-box is a power function F (x) = x d like in the AES, minimizing the number
of nonlinear multiplications results in a variant of the classical problem of minimizing
addition chains in a group (see [284]); determining the masking complexity amounts to
finding the addition chain for d with the least number of additions that are not doublings. For
instance, the inverse function x → x 254 = x −1 in F28 can be implemented with four nonlin-

11 Note,however, that if the cipher is made like the AES, with identical substitution boxes up to affine
equivalence, the substitution layer can be slightly modified so as to be masked at no extra cost: the affine
equivalent boxes are replaced by masked versions of a same box; namely, the 16 byte masks that can be
applied to the 16 boxes are the codewords of the [8, 4, 4] self-dual code.
434 Recent uses of Boolean and vectorial functions and related problems

ear multiplications, in many ways (we saw one at page 401; note that the well-known square-
and-multiply algorithm for computing the inverse needs more than four multiplications).
When the S-box is a general polynomial, minimizing the number of nonlinear multiplica-
tions is a new paradigm. It is proved in [382] that, for every positive integer n, there exists a
polynomial P (x) ∈ F2n [x] with masking complexity:
/
2n
MC (P ) ≥ − 2. (12.1)
n
There exist several methods for trying to minimize the multiplicative complexity,12
MC (P ) of polynomials P and allowing their probing secure evaluation at minimized cost.
We refer to [284, 320] for more details. The two first methods are provable and the two last
are heuristic13 (and more efficient in practice)
• The cyclotomic method consists in rewriting P (x) in the form:
q
n −1
P (x) = u0 + Li (x αi ) + u2n −1 x 2 ,
i=1
where q is a positive integer and (Li )iq is a family of linear functions. Since the
j
transformations x ∈ F2n → x 2 are F2 -linear, their masking complexity is null. This
q
implies that the masking complexity of i=1 Li (x αi ) is bounded above by the number
of nonlinear multiplications required to evaluate all the monomials x αi , that is, by
ϕ(δ)
− 1, where μ(m) denotes the multiplicative order of 2 modulo m and ϕ
μ(δ)
δ|(2 −1)
n

the Euler’s totient function.


• The Knuth–Eve method is based on a recursive use of the observation that any
polynomial P (x) of degree t over F2n [x] can be written in the form
P (x) = P1 (x 2 ) ⊕ P2 (x 2 )x,
where P1 (x) and P2 (x) have degrees bounded above by t/2. This implies that the
masking complexity of P (x) is at most

3 · 2(n/2)−1 − 2 if n is even,
2(n+1)/2 − 2 if n is odd.
• The Coron–Roy–Vivek (CRV) method [382] starts with a union C of cyclotomic classes
Ci in Z/(2n − 1)Z, such that all power functions x j , j ∈ C , can be processed with
a global small enough number of nonlinear multiplications. This set of monomials x j
spans a subspace P of F2n [x]. A polynomial R ∈ F2n [x1 , . . . , xt ] is searched such that
P (x) = R (P1 (x), . . . , Pt (x)) ,
where the Pi are taken in P . Denoting by μ the number of nonlinear multiplications
required to build C , the search tries to minimize MC (R) + μ. A heuristic approach (in
order to speed up the process) is proposed:
12 This term is meant here at the F2n field level; it can be also considered at the bit level, in relation with bitsliced
implementations; see, e.g., [565] and the references therein.
13 In the sense of “not proved.”
12.1 Physical attacks and related problems on functions and codes 435

1. Build the union set C such that all the powers of P ’s monomials are in C + C .
2. Choose and fix a set of r polynomials P1 (x), . . . , Pr (x) in P and search for r + 1
polynomials Pr+1 (x), . . . , P2r+1 (x) in P such that
r
P (x) = Pi (x) × Pr+i (x) + P2r+1 (x). (12.2)
i=1
Thanks to the fact that P1 (x), . . . , Pr (x) have been fixed, this results in solving a
linear system of n2n Boolean equations in at most min(r, |C |) × |C | + |C | unknowns.
The condition 2n  |C | × (1 + min(r, |C |)) ensures then that the method outputs
at least
√ n one solution. The complexity of the resulting probing secure method is
O( 2 /n), which is asymptotically better than the complexity of Knuth–Eve’s
method. Moreover, a comparison of Coron’s complexity with Inequality (12.1) shows
that it is asymptotically optimal.

• The CPRR method [321] is more recent and based on another algebraic decomposition
heuristic principle (CPRR stands for Carlet–Prouff–Rivain–Roche). It decomposes P (x)
by means of functions of low algebraic degree, and designs efficient probing-secure
evaluation methods for such low-degree
 functions. The decomposition step starts by
G1 (x) = F1(x)
deriving a family of generators: , where the Fi are random
Gi (x) = Fi Gi−1 (x)
polynomials
r of algebraic degree s. Then it randomly generates t polynomials Qi =
j =1 L j ◦ G j , where the Lj are linearized polynomials. Eventually, it searches for t
polynomials Pi of algebraic degree s and for r + 1 linearized polynomials Li such that
t r
 
P (x) = Pi Qi (x) + Li Gi (x) + L0 (x).
i=1 i=1
As in the CRV method, the search for polynomials Pi and Li amounts to solving a system
of linear equations over F2n .
For masking a function F of algebraic degree at most s, the method uses that for every
function from F2n to itself of algebraic degree at most s, the mapping
 
βF(s) (a1 , a2 , . . . , as ) = F ai
I ⊆{1,...,s} i∈I

is multilinear (which is easily seen and has been first observed in [209]), which allows
us to prove that, for every d ≥ s:
 d  s−1
 
F ai = βF(s) (ai1 , . . . , ais ) + ηd,s (j ) F ai ,
i=1 1≤i1 <···<is ≤d j =0 I ⊆{1,...d} i∈I
|I |=j
d−j −1
where ηd,s (j ) = s−j −1 mod 2 for every j ≤ s − 1, and to deduce that
 d
 s
 
F ai = μd,s (j ) F ai ,
i=1 j =0 I ⊆{1,...,d} i∈I
|I |=j
436 Recent uses of Boolean and vectorial functions and related problems
 −1
where μd,s (j ) = d−js−j mod 2 for every j ≤ s. This reduces the complexity of the
d-masking of a degree s function to several s-maskings. An alternative (tree-based)
method is also proposed. It is shown that the processing of any S-box of dimension
n = 8 can be split into 11 evaluations of quadratic functions, or into four evaluations of
cubic functions.

12.1.3 Vectorial functions and algebraic side-channel attacks


In [990], an attack on block ciphers called algebraic side-channel attack is introduced, which
combines the two approaches of algebraic attacks and side-channel attacks. In [272], the
algebraic phase of this attack is studied. The notion of algebraic immunity is modified to
include the information from the leakage on Hamming weight or on Hamming distance, and
it is studied how this can allow obtaining enough equations of degree one to be able to solve
the algebraic system with Gröbner methods. We refer to these two papers for the technical
details.

12.1.4 Vectorial functions and threshold implementation


The countermeasures against SCA presented so far suppose, for having good efficiency,
that the leakage has some regularity. Building hardware with such property is expensive
in practice. In particular, hardware glitches, which are transient faults (coming when the
input signals of the combinational logic arrive at different moments in time when they
should come simultaneously, signal switches then several times when it should switch once)
common in CMOS technology, change the leaking into functions L having numerical degree
larger than one, because of the interactions between bits that they cause, and which moreover
vary with time. Glitch-free hardware is very expensive. We present here the main known
solutions for avoiding needing it.
1. The problem of building implementations secure against d-th order side-channel attacks
in the presence of glitches is equivalent to the problem of securing the processing of a
function with several semihonest players (see [974]). A related way of masking S-boxes
is the so-called polynomial masking, introduced separately in [974] and [562], and which
gives a solution to this problem without needing sophisticated hardware. The idea is to
make the global circuit glitch-free-like by the implementation itself, splitting the circuit
implementing the S-box into several subcircuits communicating with each other on the
basis of a multiparty computation protocol (see page 146), like the one in [59]. The
masking operation of a sensitive data z ∈ F2n is based on Shamir’s secret  sharing seen in
Subsection 3.6.1, page 145. It consists in constructing a function fz (x) = m−1 i=1 ai x +z,
i

where (ai )1≤i≤m−1 are some random secret coefficients, then as in Boolean masking, z
can be represented by m shares (z0 , . . . , zm−1 ), with zi = (αi , fz (αi )) for 0 ≤ i ≤ m − 1
for some random inputs (αi )0≤i≤m−1 . To get z (unmasked), we have to reconstruct fz
by polynomial interpolation,14 and finally calculate z = fz (0). The advantage of this
method is that it is grounded by a well-studied theory (multiparty computation) and the

14 Better methods have been very recently found; see [340].


12.1 Physical attacks and related problems on functions and codes 437

security models are clear. Its disadvantage is that it is not very efficient, especially when
first-order SCA is considered.
2. Another S-box masking method, also based on ideas of multiparty computation and
aiming at solving the problem posed by glitches, is threshold implementation15 (TI).
Threshold schemes are attractive from an academic viewpoint, because they come with an
information-theoretic proof of resistance against first-order DPA while allowing realistic-
size circuits.16 Introduced in [903] and presented more completely in [904], they pose
interesting challenges for vectorial functions. In the TI of an (n, m)-function F , the shares
of the output of F are the outputs of several functions of the shares of the input, each such
function being independent of at least one of the shares of the input to F (a different one
for each function); these two properties, called correctness and incompleteness, provide
first-order probing security (if the implementation is done properly). More precisely:
• The masked version with t masks (i.e., with t +1 shares) of each input variable xi will
(1) (t+1) (1)
be denoted by xi = (xi , . . . , xi ) ∈ Ft+1
2 . We shall denote the sum xi ⊕ . . . ⊕
(t+1)
xi of the coordinates of xi by s(xi ); we have s(xi ) = xi for every i. Extending s
to a function over (Fn2 )t+1 , we have then s(x) = x, for every x = (x1 , . . . , xn ).
• A t-mask (i.e., (t + 1)-share) realization of an (n, m)-function F is a vector F =
(F1 , . . . , Ft+1 ) of ((t + 1)n, m)-functions, i.e., a function from (Fn2 )t+1 to (Fm 2)
t+1 ,
t+1
such that, for all x ∈ (F2 ) , denoting also j =1 Fj (x) by s(F(x)), we have
n t+1

if x = s(x), then F (x) = s(F(x)).


This property is called correctness. In practice, the numbers of input and output shares
may be different but we take them equal to simplify the presentation. To obtain d-th order
security
 against univariate attacks in a so-called HO-TI, we would need to have td maskst
and td+1t shares, and in a so-called consolidated masking scheme, d masks and (d + 1)
output shares later synchronized in a register and compressed back to d +1 shares (which
is often compared to the ISW multiplication), but with an extra register to protect against
glitches; we shall not develop this and refer the reader to [85] and [992].
In terms of function graphs, let GF = {(x, F (x)); x ∈ Fn2 } and GF = {(x, F(x)); x ∈
(F2 )t+1 } be the graphs of functions F and F; correctness corresponds to the fact that the
n

linear function
(x, y) → (s(x), s(y))
maps GF to GF .
Correctness can be characterized by the Walsh transform in the following way:

Proposition 187 Given a ((t + 1)n, (t + 1)m)-function:


F = (F1 , . . . , Ft+1 ) : x ∈ (Fn2 )t+1 → F(x) ∈ (Fm
2)
t+1

and an (n, m)-function F : Fn2 → Fm


2 , we have
s(F(x)) = F (s(x))
15 Not to be confused with threshold functions, seen in Subsection 10.1.7.
16 These can still be attacked by (univariate) mutual information and higher-order analyses.
438 Recent uses of Boolean and vectorial functions and related problems

if and only if
∀u(1) , . . . , u(t+1) ∈ Fn2 , ∀v ∈ Fm (1)
2 , WF ((u , . . . , u
(t+1)
), (v, . . . , v))
 tn
2 WF (u(1) , v) if u(1) = · · · = u(t+1)
=
0 otherwise.

Proof We have (F1 + · · · + Ft+1 )(x (1) , . . . , x (t+1) ) = F (x (1) + · · · + x (t+1) ) if and only if
these two functions have the same Walsh transform, that is, if for every u(1) , . . . , u(t+1) ∈ Fn2
and v ∈ Fm (1)
2 , WF1 +···+Ft+1 ((u , . . . , u
(t+1) ), v) equals the value at ((u(1) , . . . , u(t+1) ), v) of

the Walsh transform of function (x , . . . , x (t+1) ) → F (x (1) + · · · + x (t+1) ), that is,


(1)

(1) +···+x (t+1) )+u(1) ·x (1) ⊕···⊕u(t+1) ·x (t+1)


(−1)v·F (x ,
(x (1) ,...,x (t+1) )∈(Fn2 )t+1

which, by changing x (1) into x (1) + · · · + x (t+1) , equals


(1) )⊕u(1) ·x (1) ⊕(u(1) +u(2) )·x (2) ⊕···⊕(u(1) ⊕u(t+1) )·x (t+1)
(−1)v·F (x .
(x (1) ,...,x (t+1) )∈(Fn2 )t+1

The rest is straightforward.

A necessary and sufficient condition for a function G to be the realization of some


function is then as follows:

Corollary 31 Given G = (G1 , . . . , Gt+1 ) : (Fn2 )t+1 → (Fm 2)


t+1 , the function (G
1
+ · · · + Gt+1 )(x (1) , . . . , x (t+1) ) depends only on x (1) + · · · + x (t+1) if and only if,
for every u(1) , . . . , u(t+1) ∈ Fn2 , v ∈ Fm (1)
2 , WG ((u , . . . , u
(t+1) ), (v, . . . , v)) equals zero

when the equalities u (1) = ··· = u (t+1) are not all satisfied. Then we have that
WG ((u(1) , . . . , u(t+1) ), (v, . . . , v)) is divisible by 2tn for every u(1) , . . . , u(t+1) .

Proof The condition that WG ((u(1) , . . . , u(t+1) ), (v, . . . , v)) equals zero when the equali-
ties u(1) = · · · = u(t+1) are not all satisfied is necessary, according to Proposition 187. It is
also sufficient since we have then, according to the inverse Walsh transform formula 2.43,
page 59:

2(t+1)n (−1)v·(G1 +···+Gt+1 )(x


(1) ,...,x (t+1) )

(1) ·x (1) ⊕···⊕u(t+1) ·x (t+1)


= (−1)u WG ((u(1) , . . . , u(t+1) ), (v, . . . , v))
(u(1) ,...,u(t+1) )∈(Fn2 )t+1
(1) +···+x (t+1) )
= (−1)u·(x WG ((u, . . . , u), (v, . . . , v)) (12.3)
u∈Fn2

and we have that, for every v, function v · (G1 + · · · + Gt+1 )(x (1) , . . . , x (t+1) ) depends then
only on x (1) + · · · + x (t+1) . It is easily seen that (G1 + · · · + Gt+1 )(x (1) , . . . , x (t+1) ) depends
then only on x (1) + · · · + x (t+1) .
Using now Relation (12.3) with x (2) = · · · = x (t+1) = 0n and with x instead of x (1)
and applying the inverse Walsh transform formula to the resulting function of x, we have
12.1 Physical attacks and related problems on functions and codes 439

2tn (−1)v·(G1 +···+Gt+1 )(x,0n ,...,0n )⊕u·x = WG ((u, . . . , u), (v, . . . , v)) and this proves the
x∈Fn2
divisibility property.

Correctness is a constraint on the realization F, not really on F itself. But in threshold


implementation, a second property is required for F, and a third one is desired too; both also
put constraints on F :
• In a threshold implementation F = (F1 , . . . , Ft+1 ), every function Fj should be
independent of the j th coordinate of each xi (in the sense that this j th coordinate should
not appear at all in the ANF of Fj ), i.e. Fj should be independent of the j -th share of x.
This property is called noncompleteness17 and implies that the output of Fj , individually,
is uncorrelated to any input variable xi (assuming, as in the subsections above, that any
vector of less than t + 1 shares of xi is uncorrelated to xi ). Note that if F has algebraic
degree at most t, then it is easy to build such F: starting from the ANF of F (x), replacing
(1) (t+1)
each xi by xi ⊕ . . . ⊕ xi and expanding, we obtain a sum of monomials in each
of which at least one upper index is not appearing, and then, starting with j = 1 and
incrementing j at each step, we store in Fj all those monomials involving variables
whose upper indices are different from j and that have not yet been stored. This way
guarantees both correctness and noncompleteness.
For instance, applying this method to the Boolean function f (x) = x1 x2 gives
f1 ((x1(1) , x1(2) , x1(3) ), (x2(1) , x2(2) , x2(3) )) = x1(2) x2(2) ⊕ x1(2) x2(3) ⊕ x1(3) x2(2)
f2 ((x1(1) , x1(2) , x1(3) ), (x2(1) , x2(2) , x2(3) )) = x1(3) x2(3) ⊕ x1(1) x2(3) ⊕ x1(3) x2(1)
f3 ((x1(1) , x1(2) , x1(3) ), (x2(1) , x2(2) , x2(3) )) = x1(1) x2(1) ⊕ x1(1) x2(2) ⊕ x1(2) x2(1) .
Noncompleteness can be characterized by the Walsh transform in the following way:

Proposition 188 Given a ((t + 1)n, (t + 1)m)-function


F = (F1 , . . . , Ft+1 ) : x ∈ (Fn2 )t+1 → F(x) ∈ (Fm
2)
t+1
,
each function Fj is independent of the j th coordinate of each xi if and only if
∀j ∈ {1, . . . , t + 1}, ∀u(1) , . . . , u(t+1) ∈ Fn2 , ∀v (1) , . . . , v (t+1) ∈ Fm
2,

v (k) = 0m , ∀k = j ,
⇒ (WF ((u(1) , . . . , u(t+1) ), (v (1) , . . . , v (t+1) )) = 0). (12.4)
and u(j ) = 0n

Indeed, according to the Walsh and inverse Walsh transform formulae, Fj is indepen-
(j ) (j )
dent of (x1 , . . . , xn ) if and only if, for every (u(1) , . . . , u(t+1) ) ∈ (Fn2 )t+1 such that
u(j ) = 0n , we have WFj (u(1) , . . . , u(t+1) ) = 0.

17 In
(univariate) HO-TI of order d, this property becomes: any combination of up to d component functions of F
must be independent of at least one input share.
440 Recent uses of Boolean and vectorial functions and related problems

Definition 86 We call t-mask (i.e., (t + 1)-share) TI, or t-th order TI, of an


(n, m)-function any function F from (Fn2 )t+1 to (Fm
2)
t+1 satisfying correctness and

noncompleteness.

We shall call F the TI-masked function.


• The following property, called uniformity (of TI), is desired too but is often not included
in the very definition of TI: for every b = (b(1) , b(2) , . . . , b(t+1) ) in (Fm
2)
t+1 , the number
n
of x in (F2 ) t+1 for which F(x) = b is equal to 2 t (n−m) times the number of x in Fn2
for which F (x) = s(b) (if F is a permutation of F2 then this is equivalent18 to saying
n

that F is a permutation of (Fn2 )t+1 ). This property is needed to make sure that, if the
masking of the input to F is uniform, then the output of F is also a uniform masking of
the output of F . The uniformity property of a TI is then important when the output of the
TI is the input to another block (which is always the case in an iterative cryptographic
primitive such as a block cipher). If we use a nonuniform TI of a function, we need to add
sufficient refreshing (this is how HO-TI can be multivariate secure). Such a possibility is
used quite often in practice but is expensive.
Uniformity is the hard property among the three described above. The usual method
for trying to achieve it is by adding so-called correction terms to the output of TI when
they do not ensure uniformity; these are terms that are added in pairs to shares in a way
preserving noncompleteness. The terms in a pair canceling each other when the sum of
output shares is made, correctness is preserved as well. But this method is difficult and
has to be applied on each S-box; it is not really doable for infinite classes. How many
shares are needed for a uniform TI without extra randomness is also currently a question
without formal answer. The answer for (3, 3)-functions and (4, 4)-functions was given in
[86, 87] by exhaustive search.
In [904, theorem 1 and corollary 1], the authors observe that correctness, noncom-
pleteness, and the fact that the sharing at input is uniform suffices to ensure that
each of the output shares is statistically independent of the input variables and the
output variables and that the same holds for all intermediate results. Hence, if the
power consumption of each shared sub-circuit implementing one of the functions Fj is
independent of the other subcircuits, the implementation resists first-order SCA, even in
the presence of glitches. Uniformity ensures additionally that no more information than
a possible bias in the output distribution of the TI-masked function is provided. This uses
more random values during the setup and this is a disadvantage already acknowledged
in [903], but it does not need fresh randomness during the process. We shall specify
“TI with uniformity” when needed (that is, when the TI achieves uniformity without
additional fresh randomness).
Uniformity can be characterized by means of the Walsh transforms of F
and F as well: the condition is equivalent to (−1)v·(F(x)+b) =
x∈(Fn2 )t+1 ,v∈(Fm
2)
t+1

2tn (−1)v·(F (x)+b) , for every b, where v = (v (1) , . . . , v (t+1) ) and b = s(b), that
x∈Fn2 ,v∈Fm
2
 
is, ∀ b ∈ (Fm 2)
t+1 ,
v∈(Fm t+1 (−1)
v·b WF (0n(t+1) , v) = 2tn v∈Fm (−1) v·s(b) W
F (0n , v).
2) 2

18 If the output of F was shared in more shares than the input, it would be equivalent to saying that F is balanced.
12.1 Physical attacks and related problems on functions and codes 441

Algebraic degree of functions admitting a t-mask TI


A drawback of threshold implementation is that functions F of algebraic degree t can have
TI with at least t masks only19 (a necessary and sufficient condition for the existence of a
t-mask TI with or without uniformity is then that the algebraic degree be at most t). This has
been first observed in [903, theorem 1] (with incomplete statement and proof).

Proposition 189 Let F be any (n, m)-function admitting a t-mask (i.e., a (t + 1)-share) TI
with or without uniformity. Then dalg (F ) ≤ t.

Proof Consider the ANF of F :


F (x) = aI x I , aI ∈ Fn2 ,
I ⊆{1,...,n}

where x I = i∈I xi . Let F be a t-mask TI of F . Because of correctness, the (unique) ANF
of the ((t + 1)n, m)-function (s(F))(x1 , . . . , xn ) is obtained by expanding:
 (1) (t+1)
F (s(x1 ), . . . , s(xn )) = aI (xi ⊕ . . . ⊕ xi ). (12.5)
I ⊆{1,...,n} i∈I

Suppose that dalg (F ) ≥ t + 1 and consider a monomial x I of degree dalg (F ). Then


 (j )
the ANF of F (s(x1 ), . . . , s(xn )) contains all the monomials of the form i∈I xi i , where

ji ∈ {1, . . . , t + 1}, with nonzero coefficients. Indeed, two distinct monomials x I and x I
of degree dalg (F ) in the ANF of F provide disjoint sets of monomials in the expansion
of (12.5),
 which cannot then cancel each other. Moreover, none of the monomials of the
(j )
form ki=1 xi i , where i → ji is onto {1, . . . , t +1}, can be obtained from (s(F))(x1 , . . . , xn )
because of noncompleteness. A contradiction.

Hence for instance, the inverse function F (x) = x 2 −2 used in the AES, which has
n

algebraic degree n − 1, cannot have an (n − 1)-share (with (n − 2)-masks) TI. A question


is: can it have an n-share (with (n − 1)-masks) TI with uniformity? Many such questions
are open. For instance, recall that, for n odd and t = n+1 2 , any almost bent function
F has algebraic degree at most t. Does any AB function have an n+1 2 -mask TI with
uniformity?
Even for quadratic functions, there does not always exist a TI with uniformity of
minimum number of masks (that is, with two masks): see [87, corollaries 1 and 2]. In fact,
for any t ≥ 2, we do not know how characterizing the functions for which a t-mask TI
exists, despite the theoretical results of [73].
This is a concern since the implementation cost of a function increases then exponentially
with its degree: according to Proposition 189, a monomial of degree t results in the sum
of (t + 1)t monomials. This drawback is bypassed by expressing (when it is possible)
high algebraic degree functions as the compositions of lower algebraic degree functions for
which TI can be found; see [86, 87, 724], and see also [321] and page 435 for a general
method (the CPRR method) addressing this problem. Then uniformity can be ensured
by introducing (when necessary) fresh randomness when making the composition of two
19 For multivariate first-order security, or td masks for univariate higher-order security.
442 Recent uses of Boolean and vectorial functions and related problems

threshold implementations, that is, by adding to the output of F, a vector c such that
s(c) = 0m and such that any subvector of d components is random (this is sometimes called
re-masking). Note that when creating a masked implementation of a decomposition, the TI
of the lower-degree components have to be separated by a register stage to stop glitches
(and reducing the number of these register stages is needed). Of course, it is preferred to
minimize the number of the lower algebraic degree functions that are composed for giving
the S-box.
The TI of small S-boxes has been studied; see [86, 87, 113] and the references therein.
In particular, in [86] the authors designed the threshold implementation with at most four
masks for all (3, 3)-permutations and (4, 4)-permutations.20 But n ≤ 4 is interesting
for lightweight ciphers only. In [87], the authors studied APN (5, 5)-permutations (affine
equivalent to the AB power functions x 3 , x 5 , x 7 , x 11 and x 15 ) and the sole known APN (6, 6)-
permutation by different methods, in particular by expressing them as the compositions of
quadratic functions (which needs remasking and does not provide a TI, properly speaking).
But designing TI with uniformity for these functions is an open question, as well as for
the multiplicative inverse differentially 4-uniform (8, 8)-function used in AES. Note that
the inverse function plays not only a role in relation with the AES, since we know (see
[331, 1184]) that any (n, n)-permutation can be expressed as the composition of functions
x → ax + b and of the inverse permutation.
An alternative approach for designing TI is, given some secondary construction of
functions, to deduce the TI of the built function from the TI of the used functions. In [1094],
the following construction of an (n + 1, n + 1)-function H from two (n, n)-functions F and
G and two n-variable Boolean functions f and g is studied:
H : (x, xn+1 ) ∈ Fn2 × F2 → xn+1 (F (x), f (x)) + (xn+1 ⊕ 1)(G(x), g(x)) ∈ Fn2 × F2 .
Clearly, every (n + 1, n + 1)-function can be obtained this way. Note that H is a permutation
if and only if it is injective, that is, if and only if x → (F (x), f (x)) and x → (G(x), g(x))
are injections (which is a necessary and sufficient condition for the fact that two distinct
inputs of the same form (x, 0) or of the same form (x, 1) do not give the same output) and
have disjoint value sets (which is a necessary and sufficient condition for the fact that an
input (x, 0) and an input (y, 1) do not give the same output). If F and G are permutations,
then the condition simplifies into f ◦ F −1 ⊕ g ◦ G−1 = 1. It is then shown that if F and
f have t-mask TI with uniformity F and f, and if G(x1 , . . . , xn ) equals either F (x1 , . . . , xn )
or F (x1 , . . . , xi−1 , xi ⊕ 1, xi+1 , . . . , xn ) for some i = 1, . . . , n, and g is taken such that
f ◦ F −1 ⊕ g ◦ G−1 = 1, then H has a t-mask TI with uniformity. The idea of the
proof in the slightly more complex latter case is to decompose (F (x), f (x)) in the form
xi (Fi (x), fi (x)) + (xi ⊕ 1)(Fi (x), fi (x)), where (Fi (x), fi (x)) (resp. (Fi (x), fi (x))) is the
restriction of (F (x), f (x)) to the hyperplane of equation xi = 0 (resp. xi = 1) and to
observe that H (x, xn+1 ) equals (xn+1 ⊕ xi ⊕ 1)(Fi (x), fi (x)) + (xn+1 ⊕ xi )(Fi (x), fi (x)).
It would be nice if less restrictive cases within this general construction could be addressed.
In [105], the authors constructed (8, 8)-functions based on a Feistel network, a substitu-
tion permutation network, or the (special case of) MISTY network [830], all using quadratic

20 All (2, 2)-permutations being affine, they do not need to be studied.


12.1 Physical attacks and related problems on functions and codes 443

4-bit S-boxes, which admit a TI implementation while still maintaining a good level of
differential uniformity and nonlinearity.
A recent general alternative technique, called the changing of the guards, and which
represents a nice step forward, has been presented in [402] and applied to the Keccak
S-box. It builds a (t + 1)-share threshold implementation with uniformity of any invertible
S-box layer of algebraic degree t, after a transformation (the S-box is subdivided into several
stages, separated by registers, and some shares receive additional components; see the details
in [402]); each share at the output of S-box i is made uniform by bitwise adding to it one or
two shares from the input of S-box i −1; this solves the problem of threshold implementation
with uniformity (but only after such transformation of the S-box). In a next important step
forward, a modification of the changing of the guards has been given in [1056] and applied to
the AES S-box, at the cost of a significant extension of the AES S-box design, but ensuring
in a nice way uniformity with 3-share TI (while Daemen’s method in [402], as is, needs
t + 1 shares, that is, 8 in the case of the AES S-box). The method includes a generic way
to construct a uniform sharing for any function, by changing (by extension and reduction)
the function to an invertible one while maintaining its essential functionality unchanged,21
which can be, in the case of AES S-box, decomposed into quadratic bijections. The extension
of an (n, m)-function F is the (n + m, n + m)-permutation (x, y) → (x, F (x) + y).
If F is a 3-share TI of F , then a 3-share TI with uniformity of the extension is, still
denoting x = (x1 , x2 , x3 ) and y = (y1 , y2 , y3 ), the function ((x1 , y1 ), (x2 , y2 ), (x3 , y3 )) →
((x2 , F1 (x) + y3 ), (x3 , F2 (x) + y1 ), (x1 , F3 (x) + y2 )). The reduction of an (n + m, n + m)-
function G is the (n + m, n + m)-function (x, y) → (0n , y). If G is a 3-share TI of G, then
a 3-share TI with uniformity of the reduction is the function ((x1 , y1 ), (x2 , y2 ), (x3 , y3 )) →
((x2 + x3 , y2 ), (x3 , y3 ), (x2 , y1 )). Composing these two 3-share TI provides a 3-share TI with
uniformity of (x, y) → (0n , F (x) + y). See more in [1056].

Invariance of the existence of a t-mask TI


The existence of a t-mask TI of F (x) is invariant when changing F (x) into F (x + a) (by
changing F(x) into F(x+a) for some a such that s(a) = a, which also preserves uniformity),
and it is also invariant under linear equivalence F ∼ L ◦ F ◦ L (which preserves uniformity
as well), as observed (and proved rather informally) in [86, theorem 2]. Indeed, applying
L on each share of x and L on each share of the output of F preserves correctness since
L and L are linear, and it preserves uniformity since L and L are bijective, and it also
preserves noncompleteness, since the applications of L and L are made separately on each
share. Hence the existence of a t-mask TI is an affine invariant, as well as the existence
of a t-mask TI with uniformity. And the existence of a t-mask TI is also invariant under
adding an affine function but it is not clear whether this preserves uniformity (in other words,
affine equivalence preserves the existence of a uniform TI, but extended affine equivalence
probably does not).

Remark. Given a permutation F having a t-mask TI, function F −1 does not necessarily
have a t-mask TI, since there are quadratic functions having 2-mask TI and whose inverses

21 So the success of the method does not contradict Proposition 189.


444 Recent uses of Boolean and vectorial functions and related problems

are not quadratic22 and therefore do not have a 2-mask TI, according to Proposition 189.
In particular, if F is a t-mask TI of F , function F−1 is not necessarily a t-mask TI of F −1 .
This is because the condition “for every j = 1, . . . , t + 1, the j th coordinate function of F
is independent of the j th coordinate of each xi ” is not equivalent to the condition “for every
j = 1, . . . , t + 1, the j th coordinate function of F−1 is independent of the j th coordinate
of each xi .” This subtle difference can be more easily seen with the characterization by
Condition (12.4), which is not the same when applied to F and to its inverse: the hypothesis
“vk = 0n , ∀k = j and uj = 0n ” of the implication, when it is applied to F−1 , becomes
“uk = 0n , ∀k = j , and vj = 0n ,” since we know that changing a function into its inverse
corresponds for the Walsh transform to swapping the parameter(s) living in the domain and
the one (or those) living in the codomain, the same value of the Walsh transform being then
kept for this new input.

3. A more recent way to protect against SCA in the presence of glitches is domain oriented
rather than function oriented; it organizes properly the ISW computations (we described
above the methods based on decompositions of polynomials and related masking) and
implements the concept of share domains (keeping each domain independent from the
others). Each share of a variable is associated with one share domain. This method
is called domain-oriented masking (see [574, 575]) and is an alternative to threshold
implementation requiring less chip area and less randomness, all the more when raising
the protection order.

12.1.5 Linear complementary dual codes and complementary pairs


of codes used for direct sum masking
Direct sum masking has a weaker relationship with Boolean functions than with codes. We
briefly describe it, however, since a large number of recent papers deal with the related
notions of linear complementary dual codes and complementary pairs of codes, and also,
because of the dual distance of codes playing a central role, correlation immune functions
are closely related.
The impact of codes on protection against fault injection attacks is well studied; the
number of detected faults relates to their minimum distance. The (explicit) use of codes
for protecting against SCA is more recent. The direct sum masking (DSM) countermeasure
[131, 288] is a generalization of Boolean masking consisting in
• Encoding (see the definition at pages 5 and 7) the sensitive data, say x, that we consider
here as living in Fk2 , into a codeword of a k-dimensional linear subcode C of Fn2
• Encoding the mask y drawn at random from F2n−k into a codeword of an (n − k)-
dimensional linear subcode D of Fn2

The masked version of x equals then the sum of these two codewords. This is only a first-
order masking scheme in terms of probing security, if there is a reuse of the mask.
If G is a generator matrix of C and G a generator matrix of D, we take then
z = x × G + y × G ; x ∈ Fk2 , y ∈ F2n−k . (12.6)

22 This should be checked among quadratic functions with known 2-TI, though.
12.1 Physical attacks and related problems on functions and codes 445

For allowing the final demasking at the end of the computation, it must be possible to recover
x from z (but to avoid leaks, the algorithm should not include a computation of x, unless it
has arrived to its end). This means that C and D must have trivial intersection, that is, be
supplementary:23
Fn2 = C ⊕ D.
Every vector z ∈ Fn2 can then be written in a unique way as in ((12.6)). Note that this
provides the possibility of removing the mask without the knowledge of it.
As mentioned above, d-th order masking is a particular case of DSM: we have then n =
(d + 1)k, C = Fk2 × {0dk }, where 0dk is the all-zero vector of length dk, G = [Ik,k :
0k,dk ], where Ik,k is thek × k identity matrix, and 0k,dk is the k × dk all-zero matrix, D =
{(y0 , . . . , yd ); yi ∈ Fk2 , di=0 yi = 0k } and, for instance:
⎡ ⎤
Ik,k Ik,k 0k,k · · · 0k,k
⎢ Ik,k 0k,k Ik,k · · · 0k,k ⎥
⎢ ⎥
G = ⎢ . .. .. . .. ⎥ .
⎣ . . . . . . . ⎦
Ik,k 0k,k · · · · · · Ik,k
But an advantage of DSM is that, when C and D are properly chosen, it can be also a
countermeasure against FIA (while classical masking cannot), which helps reduce the cost
of the overall countermeasure against SCA and FIA.
A pair (C, D) of supplementary codes is called a linear complementary pair (LCP) of
codes. It is shown in [131] that if the monovariate leak L (which is a pseudo-Boolean
function) has numerical degree 1, the encoding with an LCP of codes (C, D) as described
above protects against d-th order HO-SCA if and only if the dual distance of D satisfies
d(D ⊥ ) > d. Moreover, as shown in [960, section 3.2], d-th order bit-probing security
is then ensured, because by definition, less than d(D ⊥ ) of those equations expressing the
coordinates of z = x × G + y × G by means of the coordinates of x and y do not
allow to eliminate the coordinates of y since fewer than d(D ⊥ ) columns of G are linearly
independent. The encoding protects against the injection of faults of Hamming weights
at most d if and only if the minimum distance of C satisfies d(C) > d. Note that when
encoding is made over bits, the ensured security is the so-called bit-probing security, whose
order can be higher than the probing security order when the attacker can probe symbols
belonging to a larger alphabet.
According to the observations above, the security parameter against both HO-SCA and
FIA is min{d(C), d(D ⊥ )} − 1. But taking this minimum as the sole security parameter
supposes that the orders of needed protection against SCA and FIA are comparable. This
is not always the case. In a safety context like autonomous trains and cars, it is crucial to
ensure the detection of faults and side-channel leakage is a lesser risk, while in the context
of the internet of things (IoT) or banking, minimizing side-channel leakage is a premium
objective. We must then take the pair (d(C) − 1, d(D ⊥ )} − 1) for security parameter.
In the case of Boolean masking, the codes, seen above, C = Fk2 × {0dk } and D =

{(y0 , . . . , yd ); yi ∈ Fk2 , di=0 yi = 0k } satisfy d(C) = 1 and d(D ⊥ ) = d + 1 (since as
we saw in a remark at page 17, the dual distance of a linear code equals the minimum
23 Weprefer using this term rather than complementary, which is ambiguous; we use the classical notation ⊕ to
denote such direct sum, which needs not to be confused with XOR.
446 Recent uses of Boolean and vectorial functions and related problems

nonzero number of linearly dependent columns in its generator matrix). This confirms that
Boolean masking protects against SCA but not FIA.
If D equals the dual C ⊥ of C in an LCP of codes, then C and D are so-called linear
complementary dual (LCD) codes. Such codes are well adapted to the cases where the need
for protection is the same for SCA and FIA; the security parameter of an LCD code C when
used in so-called orthogonal direct sum masking (ODSM) [131, 288] is simply d(C) − 1.
The notion of LCD code is anterior to DSM. In [1134], Yang and Massey introduced it
as an optimal linear coding solution for a rather particular problematic: the two user binary
adder channel. They provided a necessary and sufficient condition under which a cyclic
code is LCD. In [75], Bhasin et al. have shown how also using and implementing LCD
codes and LCP of codes to strengthen encoded circuits against hardware Trojan horses while
minimizing the cost.
Note that D = C ⊥ if and only if G is a parity check matrix of C, that is, G × G t = 0k,k ,
where G t is the transposed matrix of matrix G . We can denote then G by H and use an
orthogonal projection to recover x and y from z: the relation z = x × G + y × H implies
z × H t = y × H × H t and z × Gt = x × G × Gt , and this provides x and y since
“C is LCD,” “the matrix H × H t is invertible,” and “the matrix G × Gt is invertible” are
equivalent.
Since the introduction of DSM, and the investigation of numerous constructions of LCD
codes in [288], many papers have studied constructions of such codes and of LCP of codes;
see a description in [295], where a problematic is also described: faults are detected by
verifying that masks have not been altered during processing; checking this requires having
access to the masks. A possibility is to mask with z = (x × G + y × G , y) instead
of z = x × G + y × G . The leak is then modified. Some MDS codes can keep the
same protection ability when changing this way the encoding (the code of generator matrix
[G : In−k ], where In−k is the identity matrix, can have the same dual distance as the code
of generator matrix G , which can be MDS).

Remark. A particular case24 of DSM25 is inner product masking (IPM), whose principle
for masking a sensitive data x ∈ F2n is to generate a vector over F2n whose inner product
with some public vector equals x; see [44, 45, 46, 491] (see also [289, 366, 960]; the latter
reference expresses the side-channel resistance of IPM in terms of, classically, the minimum
distance of D ⊥ , and less classically, the first nonzero coefficient in its weight enumerator).
With the public vector (1, l1 , . . . , ln−1 ), since we want x = (x ×G+y ×G )·(1, l1 , . . . , ln−1 );
x ∈ F2n , y ∈ F2n−1 n as explained above, we can take (as shown in [960]) G = (1, 0, . . . , 0)
(and C has then minimum distance 1, which does not allow detecting FIA) and y × G =
(y · (l1 , . . . , ln−1 ), y1 , . . . , yn−1 ), and code D has then the generator matrix

24 With a practical difference, though: IPM works over field elements, while DSM often works over bits; hence
probing security may not mean the same in both cases. Nevertheless, it is proved in [960] that if the
deterministic leaks of the shares are linear functions of their bits, the bounded moment security order of the
IPM is equal to the probing security order of the bitwise encoding obtained by decomposing the elements of
F2n over a basis over F2 .
25 And also of leakage squeezing, which we saw at page 431, and which is also a particular case of DSM, but

only when F is linear.


12.1 Physical attacks and related problems on functions and codes 447
⎡ ⎤
l1 1 0 ··· 0
⎢ .. .. . ⎥
⎢ l2 0 . . .. ⎥
⎢ ⎥
⎢ .. .. . . .. ⎥
⎣ . . . . 0 ⎦
ln−1 0 ··· 0 1
n−1
(the masked information is z = (x + i=1 li yi , y1 , . . . , yn−1 ), and we have x = z ·

(1, l1 , . . . , ln−1 )). Code D has the generator matrix (1, l1 , . . . , ln−1 ) (and the order of
protection against SCA equals the Hamming weight of (l1 , . . . , ln−1 )).
IPM also contains Boolean masking as a particular case (take (l1 , . . . , ln−1 ) = 1n−1 ) as
well as the methods of masking using secret sharing [974] (see Relation (3.43), page 146).
It has been recently modified in [365], so as to allow fault injection detection as well.

An important issue is to compute (in particular, multiply) efficiently over encodings. In


the case of IPM, solutions exist [44, 46], but for DSM, it remains an open challenge, except
in a simplified framework addressed in [260].

12.1.6 Robust codes, AMD codes and vectorial functions


In many cases of error detection, the assumption that the most probable errors have low
Hamming weight cannot be guaranteed. As shown successively in [664, 666, 667], the
classical method of error detection by codes having large enough minimum distance is
then often not efficient. In fact, it is in many cases almost impossible to predict the error
patterns (e.g., in the case of address decoder errors, or with power glitches, or when data are
compressed for transmission and decompressed after). For instance, the error characteristics
in silicon devices like memories are changing and in many instances can be unpredictable.
This is becoming still more true while embedded systems are becoming ubiquitous, and
their roles are becoming more mission critical for sensitive applications. Taking again the
example of memories, embedded ones are exposed to unpredictable environments (when for
instance moved in a plane from sea level, where cosmic rays are weak, to high sky where
the error rate can be large) and their reliability is now a matter of critical importance and
safety. This situation of unpredictability is similar to FIA on hardware implementations of
cryptographic algorithms (as also observed in the references above): classical methods of
error detection are ineffective when the error distribution within a device is controlled by an
adversary. For instance, when an attacker induces stress (see [82]) resulting in bits vanishing
one after the other in the processed data, this results in the injection of an error on each
bit that was equal to 1; when increasing the stress, the error distribution turns to almost
uniform. Classical codes (and the codes seen in Subsection 12.1.5) may fail then to detect
errors, when the adversary succeeds in producing an error changing a correct codeword into
a wrong codeword. The worst error-masking probability is the maximal probability that
a given error e transforms a codeword into a codeword. In the case of linear codes, the
undetectable errors are the codewords themselves, and the attacker only needs, for his or her
error injection, to know the code that is used in the device and to be able to inject codewords
as errors. The worst error-masking probability being then equal to 1 (worst possible), linear
codes are not adapted to minimizing worst error-masking probability.
448 Recent uses of Boolean and vectorial functions and related problems

Codes robust against fault injection with unknown error probability


In [665, 667, 717], the notion of robust code, aiming at providing uniform protection against
all errors, without any assumption on the error distribution or on the capabilities of an
attacker has been presented:

Definition 87 Given a positive integer R, an unrestricted (i.e., nonnecessarily linear) code


C ⊂ Fnq is called R-robust if the size of the intersection of C and any of its translates e + C,
where e ∈ Fnq , e = 0n , is bounded above by R. Given a code C, the smallest possible value
of R having this property shall be denoted by RC :

RC = max n |C ∩ (e + C)|. (12.7)


0n =e∈Fq

A binary R-robust code C of length n with M = |C| is denoted by a triple (n, M, R).

The code can be systematic (recall that this means there exists a subset I of positions in
codewords, called an information set of C, such that every possible tuple in FIq occurs in
exactly one codeword within the specified coordinates xi ; i ∈ I ; this implies that M =
q |I | ). Systematic codes are more practical for error detection in computer hardware thanks
to the separation between information bits and check bits. The code equals then, up to a
permutation of the codeword coordinates, the graph of a function, {(x, F (x)); x ∈ FIq }, for
some (not necessarily linear) function F . But we shall see that the code cannot then be
perfect robust.
The probability of missing an algebraic manipulation with a code C equals the so-called
probability of error masking, which for each possible error e is denoted by Q(e) and is
defined as
|C ∩ (e + C)|
Q(e) = . (12.8)
|C|
RC
The worst error-masking probability maxe=0n Q(e) equals then |C| . As observed in [666],
|C|−1
we have maxe=0n Q(e) ≥ q n −1 (with equality if and only if the code is uniformly robust;
see below) for any code of length n over Fq (which is easily shown by using that the
maximum of a sequence of values is always larger than or equal to the arithmetic  mean,
and equals it if and only if the sequence is constant, and observing that e=0n Q(e) =

2 |C|
|C| 2 = |C| − 1); a little more can be shown by using that the numerator in (12.8) is an
integer. Note that there is a slight error in [667] about this result: it is written that for every
code, we have Q(e) ≥ |C|−1q n −1 for every e = 0n , which is false (suppose that the minimum
distance of the code is larger than 1 and let us take e smaller than the minimum distance).
A code is called a robust code if its worst error-masking probability is strictly less than
1, and it is called a uniformly robust code (or perfect robust code) if Q(e) is constant for
e = 0n , that is, Q(e) = |C|−1
q n −1 , ∀e = 0n (note that the minimum distance of such code is
necessarily 1). This is equivalent to saying that C is a difference set in (Fnq , +), that is, in
the case q = 2 and assuming that C is neither equal to {0n } nor to Fnq , that the indicator
function of C is bent.
12.1 Physical attacks and related problems on functions and codes 449

Robustness and worst error-masking probability for supports of Boolean


functions and graphs of vectorial functions
1. Let C be the support of an n-variable Boolean function. Then if we denote by  the
symmetric difference between two sets, we have that |C  (e + C)| = 2 |C| − 2 |C ∩
(e + C)| for every e ∈ Fn2 , and therefore |C ∩ (e + C)| = |C| − 12 |C  (e + C)|.
Let us revisit the case where the function is bent: we know then that |C| = 2n−1 +
n
(−1) 2 2 −1 for some ∈ F2 , and |C  (e + C)| = 2n−1 , for every e = 0n ; this gives
n −1
n 2n−2 +(−1) 2 2
|C ∩ (e + C)| = 2n−2 + (−1) 2 2 −1 , Q(e) = n −1 (equal to the optimum
2n−1 +(−1) 2 2
n −1
2n−1 +(−1) 2 2 −1
2n −1 ), and C is uniformly robust (but cannot be systematic since its size
is not a power of 2). In [666] (and the references therein), it has been proposed for f
n
the basic Maiorana–McFarland function f (x, y) = x · y; x, y ∈ F22 , n even, but any
(binary) bent function would behave the same. In this same reference, it is also proposed
n
to take C = {(x, y) ∈ (Fq2 )2 ; x · y = u}, where q is a power of a prime and n is
even. This improves Q(e) in some cases (with a different value according to whether u is
zero or not). This same reference also investigates codes that are the unions of the codes
n
{(x, y) ∈ (Fq2 )2 ; x · y = u} for some values of u.
2. Let C be now systematic, i.e., the graph of a vectorial function. We have, by slightly
completing [717]:

Proposition 190 Let C = {(x, F (x)), x ∈ Fk2 } be the graph of a vectorial function F
from Fk2 to Fr2 , with k and r nonnegative. The worst error-masking probability of C equals
the differential uniformity of F divided by 2k . It is then bounded below by 2−r and equals
this optimum if and only if F is perfect nonlinear.

Indeed, denoting e = (a, b), we have


0  >0 0 0
0 x =y+a 0 0
0
|C ∩ (e + C)| = 0 (x, y) ∈ (F2 ) ;
k 2 0 = 0(Da F )−1 (b)00 .
F (x) = F (y) + b 0
For a = 0k and b = 0k , this size is null and maxe=0k+r |C ∩ (e + C)| equals then the
differential uniformity of F (see Definition 40, page 135). We know from the bound due
to Nyberg that it is then bounded below by 2k−r with equality if and only if F is perfect
nonlinear (in which case the derivatives are balanced). Since C has size 2k , this gives the
result.
Note that for this code, the value maxe=0k+r Q(e), equal to 2−r , is larger than |C|−1
q n −1 =
2k −1
2n −1 (no systematic code can be perfect robust).
Note also that, according to Nyberg’s result26 (Proposition 104, page 269), the best
codes C from Proposition 190 can exist only if k is even and r ≤ k2 , that is, the length
n = k + r and the dimension k satisfy n ≤ 3k 2
2 (i.e., their transmission rate is at least 3 ).
If this double condition is not satisfied, we can take F almost perfect nonlinear, and the
worst error-masking probability of C is then 2−r+1 .

26 The situation is different in odd characteristic; then, PN (n, n)-functions exist for every n.
450 Recent uses of Boolean and vectorial functions and related problems

It is observed in [717] that, thanks to the fact that the robust codes above are nonlinear,
the error detection for these codes depends on the encoded data (while for a linear code,
the set of missed errors is the same for all encoded data). This makes for the attacker
the set of necessary errors harder to determine, all the more when the data depend on the
secret key or when randomization is applied. But this makes also, as observed in [304,
690], that the efficiency of these codes depends on the fact that the data are uniformly
distributed, which is not reasonable in many situations, in particular when the information
bits of messages are also controllable by an attacker. This limitation can be overcome by
algebraic manipulation detection (AMD) codes, which we address in the next paragraph.

Codes and algebraic manipulation


A model for error injection has been introduced in [395] under the name of algebraic
manipulation. This model assumes that the attacker is able to modify the value of some
abstract
 data storage device without having read access to the data. Such a device is denoted
by (G) and can hold an element g (corresponding to some secret s), from a public finite
Abelian (additive) group G. Theattacker is not able to obtain any information about the
element g stored in the device (G). However, he can change the stored element g by
adding an error e ∈ G of his choice. After such algebraic manipulation (tampering), the
abstract storage device (G) will store the value g + e. The attacker can choose the value
e only on the basis of what he already knew about g before it was stored in the device
(his a priori knowledge of g). This models for instance the situation with linear secret
sharing schemes (see Subsection 3.6.1, page 145), in which the correctness of the secret
s reconstructed from the shares of a qualified coalition of players is guaranteed only if all
these shares are correct. If the coalition contains dishonest players and if the honest players
in it are not able to reconstruct s on their own (i.e., if they do not constitute a qualified
coalition), then the dishonest players can cause the reconstruction of a modified secret s  ,
and they can control the difference between s and s  , thanks to the linearity of the secret
sharing. In particular, in a minimal qualified coalition of players, a single corrupted player
can cause the reconstruction of an incorrect secret.
Two types of fault injection attacks can be considered. In the weaker ones, the adversary
cannot choose the input. So, from the attacker’s point of view,  the source s is uniformly
distributed; he can only inject an error e in the storage device (G), but he cannot change
value s at his own discretion. In the stronger version, the adversary knows the value s and
moreover can choose it, and change it after he got some information from the device (in a
kind of adaptive chosen attack). In both types of fault injection attacks, the value g stored in

(G) is hidden from the attacker.
The countermeasure against algebraic manipulation consists in using so-called algebraic
manipulation detection codes, which were introduced in [395] after observations were made
in [481]. AMD codes encode an original information s ∈ S as an element of g ∈ G in
such way that any algebraic manipulation is detected with high probability. No secret key is
needed, contrary to the case of message authentication codes.

Definition 88 An AMD code is a pair of two functions: a probabilistic encoding function


E : S → G from a set S into a finite Abelian group G, and a deterministic decoding function
12.1 Physical attacks and related problems on functions and codes 451

D : G → S ∪ {⊥}, where ⊥∈ S symbolizes that algebraic manipulation has been detected,
satisfying that D(E(s)) = s with probability 1 for every s ∈ S.
The AMD code is called -secure for > 0 if, for every s ∈ S and for every e ∈ G, the
probability that D(E(s) + e) ∈ / {s, ⊥} is at most . It is called weak -secure if, for every
e ∈ G and for every s ∈ S sampled from S with uniform distribution (independently of e,
then), the probability that D(E(s) + e) ∈/ {s, ⊥} is at most .
A systematic AMD code is an AMD code in which set S is a group and the encoding
function E has the form
E : S → G = S × G1 × G2
(12.9)
s → (s, x, F (x, s)),
where G1 and G2 are groups, F is a function and x is randomly chosen with uniform prob-
ability in G1 . The decoding is then D(s  , x  , r  ) = s  if F (x  , s  ) = r  , and D(s  , x  , r  ) =⊥
otherwise.

Given an AMD code, E(s) can safely be stored on (G) (supposed protected from
reading) so that the adversary who manipulates the stored value by adding some nonzero e
can cause it to decode to some s  = s with probability at most , only. AMD codes also allow
the protection of hardware and memories against FIA (seen in Section 12.1), see [1113]27 .
Note that if the AMD code is -secure, then for every s ∈ S, the size |D −1 (s)| of the
preimage of s by D is necessarily at least 1 (this property will be used below), since denoting
by Es the set of all possible images of s by E, and choosing e so that there exists in Es + e
an element x of Es  with s  = s (and so D(x) = s  ∈ / {s, ⊥}), the size of Es needs to be at
least 1 for allowing the probability that D(E(s) + e) ∈ / {s, ⊥} to be at most .
Deterministic weak secure AMD codes are a randomization of systematic robust codes
(seen above), with maxe=0 Q(e) = maxe=0 Prob [D(E(s) + e) ∈ / {s, ⊥}].
Note that, as already seen, every (|G|, |S|, λ)-difference set D (see page 196) in (G, +),
where G = S × G1 × G2 (assuming that S is an additive group and that such a difference
set exists) provides then a weak -secure AMD code with = |S| λ
, by taking for E any
bijection between S and D. In fact, it is enough (and necessary) that every nonzero element
e in S × G1 × G2 can be written in at most (rather than exactly) λ ways as the difference
between two elements of D, that is, |D ∩ (e + D)| ≤ λ. The graphs of perfect nonlinear
(resp. almost perfect nonlinear) (r, s)-functions have such property with λ = 2r−s (resp.
λ = 2r−s+1 ). 
In [395], is proposed the systematic AMD code with F (x, s) = x d+2 + di=1 si x i , where
s ∈ Fdq , x ∈ Fq , which provides a systematic d+1q -secure AMD code, thanks to the fact that,
when ex = 0, (x+ex ) −x
d+2 d+2
 exactly d +1 (which
equals a polynomial of degree matches
any value at most d +1 times), and when ex = 0 and es = 0, di=1 (si +[es ]i )x i − di=1 si x i
is a nonzero polynomial of degree at most d (which matches any value at most d times).
This construction is generalized in [396], where it is shown (by extending an idea from
[1112], which worked with generalized Reed–Muller codes) how systematic AMD codes
can be deduced from classical codes: from any subset S of GG 1
2 (i.e., any code of length

27 In
this reference, is required for a systematic AMD code that, for any nonzero (ex , es ), Dex ,es is nonconstant
on any section G1 × {s}.
452 Recent uses of Boolean and vectorial functions and related problems

|G1 | over G2 whose codewords are indexed in G1 ), we take for F (x, s) the coordinate of
index x in the codeword s; the condition for the -security of such AMD code is given
in [396, 397]. The AMD code from [395] given above corresponds to the case where S
equals the subset of a Reed–Solomon code (viewed as in the remark on RS codes at page
45) whose elements correspond to monic polynomials of degree d + 2 with no term of
degree d + 1. Other examples of AMD codes are given in [397, 668, 1112]. In [535], the
authors proposed modifications of AMD codes, which can have minimum distances larger
than 1, and then not only detect injected faults but also correct errors caused by natural
reasons.
It is shown in [916] (which dealt with cheating detection in secret sharing) and recalled
in [396] that
• For any -secure AMD code, we have |G| ≥ |S|−1
2 + 1;

indeed, given s ∈ S, applying the inequality |D −1 (s  )| ≥ 1 for each s  = s, we have


that the probability that D(E(s) + e) ∈ / {s, ⊥} when e is chosen uniformly at random
|S|−1
in G \ {0} is at least (|G|−1) , and we have by hypothesis that this probability is at
most .
– As observed in [396], the inequality |G| ≥ |S|−1 2 + 1 cannot be an equality for
systematic codes, since for such codes, the size of Es (the set of all possible images
of s by E) equals |G1 |, which is then at least 1 , and we have also |G2 | ≥ 1 because,
for every s ∈ S and (es , ex ) ∈ S × G1 \ {(0, 0)}, we have maxeF ∈G2 Prob [D(E(s) +
(es , ex , eF )) ∈
/ {s, ⊥}] ≥ |G12 | , as this probability equals that of the event F (x +
ex , s + es ) − F (x, s) = eF , where F (x + ex , s + es ) − F (x, s) ∈ G2 ; these two
inequalities imply |G| ≥ |S|2 .
– Moreover, it is shown in [396] (which adapted a proof from [1112] dealing with a
different notion of AMD codes) that, for any systematic -secure AMD code with
log |S|
< 1, we have |G1 | ≥ log |G2 | , where log is (for instance) the base 2 logarithm;
indeed, the code over G2 of all functions x → F (x, s) + eF , where (s, eF ) ranges
over S × G2 , contains |S| |G2 | codewords of length |G1 | and has minimum distance
at least |G1 |(1 − ) > 0 (since the code is -secure); the bound is then deduced from
the Singleton bound28 (see page 6): |G1 |(1 − ) ≤ |G1 | − log|G2 | (|S| |G2 |) + 1, that
is, |G1 | ≥ log|G2 | (|S|).

• For weak -secure AMD codes, we have |G| ≥ |S|−1


+ 1; indeed, all that we can then
say is that |D −1 (s  )| ≥ 1.

A stronger definition of AMD codes has been proposed in [1112, 1113], in which the
condition becomes that the probability that D(E(s) + e) ∈ / {⊥} is at most (hence,
in this definition, every undetected algebraic manipulation is treated as a success of the
adversary, while in Definition 88, when the source message is unaltered, it is not). This is
preferred for some applications (such as nonmalleable secret sharing schemes [558]). More
precisely, systematic AMD codes detect algebraic manipulation for errors (es , ex , eF ), under
the condition that the information part contains an error, es = 0, while this stronger version

28 Recall that this bound is valid for (unrestricted) codes over any alphabet.
12.2 Fully homomorphic encryption and related questions on Boolean functions 453

detects errors with zero information part (es = 0, ex , eF ); for some secure architectures, the
integrity of redundant bits of the codes is indeed also important. The AMD codes described
above from [395] satisfy this stronger requirement as well as the main construction in [397].
Lower bounds on the values of such that such codes can be -secure are studied in [1113]
(with other notation) and constructions are given.
The first domains of application of AMD codes have been, as indicated in [395], robust
secret sharing schemes (which ensure that, given a coalition S of players able to reconstruct
some secret value s, no subcoalition of (dishonest) players unable on their own to reconstruct
s can modify their shares and lead with the other players from S to the reconstruction of some
value s  = s + t, where t = 0 could be controlled by the dishonest players; this is achieved
by applying a linear secret sharing scheme to an encoding of the secret by an AMD code
rather than to the secret itself) and robust fuzzy extractors (enabling to recover a uniformly
random key from a noisy and nonuniform secret, such as those obtained by biometrics,
in such way that the key can be recovered from any value close to the secret) [395]. Other
cryptographic applications are the message authentication codes that remain secure when the
adversary can manipulate the key, unconditionally secure multiparty computation protocols
with a dishonest majority, anonymous message transmission (and quantum communication),
and more applications mentioned in [396, 397]. Applications to memory security have been
developed in [534, 1114].

12.2 Fully homomorphic encryption and related questions on Boolean functions


We refer to [305, 306, 839] for the present section. We observe nowadays two comple-
mentary phenomena: the proliferation of small embedded devices having growing but still
limited computing (and data storage) facilities, and the development of cloud services
with extensive storage and computing means. The cloud becomes then a more and more
unavoidable complement to embedded devices. But the outsourcing of data processing
raises new privacy concerns. The users want to prevent the servers from learning about
their data, while these servers are needed to help computing values from them. Gentry’s
fully homomorphic encryption (FHE) scheme [536, 537] gives a theoretical solution to
this problem, by allowing encryption CH preserving both operations of addition and
multiplication:
CH (m + m ) = CH (m) + CH (m ); CH (mm ) = CH (m) CH (m ). (12.10)
Given a vectorial function F from a finite field to itself (possibly to a subfield), if Alice
wants to compute F (m) and needs the help of the cloud for that, she can send CH (m) to
Claude,29 who computes F (CH (m)), which equals CH (F (m)), thanks to (12.10) and since
F has a polynomial representation. After decryption, Alice gets F (m), but the server has not
learned anything about m nor about F (m).
But repetitive use of homomorphic encryption requires more computational power and
storage capacity than small devices can offer (see more in [305]). A solution to this problem
is that Alice uses a hybrid symmetric-FHE encryption protocol, which works according to
the following phases:

29 The name used now in cryptography to personify the cloud.


454 Recent uses of Boolean and vectorial functions and related problems

1. Initialization. Alice sends to Claude her homomorphic public key pkH and the homo-
morphic ciphertext of her symmetric key CH (skS ) (which is much easier to compute
than CH (m) since skS is much shorter than m, and which needs to be computed once for
all further communication with Claude).
2. Storage. Alice encrypts her data m with the symmetric encryption scheme CS , and sends
CS (m) to Claude.
3. Evaluation. Claude calculates CH (CS (m)) and homomorphically evaluates the decryp-
tion of the symmetric scheme on Alice’s data and gets CH (m).
4. Computation. Claude homomorphically executes the treatment of F on Alice’s data, and
gets CH (F (m)).
5. Result. Claude sends CH (F (m)) and Alice gets F (m) by deciphering (deciphering being
much less costly than enciphering in FHE).
However, the best-adapted generations of FHE, that is, second and third generations,
are noise based (being built on the learning with errors [LWE] problem) and need
expensive “bootstrapping” when the noise grows too much. It is then mandatory to reduce
the error growth during evaluation-computation, and this is more or less equivalent to
reducing the number of multiplications for the second generation (more precisely, to reduce
the multiplication depth), and the number of additions for the third generation (in fact, the
correct parameter is much more complex; it also depends of multiplications, see [537], but
describing it precisely would be too long). The choice of the symmetric cipher CS is then
central for reducing the cost.

12.2.1 The FLIP cipher


The multiplicative depth of AES being too large (and its additive depth being still larger),
other symmetric encryption schemes have been proposed: block ciphers, like LowMC [11],
Rasta and Agrasta [480], and the stream cipher Kreyvium [192]. These solutions have
drawbacks: Kreyvium is expensive (all the more if it needs to be started again, which
can happen often), and lowMC has low complexity rounds, but their iteration makes it
unadapted, as almost any other block cipher (if we look precisely how they can work with
HeLib [584], for instance), except for Rasta and Agrasta, which are also well adapted for
multiparty computation, but whose originality is not in the choice of the S-box, which is
why we do not describe them here.

The filter permutator and the FLIP cipher


The FLIP cipher is an encryption scheme described in [839], which tries to minimize the
parameters mentioned above (in particular, the multiplicative depth). It is based on a new
stream cipher model, called the filter permutator (see Figure 12.1 below), consisting in
updating at each clock cycle a key register by a permutation of the coordinates, piloted by a
pseudorandom number generator (PRNG), and in filtering the resulting permuted key with
a Boolean function f , whose input is the whole register30 and whose output provides the
keystream. Applying the nonlinear filtering function directly on the key bits allows reducing

30 A future version of FLIP gets rid of this constraint.


12.2 Fully homomorphic encryption and related questions on Boolean functions 455

PRNG . Key register K

Perm.
Pi
Generator

P laintext

Ciphertext

Figure 12.1 Filter permutator construction.

the noise level when used in hybrid symmetric-FHE encryption protocols. In theory, there is
no big difference between the filter model seen at page 23 and the filter permutator since the
LFSR is simply replaced by a permutator. But in practice, there is much difference since the
filter function has hundreds of input bits instead of about 20, and there is another important
difference that we shall see in the next subsection.
In the versions of the cipher proposed in [839], function f has n = n1 + n2 + n3 ≥ 500
variables, where n2 is even and n3 equals k(k+1)
2 t for some k and t. It is defined as

1 −1
n n2
/2−1
f (x0 , . . . , xn1 −1 , y0 , . . . , yn2 −1 , z0 , . . . , zn3 −1 ) = xi ⊕ y2i y2i+1 ⊕,
i=0 i=0

t  
Tk z (j −1)k(k+1) , z (j −1)k(k+1) +1 , . . . , z (j −1)k(k+1) + k(k+1) −1 ,
2 2 2 2
j =1

where triangular function Tk is defined as


Tk (z0 , . . . , zj −1 ) = z0 ⊕ z1 z2 ⊕ z3 z4 z5 ⊕ . . . ⊕ z k(k−1) · · · z k(k+1) −1 .
2 2

We have seen in Subsection 6.2.6, page 265, how to calculate the nonlinearity of direct
sums, in Subsection 9.1.4, page 341, how to calculate their algebraic immunity, and we have
calculated in Subsection 10.3.1, page 363, the values of the nonlinearities and algebraic
immunities of triangular functions.
Four sets of parameters were proposed for the filtering function. The Hamming weight
of the input to the function being forced to n2 where n is the size of the register, the four
n
proposed instances, displayed in Table 12.1, ensure that n/2 ≥ 2λ , where λ is a security
parameter (the number of elementary operations needed for a cryptanalysis by exhaustive
search being 2λ ). There exists a guess and determine attack on a preliminary version of
456 Recent uses of Boolean and vectorial functions and related problems

Table 12.1 n: total number of variables, n1 : linear part, n2 :


quadratic part, t: number of triangular functions, k: degree of
the triangular functions; λ: resulting security parameter.

Name n n1 n2 t k λ

FLIP-530 530 42 128 8 9 80


FLIP-662 662 46 136 4 15 80
FLIP-1394 1,394 82 224 8 16 128
FLIP-1704 1,704 86 238 5 23 128

FLIP [490]. It is not efficient on the regular versions of FLIP. As checked in [839], FLIP is
well suited for reducing the increase of the noise in homomorphic encryption, particularly
for the third generation, and even for the second generation.

12.2.2 Boolean functions with restricted inputs


It was asserted in [839] that function f has sufficiently good cryptographic parameters
(small balance bias, large algebraic degree, large nonlinearity, large algebraic immunity,
and fast algebraic immunity), but by definition in the filter permutator, the input to f has
constant Hamming weight (equal to the weight of the secret key), while the study of f was
made over the whole space Fn2 . An important question has then been to see if the filtering
function proposed in [839] maintains good behavior with respect to classical attacks when
its domain is restricted. This has been established in a subsequent paper [306]. The work
consisted in
• Reconsidering all classical attacks in the framework of Boolean functions restricted to
some generic subset E of Fn2 (resulting from the specifications of the cryptosystem that
uses them, and also possibly of the cryptanalysis performed on it, for instance a guess
and determine attack), and in particular to a set of vectors of constant Hamming weight
• Studying how a generic function can contribute to the resistance against each attack in
such framework
• Revisiting all related criteria, and studying constructions of functions satisfying the new
versions of these criteria
• Studying specifically FLIP’s function and seeing if it provides a good trade-off

Set E may change when processing the algorithm or during the cryptanalysis. We may
also want the function to be usable in a variety of situations. We are then interested in
Boolean functions achieving good trade-off among all important cryptographic criteria,
when they are restricted to each set E in some family E . A particular family plays a special
role for FLIP, as explained above:

E = {En,1 , . . . , En,n−1 }, where En,k = {x ∈ Fn2 ; wH (x) = k}.

These sets are called slices in some papers (see, e.g., [504, 505]). Note that symmetric
functions (see Section 10.1, page 352), among which are balanced functions, bent functions,
12.2 Fully homomorphic encryption and related questions on Boolean functions 457

and functions with optimal algebraic immunity, are constant on each set En,k and lose then
completely their desirable properties. We shall see other examples of similar degradation.
We recall below from [306] the general study of the most important cryptographic criteria
in such general framework and how they particularize when E lives in class E above.
Note that for the FLIP cipher, Siegenthaler’s correlation attack (see page 86) does not
seem to apply. We do not study then the resilience of restricted Boolean functions, but such
study could be useful for other ciphers and for the resistance to guess and determine attacks.

Remark. A probabilistic and asymptotic study has been made in [504, 507, 508] on the
restrictions of Boolean functions to sets of inputs of fixed Hamming weight. We refer the
reader interested to these papers (which also contain other interesting results); we deal here
with fixed (generic) numbers of variables.
The nonlinearity of Boolean functions under nonuniform input distribution (which is
another, possibly more general, way of not reducing the study of Boolean functions to the
usual framework) has been also studied in [525], but the chosen distribution is binomial and
does not fit with the framework of FLIP nor that of guess and determine attacks.

Balance We denote by wH (f )k the Hamming weight of the restriction of f to En,k :

wH (f )k = |{x ∈ Fn2 , wH (x) = k, f (x) = 1}|.

For all n ≥ 2, there exist balanced Boolean functions that are unbalanced on En,k for every
k ∈ [1, n − 1]; these functions can even be (n − 1)-resilient (and remain then balanced
when at most n − 1 of their variables are
arbitrarily fixed): an example is the first elementary
n
symmetric Boolean function σ1 (x) = i=1 xi = wH (x) [mod 2]. But there exist, for some
values of n, balanced functions that are balanced on each En,k ; k ∈ [1, n − 1]:

Definition 89 We call weightwise perfectly balanced the functions that are balanced on
any En,k for k = 1, . . . , n − 1, that is, such that
n
∀k ∈ [1, n − 1], wH (f )k = k
, (12.11)
2

and such that f (0n ) = 0 and f (1n ) = 1.

The double condition “f (0n ) = 0 and f (1n ) = 1” makes f globally balanced and is
not restrictive for balanced functions satisfying
n (12.11), up to the addition of constant 1. Of
course, such functions can exist only if k is even for every k = 1, . . . , n − 1, i.e., n is a
power of 2.
Necessary conditions on the algebraic normal form of Boolean functions to be weightwise
perfectly balanced are given in [306]. A secondary construction based on the “indirect
sum” (see Theorem 21, page 300) has been given in this same reference. We recall the
proof.
458 Recent uses of Boolean and vectorial functions and related problems

Proposition 191 [306] Let f , f  , and g be weightwise perfectly balanced n-variable


functions and let g  be any n-variable Boolean function; then

n
h(x, y) = f (x) ⊕ xi ⊕ g(y) ⊕ (f (x) ⊕ f  (x))g  (y); x, y ∈ Fn2
i=1

is a weightwise perfectly balanced 2n-variable function.

Proof
• If wH (x, y) = 0, then h(x, y) = 0.
• If k ∈ {1, . . . , n − 1}, then the set {(x, y) ∈ F2n
2 ; wH (x, y) = k} equals the disjoint union
of the following sets:
– {0n } × {y ∈ Fn2 ; wH (y) = k}, on which h(x, y) equals g(y) and is then balanced.
– {x ∈ Fn2 ; wH (x) = i} × {y}, where 1 ≤ i ≤ k and wH (y) = k − i, on each of which
h(x, y) equals f (x) ⊕ g(y) if g  (y) = 0 and f  (x) ⊕ g(y) if g  (y) = 1; in both cases,
it is balanced.
• If k = n, then the set {(x, y) ∈ F2n
2 ; wH (x, y) = k} equals the disjoint union of the
following sets:
– {(0n , 1n )} ∪ {(1n , 0n )}, on which h(x, y) equals respectively 1 and 0 and is then
globally balanced.
– {x ∈ Fn2 ; wH (x) = i} × {y}, where 1 ≤ i ≤ n − 1 and wH (y) = n − i, on each of
which h(x, y) equals f (x) ⊕ g(y) if g  (y) = 0 and f  (x) ⊕ g(y) if g  (y) = 1; in
both cases, it is balanced.
• If k ∈ {n + 1, . . . , 2n − 1}, then the set {(x, y) ∈ F2n
2 ; wH (x, y) = k} equals the disjoint
union of the following sets:
– {1n } × {y ∈ Fn2 ; wH (y) = k − n}, on which h(x, y) equals g(y) and is then balanced.
– {x ∈ Fn2 ; wH (x) = i} × {y}, where k − n + 1 ≤ i ≤ n − 1 and wH (y) = k − i, on
each of which h(x, y) equals f (x)⊕g(y) if g  (y) = 0 and f  (x)⊕g(y) if g  (y) = 1;
in both cases, it is balanced.
• If k = 2n, then wH (x, y) = k is equivalent to x = y = 1n , then h(x, y) = 1.

Noting that f (x1 , x2 ) = x1 is weightwise perfectly balanced, we can recursively build


weightwise perfectly balanced Boolean functions of 2 variables, for all  in N∗ . For
instance, with f = f  , we obtain the following class:


 
2 2 −1
−a a−1

f (x1 , x2 , . . . , x2 ) = xi+j 2−a+1 .


a=1 i=1 j =0

In [787], another construction is proposed based on the nice idea that if a Boolean function
f on F2n satisfies f (0n ) = 0, f (1n ) = 1 and f (x 2 ) = f (x) ⊕ 1 for all x ∈ F2n \ F2 , the
function over Fn2 obtained by decomposing x over a normal basis is weightwise perfectly
12.2 Fully homomorphic encryption and related questions on Boolean functions 459

balanced. Indeed, the transformation x → x 2 results in a cyclic shift. These functions are
invariant under a shift by two positions of the input (we already evoked in Section 10.2,
page 360, the interest and risk of such rotation symmetry). The restricted nonlinearities of
the functions (see the definition below) are also studied.
In [1073], the authors give a large family of Boolean functions that are weightwise
perfectly balanced if n is equal to a power of 2 and weightwise almost perfectly balanced
(see below) otherwise, and which have optimal algebraic immunity and keep good algebraic
immunity when restricted.
It is possible to extend the construction of Proposition 191 to get for all n weightwise
almost perfectly balanced functions, satisfying by definition that for all k ∈ [1, n − 1],
(n)  (n)±1 
wH (f )k equals 2k when nk is even and k 2 when nk is odd; see the proof and more
results in [306]. 
The transformation f → (g → x∈Fn (−1)f (x)+g(x) ), where g ranges over the set of
2
all symmetric Boolean functions null at zero input, is introduced in this same reference.
This transformation is similar to the Walsh transform, but with symmetric functions playing
the role played normally by affine functions. It is shown
 that, for every n-variable Boolean
function f , the quadratic mean of the sequence k → wH (x)=k (−1)f (x) equals √ 1 times
 n+1
the quadratic mean of the sequence g → x∈Fn (−1)f (x)+g(x) .
2

Nonlinearity The Hamming distance between a function f and a linear function a (x) =
a · x on inputs ranging over some set E equals
|E| 1
dE (f , a ) = − (−1)f (x)⊕a·x
2 2
x∈E

(sum performed in Z). The minimal distance nlE (f ) between f and affine functions over
E, which we shall call nonlinearity with inputs in E, equals then
|E| 1 0 0
0 0
nlE (f ) = − maxn 0 (−1)f (x)⊕a·x 0.
2 2 a∈F2
x∈E

  2
Since a∈Fn2 x∈E (−1)
f (x)⊕a·x = 2n |E|, we have then

|E| |E|
nlE (f ) ≤ − . (12.12)
2 2
For E  Fn2 , this bound is in general not achievable with equality (contrary to the
unrestricted case for n even). In the case of E = En,k , it is never tight, except maybe for two
particular
n pairs (n, k): (50, 3) and (50, 47), since Erdös showed that the binomial coefficient
k with 3 ≤ k ≤ n/2 is the square of an integer for the single case 50 3 .
Bound (12.12) can be improved:

Proposition 192 [306] Let E be a subset of Fn2 and f a Boolean function over E. Then:
|E| 1 8
nlE (f ) ≤ − |E| + λ
2 2
460 Recent uses of Boolean and vectorial functions and related problems

where
0 0
0 0
λ= max 0 (−1)f (x)⊕f (y) 0.
a∈F2 ;a=0n
n
(x,y)∈E 2
x+y=a

Proof For every nonzero a ∈ Fn2 , we have


 2
(−1)f (x)⊕b·x = (−1)f (x)⊕f (y) (−1)b·(x+y)
b∈Fn2 ; a·b=0 x∈E (x,y)∈E 2 b∈Fn2 ; a·b=0

= 2n−1 (−1)f (x)⊕f (y) ,


(x,y)∈E 2
x+y∈{0n ,a}

which implies
0 0 2
0 0 3
max
n
0 (−1)f (x)⊕a·x 0 ≥ 3
4|E| + (−1)f (x)⊕f (y) .
b∈F2 ; a·b=0
x∈E (x,y)∈E 2
x+y=a


If (−1)f (x)⊕f (y) is negative, then we can apply this inequality to function f  (x) =
(x,y)∈E 2
x+y=a
   
f (x)⊕v·x, where v·a = 1; we have (x,y)∈E 2 (−1)f (x)⊕f (y) = − (x,y)∈E 2 (−1)f (x)⊕f (y) .
x+y=a x+y=a
Relation (3.1), page 79, completes the proof.

Note that this result applied for E = Fn2 proves again that the derivatives of bent functions
are all balanced.
More observations are made for E = En,k in [306] and the case of direct sums is studied.
Proposition 192 is a particular case of a more general and slightly more complex result
given in this same reference, which has been generalized in [878], where the consequences
are studied in detail.
The maximal value of nlE (f ) is the covering radius of the punctured first order Reed–
Muller code obtained by deleting all the coordinates whose indices lie outside E and is then
at least d2 , where d is the minimum distance of this code.
For E = En,k , this minimum distance has been determined by Dumer and Kapralova
[488]; we have

• For 0 ≤ k < n/2, d = n−1
 k−1
• For k = n/2, d = n−2 k−2 
• For n/2 < k ≤ n − 1, d = n−1 k
• For k = n, d = 1.

The maximal value of nlEn,k (f ) is then nonzero except for particular values of k.
Nevertheless, fixing the input Hamming weight of some functions may deteriorate their
nonlinearity in an extreme way: for every n, there exists f of large nonlinearity such that
nlk (f ) = 0, ∀k = 0, . . . , n. For instance, the (bent) elementary symmetric function σ2
(n even) has this latter property (like any other symmetric function). We leave open the
12.2 Fully homomorphic encryption and related questions on Boolean functions 461

determination of all the bent n-variable Boolean functions such that nlk (f ) = 0, ∀k =
0, . . . , n. Those that are quadratic have been studied in [306], but the proof was incomplete,
because the third item of the next technical lemma was viewed as straightforward while it is
not.

Lemma 13 Let n be any positive integer.


1. The n-variable Boolean functions such that nlk (f ) = 0 for every k = 1, . . . , n are the
functions of the form

n
f (x) = xi ϕi (x) ⊕ ϕ0 (x), (12.13)
i=1
where ϕ0 , ϕ1 , . . . , ϕn are symmetric Boolean functions.
2. Up to the addition of an affine function, such a function equals

n
i (x)σi (x), (12.14)
i=1
where σi is the ith elementary symmetric Boolean function and where the i are all affine.
3. If n ≥ 6, then f is quadratic if and only if, up to the addition of an affine function, we
have
f (x) = (x) σ1 (x) ⊕ σ2 (x), (12.15)
where ∈ F2 , and (x) is a linear function.

Proof 1. Any function of the form (12.13) coincides with an affine function on every En,k
since each symmetric function is constant on it, and conversely, if a Boolean  function
f coincides on every En,k with an affine function, say with k (x) = i∈Ik i ⊕ k ,
x
then defining, for every x ∈ En,k and every i = 1, . . . , n,that ϕi (x) = 1 if i ∈ Ik and
ϕi (x) = 0 otherwise, and ϕ0 (x) = k , we have f (x) = ni=1 xi ϕi (x) ⊕ ϕ0 (x), where
the ϕi are symmetric functions.
2. Expressing each function ϕ0 , . . . , ϕn by means of the elementary symmetric  functions
σ1 , . . . , σn , we obtain, up to the addition of an affine function, f (x) = ni=1 i (x)σi (x),
where the i are all affine. 
3. All the terms obtained after expansion of ni=3 i (x)σi (x) in (12.14) have degree at least
3 and, using the uniqueness of the ANF of a Boolean function, f is quadratic if and only
if all those whose degree is at least 4 cancel and those of degree 3 are canceled by those
from 2 (x)σ2 (x) (the expression of the function can then be taken equal to the quadratic
part of (12.14) expanded). Let us translate this into explicit conditions on (12.14). I
For every i, j = 1, . . . , n, we have xj σi (x) = ( I ⊆{1,...,n} x I ) ⊕ ( I ⊆{1,...,n} x ). We
 |I |=i,j ∈I |I |=i+1,j ∈I
deduce that, writing i (x) = j ∈Ji xj ⊕ i , we have, for i < n:
⎛ ⎞ ⎛ ⎞
⎜  ⎟ ⎜  ⎟
i (x) σi (x) = ⎝ xI ⎠ ⊕ ⎝ xI ⎠ , (12.16)
I ⊆{1,...,n} I ⊆{1,...,n}
|I |=i,|I ∩Ji | [mod 2]= i ⊕1 |I |=i+1,|I ∩Ji | odd
462 Recent uses of Boolean and vectorial functions and related problems

since each xj , j ∈ Ji , contributes once for each x I such that |I | = i and j ∈ I and once
for each x I such that |I | = i + 1 and j ∈ I . And for i = n, n (x) σn (x) equals σn (x) if
|Jn | [mod 2]= n ⊕ 1 and is zero otherwise. Hence:
• For i = n, we have n (x) σn (x) = (|Jn | [mod 2]⊕ n ) σn (x).
• For 1 ≤ i ≤ n − 1, specifying the values of the two subsums in (12.16), we have:
– If 0 < |Ji | < n, then i (x) σi (x) contains terms of degree i but not all of them
(since both parities can be achieved by |I ∩ Ji | when |I | = i) and terms of degree
i + 1 but, if i ≤ n − 2, not all of them as well, and if i = n − 1 the part in
σi+1 = σn has coefficient |Jn−1 | [mod 2].
– If |Ji | = 0, then i (x) σi (x) = i σi (x) (note that, for i = n − 1, the coefficient
of σi+1 , which is then 0, takes the same value |Jn−1 | [mod 2], obtained above for
0 < |Ji | < n).
– if |Ji | = n, then i (x) σi (x) = (σ1 (x) ⊕ i )σi (x) = (i [mod 2]⊕ i ) σi (x) ⊕ (i + 1
[mod 2])σi+1 (x) (and the coefficient i + 1 [mod 2] of σn for i = n − 1 matches
the value |Jn−1 | [mod 2] above as well).

It cannot then happen, when the function is quadratic, that 0 < |Ji | < n for some
value of i ≥ 3 and |Ji | = 0 or |Ji | = n for another value of i ≥ 3.

Then f is quadratic if and only if we have n = (|Jn | + |Jn−1 |) [mod 2] and, addressing
first the two latter cases above and then the first case:
• Either, for every i = 2, . . . , n − 1, we have |Ji | = ηi n with ηi ∈ {0, 1} and for i ≥ 3,
i = (ηi + ηi−1 ) i [mod 2].
• Or, for every i = 3, . . . , n − 1, we have 0 < |Ji | < n and the two following sets
{I ⊆ {1, . . . , n}; |I | = i and |I ∩ Ji | [mod 2] = i ⊕ 1} and {I ⊆ {1, . . . , n}; |I | =
i and |I ∩ Ji−1 | odd} are equal. Denoting by zi the vector of Fn2 of support Ji , by
B≥3 the set of vectors of Fn2 of Hamming weight at least 3, by E 0 (rather than E ⊥ )
the orthogonal of an F2 -vector space E and by E 1 its complement, the condition
writes {0n , zi } i ⊕1 ∩ B≥3 = {0n , zi−1 }1 ∩ B≥3 , or equivalently {0n , zi } i ∩ B≥3 =
{0n , zi−1 }0 ∩ B≥3 .
If n ≥ 6 then the linear space {0n , zi−1 }0 contains elements of Hamming weight at
least 5 and each of its elements of weight at most 2 is then the sum of two elements
of {0n , zi−1 }0 ∩ B≥3 (one of weight at least 5 and one of weight at least 3); hence the
vector space ({0n , zi−1 }0 ∩ B≥3 ) spanned by {0n , zi−1 }0 ∩ B≥3 equals {0n , zi−1 }0 ; the
same is true for {0n , zi } i ∩ B≥3 if i = 0, in which case we have zi = zi−1 , that is,
Ji = Ji−1 , and if i = 1 then {0n , zi−1 }0 equals ({0n , zi } i ∩ B≥3 ), which contains
{0n , zi }0 for the same reasons as above and cannot be reduced to a hyperplane, and is
then equal to Fn2 , a contradiction.

Summarizing, we have, up to the addition of an affine function:


• We are in the first case above, and f (x) equals the quadratic part of a function of
the form (x) σ1 (x) ⊕ (η σ1 (x) ⊕ ) σ2 (x), where , η ∈ F2 , and (x) is a linear
function. Since η σ1 (x) σ2 (x) = η σ3 (x), we can take η = 0 and we obtain then
(12.15).
12.2 Fully homomorphic encryption and related questions on Boolean functions 463

• Or we are in the second case above with z2 = z3 = · · · = zn−1 and 3 = · · · = n−1 = 0,


n−1
and f (x) is the quadratic part of (x)σ1 (x) ⊕ σ2 (x) ⊕  (x)( i=2 σi (x)) ⊕ n σn (x),
where  and  are linear and n =  (1), that is, (x)σ1 (x) ⊕ σ2 (x) ⊕  (x)(σ1 (x) ⊕
1 ⊕ δ0 (x)) = (x)σ1 (x) ⊕ σ2 (x) ⊕  (x)(σ1 (x) ⊕ 1), and this second case happens
then to be equivalent, up to the addition of an affine function, to a particular case of the
first.

Remark. The same proof shows that a function (12.14) n has algebraic degree at most k if
and only if all the terms of degree at least k + 2 in i=k+1 i (x)σi (x) cancel and those
of degree k + 1 are canceled by those from k (x)σk (x). The expression of the function
equals
 then the part of degree at most k in (12.14), which is the part of degree at most k
in k−1
i=1 i (x)σi (x) ⊕ (η σ1 (x) ⊕ )σk (x), where i (x) is affine for every i = 1, . . . k − 1,
and , η ∈ F2 . Since the degree kpart of σ1 (x)σk (x) equals σk (x) if k is odd and equals
k−1
0 otherwise, we obtain f (x) = i=1 i (x)σi (x) ⊕ σk (x), where the i are affine and
∈ F2 .
Proposition 193 [306] For every even n ≥ 6, the quadratic bent functions satisfying
nlk (f ) = 0 for every k are, up to the addition of an affine function, the functions
f (x) = (x) σ1 (x) ⊕ σ2 (x), where  is linear and (1n ) = 0.
Proof The symplectic form (x, y) → f (x + y) ⊕ f (x) ⊕ f (y) ⊕ f (0n ) associated with
the function in (12.15) equals
⎛ ⎞

(x)σ1 (y) ⊕ (y)σ1 (x) ⊕ ⎝ xj yi ⎠ .
1≤j =i≤n
n
Denoting (x) = i=1 li xi , the kernel
E= {x ∈ F2n ; ∀y ∈ F2n , f (x + y) ⊕ f (x) ⊕ f (y) ⊕ f (0n ) = 0}
of this symplectic form is the F2 -vector space of the solutions of the equations:
⎛ ⎞ ⎛ ⎞
 n 
(Li ) : (x) ⊕ li ⎝ xj ⎠ ⊕ ⎝ xj ⎠ = 0, i = 1, . . . , n.
j =1 j =i

If = 0, then since the hyperplane of equation nj=1 xj = 0 has nontrivial intersection
with the kernel of  (because n ≥ 3), and since every element in this intersection satisfies
all equations, f cannot be bent. We assume then that = 1. For all those x ∈ E such that
 n
j =1 xj = 0, the equation
⎛ ⎞

n
(Li + Li  ) : (li ⊕ li  ) ⎝ xj ⎠ ⊕ (xi ⊕ xi  ) = 0,
j =1

i ,
valid for every i = results in xi ⊕ xi  = 0 and implies that either all xi are null (in which
case (Li ) is of course satisfied), or all are equal to 1, in which case (Li ) becomes (since n
is even) (1n ) = 1. Hence, = 1 and (1n ) = 0 is a necessary condition
 for the function
to be bent. It is also sufficient, because for all x ∈ E such that nj=1 xj = 1, according
to Equation (Li + Li  ) again, all those xi such that li = 0 are equal to some value η ∈ F2
464 Recent uses of Boolean and vectorial functions and related problems

n all those xi such that li = 1 are equal to another value, which can be only η ⊕ 1, since
and
j =1 xj = 1, and the number of i such that li = 1 is odd, that is, (1n ) = 1, a contradiction.
This completes the proof.

It is shown in [306] that for the direct sum of any n-variable function f and any m-variable
function g, we have
k  k  
n m
nlEn+m,k (f ⊕ g) ≥ nlEm,k−i (g) + nlEn,i (f ) − 2nlEm,k−i (g) .
i k−i
i=0 i=0
We refer to this reference for the proof.

Algebraic immunity The majority function being, like every symmetric function, con-
stant on all inputs of the same Hamming weight, and having optimal algebraic immunity, it
is an extreme example of degradation of the algebraic immunity when inputs are restricted
to En,k . We call algebraic immunity with inputs in E of a given Boolean function f the
nonnegative integer:
AIE (f ) = min{max(dalg (g), dalg ((fg)|E )); g ≡ 0 on E}
= min{dalg (g); (fg)|E ≡ 0 or ((f ⊕ 1)g)|E ≡ 0; g ≡ 0 on E},
where dalg ((fg)|E ) equals the minimum algebraic degree of Boolean functions over Fn2 that
coincide with fg over E, and we call annihilators of f over E the functions g such that
(fg)|E ≡ 0.
The equality between these two minima is shown easily: if g and h = fg achieving the
former minimum coincide on E, we have then g ⊕ h = g(f ⊕ 1) = 0 on E, where g has
nonzero restriction to E, and if they do not, then after multiplication of equality h = fg
by f , we have (g ⊕ h)f = 0, where g ⊕ h has nonzero restriction to E; this proves that
the former minimum is bounded below by the latter. The inequality in the other order is still
more obvious since the set over which the latter minimum is taken is a subset of the set over
which the former is taken.

Remark. Taking the restriction to E may (often) decrease the algebraic immunity (we
shall see examples below) since it weakens the condition on g to be an annihilator of f or of
f ⊕ 1, but it may also increase the algebraic immunity, because it strengthens the condition
on g to be nonzero. Take for instance an (n−1)-variable function f of algebraic immunity at
least 2, and define f  (x, 0) = f (x), f  (x, 1) = 0, for every x ∈ F2n−1 . Then the indicator of
F2n−1 × {1} being an annihilator, we have AI (f  ) = 1, while AIFn−1 ×{0} (f  ) = AI (f ) ≥ 2.
2
Moreover, for the same reason, we have AIk (f  ) = 1 for every k ∈ {1, . . . , n − 1}, while for
some functions f , we can have AIk (f ) ≥ 2 for some k.

The upper bound of Proposition 26, page 92, has been adapted to the algebraic immunity
with inputs in E.

Proposition 194 [306] Let E ⊆ Fn2 and let f be defined over E. Let d and e be
 
nonnegative integers. Let Md,E be the ( di=0 ni ) × |E| matrix whose term
nat row indexed
ui
by u ∈ F2 such that wH (u) ≤ d, and at column indexed by x ∈ E, equals i=1 xi .
n
12.2 Fully homomorphic encryption and related questions on Boolean functions 465

If rank(Md,E ) + rank(Me,E ) > |E|, then there exist two Boolean functions g and h on
E, such that g is not identically null on E and

dalg (g) ≤ e, dalg (h) ≤ d and fg = h on E.

We have then
6 |E| 7
AIE (f ) ≤ min e; rank(Me,E ) > . (12.17)
2

Proof By definition, rank(Md,E ) equals the maximum size of a free family Fd of


restrictions to E of monomials x u of algebraic degree wH (u) ≤ d (such family generates
the restrictions to E of the Boolean functions of algebraic degree at most d) and |E|
is the dimension of the F2 -vector space of Boolean functions over E. If rank(Md,E ) +
rank(Me,E ) > |E|, the elements of Fd and the products between f and the elements of a
maximum size free family Fe are necessarily F2 -linearly dependent. Gathering the part of
this linear combination dealing with the elements of Fd and those dealing with Fe f , this
linear dependence gives two functions h and g of degrees at most d and e, respectively,
such that (fg)|E = h|E and (g|E , h|E ) ≡ (0, 0), i.e., g|E ≡ 0. Inequality (12.17) is then
straightforward by taking d = e.

In the case of fixed input weights, a recurring relation on the rank of Md,En,k has been
found in [306] (we refer to this reference for the proof, which is a little too long to be given),
where it has been deduced that this rank equals

n
.
min(d, k, n − k)
For k ≤ n/2, Relation (12.17) implies then that, for every n-variable Boolean function f :
  >
n n
AIEn,k (f ) ≤ min e; 2 > .
e k
It is deduced in this same reference (by technical observations dealing with binomial
coefficients) that the best possible algebraic immunity of a function with constrained input
Hamming weight is lower than for unconstrained functions.
A lower bound exists on the algebraic immunity of the direct sum of two Boolean
functions. We recall the proof from [306]:

Proposition 195 Let (f ⊕ g)(x, y) = f (x) ⊕ g(y), x ∈ Fn2 , y ∈ Fm


2 , where n ≤ m. Let k
be such that n ≤ k ≤ m. Then the following relation holds:

AIk (f ⊕ g) ≥ AI (f ) − dalg (g). (12.18)

Proof Let h(x, y) be a nonzero annihilator of f ⊕ g over En+m,k . Let (a, b) ∈ F2n+m
have Hamming weight k and be such that h(a, b) = 1. Since (a, b) has Hamming weight
k with n ≤ k ≤ m, we may, up to changing the order of the coordinates of b (and without
loss of generality), assume that, for every j = 1, . . . , n, we have bj = aj ⊕ 1 and for
every j = n + 1, . . . k, we have bj = 1 (so that for every j = k + 1, . . . m, we have
466 Recent uses of Boolean and vectorial functions and related problems

bj = 0). This is possible since k ≥ n and in all cases, the last 1 in (a, b) is at position
2n + (k − n) = n + k ≤ n + m. We define the following affine function over Fn2 :
L(x) = (x1 ⊕ 1, x2 ⊕ 1, . . . , xn ⊕ 1, 1, . . . , 1, 0, . . . , 0),
where the length of the part “1, . . . , 1” equals k − n. We have L(a) = b. The n-variable
function h(x, L(x)) is then nonzero and is an annihilator of f (x) ⊕ g(L(x)) over Fn2 . If
g(b) = 0, then function h(x, L(x)) (g(L(x)) ⊕ 1) is a nonzero annihilator of f and has
algebraic degree at most dalg (h) + dalg (g); then we have dalg (h) + dalg (g) ≥ AI (f ). If
g(b) = 1, then by applying the same reasoning to f ⊕ 1 instead of f and g ⊕ 1 instead of
g, we have dalg (h) + dalg (g) ≥ AI (f ). If h(x, y) is a nonnull annihilator of f ⊕ g ⊕ 1 over
En+m,k , we have the same conclusion by replacing f by f ⊕ 1 or g by g ⊕ 1. This completes
the proof.

Bound (12.18) may seem loose because of the presence of −dalg (g), but it is not. Let
us see with an example (given in [306]) that making the direct sum with some nonconstant
Boolean functions g may indeed contribute to a decrease of the algebraic immunity over
inputs of fixed Hamming weight: take n odd, f (x) = 1⊕maj (x), where maj is the majority
function over n variables (which has optimal algebraic immunity n+1 2 ) and g(y) = maj (y)
over n variables as well. Then the 2n-variable function f ⊕ g is null at fixed input weight
n, because if wH (x) + wH (y) = n, then either wH (x) ≤ n−1 2 and wH (y) ≥ 2 , and
n+1

we have then f (x) = g(y) = 1, or wH (x) ≥ 2 and wH (y) ≤ 2 , and we have then
n+1 n−1

f (x) = g(y) = 0. The algebraic immunity with input weight n equals then 0.
Bound (12.18) also shows that, if k ≥ n, then taking g = 0 (i.e., adding m ≥ k
virtual variables to f ) gives AIk (f ⊕ 0) ≥ AI (f ); this latter bound is tight (take for f a
function whose algebraic immunity equals its algebraic degree). Another example showing
the tightness of Bound (12.18) when dalg (g) = 1 is also given in this same reference.
It is shown in [306] that for the direct sum of any n-variable function f and any m-variable
function g, we have
AIk (f ⊕ g) ≥ min (max[AI (f ), AIk− (g)]).
0≤≤k

We refer to this reference for the proof.

Impact of Boolean functions with restricted input on FLIP


Balancedness For given k, let pk = P rx∈En,k [f (x) = 1] = 12 − k . The amount of data
needed for an attacker to detect the bias k is equal to 12 . In the case of the FLIP cipher, we
k
have k = n2 . It has been checked in [306] that the bias is not exploitable, even in the case of
guess and determine attacks.

Nonlinearity In a fast correlation attack (approximating the keystream equations by linear


approximation of the filtering function and using a decoding method), the attacker builds
a linear system that can be seen as an instance of the learning parity with noise (LPN)
nlEn,k
problem [99], where the noise parameter is ηk = . The data complexity of the attack
(nk)
is O(2h ηk−2(r+1) ), where the parameters h and r depend on the algorithm used and on the
12.3 Local pseudorandom generators and related criteria on Boolean functions 467

number of variables. It could be shown in [306] that nlEn, n (f ) is large enough for allowing
2
the FLIP cipher to resist the fast correlation attack, combined or not with a guess and
determine attack.

Algebraic immunity Proposition 195 has allowed to bound nlEn, n (f ) from below, with
2
the help of the next proposition.

Proposition 196 Let f (x) be an n-variable Boolean function such that

∀x ∈ F2n−2 f (0, 0, x) = f (0, 1, x) = f (1, 0, x).

Let f  (X, x3 , . . . , xn ) be the Boolean function in n − 1 variables defined by

∀x ∈ F2n−2 f  (1, x) = f (1, 1, x) and f  (0, x) = f (0, 0, x).

If AI (f ) ≤ d, then AI (f  ) ≤ d.

This proposition, whose proof can be found in [306], implies that if f is the direct sum
of d monomials and if, for every i ∈ [k, d], f has a monomial of degree i, where k is the
smallest degree of all monomials of f , then AI (f ) = d. This allowed to show that all
instances of the FLIP cipher resist the algebraic attack. Determining whether they resist the
algebraic attack combined with the guess and determine attack is open. An interesting point
observed in [306] is that the high number of triangular functions used in FLIP to prevent the
guess and determine attack combined with fast algebraic attack may reduce the algebraic
immunity, and there is then a trade-off to be found.

12.3 Local pseudorandom generators and related criteria on Boolean functions


Recall that the principle of pseudorandom generators is to allow expanding short random
strings (like private keys), called seeds, into pseudorandom strings, whose length is
significantly larger (say, polynomial, that is, in O(ns ), where n is the length of the seed,
with s > 1). They are called local if each output bit depends on a constant number d
of input bits. This property, related to the design of cryptographic primitives that can be
evaluated in constant time while using polynomially many cores, allows a wide variety of
applications. The only known example of a local pseudorandom generator is the so-called
Goldreich’s PRG, which applies a simple d-variable Boolean function (Goldreich calls it a
d-ary predicate) to public random subsets of size d of the seed.

12.3.1 The Goldreich pseudorandom generator


In [541], Goldreich proposed a one-way function (OWF), which is an asymptotic con-
struction aiming at being a “simplest possible function that we do not know how to invert
efficiently.” This OWF is built as a random local function (see below) and was later modified
into a local pseudorandom generator [543] with nice applications (making possible, with
constant computational overhead, a secure two-party computation of any Boolean circuit,
and having other applications; see [393, 636]).
468 Recent uses of Boolean and vectorial functions and related problems

Let n and m be two integers, let (S1 , . . . , Sm ) be a list of m subsets of {1, . . . , n} of size
d, where d is small compared to n (it can be logarithmic in n or even constant), and let f be
a Boolean function in d variables (the so-called predicate). The corresponding Goldreich’s
function G : Fn2 → Fm 2 is defined as G(x) = f (S1 (x)), f (S2 (x)), . . . , f (Sm (x)) for every
x ∈ Fn2 , where Si (x) is a vector made of those bits of x indexed by Si . Originally, m was of
size comparable to n. The one-way property of the function was related to the choice of the
predicate and to the fact that (S1 , . . . , Sm ) was an expander graph31 (which corresponds to
saying in the particular framework we are in, that for some k, every k subsets cover k +(n)
elements of {1, . . . , n} (which happens to be the case with probability tending to 1 for subsets
drawn at random).
Goldreich’s pseudorandom generator proposed later takes a larger value of m, polynomial
in n. The integer d is called the locality of the PRG, and many works have focused on a
framework called “polynomial-stretch local” in which d is constant and m = ns , where s >
1 (s is called the stretch). For these polynomial-stretch local PRG, the security is considered
asymptotically, relative to the class of polynomial adversaries as linear distinguishers. They
are conjectured secure under some necessary conditions on the predicate and on the subsets
Si (see the survey [22] and, for a faster overview, [393, section 1.2]). For instance, to avoid
an attack by Gaussian elimination, the predicate f must be nonlinear in a basic sense.
Moreover, the higher is the algebraic degree, the better, since a random local function with
a predicate of algebraic degree s cannot be pseudorandom for a stretch as large as s. The
predicate must also be such that, when fixing some number r of input bits to f , its algebraic
degree remains large. Note that this has a close relation with algebraic immunity (since
if the algebraic degree of f falls down to k when r input bits are fixed, we know that
AI (f ) ≤ r + k) and algebraic attacks have been actually further investigated, and the
algebraic immunity AI (f ) (sometimes called rational degree among people working on
Goldreich’s PRG) happens to play a direct role and should be large enough (larger than s).
There is also an attack [889] when the output of the function is correlated with a number
of its input bits smaller than or equal to 2s , and f should then be resilient with a sufficient
order (all the more since this attack has been extended to cases where m ≥ λn for large
λ); in [915] the authors show that f should be (at least) 2-resilient. The very simple five-
variable function f (x) = x1 ⊕ x2 ⊕ x3 ⊕ x4 x5 (whose structure is similar to that of the FLIP
function, but simpler and with considerably smaller parameters) has been proved resisting
some attacks based on F2 -linear distinguishers when m ∈ O(ns ) with s < 1.5 (note that
its algebraic degree, resiliency order, and algebraic immunity are all three equal to 2), but
of course does not resist the attacks evoked above for larger stretches nor resists algebraic
attacks. A general structure
 has been proposed in [23] for predicates: the direct sum of the
full linear function ki=1 xi and of the majority function (see page 335) in n − k variables.
No attack is known on such functions when k ≥ 2s and  n−k 2  ≥ s.
Of course, constraints also exist on the choice of the subsets (S1 , . . . , Sm ), more precisely
on the hypergraph (see page 70) given by them, which needs to be sufficiently expanding,
but in practice, an overwhelming proportion of hypergraphs are sufficiently expanding.

31 Ahypergraph whose hyperedges have size d is said to be (α, β)-expanding if, for any choice of k ≤ αm edges,
their union has size at least βkd.
12.4 The Gowers norm on pseudo-Boolean functions 469

As we can see, Goldreich’s PRG gives one more example (and an important one) where
the classical notions on Boolean functions for cryptography play central roles in frameworks
different from stream or block ciphers. An open question is asked in this context by
Applebaum and Lovett in [23]: given two positive integers e and k, what is the smallest
number of variables (the reference writes “the smallest arity”) for which there exists a
Boolean function (a predicate) of algebraic immunity (of rational degree) at least e and
of resiliency order at least k? The parameters of the direct sum of the full linear function
 k
i=1 xi and of the majority function in 2e − 1 variables show that this number is at most
k + 2e − 1 as observed in [838] (Applebaum
 and Lovett give the bound k + 2e and propose
as example the direct sum of function ki=1 xi and of the majority function in 2e variables;
note that this very function is not k-resilient, at least in the usual sense, because the majority
function in 2e variables is not balanced, but they probably think of a majority function
modified into a balanced function with the same AI). As we can see, the upper bound k + 2e
is not optimal, and even k + 2e − 1 may not be optimal.
This problem is clearly related to another open question: is there, for k ≤ n − 2, an
upper bound on the algebraic immunity of k-resilient functions that would be sharper than
min(n−k −1,  n2 ) (implied by the Siegenthaler bound and the Courtois–Meier bound)? We
specify k ≤ n − 2 because for k = n − 1, Siegenthealer’s bound gives 1 and not 0 (and the
two (n−1)-resilient functions equal to the full linear n-variable function and its complement
have algebraic immunity 1). For very large values of k, the reply to the latter open question
is probably no (for instance, for k = n − 2, it is clearly no, and for k = n − 3, there exist
(n−3)-resilient functions of algebraic immunity 2, which are easy to obtain with Maiorana–
McFarland’s construction; an example is the direct sum of the full linear (n − 2)-variable
function and of the two-variable majority function xn−1 xn ). Note that many infinite classes
of 1-resilient functions of optimal algebraic immunity have been found in even numbers of
variables, but as far as we know, none has been found being 2-resilient, nor 1-resilient in
odd numbers of variables (recall that, if f (x) is a 1-resilient function with optimal AI in odd
number n of variables, then f (x) ⊕ xn+1 is a 2-resilient function with optimal AI in n + 1
variables). So already for k = 2, the question is open.

12.4 The Gowers norm on pseudo-Boolean functions


The Gowers uniformity norm has been introduced in 2001 in the paper [568], which
provided a new proof of a result originally shown by van der Waerden, on the existence,
for any positive integers k, r, of a positive integer M such that, in any r-partition of
{1, 2, . . . , M}, there exists at least one class containing an arithmetic progression of length k.
We shall not try to summarize the content of this rather dense 129-page long paper (available
on the internet), in which the norm was defined for functions over Z/N Z. The Gowers norm
can also be expressed, as in [571], in terms of pseudo-Boolean functions (see below). Since
2001, the Gowers norm has been intensively studied and applied in additive combinatorics
and in the probabilistic testing of specific properties of Boolean functions (knowing only a
few of their values, see the thesis [361], see also [13] where are addressed the Reed–Muller
codes and [542]). When applied to the sign function of a Boolean function f , it deals, as we
shall see, with the higher-order derivatives of f (whose definition and notation have been
given at page 39). It results in a measure related to the higher-order nonlinearity. We shall
470 Recent uses of Boolean and vectorial functions and related problems

see with Corollary 32 below that smaller is the Gowers Uk norm of f , higher is then the
contribution of f to the resistance to attacks by approximations by Boolean functions of
algebraic degree at most k − 1 (these attacks are listed after Definition 20, page 83). The
definition of the Gowers Uk norm, valid for all pseudo-Boolean functions, is as follows:

Definition 90 [568, 569, 571] Let k, n be positive integers such that k < n. Let ϕ : Fn2 →
R be a pseudo-Boolean function. The k-th order Gowers uniformity norm of ϕ equals:
⎛ ⎡  ⎤⎞ 21k

||ϕ||Uk = ⎝Ex,x1 ,...,xk ∈Fn2 ⎣ ϕ x+ xi ⎦⎠ ,
S⊆{1,...,k} i∈S

where Ex,x1 ,...,xk ∈Fn2 is the notation for arithmetic mean (i.e., for expectation in uniform
probability).

Note that, by considering separately the cases where S does not contain k
and those where it does, and using that (x, xk ) → (x, x + xk ) is  a permuta-
  
tion of (Fn2 )2 , the expression Ex,x1 ,...,xk ∈Fn2 S⊆{1,...,k} ϕ x + x
i∈S i is equal to
   
   2
Ex1 ,...,xk−1 ∈Fn2 Ex S⊆{1,...,k−1} ϕ x + i∈S xi . Being then always nonnegative,
the expression does admit a 2k th root.
Equality ||ϕ||Uk =
⎛ ⎡⎛ ⎡ ⎤⎞ 1
 ⎤⎞2 2k
⎜ ⎢  ⎥ ⎟
⎝Ex1 ,...,xk−1 ∈Fn2 ⎣⎝Ex ⎣ ϕ x+ xi ⎦⎠ ⎦⎠ (12.19)
S⊆{1,...,k−1} i∈S

has its own interest. For k = 1, it shows that


0 0
0 0 0 0
0 0 0 0
||ϕ||U1 = 0Ex∈F2 [ϕ(x)]0 = 2−n 0 ϕ(x)00 (12.20)
n
0
0x∈Fn 0
2

(which is then not a norm), and for k = 2 that


 2  14
||ϕ||U2 = Ex1 ∈Fn2 Ex [ϕ(x)ϕ(x + x1 )] .

Another identity is also useful:


⎡ ⎡  ⎤⎤

||ϕ||Uk = Exk ∈Fn2 ⎣Ex,x1 ,...,xk−1 ∈Fn2 ⎣ ψxk x + xi ⎦⎦ , (12.21)
S⊆{1,...,k−1} i∈S

where ψxk (x) = ϕ(x)ϕ(x + xk ).



Proposition 197 For every pseudo-Boolean function ϕ, the sequence ||ϕ||Uk k≥1
is
nondecreasing:
||ϕ||U1 ≤ ||ϕ||U2 ≤ · · · ≤ ||ϕ||Uk ≤ · · · (12.22)
12.4 The Gowers norm on pseudo-Boolean functions 471

This is due to the inequality


⎡ ⎡  ⎤⎤

Ex1 ,...,xk−1 ∈Fn2 ⎣Ex∈Fn2 ⎣ ϕS x + xi ⎦⎦ ≤
S⊆{1,...,k−1} i∈S
⎛ ⎡⎛ ⎡ ⎤⎞ 1
 ⎤⎞2 2

⎜ ⎢  ⎥⎟
⎝Ex1 ,...,xk−1 ∈Fn2 ⎣⎝Ex∈Fn2 ⎣ ϕS x + xi ⎦⎠ ⎦⎠ ,
S⊆{1,...,k−1} i∈S

which is a direct consequence of the Cauchy–Schwarz inequality, since it is equivalent to


⎛ ⎛ ⎡  ⎤⎞⎞2

inequality ⎝ ⎝Ex∈Fn ⎣
2
ϕS x + xi ⎦⎠⎠
x1 ,...,xk−1 ∈Fn2 S⊆{1,...,k−1} i∈S
⎛⎛ ⎡ ⎞
 ⎤⎞2
⎜⎝  ⎟
≤ 2(k−1)n ⎝ Ex∈Fn2 ⎣ ϕS x + xi ⎦⎠ ⎠ .
x1 ,...,xk−1 ∈Fn2 S⊆{1,...,k−1} i∈S

As shown in [568], for every k ≥ 2, || · ||Uk is a norm. The triangular inequality


||ϕ + ψ||Uk ≤ ||ϕ||Uk + ||ψ||Uk
   k
can be checked as follows: expanding S⊆{1,...,k} (ϕ + ψ) x + i∈S xi leads to 22 terms
  
of the form S⊆{1,...,k} ϕS x + i∈S xi , where each function ϕS is either ϕ or ψ; for each
of these terms, we have, using the Cauchy–Schwarz inequality and Relation (12.19) (in both
ways):
0 ⎡  ⎤00
0
0  0
0Ex,x ,...,x ∈Fn ⎣ ϕ x + x ⎦0
0 1 k S i 0
0 2
S⊆{1,...,k} i∈S 0
0 ⎡⎛ ⎡   ⎤⎞
0 
0
= 00Ex1 ,...,xk−1 ∈Fn2 ⎣⎝Ex∈Fn2 ⎣ ϕS x + xi ⎦⎠ ·
0 S⊆{1,...,k−1} i∈S
⎛ ⎡ ⎤⎞⎤00

 0
⎝Ex  ∈Fn ⎣ ϕS∪{k} x  + xi ⎦⎠⎦00
2
S⊆{1,...,k−1} i∈S 0
⎛ ⎡⎛ ⎡ ⎤⎞
⎤⎞2
1
 2

⎜ ⎢  ⎥ ⎟
≤ ⎝Ex1 ,...,xk−1 ∈Fn2 ⎣⎝Ex∈Fn2 ⎣ ϕS x + xi ⎦⎠ ⎦⎠ ·
S⊆{1,...,k−1} i∈S

⎛ ⎡⎛ ⎡ ⎤⎞ 1
 ⎤⎞2 2

⎜ ⎢  ⎟
⎝Ex1 ,...,xk−1 ∈Fn2 ⎣⎝Ex∈Fn2 ⎣ ϕS∪{k} x + xi ⎦⎠ ⎥
⎦⎠
S⊆{1,...,k−1} i∈S
⎛ ⎡  ⎤⎞ 12

= ⎝Ex,x1 ,...,xk ∈Fn2 ⎣ ϕS\{k} x + xi ⎦⎠ ·
S⊆{1,...,k} i∈S
472 Recent uses of Boolean and vectorial functions and related problems
⎛ ⎡  ⎤⎞ 12

⎝Ex,x ⎣ ϕS∪{k} x + xi ⎦⎠ ≤ · · ·
1 ,...,xk ∈F2
n

S⊆{1,...,k} i∈S

0 ⎡  ⎤00 21k
 0
0  0
≤ 0Ex,x ,...,x ∈Fn ⎣ ϕS x + xi ⎦00 ,
0 1 k
S⊆{1,...,k} 0 0
2
S  ⊆{1,...,k} i∈S 

resulting in an upper estimate by ||ϕ||rUk ||ψ||2Uk−r , where r and 2k − r are the numbers
k

k
of times that ϕS equals ϕ and ψ respectively, and this proves that (||ϕ + ψ||Uk )2 ≤
2 k  2 k 2k −r
r=0 r ||ϕ||Uk ||ψ||Uk , that is, ||ϕ + ψ||Uk ≤ ||ϕ||Uk + ||ψ||Uk .
r

Let now f be an n-variable Boolean function. We can consider the Uk norm of the sign
function fχ = (−1)f or that of f itself, viewed as a function from Fn2 to {0, 1}. In the former
case, which is the most studied one, we have from the very definition:

Proposition 198 Let k, n be positive integers such that k < n. Let f be an n-


variable Boolean function. Then ||fχ ||Uk equals k th root of the average value
 the 2 g(x)
−n
of 2 F (Da1 Da2 . . . Dak f ), where F (g) = , when a1 , a2 , . . . , ak range
x∈Fn2 (−1)
independently over F2 .
n

For any k ≥ 1 and any Boolean function f , according to Proposition 198 and to the fact
that, for every n-variable Boolean function g, we have F (g) ≤ 2n with equality if and only
if g is the null function, we have that ||fχ ||Uk is bounded above by 1, with equality if and
only if all k-th order derivatives of f are null, and we know, according to Proposition 5, page
38, that this is equivalent to saying that f has algebraic degree at most k − 1.
Relation (12.21) shows that ||fχ ||Uk satisfies the “recurrence” relation:
  k−1
 1
k
||fχ ||Uk = Eh∈Fn2 ||(Dh f )χ ||2Uk−1 2 . (12.23)

This relation can be iterated and shows then again the role of higher-order derivatives.
Note that, for k = 2, according to Proposition 198 and to Relation (3.9), page 98, ||fχ ||U2
is related to the second moment V (f ) of the autocorrelation coefficients by
(||fχ ||U2 )4 = 2−3n V (f ), (12.24)
and all the observations made at page 98 on V (f ) give then corresponding equalities and
bounds on ||fχ ||U2 (hence, studying the U2 norm has limited interest).
3n
For instance, we have nl(f ) ≤ 2n−1 − 2n−1 (||fχ ||U2 )2 ≤ 2n−1 − 2 4 −1 ||fχ ||U2 , with
equality on the left-hand side if and only if f is plateaued and overall equality if and only
if f is bent. We have also, according to Relations (12.24) and (3.10), page 98, that ||fχ ||U2
equals the normalized quartic mean of the Walsh transform of f :
⎛ ⎞1
4
−n ⎝
||fχ ||U2 = 2 Wf4 (b)⎠ . (12.25)
b∈Fn2
12.4 The Gowers norm on pseudo-Boolean functions 473

In fact, it is easily shown that, for any pseudo-Boolean function ϕ, we have


⎛ ⎞1
4
−n ⎝
||ϕ||U2 = 2 ϕ (b)⎠ .
4
(12.26)
b∈Fn2

Relations (12.23) and (12.25) and Relation (6.5), page 198, directly imply that a bent
function and its dual have the same U3 norm (as observed in [528]).
Of course, thanks to Relation (12.23) iterated k − 2 times, Relation (12.25) results in a
similar relation between ||fχ ||Uk and the average quartic mean of the Walsh transforms of
the (k − 2)th derivatives of f .
An important property of the 0Gowers uniformity norm0 is that ||ϕ||Uk is an upper bound
0 0
for the normalized correlations 0Ex∈Fn2 ϕ(x)(−1) 0 between pseudo-Boolean function ϕ
g(x)

and the sign functions (−1)g of all Boolean functions g of algebraic degree at most k − 1
(and in fact, between ϕ and a larger set of pseudo-Boolean functions; see below). This is a
direct corollary of Proposition 197:

Corollary 32 [568, 572] Let k, n be positive integers such that k < n. Let ϕ be an n-
variable pseudo-Boolean function and g an n-variable Boolean function of algebraic degree
at most k − 1. Then:
0 0
0 0
0Ex∈Fn2 ϕ(x)(−1)g(x) 0 ≤ ||ϕ||Uk .

0 0
0 0
Indeed, according to Relation (12.22), we have that 0Ex∈Fn2 ϕ(x)(−1)g(x) 0 = ||ϕ(−1)g ||U1
≤ ||ϕ(−1)g ||Uk = ||ϕ||Uk .
The result applies more generally when replacing (−1)g by what is called, in the Gowers
norm domain,  a “polynomial
 of degree at most k − 1,” that is, a pseudo-Boolean function
ψ such that S⊆{1,...,k} ψ x + i∈S xi equals the constant function 1 for every choice of
x 1 , . . . , xk .  0 0
0 0
For ϕ = fχ , we have min(dH (f , g), dH (f , g ⊕ 1)) = 2n−1 1 − 0Ex∈Fn2 ϕ(x)(−1)g(x) 0 ,
and taking the minimum for all Boolean functions g of algebraic degree at most k − 1,
Corollary 32 implies

nlk−1 (f ) ≥ 2n−1 1 − ||fχ ||Uk . (12.27)

Relation (12.27) means (similarly to what we announced at page 470) that the functions
with small Uk norm have large (k − 1)-th order nonlinearity. Recall that we have seen at
page 83 that, asymptotically and for every > 0, almost all Boolean functions are such that
nlk−1 (f ) > 2n−1 (1 − ).
The Gowers inverse conjecture (GIC) is that if ||ϕ||Uk is positive for a given pseudo-
Boolean function of absolute value bounded above by 1, then ϕ correlates (at a level to be
determined for each k) with a polynomial of algebraic degree k − 1 (as defined above).
This is straightforwardly true for k = 1, according to Relation (12.20) (taking constant
polynomial 1).
474 Recent uses of Boolean and vectorial functions and related problems

The GIC is also easily checked for k = 2: Relation (12.26) gives


⎛ ⎞1
4
−n ⎝
||ϕ||U2 = 2 ϕ (b)⎠
4

b∈Fn2
⎛  ⎞1
4
−n ⎝
≤2 maxn 
ϕ (b) 2
ϕ (b)⎠
2
b∈F2
b∈Fn2
⎛   ⎞1
4
−n ⎝ n
=2 2 maxn  2
ϕ (b) ϕ (x)⎠
2
b∈F2
x∈Fn2
 1
4
− n2
≤2 maxn 
ϕ 2 (b) ,
b∈F2

and we observe that | ϕ (b)| measures the correlation between ϕ and (−1)b·x .
The GIC is proved for ϕ = (−1)f and k = 3 in [1010, Appendix A]. The proof is a little
too long for being included here.
But for generic values of k, the GIC has been independently refuted by Green and Tao
[573] and by Lovett et al. [805] (a counter-example ϕ for k = 4 is the sign function of
the elementary symmetric function σ4 : its maximum normalized correlation with the sign
functions of cubic Boolean functions tends to 0 when n tends to infinity, but its Gowers U4
norm is bounded below by a strictly positive number). Bergelson et al. [9] have proposed
and proved a modification of the inverse Gowers conjecture valid in low characteristic, but
in characteristic 2, their result does not relate to distances, and a better adapted modification
needs then to be found.
In [528], the authors studied ||fχ ||U3 for some Maiorana–McFarland bent functions (the
value is determined when the permutation involved in the definition of the function, see
Relation 6.9, page 209, is APN) and of some cubic monomial functions.
13

Open questions

In this chapter, we list open problems related to the main chapters of this book; some have
been already mentioned in [245, 248]. We avoid stating those that seem elusive, such as the
determination of all bent functions. Some open questions are, however, quite difficult, while
others, more recent, may be easier to address, and some (which have never been proposed
until now) may even be easy.

13.1 Questions of general cryptography dealing with functions


1. Generalize the higher-order differential attack to block ciphers using S-boxes CCZ
equivalent to quadratic functions.
2. Find an expression of the period of general nonlinear-feedback shift register sequences
(NFSR) by means of the initialization and the feedback function.

13.2 General questions on Boolean functions and vectorial functions


1. Determine, for all values of n, the exact minimum numerical degree of n-variable
Boolean functions depending on all their variables (this value is near log2 n, according
to Proposition 15, page 67, and to the few lines after its proof); determine the functions
having such numerical degree.
2. Determine the set of all possible coset leaders of the first order Reed–Muller code, i.e.,
of those Boolean functions whose nonlinearity equals the Hamming weight, that is, such
that the null function is a best affine approximation, or equivalently such that Wf (0n ) =
maxa∈Fn2 |Wf (a)| (see page 79).
3. Find simple formulae for the number of balanced quadratic functions in n variables and
for the weight distribution of the dual RM(n − 3, n) of the second-order Reed–Muller
code.
4. Determine the possible Hamming weights in the third-order Reed–Muller code (we know
they are diverse, see Section 5.3, page 180); determine the weight distribution of this
code.
5. Determine, for all values of n and all 3 ≤ k ≤ n − 2, those n-variable Boolean functions
whose Walsh transform is divisible by 2k (see the second remark at page 64).
6. Use the numerical normal form (see Definition 12, page 47) to design relevant
secondary constructions of Boolean functions (see some ideas of constructions in
[248, subsection 4.1]).

475
476 Open questions

7. Determine the best nonlinearities of Boolean functions in odd dimension n ≥ 9; in


particular, find nine-variable Boolean functions having nonlinearity larger than 242 or
show they do not exist.
8. Determine the best nonlinearities of balanced Boolean functions in dimension n ≥ 8;
in particular, find an eight-variable (resp. ten-variable) balanced Boolean function with
nonlinearity 118 (resp. 494) or show they do not exist. Prove or disprove Dobbertin’s
conjecture for balanced functions (see page 297).
9. Find a better upper bound than the known ones for (n, m)-functions when
– n is odd and m < n
– n is even and n2 < m < n

13.3 Bent functions and plateaued functions


1. Characterize all bent functions of algebraic degree 3; extend this characterization to
plateaued functions for any amplitude.
2. Determine an efficient lower bound on the number of n-variable bent functions (see page
242); same question for plateaued functions of any amplitude.
3. Is any n-variable Boolean function (n even) of algebraic degree at most n2 the sum of
two bent functions? (This is called the bent sum decomposition problem; see page 242
as well.)
4. Determine an efficient upper bound on the number of n-variable bent functions (see page
243); same question for plateaued functions of any amplitude.
5. What is the minimum, for all n-variable bent Boolean functions, of the maximal
dimension of those affine subspaces of Fn2 on which they are constant? (We know it
is strictly smaller than n/2, according to the existence of nonnormal bent functions.)
Same question for plateaued functions of any amplitude.
6. Characterize all self-dual bent functions; see page 198 (quadratic ones have been
determined in [626]).
7. Characterize the algebraic normal forms of the elements of class PS (see page 212) or
their trace representations.
8. Investigate the structure of PS , and find a constructive definition of PS functions.
9. Evaluate the size of PS ; determine whether the subclass of those PS functions that are
related to full spreads is a large part of it (as observed by Dillon, the PS functions related
to those partial spreads that can be extended to spreads of larger sizes – in particular,
those related to full spreads – have necessarily algebraic degree n/2).
10. What are the possible algebraic degrees of PS + bent functions?
11. Determine whether all Kasami bent functions trn (ax 2 −2 +1 ) are non-weakly normal
2k k

for n ≥ 14 not divisible by 3 and 1 < k < n/2 coprime with n and a ∈ F4 \ F2 (see
page 253 and [270]).
12. Clarify what can be all the univariate representations of those Niho bent functions (see
page 221) related to known o-polynomials such as Subiaco and Adelaide.
13. Determine the duals of the Niho-bent functions numbers 1 and 3 of pages 221 and foll.
(the dual of function number 2 has been determined in [311]).
14. Determine what is the largest possible number of distinct affine derivatives of a
nonquadratic bent n-variable function (n even), which results in determining what
13.5 Algebraic immune functions 477

is the maximal dimension of this vector space (it is easy to see that it is at least
n − 3: take a quadratic bent function g in n − 6 variables and a cubic bent function
h in 6 variables; then g(x) has 2n−6 affine derivatives and it is easy to see that h
can have 23 affine derivatives (take for instance the Maiorana McFarland function
x1 y1 ⊕ x2 y2 ⊕ (x1 x2 ⊕ x3 )y3 ). Then g(x) ⊕ h(y), x ∈ F2n−6 , y ∈ F62 , has 2n−3 distinct
affine derivatives.
15. Find codes with the same parameters as the Kerdock codes (see page 254) and which are
not equivalent to subcodes of the second-order Reed–Muller code.
16. Find new simple and general constructions of perfect nonlinear/bent (n, m)-functions
(see pages 268 and 270).
17. Find hyper-bent functions (see page 244) EA inequivalent to PS ap functions in more
than four variables (a sporadic example exists in four variables [278]).
18. Determine if there exist Boolean functions in more than three variables whose second-
order derivatives Da Db f are all balanced when a and b are F2 -linearly independent;
see page 257.

13.4 Correlation immune and resilient functions


1. Determine an efficient lower bound on the number of n-variable k-resilient functions (see
page 311).
2. Determine an efficient upper bound on the number of n-variable k-resilient functions (see
page 312).
3. Determine whether there exists any nonaffine 3-resilient symmetric Boolean function (see
page 356).
4. Determine whether the minimum nonzero Hamming weight ωn,t of n-variable t-th order
correlation immune functions satisfies ωn,t ≤ ωn+1,t for every n and t (see page 305).
5. Determine ωn,t for every n and t.

13.5 Algebraic immune functions


1. Determine, for n odd and for n even, an efficient lower bound on the number of n-variable
functions of maximal algebraic immunity (see page 92).
2. Determine, for n odd and for n even, an efficient upper bound on the number of n-variable
functions of maximal algebraic immunity.
3. Determine a lower bound on the hyper-nonlinearity of the indicator function of
n−1
{α, . . . , α 2 } in F2n (α primitive element), see page 337, which would be not far
from the values of the nonlinearity computed for n ≤ 26.
4. Find a class of Boolean functions that would be as fast to compute as the hidden weight bit
function (see page 343) and with provably not bad algebraic immunity and fast algebraic
immunity, and whose nonlinearity would be good.
5. Determine, for any n, what is the best possible resiliency order of n-variable Boolean
functions with optimal algebraic immunity.
6. Determine, for any n, what is the best possible nonlinearity of Boolean functions with
optimal algebraic immunity.
478 Open questions

13.6 Highly nonlinear vectorial functions with low differential uniformity


1. Determine whether there exist (n, n)-functions with nonlinearity strictly larger than
n
2n−1 − 2 2 when n is even (see page 371).
2. Determine whether nonquadratic crooked functions exist (see page 278).
3. Find APN functions (see page 137) new up to CCZ equivalence (see page 28), by means
of their ANF.
4. Find new APN exponents (see page 390) or prove that all are known.
5. Determine whether APN functions with bad nonlinearity exist.
6. Determine whether the APN binomials of Proposition 178, page 405, can be generalized
for t = 4 to trinomials or quadrinomials.
7. Find simple and general secondary constructions of APN and AB functions (see page
119), and of differentially 4-uniform (n, n)-functions (see page 135) different from the
switching construction (see page 407), and if possible more systematic.
8. Find classes of AB functions by using CCZ equivalence with Kasami (resp. Welch,
Niho) functions (see page 397).
9. Find an example of an AB function CCZ inequivalent to power functions and to
quadratic functions (we have only one APN function known, with n = 6, having such
property [494]).
10. Find infinite classes of APN and AB functions CCZ inequivalent to power functions and
to quadratic functions.
11. Determine whether there exist componentwise APN (CAPN) functions (see page 390)
that are neither AB nor power permutations.
12. Determine whether there exist APN functions in odd dimension that are not CAPN.
13. Determine whether the CAPNness of permutations is equivalent to the CAPNness of
their compositional inverses, and more generally, whether CAPNness is CCZ invariant.
14. Determine whether Kasami APN functions are componentwise Walsh uniform (CWU;
see page 414).
15. Find a systematic way, given an APN function F , to build another (EA inequivalent)
function F  such that γF  = γF (see Proposition 158, page 375).
16. Find an APN permutation in even dimension n ≥ 8, or better, an infinite class (this is
the so-called “big APN problem”; see observations on this problem in [136, 199, 248]
and how to work with CCZ equivalence to reach EA-inequivalent functions in [145]).
17. Derive new simple and general constructions of APN/AB functions from perfect
nonlinear functions (see page 409), and vice versa.
18. If possible, classify APN functions, or at least their extended Walsh spectra, or at least
their nonlinearities.
19. Determine whether differentially 6-uniform (n, n − 2)-functions exist for n > 5.
20. Determine the pairs (n, m) for which Nyberg’s bound (see page 423) is tight.
21. Construct infinite classes of CWU differentially 4-uniform (n, n − 1)-functions.

13.7 Recent uses of Boolean and vectorial functions and related problems
1. Characterize for t ≥ 2 (or at least for t = 2) those functions that admit a threshold
implementation (TI) with t masks (i.e., a t-th order TI) and with uniformity; see pages
436 and foll.
13.7 Recent uses of Boolean and vectorial functions and related problems 479

2. Can the multiplicative inverse function F (x) = x 2 −2 have an (n − 1)-th order TI with
n

uniformity, in particular for n = 8?


3. Any AB function has it an n+1 2 th order TI with uniformity?
4. Find generic primary constructions of TI with uniformity.
5. Provide cases of secondary constructions of TI with uniformity more general than those
exhibited in [1094] (see page 442).
6. Determine all nonquadratic bent functions whose restrictions to the set of binary vectors
of length n and Hamming weight k have null nonlinearity (i.e., coincide with an affine
function), for every k = 1, . . . , n − 1 (the determination of the quadratic ones is given
in Proposition 193, page 463).  
7. Determine, for every 1 ≤ k ≤ n, the smallest integer e such that 2 ne > nk (providing
an upper bound on algebraic immunity with input restricted to Hamming weight k;
see page 465), and study its asymptotic behavior relatively to the standard algebraic
immunity upper bound n/2.
8. Determine whether the four instances of the FLIP cipher (see Subsection 12.2.1) resist
algebraic attacks combined with guess and determine attacks.
9. Given two positive integers e and k, what is the smallest number of variables for which
there exists a Boolean function of algebraic immunity at least e and resiliency order at
least k (see page 469)?
10. Is there, for k ≤ n−2, an upper bound on the algebraic immunity of k-resilient functions
that would be sharper than min(n − k − 1,  n2 )?
11. Find a modification of the inverse Gowers conjecture (see page 473) that would be true
in characteristic 2 and would involve Hamming distances.
14

Appendix: finite fields

We briefly recall the basics on finite fields and the main properties used in the body of the
present book. In the limit of this appendix, we are far from complete, and we refer then
to the books [775, 890], whose sizes show the extent of the state of the art that we briefly
summarize here.
Reminder: A field (F, +, ∗) is by definition such that
– (F, +) is an Abelian group (we denote its neutral element by 0).
– (F \ {0}, ∗) is an Abelian group (we denote its neutral element by 1).
– ∗ is distributive with respect to +.

Notation: F \ {0} can be denoted by F∗ .


Important property: F has no nonzero “zero divisor” (we call this way any element α of F
such that there exists β = 0 such that α ∗ β = 0).

Exercise: Show (by Euclidean division and factorization) that a polynomial of degree n
over a field can have at most n zeros in this field.

14.1 Prime fields and fields with four, eight, and nine elements
14.1.1 Characteristic of a finite field
The cardinality of a field is called its order. Let F be a finite field (i.e., a field with a finite
order, also called a Galois field). The mapping m ∈ N → m · 1 = 1 + · · · + 1 ∈ F cannot
be injective. Hence there exist positive integers m, m such that m < m and m · 1 = m · 1.
Then we have (m − m) · 1 = 0 and m − m > 0. The smallest positive integer p such that
p · 1 = 0 is called the characteristic of F.

Remark. For every x ∈ F, we have p · x = (p · 1) ∗ x = 0, i.e., the iterated addition of


any element with itself is “mod p.”

Theorem The characteristic of any finite field is a prime integer.

480
14.1 Prime fields and fields with four, eight, and nine elements 481

Proof Suppose that p = kl for some integers 1 < k < p and 1 < l < p. We have
(kl) · 1 = 0, then k · 1 and l · 1 are nonzero divisors of zero. A contradiction.

Notation: we will now write p instead of p · 1 and the multiplication between field elements
will be (classically) represented with a lack of symbol instead of ∗.

14.1.2 Prime fields


For every field F of characteristic p (prime), we have {0, 1, . . . , p − 1} ⊆ F. As observed
already, 0, 1, . . . , p − 1 have here to be considered as integers mod p. Hence they belong to
Z/pZ. Hence, more precisely, we have Z/pZ ⊆ F.

Theorem Let p be any prime integer. Every field of characteristic p admits Z/pZ as a
subfield, and Z/pZ does not have a proper subfield (and is then called a prime field).

Indeed, Z/pZ is itself a field (and it is the smallest field of characteristic p): it is a ring
(i.e., (Z/pZ, +) is an Abelian group and ∗ is associative and distributive with respect to +)
and every nonzero element a has an inverse, since the mapping x ∈ Z/pZ → ax ∈ Z/pZ
being injective, it is bijective.
If n is not a prime, then Z/nZ is not a field, since it has zero divisors.

Remark. According to Bézout’s identity, for p prime and a ∈ Z/pZ∗ , since a and p are
coprime, there exist u and v such that 1 = au + pv and (u mod p) is the inverse of a. It can
be calculated by the (extended) Euclidean algorithm.

Example: operations in Z/7Z:

+ 0 1 2 3 4 5 6 * 0 1 2 3 4 5 6
0 0 1 2 3 4 5 6 0 0 0 0 0 0 0 0
1 1 2 3 4 5 6 0 1 0 1 2 3 4 5 6
2 2 3 4 5 6 0 1 2 0 2 4 6 1 3 5
3 3 4 5 6 0 1 2 3 0 3 6 2 5 1 4
4 4 5 6 0 1 2 3 4 0 4 1 5 2 6 3
5 5 6 0 1 2 3 4 5 0 5 3 1 6 4 2
6 6 0 1 2 3 4 5 6 0 6 5 4 3 2 1

Notation: Since Z/pZ is a field, we shall denote it by Fp .

14.1.3 Possible size of a finite field


Let F be a finite field of characteristic p. Since Fp is a subfield, F is a vector space over
Fp , and since F is a finite set, F must have a finite basis, say of size n. Let {b1 , b2 , . . . , bn }
denote such a basis. Any element of F can be written uniquely as a linear combination
c1 b1 + c2 b2 + · · · + cn bn and there are then p n elements in F.
482 Appendix: finite fields

Theorem A finite field of characteristic p must have size q = p n for some natural
number n.

Number n is called the degree.

14.1.4 Extending prime fields; fields with four, eight, and nine elements
Reminder: C is an extension field of R as follows: every element (a0 , a1 ) of R2 is identified
to the polynomial a0 + a1 x; C equals the set of polynomials a0 + a1 x with the usual addition
and with multiplication mod x 2 + 1; x is written i.
C has no nonzero zero divisor because the polynomial x 2 + 1 is irreducible over R (i.e.,
cannot be factored). C is a field. It is the smallest field containing R and an additional
element i, solution of the equation x 2 + 1 = 0.
C is an extension field of degree 2 of R.
The field F4 : It is easily checked that the polynomial x 2 + x + 1 is irreducible over F2 . Any
element (a0 , a1 ) of F22 is identified to the polynomial a0 + a1 x, and F4 equals F22 with usual
addition and multiplication mod x 2 + x + 1.

+ 0 1 x 1+x * 0 1 x 1+x
0 0 1 x 1+x 0 0 0 0 0
1 1 0 1+x x 1 0 1 x 1+x
x x 1+x 0 1 x 0 x 1+x 1
1+x 1+x x 1 0 1+x 0 1+x 1 x

F4 has no nonzero zero divisor because x 2 + x + 1 is irreducible. The mapping x ∈ F4 →


ax ∈ F4 being injective for a = 0, it is bijective and F4 = {0, 1, x, 1 + x} is a field. We
have also F4 = {0, 1, x, x 2 }, that is, x is a generator of F4 , which is related to the fact that
x 2 + x + 1 is a primitive polynomial (see below).
Notation: x is written α. It is a root in F4 of the polynomial x 2 + x + 1. We have
+ 0 1 α 1+α * 0 1 α 1+α
0 0 1 α 1+α 0 0 0 0 0
1 1 0 1+α α 1 0 1 α 1+α , that is,
α α 1+α 0 1 α 0 α 1+α 1
1+α 1+α α 1 0 1+α 0 1+α 1 α

+ 0 1 α α2 * 0 1 α α2
0 0 1 α α2 0 0 0 0 0
1 1 0 α2 α 1 0 1 α α2
α α α2 0 1 α 0 α α2 1
α2 α2 α 1 0 α2 0 α2 1 α
The first representation of the elements 0, 1, α, 1+α of F4 is called the additive form and the
second one 0, 1, α, α 2 , the multiplicative form.
14.2 General finite fields: construction, primitive element 483

Exercise: Calculate the tables of + and ∗, in the two representations, for


– F8 , constructed with F2 and the irreducible (primitive) polynomial x 3 + x + 1
– F9 and F27 , constructed with F3 and the irreducible (primitive) polynomials x 2 + x + 2,
x 3 + 2x 2 + 1

14.2 General finite fields: construction, primitive element


Let p be a prime number, n a positive integer, and f (x) an irreducible polynomial of degree
n over Fp (we shall see below that such polynomial exists for any p and any n). Then
Kronecker’s construction is to identify Fnp with the set of polynomials of degrees at most
n − 1 over Fp and to endow it with the usual addition and with the multiplication mod
f (x). In precise mathematical terms, we construct Fp [x]/(f (x)), the quotient of the ring
Fp [x], by the ideal (f (x)) equal to the set of multiples of f (x). In more practical words,
by denoting x by α, we take this symbol and allow it to add and multiply with elements
of Fp and with itself, with the restriction that f (α) = 0. We obtain a ring, denoted by
Fp (α), with no nonzero divisor of zero. Since this ring is finite, it is a field: the mapping
x ∈ Fp (α) → ax ∈ Fp (α) being injective, it is bijective.

Theorem Let f (x) be an irreducible polynomial over any finite field K (in practice, a
prime field). The set K[x]/(f (x)) obtained from Kronecker’s construction is again a field.

Electronic circuits for calculating in finite fields are based on flip-flops (for storing bits),
adders, and multipliers:

+ ∗

For instance, in F24 with irreducible polynomial X4 + X + 1, the operation of multiplication


by α results in

Notation: Fp (α) is denoted by Fpn (the same notation1 for all choices of f (x), for reasons
that will appear later).

Remark. Different time/memory trade-offs exist in the literature for implementing multi-
plications. For hardware implementations and large dimensions n, several works have been
published among which are the Omura–Massey method, the Sunar–Koç method, and the
1 But some authors write GF (pn ) instead of Fpn .
484 Appendix: finite fields

Karatsuba algorithm. For software implementations in small dimensions (e.g., n ≤ 10), the
number of pertinent possibilities is reduced. See a survey in [320].
k k i k−i
Remark. In characteristic 2, Newton’s formula (u + v)k = i=0 i u v  has to be
applied in conjunction with Lucas’ theorem (see page 487), which says that ki [mod 2]
equals 1 if and only if the binary expansion of k covers that of i.

Calculating the inverse: Recall that there is a Euclidean algorithm for polynomials similar
to the one for integers. Let a(x), b(x) be a pair of polynomials. The (extended) Euclidean
algorithm for polynomials provides their greatest common divisor d(x) and also a pair of
polynomials s(x), t (x) such that s(x)a(x) + t (x)b(x) = d(x).
In our situation, one of the polynomials, b(x) = f (x), is irreducible, and the other, a(x)
representing a nonzero element of the field, is of degree lower than the irreducible one, so
that their greatest common divisor is 1. Then we can find polynomials s(x) and t (x) such
that s(x)a(x) + t (x)f (x) = 1.
Denoting as usual x by α, we have f (α) = 0, then we see that we have found that s(α) is
an inverse of a(α).

Example [Euclidean algorithm for x 2 + 1 in F27 constructed with the irreducible polynomial
x 3 +2x 2 +1 over F3 ]. We detail the operations mathematically, as an illustration. A simpler
method will be possible when we arrive at the notion of primitive element (thanks to the
multiplicative representation). First we divide f (x) = x 3 + 2x 2 + 1 by a(x) = x 2 + 1.
We get x 3 + 2x 2 + 1 = (x + 2)(x 2 + 1) + 2x + 2. The remainder 2x + 2 is not of degree
zero, so we must divide x 2 + 1 by 2x + 2; we get x 2 + 1 = (2x + 1)(2x + 2) + 2.
Since the remainder is a constant, the Euclidean algorithm stops; the two polynomials are
relatively prime with greatest common divisor equal to 1 (after division by 2). Expressing
each remainder by its expression obtained in each division, from bottom to top, we have
2 = x 2 + 1 − (2x + 1)(2x + 2) = x 2 + 1 − (2x + 1)[x 3 + 2x 2 + 1 − (x + 2)(x 2 + 1)] =
−(2x + 1)(x 3 + 2x 2 + 1) + (2x 2 + 2x)(x 2 + 1) and therefore (2x 2 + 2x)(x 2 + 1) ≡
2 [mod x 3 + 2x 2 + 1]. If we divide by 2 (i.e., multiply by 2), we conclude that the inverse
of x 2 + 1 is x 2 + x.

14.2.1 The fundamental equation over finite fields


Let Fq be a field of order q = p n , where p is a prime. Consider the multiplicative group
F∗q of nonzero elements of Fq . It has order q − 1. Let β denote an arbitrary element of the
multiplicative group. By Lagrange’s theorem (saying that, for any finite group G, the order
of every subgroup divides the order of G), an element of a group raised to the order of the
group equals the identity, that is, β q−1 = 1 (indeed, the set of all powers of β is a subgroup
of F∗q whose order equals the order of β). This property is called Fermat’s little theorem for
finite fields.
This equation is valid for any nonzero β. Multiplying through by β yields
β q = β,
which is still valid for nonzero elements and now also valid for zero.
14.2 General finite fields: construction, primitive element 485

In particular, for all j ∈ Z/pZ, we have j p = j .


Another proof of equation β q = β (without using Lagrange’s theorem) is to say that
if a1 , a2 , . . . , aq−1 denote the nonzero elements of Fq and β is any nonzero element of
Fq , the elements βa1 , βa2 , . . . , βaq−1 are all different and are then all nonzero elements of
Fq ; hence we have a1 a2 . . . aq−1 = βa1 βa2 . . . βaq−1 = β q−1 a1 a2 . . . aq−1 and therefore
β q−1 = 1. The rest is similar.

Exercise: Let P (x) be a polynomial over Fq . Show that P (x) is in fact over Fp if and only
if (P (x))p = P (x p ).

Note that the polynomial x q − x, having degree q, cannot have more than q zeros (roots),
 in Fq , that is, x − x factors completely into linear
and we know then all its q distinct zeros q

factors in Fq (i.e., splits): x − x = β∈Fq (x − β). We say that Fq is the splitting field of
q

x q − x:
Fq = {x; x q − x = 0}.
Let us now prove that any irreducible polynomial f (x) of degree n over Fp is a divisor
of x q − x. As we saw, this polynomial f (x) does not have any zero in Fp , but Kronecker’s
construction makes it possible to construct an extension field Fq in which f (x) does have a
zero. We know that in Fq the polynomial x q − x has all elements of the field as zeros, so it
must have a zero in common with f (x), and since f (x) is irreducible, gcd(f (x), x q − x),
which is not trivial, equals f (x).

Theorem Let q = p n be a power of a prime. Then every irreducible polynomial of degree


n over the prime field Fp divides the polynomial x q − x.

Moreover, since x q − x splits in Fq , then f (x) also splits in Fq . Since f (β p ) = (f (β))p


for every β ∈ Fq , because the Newton formula reduces to (β + β  )p = β p + β  p , and
j
because j p = j for every j ∈ Z/pZ, the elements of the form α p , j = 0, . . . , n − 1,
where α denotes a zero of f (x) in Fq , are zeros of f (x), and they are then all the zeros of
j
f (x) since the degree of this polynomial is assumed equal to n and all the elements α p are
distinct (otherwise, f (x) would be divisible by a polynomial of degree strictly less than n
with coefficients in Fp and would then not be irreducible). Note that if we do not assume
d
anymore that f (x) has degree n, its degree d is necessarily a divisor of n and x p −x divides
n
x p − x.

14.2.2 Existence of finite fields


Let q = pn be a power of a prime. If we wish to build Fq , the only thing we need is an
irreducible polynomial of degree n over Fp .

Lemma There exist irreducible polynomials of every degree n over every prime field Fp .

Proof Let Fd (x) denote the product of all polynomials irreducible over Fp of degree d.
Then since distinct irreducible polynomials are coprime, we have
486 Appendix: finite fields

xq − x = Fd (x).
d|n

Let Nd denote the number of irreducible polynomials  of degree d over Fp . By equating


degrees on both sides of this relation, we see that p n = d|n dNd .
 By the Möbius inversion formula [635], denoting by μ the Möbius function μ(d) =
0 if d is divisible by a square
, we have then
(−1)nd otherwise, where nd is the number of primes dividing d

nNn = μ(d) pn/d > 0.


d|n

Hence, for any power of a prime q, there


 exists a field with q elements.
 1 if n = 1
Reminder: We have d|n μ(d) = , which implies the Möbius inversion
0 if n > 1
formula.

14.2.3 Uniqueness of finite fields


We show now that when building Fq by Kronecker’s construction, any irreducible polyno-
mial of degree n gives the same field up to isomorphism.

Definition Let F and K be two fields, finite or not. An isomorphism between F and K is
a bijective (one-to-one) correspondence φ from F to K such that for any a and b in F, the
following equalities hold:
φ(a + b) = φ(a) + φ(b)
φ(ab) = φ(a)φ(b).

If such an isomorphism exists, the tables of operations + and ∗ are the same in the
two isomorphic fields, but with different orderings, or names, of the elements. We shall
understand this as: the two fields being in fact the same field.

Exercise: Let q = pn with p prime. Let α be a zero of some irreducible polynomial f (x)
over Fp and let Fq = Fp (α). Let K be a field of the same order q.
1. Show that the image of α by any isomorphism from Fq to K must be a zero of f (x).
2. Prove that sending α to a zero of f (x) in K gives an isomorphism from Fq to K.

Theorem All finite fields of the same size are isomorphic and can be obtained with
Kronecker’s construction.

Example: Let F27 denote the field obtained by adjoining a zero α of f (x) = x 3 + 2x 2 + 1
to F3 , and let K27 be the field obtained by adjoining a zero β of g(x) = x 3 + 2x + 1. To
find an isomorphism from F27 to K27 , we can factor the polynomial f (x) = x 3 + 2x 2 + 1 in
K27 . We know that f (x) must factor completely in K27 . So to factor f (x), we need to only
look for its three zeros in K27 , and there are only 27 elements to try (actually, 24, because
14.2 General finite fields: construction, primitive element 487

no element of F3 is a zero). We obtain f (x) = (x + β 2 + 2)(x + β 2 + β)(x + β 2 + 2β), from


which we can read that the three zeros are 2β 2 +1, 2β 2 +2β, and 2β 2 +β. The isomorphism
is now given by sending α into any of the zeros of f (x) in K27 .
n m
Exercise: show that x p − x divides x p − x if and only if n divides m. Deduce that Fpn
is a subfield of Fpm if and only if n divides m.

14.2.4 Frobenius automorphism


In Fq , the mapping φ : x → x p is an automorphism (that we already encountered above),
since
 
– (x + y)p = x p + y p , ∀x, y ∈ Fq , since p being a prime pi = pi p−1 i−1 is divisible by p
for every 0 < i < p.
– (xy)p = x p y p , and x → x p is bijective since its (additive or multiplicative) kernel is
trivial. It is called the Frobenius automorphism.
 i  i i
This implies, for instance, that (x + y) i∈I p = i∈I (x p + y p ), which for p = 2 implies
n
Lucas’ theorem [809, page 404]: j is odd if and only if the binary expansion of j is covered
by (i.e., has support included in) that of n.

Exercise: We saw that any polynomial f (X) over Fp satisfies f (φ(β)) = φ(f (β)) for
every β ∈ Fq . Show that this is characteristic of polynomials over Fp among polynomials
over Fq .

The automorphisms of Fq are the powers of the Frobenius automorphism; their set, with
the composition operation, is a group, called the Galois group of Fq .

14.2.5 Primitive element


When studying F4 , we have seen that the element that we denoted by α is such that F4 =
{0, 1, α, α 2 }. The lemma below implies that, for every power q of a prime, the multiplicative
group F∗q is also cyclic, that is, there exists α ∈ Fq such that Fq = {0, 1, α, . . . , α q−2 } (i.e.,
q − 1 is the smallest possible integer i such that α i = 1). Such α will be called a primitive
element.

Exercise: Recall that any irreducible polynomial P (x) ∈ Fp [x] of degree n > 1 is a
divisor of x p −1 − 1. We say that P (x) is primitive if one of its zeros is a primitive element
n

of Fpn (and then, all its zeros are).


1. Show that if P (x) is a primitive polynomial, then min{m; P (x) | x m − 1} = p n − 1.
2. Show conversely that if P (x) is any irreducible polynomial of degree n such that
min{m; P (x) | x m − 1} = p n − 1, then P (x) is primitive.

Lemma Let G be a finite multiplicative group with k elements, in which for every m ≤ k,
there are at most m solutions for the equation x m = 1. Then G is a cyclic group.
488 Appendix: finite fields

Proof Denote by am the number of elements of G of order m (that is, satisfying x m = 1



and x m = 1 for m < m). Note that, if m does not divide k, then am = 0, according to
Lagrange’s theorem. If am = 0, then there exists g ∈ G of order m. Then, according to the
hypothesis that there are at most m solutions for the equation x m = 1, the m powers of g
are all solutions of x m = 1. Moreover, g i has order gcd(i,m)
m
. Hence, the group generated by
g has φ(m) generators, where φ(m) is Euler’s totient function (whose value is the number
of elements in {1, . . . , m}, which are coprime with m). Thus, if am is nonzero, then it is
precisely
 φ(m).  
But m|k φ(m) = k and m|k am = km=1 am = k. Hence am = φ(m) for every m that
divides k.
In particular, ak = φ(k) > 0, so G contains elements of order k, and is cyclic.

Exercise: We know that every multiplicative subgroup of F∗q has for order a divisor of
q − 1. Show that, for each divisor k of q − 1, there exists a unique multiplicative subgroup of
q−1
F∗q of order k. Show that a generator of this subgroup is α k , where α is a primitive element
of Fq .

14.3 Representation (additive and multiplicative); trace function


For q = p n , Kronecker’s construction with any irreducible polynomial of degree n over Fp
leads for any element x ∈ Fq to:

x = x0 + x1 α + x2 α 2 + · · · + xn−1 α n−1 ; x0 , . . . , xn−1 ∈ Fp .

This is called the additive representation of x


For α primitive, since the fundamental polynomial x q − x splits in Fq , this α is the zero
of an irreducible (a primitive) polynomial over Fp . In other words, for every n, there exists
a primitive polynomial of degree n.
Then any nonzero element x ∈ Fq can be written

x = αi ; for i ∈ Z/(q − 1)Z that is i ∈ {0, . . . , q − 2}.

This is the multiplicative representation.

Remark. Denote by fn,α (i, j ) the bivariate function over Z/(pn − 1)Z such that α i +
α j = α fn,α (i,j ) . There is no known expression of fn,α (i, j ) (see for instance [890, subsection
2.1.7.5, page 27], but we have fn,α (i + 1, j + 1) = fn,α (i, j ) + 1, fn,α (pi, pj ) = pfn,α (i, j );
if k is coprime with pn − 1, then we have fn,α k (i, j ) = fn,α (ki, kj ), and if β is a primitive
element of Fprn and α = β (p −1)/(p −1) , then we have frn,β ((p rn − 1)i/(pn − 1), (p rn −
rn n

1)j/(pn − 1)) = (p rn − 1)fn,α (i, j )/(p n − 1).

Exercise: Show that, for every integer i, we have

{x i , x ∈ Fq } = {x gcd(i,q−1) , x ∈ Fq }.
14.3 Representation (additive and multiplicative); trace function 489

14.3.1 Absolute trace function


Let q = p n . Recall that the Frobenius automorphism  : x → x p satisfies n = I d and
that Fp is the set of solutions of equation (x) = x. Two elements are called conjugate if
they correspond through i for some integer i. We have seen that the zeros of any irreducible
polynomial over Fp are conjugate. The function
trq/p (x) = x + (x) + 2 (x) + · · · + n−1 (x)
2 n−1
= x + xp + xp + · · · + xp
is linear over Fq . In the body of our book and below, for p = 2 and q = 2n , we simply write
trn instead of trq/p .

Exercise: Show that


trq/p ((x)) = (trq/p (x)) = trq/p (x) for every x ∈ Fq .
Hence trq/p (x) ∈ Fp for every element x of Fq (but not for any element of a superfield of
Fq , if we extend the polynomial function trq/p to this superfield) and trq/p is an Fp -linear
form over the space Fq . It is called the absolute trace function over Fq .

Exercise: 1. Show that, for every a ∈ Fq the function trq/p (ax) is identically null on Fq
if and only if a = 0.
2. Deduce that the set of all Fp -linear forms over Fq equals the set of these functions.

Remark. For every nonzero a ∈ Fq , the set {x ∈ Fq ; trq/p (ax) = 0} is a hyperplane (a


vector space of codimension 1) in the vector space Fq over Fp . If (α1 , . . . , αn ) is a basis of
Fq over Fp and (β1 , . . . , βn ) an orthonormal ntrq/p (αi βj ) = 1 if i = j and
n basis (such that
trq/p (αi βj ) = 0 otherwise)
 and if a = i=1 ai βi , x = i=1 xi αi , then the equation of
this hyperplane in Fnp is ni=1 ai xi = 0.

Exercise: Show that every hyperplane of Fq (vectorspace over Fp ) has this form.

Exercise: Show that, for every i ∈ Fp , we have



trq/p (X) − i = (X − u).
u∈Fq ;trq/p (u)=i

Exercise: 1. Recall why a binary linear recurring sequence


sn = a1 sn−1 ⊕ · · · ⊕ aL sn−L ; a1 , . . . , aL ∈ F2 (14.1)

has ultimate period at most 2L − 1.


2. Show that if aL = 0, then the sequence is fully periodic.
3. We assume that the polynomial f (x) = x L + a1 x L−1 + · · · + aL is primitive.
a. Show that any sequence of the form sn = trL (aα n ), where a ∈ F∗2n and α is a zero of
f (x), satisfies Relation (14.1).
490 Appendix: finite fields

b. Deduce that all the nonzero sequences satisfying Relation (14.1) are of the form sn =
trL (aα n ) where a ∈ F∗2n . Show that they admit 2L − 1 as minimal period.
4. Conversely, we assume that the nonzero sequences sn satisfying Relation (14.1) admit
2L − 1 as minimal period. Show that f (x) is primitive.
These sequences are called m-sequences.

Exercise: Determine the kernel of the linear mapping x ∈ F2n → x + x 2 and deduce that,
for every u ∈ F2n , there exists a solution of the equation x 2 + x = u in F2n if and only if
trn (u) = 0.

14.3.2 Subfields and other trace functions


Kronecker’s construction can be applied the same way as before to any finite field instead of
a prime field: let q be a prime power and f (x) be an irreducible polynomial of degree k over
Fq . Then Fq [x]/(f (x)) is a field of order q k . This implies again that, if n divides m, then
(up to isomorphism) Fpn is a subfield of Fpm . Recall that, conversely, if Fpn is a subfield of
Fpm , then n divides m.
The trace function from Fq k to Fq is the function
2 k−1
trq k /q (x) = x + x q + x q + · · · + x q .

Exercise: Prove that trq k /q is a Fq -linear form over Fq k .

Exercise: Check that, if n | m | s, then trps /pn = trpm /pn ◦ trps /pm .

14.4 Permutations on a finite field


Exercise: Show that, for every prime power q, every function f (x) from Fq to Fq is a
polynomial function of degree at most q − 1 over Fq , that is,
q−1
f (x) = ai x i ; ai ∈ Fq
i=0

and that this representation is unique.


This polynomial is seen as an element of Fq [x]/(x q − x). It is called a permutation
polynomial if the function f (x) is bijective.

14.4.1 Examples of permutation polynomials


Affine polynomials: P (x) = ax + b, a = 0
n−1 i
Bijective linearized polynomials (or p-polynomials): P (x) = i=0 ai x p , ai ∈ Fq , such
that ker(P ) = {0}. These polynomials being viewed [mod x q − x], the exponents i are
viewed in Z/nZ, where q = pn .
14.4 Permutations on a finite field 491
n−1 i
Exercise: Show that the gcd of two linearized polynomials L(x) = i=0 ai x p , ai ∈ Fp ,
 
and L (x) = i=0 n−1 i n−1 i
bi x p , bi ∈ Fp (p-polynomials over Fp ), equals i=0 ci x p , ci ∈ Fp ,
n−1  n−1 n−1
i=0 ci x = gcd( i=0 
where i ai x i , i=0 bi x i ). Deduce that L(x) is a permutation
n−1
polynomial over Fq if and only if i=0 ai x i (its p-associate polynomial) is coprime with
x n − 1. n−1 i
This result extends to polynomials i=0 ai x q , ai ∈ Fq , over Fq n for q = p r .

The sums of such polynomials with constants (affine permutations).


Power (monomial) functions: recall that if α is a primitive element of Fq , then for every
i ∈ Z/(q − 1)Z, α i is primitive if and only if gcd(i, q − 1) = 1.

Exercise: Show that, for every i ∈ Z/(q − 1)Z, the function x → x i is a permutation of
Fq if and only if gcd(i, q − 1) = 1.

gcd(22j −1,q−1) 2gcd(2j ,n) −1


Exercise: Show for i = 2j +1 and q = 2n that gcd(i, q −1) = gcd(2j −1,q−1)
= 2gcd(j ,n)−1
and that x → x i is a permutation of Fq if and only if n
gcd(j ,n) is odd.

Dickson polynomials

Theorem For every  integer k, there exists a polynomial Dk over Z satisfying the
 positive
formal equality Dk x + x = x k + x1k .
1

 k  k−1 k  2i−k
Proof By induction on k: if k is odd then x+ 1
x = 2
i=0 i x + x k−2i
and therefore, assuming the property valid until k − 1, it is proved for k and x k =
 k−1
2
k
i=0 i Dk−2i (x) and
k−1
2 
k
Dk (x) = x − k
Dk−2i (x)
i
i=1
 k
 k k  2i−k 
2 −1
and if k is even, then x + x1 = i=0 i x + x k−2i + kk implies
2
 k2 −1 k k
x k = i=0 i Dk−2i (x) + k and therefore
2
k
2 −1  
k k
Dk (x) = x −k
Dk−2i (x) − k .
i 2
i=1
This completes the proof.

Exercise: 1. Let q = 2n . Recall why the equation x 2 + x = c has solutions if and only if
trn (c)= 0. Deduce that, for a = 0, the equation x 2 + ax = b has solutions if and only
if trn ab2 = 0.
492 Appendix: finite fields

2. a. Let q = 2n . Show that any element x of Fq satisfies tr2n (x) = 0. Deduce that, for
every x ∈ F∗q , there exist two elements h of F∗q 2 such that h+h−1 = x (by transforming
this equation in h into an equation of degree 2 in xh ) and that these two elements are
inverse of each other.
b. Show that h belongs to Fq if and only if trn (x −1 ) = 0 and that, otherwise, h belongs
to the multiplicative subgroup of order q + 1 of Fq 2 .
3. Let q = p n with p odd.
a. Does any element of Fq have a square root in Fq ?
b. Show that, given a primitive element α of Fq 2 , every element x ∈ F∗q can be written
α i(q+1) and has a square root in Fq 2 .
c. Show that, for every x ∈ F∗q , there exist two elements h of F∗q 2 such that h + h−1 = x
and that these two elements are inverse of each other.

Exercise:
1. Show that the Dickson polynomial defines a function from Fq to Fq .
2. Show that, if gcd(k, q 2 − 1) = 1, then Dk is a permutation polynomial over Fq .
3. To prove the converse, assume that gcd(k, q 2 − 1) = d > 1.
a. If d is even, show that Dk (−x) = Dk (x) and −x = x.
b. If d is odd, then let r be an odd prime dividing d. Show that there exist two distinct
elements b, c in F∗q 2 such that b + b1 ∈ Fq , c + 1c ∈ Fq , b = 1c and br = cr (the two
cases where r divides q − 1 and r divides q + 1 can be distinguished). Check that
Dk (b + b1 ) = Dk (c + 1c ) and b + b1 = c + 1c .

Then Dk is not a permutation polynomial over Fq .

14.4.2 General results on permutation polynomials


Exercise:

 0 if t = 0, . . . , q − 2
1. Show that bt = .
b∈Fq −1 if t = q − 1
q−1
2. Deduce
 that, for any polynomial P (x) = i=0 pi x over Fq , we have pq−1 =
i

− b∈Fq P (b).

Exercise: With the usual convention 00 = 1, show that, given b ∈ Fq , we have



q−2 ⎨ 0 if b = 0, 1
b =
t
1 if b = 0

t=0 −1 if b = 1

and
q−1 
1 if b = 1
b =
t
.
0 if b = 1
t=0
14.4 Permutations on a finite field 493

Lemma Let a0 , . . . , aq−1 be elements of Fq . These elements are all distinct if and only if
q−1 
0 if t = 0, . . . , q − 2
ait = .
−1 if t = q − 1
i=0

Proof If a0 , . . . , aq−1 are distinct, then {a0 , . . . , aq−1 } = Fq and, denoting by α a primitive
element of Fq , we have
q−1 q−2 q−2 
0 if t = 0, . . . , q − 2
ait = 0t + α j t = 0t + (α t )j = .
−1 if t = q − 1
i=0 j =0 j =0

q−1 0 if t = 0, . . . , q − 2
Conversely, if t
i=0 ai = , then
−1 if t = q − 1

q−1 q−1
P (x) = ait x q−1−t = −1.
i=0 t=0

q−1 q−1  ai t 
For every b ∈ F∗q , we have P (b) = i=0 t=0 b = |{i = 0, . . . , q − 1; ai = b}|
[mod p], according to the previous exercise. Hence, |{i = 0, . . . , q − 1; ai = b}| = 0, and
the same happens for b = 0. This completes the proof.

Theorem (Hermite’s criterion) A polynomial P (x) over Fq is a permutation polynomial


if and only if the following two conditions hold:
1. P (x) has a single root in Fq .
2. For each integer t = 1, . . . , q −2 not divisible by p, the polynomial (P (x))t [mod x q −x]
has degree at most q − 2.

Proof Condition 1 is necessary and implies that b∈Fq (P (b))q−1 = −1 . Condition 2 is

equivalent to b∈Fq (P (b))t = 0 for every t = 1, . . . , q − 2 not divisible by p. For t = pt  ,
  p
we have b∈Fq (P (b))t = (P (b)) t . The lemma above completes the proof.
b∈Fq


 2iπtrq/p (va)
q if a = 0
Exercise: Show that v∈Fq e p =
0 otherwise.

Theorem (characterization through component functions) A polynomial P (x) over Fq is


a permutation polynomial if and only if, for every v ∈ F∗q , the function trq/p (vP (x)) is
balanced (that is, takes every value of Fp the same number of times). Equivalently, for every
v ∈ F∗q :
2iπtrq/p (vP (c))
e p = 0.
c∈Fq
494 Appendix: finite fields

Proof If P (x) is a permutation polynomial over Fq , then, for every v ∈ F∗q , the function
trq/p (vP (x)) is balanced since the function trq/p is balanced over Fq . This implies that
 2iπtrq/p (vP (c)) p−1 2iπ
c∈Fq e
p is proportional to j =0 (e p )j = 0.
Conversely, if the condition is satisfied, then for every b ∈ Fq , we have
2iπtrq/p (vP (c)) 2iπtrq/p (vb)
1 −
|{c ∈ Fq ; P (c) = b}| = e p e p = 1.
q
c∈Fq v∈Fq

Remark. As proved by Carlitz in [330], all permutation polynomials over Fq with q > 2
are generated through composition by the multiplicative inverse monomial x q−2 and the
degree 1 polynomials ax + b with a ∈ F∗q , b ∈ Fq .

14.5 Equations over finite fields


Fundamental equation and general equations
We have seen that, for every prime power q, the equation x q − x = 0 admits Fq as set of
solutions. Consequently, given a prime p and two positive integers r and s, the equation
s
x p − x = 0 has for solutions in Fpr the elements of Fpr ∩ Fps = Fpgcd(r,s) , and in Fp r with
p  = p, it has solutions 0, 1.
Important remark. More generally and for the same reason, finding the solutions in Fq
of an equation P (x) = 0 over Fq is equivalent to finding the solutions of the equation
gcd(P (x), x q − x) = 0. Since the polynomial x q − x splits over Fq , the number of solutions
equals the degree of gcd(P (x), x q − x).

Equations of degree 1
ax + b = 0, a = 0, has solution −b/a.

Equations of degree 2
ax 2 + bx + c = 0, a =
 0, behaves differently according to whether p = 2 or not:
• If p = 2, then the usual resolution works.
 2 ax
• if p = 2 then, if a = 0 and b = 0, ax 2 + bx + c = 0 is equivalent to ax b + b =
ac
b2
. Hence, solving the equation ax + bx + c = 0 of degree 2 reduces to solving the
2

n−1 j −1 
j k
equation x 2 + x = β for some β. Let c ∈ F2n and x = β2 c2 , then x +
j =1 k=0
n−1 j −1  n j −1  n−1 j −1  n j −1 
2j 2k 2j 2k 2j 2k 2j 2k
x = 2
β c + β c = β c + β c +
j =1 k=0 j =2 k=1 j =1 k=0 j =2 k=0
n n−1  n−1
2j 2k j
β c = β c+β 2
c + β 2 c = c trn (β) + β trn (c). This equality and what
j =2 k=1 j =2
we have seen already in an exercise imply:
14.5 Equations over finite fields 495

Theorem Let n be any positive integer and β ∈ F2n . A necessary and sufficient
condition for the existence of solutions in F2n of the equation x 2 + x = β is that
trn (β) = 0. Assuming that this condition is satisfied, the solutions of the equation are
 2j
j −1 2k n−1 2j j −1 2k
x = jn−1 =1 β ( k=0 c ) and x = 1 + j =1 β ( k=0 c ), where c is any (fixed)
element such that trn (c) = 1.

Note that if n = 2m and β ∈ F2m , then the condition trn (β) = 0 is satisfied and x
 2k
m−1 2j m+j −1 2k m−1 2j
simplifies into x = β ( m−1 k=0 c ) + j =1 β ( k=j c ) = j =0 (βd) , where
 2k
m−1 2j
d = m−1 k=0 c , and since trn (c) = trm (d), we have that x =
n
j =0 (βd) , where d is
any element of F2n such that trm n (d) = 1. Then, for every u = 0 and v in F m , the equation
2
m−1  vd 2j
x +ux = v has for solutions x and x +u in F2n , where x = u j =0 u2
2 , where d ∈
F2n , trm (d) = 1.
n

k
Remark. For the more general equation x + x 2 = b, where k is odd, gcd(k, n) = 1, and
n−1 2ki
x and U = {ζ ∈ F22n | ζ 2 +1 = 1}; let ζ ∈ U \ {1}; then,
n
trn (b) = 0, let Sn,k (x) = i=0
for any b ∈ F∗2n , we have {x ∈ F2n | x + x 2 = b} = Sn,k
k b
ζ +1 + F2 ; see [700].

Equations of degree 3
Theorem [64, 1119] Let t1 , t2 denote the roots of t 2 + bt + a 3 = 0 in F22n , where a ∈
F2n , b ∈ F∗2n . Then the factorization of f (x) = x 3 + ax + b over F2n is characterized as
follows:
 3 
– f has three zeros in F2n if and only if trn a2 + 1 = 0 and t1 , t2 are cubes in F2n (n
b
even), F22n (n odd).  3 
– f has exactly one zero in F2n if and only if trn a2 + 1 = 1.
 3 b
– f has no zero in F2n if and only if trn 2 + 1 = 0 and t1 , t2 are not cubes in F2n (n
a
b
even), F22n (n odd).

We refer to [64, 1119] for the proof.

Equation x 2 +1 + x + a = 0
k

This equation, first studied in [96, 594, 595], is solved in [700] for gcd(k, n) = 1 (which is
a breakthrough).

Power equations
The image set of a power function x i equals the union of {0} and of a multiplicative
subgroup of F∗q of order gcd(i,q−1)
q−1
. The equation x i = a has one solution if a = 0, no
solution if a does not belong to this subgroup and gcd(i, q − 1) solutions if a belongs
to it, since there exist integers k (coprime with q − 1) and l (coprime with i), such that
ik + j (q − 1) = gcd(i, q − 1).
496 Appendix: finite fields

Multivariate method: an example


Hans Dobbertin has developed a method for solving some kinds of equations that play a role
when proving that some vectorial functions are APN. The method applies if n is a multiple
of a small positive number, say 2 or 3. Assume for instance that n is a multiple of 3, denote
k = n/3, and assume that we are given some equation in which x appears with exponents
that are linear combinations of 1, 2k , and 22k . The idea of the method is then to introduce the
k k
two new variables y = x 2 and z = y 2 , to express the equation and its 2k and 22k powers
by means of the unknowns x, y, z and to eliminate (for instance by using resultants) some of
these variables from these three equations. Then even if y and z are eliminated, it happens
that the resulting equation is different from the original one. We give an example of this
method taken from [158].
Let s and k be positive integers with gcd(s, 3k) = 1, and n = 3k. Let
d = 22k + 2k+s − (2s + 1), g1 = gcd(23k − 1, d/(2k − 1)), g2 = gcd(2k − 1, d/(2k − 1)),

and let a ∈ F∗2n have the order 22k + 2k + 1 (i.e. a = α 2 −1 for some primitive element α of
k

F∗2n ). Let
 2k k+s
 s
a (x) = a x 2 + x 2 + x 2 + x.

The equation a (x) = 0 has x = 0 and x = 1 for zeros. Let us show that if g1 = g2 , then
there are no other zeros.
k k k k
We denote y = x 2 , z = y 2 and b = a 2 , c = b2 , and so the equation a (x) = 0 can
s s
be rewritten as a(z + y 2 ) + (x 2 + x) = 0. By definition, a is always a (2k − 1)th power
and thus abc = 1. Besides, a ∈ / F2 . Considering also the conjugated equations we derive
the following system of equations:
s s
f1 = a (x) = a(z + y 2 ) + x 2 + x = 0
k s s
f2 = f12 = b(x + z2 ) + y 2 + y = 0
2k s s
f3 = f12 = ab
1
(y + x 2 ) + z2 + z = 0.
Eliminating y and z from these equations gives an equation in x. It happens that this equation
is in general different from the original equation and is often simpler: we compute
s s s 2s s s s 2s s s
R1 = b(f1 )2 + a 2 f2 = a 2 by 2 + a 2 y 2 + a 2 y + bx 2 + bx 2 + a 2 bx
1 s a+1 1 s ab + b
R2 = (bf1 + af2 + abf3 ) = y 2 + y + x2 + x
a(b + 1) ab + a a ab + a
2s
to eliminate z. To eliminate y 2 , we compute
s s
R3 = R1 + a 2 b(R2 )2
a 2 b2 +1 + b 2s
s s s s s
a 2 (b + 1)2 + (a + 1)2 b 2s 2s s
= y + a y + x + a 2 bx.
(b + 1) b +1
2 s 2 s

s
Using equations R2 and R3 , we can eliminate y 2 by computing
s s s
a 2 (b + 1)2 + (a + 1)2 b s
R4 = R 3 + R2 = P (a)(y + (b + 1)x 2 + bx),
(b + 1)2
s
14.5 Equations over finite fields 497

where
s +1 s s s
(ab)2 + (ab)2 + a 2 b + a 2 + ab + b
P (a) = .
(b + 1)2 +1 a
s

Computing
s s s
R5 = (R4 )2 + P (a)2 R2 = P (a)2
 
a+1 s 2s ab 2s + 1 s ab + b
× y + (b2 + 1)x 2 + x2 + x
ab + a a ab + a
we finally get our desired equation by
a+1  2s 
P (a)2 −1 R4 + R5 = P (a)2 (b + 1)2 x 2 + x 2 .
s s s s
R6 =
ab + a
s
Obviously if x is a solution of a (x) = 0, then R6 (x) = 0. For P (a)2 (b + 1) = 0, this is
equivalent to x = 0, 1. Thus to prove the result, it is sufficient to show that P (a) does not
vanish for elements a fulfilling the equation
 2k −1
a = αv 2 +2 +1
k s
(14.2)

Note that, if a satisfies (14.2), then a is not a (2k + 2s + 1)th power, since α 2 −1 is not:
k

g2 = gcd(2k −1, 2k +2s +1) is by hypothesis a strict divisor of g1 = gcd(2n −1, 2k +2s +1)
and α being a primitive element, it cannot be a (g1 /g2 )th power.
Consequently, it is sufficient to show that if P (a) = 0, then a is a (2k + 2s + 1)th power.
For a ∈
/ F2 the equation P (a) = 0 is equivalent to
s  k s
a + 1 2 +1 2s +1 b + 1 a + 1 2 +2 +1
a= c a= c ,
c+1 a+1 c+1
as can be easily seen by dividing this equality by a, simplifying it by (a + 1), and then
expanding it, using that c = 1/ab. Note that the right-hand side is always a (2k + 2s + 1)-th
power. This proves the property.
References

[1] E. Abbe, A. Shpilka, and A. Wigderson. Reed–Muller codes for random erasures and errors. IEEE
Transactions on Information Theory 61 (10), pp. 5229–5252, 2015. See page 151.
[2] K. Abdukhalikov. Bent functions and line ovals. Finite Fields and Their Applications 47,
pp. 94–124, 2017. See pages 220 and 221.
[3] K. Abdukhalikov. Hyperovals and bent functions. European J. Combin. 79, pp. 123–139, 2019. See
page 220.
[4] K. Abdukhalikov and S. Mesnager. Explicit constructions of bent functions from pseudo-planar
functions. Advances in Mathematics of Communications 11 (2), pp. 293–299, 2017. See page 217.
[5] K. Abdukhalikov and S. Mesnager. Bent functions linear on elements of some classical spreads and
presemifields spreads. Cryptography and Communications 9 (1), pp. 3–21, 2017. See page 227.
[6] C. M. Adams. Constructing symmetric ciphers using the cast design procedure. Designs, Codes and
Cryptography 12 (3), pp. 283–316, 1997. See page 26.
[7] C. M. Adams and S. E. Tavares. Generating and counting binary bent sequences. IEEE Transactions
on Information Theory 36 (5), pp. 1170–1173, 1990. See pages 234 and 293.
[8] S. Agievich. On the representation of bent functions by bent rectangles. Proceedings of Proba-
bilistic Methods in Discrete Mathematics: Fifth International Conference, pp. 121–135, 2002, and
arXiv /0502087, 2005. See pages 237 and 239.
[9] S. Agievich. On the affine classification of cubic bent functions. IACR Cryptology ePrint Archive
(http://eprint.iacr.org/) 2005/44, 2005. See page 208.
[10] S. Agievich. Bent rectangles. NATO Science for Peace and Security Series – D: Information and
Communication Security, Vol 18: Boolean Functions in Cryptology and Information Security, IOS
Press, pp. 3–22, 2008. See page 239.
[11] M. R. Albrecht, C. Rechberger, T. Schneider, T. Tiessen, and M. Zohner. Ciphers for MPC and FHE.
Proceedings of EUROCRYPT (1) 2015, Lecture Notes in Computer Science 9056, pp. 430–454,
2015. See page 454.
[12] N. Alon, O. Goldreich, J. Hastad, and R. Peralta. Simple constructions of almost k-wise independent
random variables. Random Stuctures and Algorithms 3 (3), pp. 289–304, 1992. See page 105.
[13] N. Alon, T. Kaufman, M. Krivelevich, S. Litsyn, and D. Ron. Testing Reed–Muller codes. IEEE
Transactions on Information Theory 51 (11), pp. 4032–4039, 2005. See page 469.
[14] N. Alon and J. H. Spencer. The Probabilistic Method. Wiley-VCH, 2000 (second edition). See
page 84.
[15] Y. Alsalami. Constructions with high algebraic degree of differentially 4-uniform (n, n − 1)-
functions and differentially 8-uniform (n, n − 2)-functions. Cryptography and Communications
10 (4), pp. 611–628, 2018. See page 424.
[16] A. S. Ambrosimov. Properties of bent functions of q-valued logic over finite fields. Discrete
Mathematics and Applications 4 (4), pp. 341–350, 1994. See page 193.
[17] N. Anbar and W. Meidl. Bent and bent4 spectra of Boolean functions over finite fields. Finite Fields
and Their Applications 46, 163–178, 2017. See pages 266 and 268.
[18] N. Anbar and W. Meidl. Modified planar functions and their components. Cryptography and
Communications 10 (2), pp. 235–249, 2018. See pages 266, 267, 268, and 274.

498
References 499

[19] N. Anbar, W. Meidl, and A. Topuzoğlu. On the nonlinearity of idempotent quadratic functions
and the weight distribution of subcodes of Reed–Muller codes. Proceedings of the 9th Interna-
tional Workshop on Coding and Cryptography 2015 WCC2015, (https://hal.archives-ouvertes.fr/
WCC2015browse/latest-publications), 2015. See page 178.
[20] R. J. Anderson. Searching for the optimum correlation attack. Proceedings of Fast Software
Encryption FSE 1994, Lecture Notes in Computer Science 1008, pp. 137–143, 1994. See page 89.
[21] B. Applebaum. Pseudorandom generators with long stretch and low locality from random local
one-way functions. Proceedings of ACM STOC 2012, pp. 805–816. ACM Press, 2012.
[22] B. Applebaum. Cryptographic hardness of random local functions-survey. Computational Com-
plexity 25 (3), pp. 667–722, 2013. See page 468.
[23] B. Applebaum and S. Lovett. Algebraic attacks against random local functions and their counter-
measures. Proceedings of ACM STOC 2016, pp. 1087–1100, 2016. See pages 468 and 469.
[24] R. Aragona, M. Calderini, D. Maccauro, and M. Sala. On some differential properties of Boolean
functions. Applicable Algebra in Engineering, Communication and Computing 27 (5), pp. 359–372,
2016. See pages 373, 383, and 411.
[25] F. Armknecht. Improving fast algebraic attacks. Proceedings of Fast Software Encryption FSE
2004, Lecture Notes in Computer Science 3017, pp. 65–82, 2004. See page 94.
[26] F. Armknecht and G. Ars. Introducing a new variant of fast algebraic attacks and minimizing their
successive data complexity. Proceedings of International Conference on Cryptology in Malaysia
Mycrypt 2005, Lecture Notes in Computer Science 3715, pp. 16–32, 2005. See page 94.
[27] F. Armknecht, C. Carlet, P. Gaborit, S. Künzli, W. Meier, and O. Ruatta. Efficient computation of
algebraic immunity for algebraic and fast algebraic attacks. Proceedings of EUROCRYPT 2006,
Lecture Notes in Computer Science 4004 , pp. 147–164, 2006. See pages 92, 321, 334, 335, 336,
and 338.
[28] F. Armknecht and M. Krause. Algebraic attacks on combiners with memory. Proceedings of
CRYPTO 2003, Lecture Notes in Computer Science 2729, pp. 162–175, 2003. See page 91.
[29] F. Armknecht and M. Krause. Constructing single- and multi-output Boolean functions with max-
imal algebraic immunity. Proceedings of ICALP 2006, Lecture Notes of Computer Science 4052,
pp. 180–191, 2006. See pages 127, 128, 344, 345, 346, and 348.
[30] F. Arnault and T. P. Berger. Design and properties of a new pseudorandom generator based on a
filtered FCSR automaton. IEEE Transactions on Computers 54 (11), pp. 1374–1383, 2005. See
page 23.
[31] S. Arora and B. Barak. Computational Complexity : A Modern Approach. Cambridge University
Press, 2009. See page 19.
[32] G. Ars and J.-C. Faugère. Algebraic immunities of functions over finite fields. Proceedings of the
Conference BFCA 2005, Publications des universités de Rouen et du Havre, pp. 21–38, 2005. See
pages 127 and 344.
[33] A. Ashikhmin and A. Barg. Minimal vectors in linear codes. IEEE Transactions on Information
Theory 44 (5), pp. 2010–2017, 1998. See pages 148 and 149.
[34] E. F. Assmus. On the Reed–Muller codes. Discrete Mathematics 106/107, pp. 25–33, 1992. See
page 154.
[35] E. F. Assmus and J. D. Key. Designs and Their Codes. Cambridge University Press, 1992.
[36] E. F. Assmus Jr. and H. F. Mattson Jr. New 5-designs. Journal of Combinatorial Theory, Series A 6
(2), pp. 122–151, 1969. See page 270.
[37] E. F. Assmus Jr. and H. F. Mattson Jr. The weight-distribution of a coset of a linear code. IEEE
Transactions on Information Theory 24 (4), p. 497, 1978. See page 15.
[38] Y. Aubry, D. J. Katz, and P. Langevin. Cyclotomy of Weil sums of binomials. Journal of Number
Theory 154, pp. 160–178, 2015. See page 73.
[39] Y. Aubry and P. Langevin. On a conjecture of Helleseth. Proceedings of Algebraic informatics CAI
2013, Lecture Notes in Comput. Science 8080, pp. 113–118, 2013. See page 73.
[40] Y. Aubry, G. McGuire, and F. Rodier. A few more functions that are not APN infinitely often.
Proceedings of Fq9, Contemporary Mathematics 518, 2010. ArXiv 0909.2304. See page 404.
500 References

[41] J. Ax. Zeroes of polynomials over finite fields. American Journal on Mathematics 86, pp. 255–261,
1964. See page 156.
[42] T. Baignères, P. Junod, and S. Vaudenay. How far can we go beyond linear cryptanalysis?
Proceedings of ASIACRYPT 2004, Lecture Notes in Computer Science 3329, pp. 432–450, 2004.
See page 191.
[43] R. D. Baker, J. H. Van Lint, and R. M. Wilson. On the Preparata and Goethals codes. IEEE
Transactions on Information Theory 29, pp. 342–345, 1983. See page 255.
[44] J. Balasch, S. Faust, and B. Gierlichs. Inner product masking revisited. Proceedings of EURO-
CRYPT 2015, Lecture Notes in Computer Science 9056, pp. 486–510, 2015. See pages 446 and 447.
[45] J. Balasch, S. Faust, B. Gierlichs, C. Paglialonga, and F.-X. Standaert. Consolidating inner
product masking. Proceedings of ASIACRYPT 2017, Lecture Notes in Computer Science 10624,
pp. 724–754, 2017. See page 446.
[46] J. Balasch, S. Faust, B. Gierlichs, and I. Verbauwhede. Theory and practice of a leakage resilient
masking scheme. Proceedings of ASIACRYPT 2012, Lecture Notes in Computer Science 7658,
pp. 758–775, 2012. See pages 446 and 447.
[47] B. Barak, G. Kindler, R. Shaltiel, B. Sudakov, and A. Wigderson. Simulating Independence: new
constructions of condensers, Ramsey graphs, dispersers, and extractors. Proceedings of ACM STOC
2005, 2005. See page 108.
[48] G. Barthe, S. Belaı̈d, F. Dupressoir, et al. Strong non-interference and type-directed higher-order
masking. Proceedings of ACM CCS 16, ACM Press, pp. 116–129, 2016. See page 430.
[49] G. Barthe, F. Dupressoir, S. Faust, B. Grégoire, F.-X. Standaert, and P.-Y. Strub. Parallel implemen-
tations of masking schemes and the bounded moment leakage model. Proceedings of EUROCRYPT
2017, Lecture Notes in Computer Science 10210, pp. 535–566, 2017. See page 429.
[50] L. A. Bassalygo, G. V. Zaitsev, and V. A. Zinoviev. Uniformly packed codes. Problems of
Information Transmission 10, No. 1, pp. 9–14, 1974. See page 10.
[51] L. A. Bassalygo and V. A. Zinoviev. Remarks on uniformly packed codes. Problems of Information
Transmission 13, No 3, pp. 22–25, 1977. See page 10.
[52] L. M. Batten. Algebraic attacks over GF (q). Proceedings of INDOCRYPT 2004, Lecture Notes in
Computer Science 3348, pp. 84–91, 2004. See page 91.
[53] P. Beelen and G. Leander. A new construction of highly nonlinear S-boxes. Cryptography and
Communications 4(1), pp. 65–77, 2012. See pages 122 and 160.
[54] S. Belaı̈d, F. Benhamouda, A. Passelègue, E. Prouff, A. Thillard, and D. Vergnaud. Private
multiplication over finite fields. Proceedings of CRYPTO 2017, Lecture Notes in Computer Science
10403, pp. 397–426, 2017. See page 431.
[55] S. Belaı̈d, D. Goudarzi, and M. Rivain. Tight private circuits: achieving probing security with the
least refreshing. Proceedings of ASIACRYPT 2018, Lecture Notes in Computer Science 11273,
pp. 343–372, 2018. See page 430.
[56] T. D. Bending. Bent functions, SDP designs and their automorphism groups. Ph.D. thesis, Queen
Mary and Westfield College, 1993. See page 192.
[57] T. Bending and D. Fon-Der-Flaass. Crooked functions, bent functions and distance regular graphs.
Electron. J. Comb. 5, Research paper 34 (electronic), 14 pages, 1998. See pages 231, 278, 279,
and 373.
[58] C. H. Bennett, G. Brassard, and J. M. Robert. Privacy amplification by public discussion. SIAM
Journal on Computing 17, pp. 210–229, 1988. See pages 129, 292, and 314.
[59] M. Ben-Or, S. Goldwasser, and A. Wigderson. Completeness theorems for non- cryptographic fault-
tolerant distributed computation. Proceedings of ACM STOC 1988, pp. 1–10, 1988. See page 436.
[60] C. Berbain, O. Billet, A. Canteaut, et al.. Decim v2. New Stream Cipher Designs – The eSTREAM
Finalists, Lecture Notes in. Computer Science 4986, pp. 140–151, 2008. See page 318.
[61] C. Berbain, H. Gilbert, and J. Patarin. QUAD: a practical stream cipher with provable security.
Proceedings of EUROCRYPT 2006, Lecture Notes in Computer Science 4004, pp 109–128, 2006.
See page 4.
References 501

[62] T. Berger, A. Canteaut, P. Charpin, and Y. Laigle-Chapuy. On almost perfect nonlinear functions.
IEEE Transactions on Information Theory 52 (9), pp. 4160–4170, 2006. See pages 385, 386, 391,
and 411.
[63] E. Berlekamp, Algebraic Coding Theory, McGraw-Hill, 1968. See page 4.
[64] E. R. Berlekamp, H. Rumsey, and G. Solomon. On the solution of algebraic equations over finite
fields. Information and Control 10, pp. 553–564, 1967. See page 495.
[65] E. R. Berlekamp and N. J. A. Sloane. Restrictions on the weight distributions of the Reed–Muller
codes. Information and Control 14, pp. 442–446, 1969. See page 157.
[66] E. R. Berlekamp and L. R. Welch. Weight distributions of the cosets of the (32,6) Reed–Muller code.
IEEE Transactions on Information Theory, 18 (1), pp. 203–207, 1972. See pages 143 and 156.
[67] S. D. Berman. On the theory of group codes. Kibernetica 1 (1), pp. 31–39, 1967. See page 152.
[68] A. Bernasconi and B. Codenotti. Spectral analysis of Boolean functions as a graph eigenvalue
problem. IEEE Transactions on Computers 48 (3), pp. 345–351, 1999. See pages 70 and 193.
[69] A. Bernasconi, B. Codenotti, and J. M. Vanderkam. A characterization of bent functions in terms of
strongly regular graphs. IEEE Transactions on Computers 50 (9), pp. 984–985, 2001. See page 70.
[70] D. J. Bernstein. Post-quantum cryptography. Encyclopedia of Cryptography and Security,
pp. 949–950, 2011. See page 1.
[71] T. Beth and C. Ding. On almost perfect nonlinear permutations. Proceedings of EUROCRYPT 93,
Lecture Notes in Computer Science 765, pp. 65–76, 1994. See pages 137, 373, 374, and 400.
[72] C. Bey and G. M. Kyureghyan. On Boolean functions with the sum of every two of them being
bent. Designs, Codes and Cryptography 49, pp. 341–346, 2008. See page 255.
[73] T. Beyne and B. Bilgin. Uniform first-order threshold implementations. Proceedings of SAC 2016,
Lecture Notes in Computer Science 10532, pp. 79–98, 2016. See page 441.
[74] S. Bhasin, C. Carlet, and S. Guilley. Theory of masking with codewords in hardware: low-weight
dth-order correlation-immune Boolean functions. IACR Cryptology ePrint Archive (http://eprint
.iacr.org/) 2013/303, 2013. See pages 144, 304, 306, and 433.
[75] S. Bhasin, J.-L. Danger, S. Guilley, Z. Najm, and X. T. Ngo. Linear complementary dual code
improvement to strengthen encoded circuit against hardware Trojan horses. IEEE International
Symposium on Hardware Oriented Security and Trust (HOST), May 5–7, 2015. See page 446.
[76] A. Bhattacharyya, S. Kopparty, G. Shoenebeck, M. Sudan, and D. Zuckerman. Optimal testing of
Reed–Muller codes. Proceedings of Electronic Colloquium on Computational Complexity, report
no. 86, 2009.
[77] J. Bierbrauer. New semifields, PN and APN functions. Designs, Codes and Cryptography 54,
pp. 189–200, 2010. See page 394.
[78] J. Bierbrauer, K. Gopalakrishnan, and D. R. Stinson. Bounds for resilient functions and orthogonal
arrays. Proceedings of CRYPTO 1994, Lecture Notes in Computer Science 839, pp. 247–256, 1994.
See pages 313 and 357.
[79] J. Bierbrauer, K. Gopalakrishnan, and D. R. Stinson. Orthogonal arrays, resilient functions, error-
correcting codes, and linear programming bounds. SIAM Journal on Discrete Mathematics 9 (3),
pp. 424–452, 1996. See page 129.
[80] J. Bierbrauer and G. Kyureghyan. Crooked binomials. Designs Codes Cryptography 46(3),
pp. 269–301, 2008. See pages 278, 373, and 393.
[81] A. Biere, M. Heule, H. van Maaren, and T. Walsh, eds. Handbook of Satisfiability. IOS Press, 2009.
See page 19.
[82] E. Biham and A. Shamir. Differential cryptanalysis of DES-like cryptosystems. Journal of
Cryptology 4 (1), pp. 3–72, 1991. See pages 134 and 447.
[83] E. Biham and A. Shamir. Differential fault analysis of secret key cryptosystems. Proceedings of
CRYPTO 1997, Lecture Notes in Computer Science 1294, 1997, pp. 513–525. See page 427.
[84] B. Bilgin, A. Bogdanov, M. Knezevic, F. Mendel, and Q. Wang. Fides: lightweight authenticated
cipher with side-channel resistance for constrained hardware. Proceedings of International Work-
shop Cryptographic Hardware and Embedded Systems CHES 2013, Lecture Notes in Computer
Science 8086, pp. 142–158, 2013. See page 411.
502 References

[85] B. Bilgin, B. Gierlichs, S. Nikova, V. Nikov, and V. Rijmen. Higher-order threshold implementa-
tions. Proceedings of ASIACRYPT 2014, Lecture Notes in Computer Science 8874, pp. 326–343,
2014. See page 437.
[86] B. Bilgin, S. Nikova, V. Nikov, V. Rijmen, and G. Stütz. Threshold implementations of all 3 × 3 and
4 × 4 S-boxes. Proceedings of International Workshop Cryptographic Hardware and Embedded
Systems CHES 2012, Lecture Notes in Computer Science 7428, pp. 76–91, 2012. See pages 440,
441, 442, 443, and 502.
[87] B. Bilgin, S. Nikova, V. Nikov, V. Rijmen, N. N. Tokareva, and V. Vitkup. Threshold implementa-
tions of small S-boxes. Cryptography and Communications 7(1), pp. 3–33, 2015 (extended version
of [86]). See pages 440, 441, and 442.
[88] A. Biryukov and C. De Cannière. Data Encryption Standard (DES). Encyclopedia of cryptography
and security; Editors: H. C. A. van Tilborg, S. Jajodia, pp. 295–301, 2011. See page 25.
[89] A. Biryukov and D. Wagner. Slide attacks. Proceedings of Fast Software Encryption FSE 1999,
Lecture Notes in Computer Science 1636, pp. 245–259, 1999. See page 142.
[90] G. Blakely. Safeguarding cryptographic keys. National Comp. Conf. 48, pp. 313–317, New York,
June 1979. AFIPS Press. See page 145.
[91] J. Blömer and J.-P. Seifert. Fault based cryptanalysis of the Advanced Encryption Standard (AES).
Proceedings of Financial Cryptography, Lecture Notes in Computer Science 2742, pp. 162–181,
2003. See page 427.
[92] C. Blondeau, A. Canteaut, and P. Charpin. Differential properties of power functions. International
Journal of Information and Coding Theory 1 (2), pp. 149–170, 2010. See also the Proceedings of
the 2010 IEEE International Symposium on Information Theory (ISIT), pp. 2478–2482, 2010. See
page 418.
[93] C. Blondeau, A. Canteaut, and P. Charpin. Differential properties of x → x 2 −1 . IEEE Transactions
t

on Information Theory 57 (12), pp. 8127–8137, 2011. See pages 401 and 422.
[94] C. Blondeau and K. Nyberg. Perfect nonlinear functions and cryptography. Finite Fields and Their
Applications 32, pp. 120–147, 2015. See pages 369 and 373.
[95] C. Blondeau and L. Perrin. More differentially 6-uniform power functions. Designs, Codes and
Cryptography 73(2), pp. 487–505, 2014. See page 422.
[96] A. W. Bluher. On x q+1 + ax + b. Finite Fields and Their Applications 10(3), pp. 285–305, 2004.
See page 495.
[97] A. W. Bluher. On existence of Budaghyan–Carlet APN hexanomials. Finite Fields and Their
Applications 24, pp. 118–123, 2013. See pages 408 and 409.
[98] L. Blum, M. Blum, and M. Shub. A simple unpredictable pseudo-random number generator. SIAM
Journal on Computing 15 (2), pp. 364–383, 1986. See page 4.
[99] A. Blum, A. Kalai, and H. Wasserman. Noise-tolerant learning, the parity problem, and the
statistical query model. Journal ACM 50(4), pp. 506–519, 2003. See page 466.
[100] A. Bogdanov, L. R. Knudsen, G. Leander, C. Paar, A. Poschmann, M. J. Robshaw, Y. Seurin, and
C. Vikkelsoe. PRESENT: An ultra-lightweight block cipher. Proceedings of 9th International Work-
shop Cryptographic Hardware and Embedded Systems CHES 2007, Lecture Notes in Computer
Science 4727, pp. 450–466, 2007. See page 26.
[101] Y. Borissov, A. Braeken, S. Nikova, and B. Preneel. On the covering radii of binary Reed–Muller
codes in the set of resilient Boolean functions. IEEE Transactions on Information Theory 51 (3),
pp. 1182–1189, 2005. See page 287.
[102] Y. Borissov, A. Braeken, S. Nikova, and B. Preneel. Classification of the cosets of RM(1,
7) in RM(3, 7) revisited. NATO Science for Peace and Security Series – D: Information and
Communication Security, Vol 18: Boolean Functions in Cryptology and Information Security, IOS
Press, pp. 58–72, 2008. See page 144.
[103] Y. Borissov, N. Manev, and S. Nikova. On the non-minimal codewords of weight 2dmin in the binary
Reed–Muller code. Proceedings of the Workshop on Coding and Cryptography 2001, Electronic
Notes in Discrete Mathematics, Elsevier, vol. 6, pp. 103–110, 2001. Revised version in Discrete
References 503

Applied Mathematics 128 (Special Issue “Int. Workshop on Coding and Cryptography (2001)”),
pp. 65–74, 2003. See page 148.
[104] Y. Borissov, N. Manev, and S. Nikova. On the non-minimal codewords in binary Reed–Muller
codes. Discrete Applied Mathematics 128, pp. 65–74, 2003. See page 148.
[105] E. Boss, V. Grosso, T. Güneysu, G. Leander, and A. Moradi, and Tobias Schneider. Strong
8-bit Sboxes with efficient masking in hardware extended version. Journal of Cryptographic
Engineering JCEN 7 (2), pp. 149–165, 2017 and IACR Cryptology ePrint Archive (http://eprint
.iacr.org/) 2016/647. See page 442.
[106] C. Boura and A. Canteaut. On the influence of the algebraic degree of F −1 on the algebraic degree
of G ◦ F . IEEE Transactions on Information Theory 59 (1), pp. 691–702, 2013. See pages 40, 41,
114, and 115.
[107] C. Boura and A. Canteaut. On the boomerang uniformity of cryptographic Sboxes. IACR Transac-
tions on Symmetric Cryptology 2018 (3), pp. 290–310, 2018. See pages 141 and 142.
[108] C. Boura, A. Canteaut, and C. Cannière. Higher-order differential properties of Keccak and Luffa.
Proceedings of Fast Software Encryption FSE 2011, Lecture Notes in Computer Science 6733,
pp. 252–269, 2011. See page 115.
[109] C. Boura, A. Canteaut, J. Jean, and V. Suder. Two notions of Differential equivalence on Sboxes.
Designs, Codes and Cryptography 87 (2–3), pp. 185–202, 2019 and IACR Cryptology ePrint
Archive (http://eprint.iacr.org/) 2018/617. See page 376.
[110] J. Bourgain. On the construction of affine extractors. Geometric & Functional Analysis GAFA 17
(1), pp. 33–57, 2007. See page 105.
[111] J. Boyar and M. G. Find. Constructive relationships between algebraic thickness and normality.
Proceedings of International Symposium on Fundamentals of Computation Theory, pp. 106–117,
2015. See page 109.
[112] P. Boyvalenkov, T. Marinova, and M. Stoyanovac. Nonexistence of a few binary orthogonal arrays.
Discrete Applied Mathematics 217, Part 2, pp. 144–150, 2017. See page 305.
[113] D. Bozilov, B. Bilgin, and H. Sahin. A note on 5-bit quadratic permutations classification. IACR
Transactions on Symmetric Cryptology 2017 1, pp. 398–404, 2017. See page 442.
[114] C. Bracken, E. Byrne, N. Markin, and G. McGuire. On the Walsh spectrum of a new APN function.
Proceedings of IMA Conference on Cryptography and Coding 2007, Lecture Notes in Computer
Science 4887, pp. 92–98, 2007. See page 406.
[115] C. Bracken, E. Byrne, N. Markin, and G. McGuire. Determining the nonlinearity of a new family of
APN functions. Proceedings of AAECC-17 Conference, Lecture Notes in Computer Science 4851,
pp. 72–79, 2007. See page 422.
[116] C. Bracken, E. Byrne, N. Markin, and G. McGuire. New families of quadratic almost perfect
nonlinear trinomials and multinomials. Finite Fields and Their Applications 14, pp. 703–714, 2008.
See pages 231, 398, 406, and 407.
[117] C. Bracken, E. Byrne, N. Markin, and G. McGuire. On the Fourier spectrum of binomial APN
functions. SIAM Journal on Discrete Mathematics 23 (2), pp. 596–608, 2009. See pages 398
and 412.
[118] C. Bracken, E. Byrne, N. Markin, and G. McGuire. A few more quadratic APN functions.
Cryptography and Communications 3 (1), pp. 43–53, 2011. See pages 398, 405, and 406.
[119] C. Bracken, E. Byrne, G. McGuire, and G. Nebe. On the equivalence of quadratic APN functions.
Designs, Codes and Cryptography 61 (3), pp. 261–272, 2011. See page 30.
[120] C. Bracken and G. Leander. A highly nonlinear differentially 4 uniform power mapping that
permutes fields of even degree. Finite Fields and Their Applications 16(4), pp. 231–242, 2010.
See pages 418 and 422.
[121] C. Bracken, C. Tan, and Y. Tan. Binomial differentially 4 uniform permutations with high
nonlinearity. Finite Fields and Their Applications 18(3), pp. 537–546, 2012. See page 418.
[122] C. Bracken, C. H. Tan, and Y. Tan. On a class of quadratic polynomials with no zeros and its
application to APN functions. Finite Fields and Their Applications 25, pp. 26–36, 2014. See
pages 408 and 409.
504 References

[123] C. Bracken and Z. Zha. On the Fourier spectra of the infinite families of quadratic APN functions.
Advances in Mathematics of Communications 3 (3), pp. 219–226, 2009. See pages 406 and 412.
[124] A. Braeken, Y. Borisov, S. Nikova, and B. Preneel. Classification of Boolean functions of 6 variables
or less with respect to cryptographic properties. Proceedings of ICALP 2005, Lecture Notes in
Computer Science 3580, pp. 324–334, 2005. See pages 143 and 208.
[125] A. Braeken, Y. Borisov, S. Nikova, and B. Preneel. Classification of cubic (n − 4)-resilient boolean
functions. IEEE Transactions on Information Theory 52 (4), pp. 1670–1676, 2006. See page 312.
[126] A. Braeken, V. Nikov, S. Nikova, and B. Preneel. On Boolean functions with generalized
cryptographic properties. Proceedings of INDOCRYPT 2004, Lecture Notes in Computer Science
3348, pp. 120–135, 2004. See page 290.
[127] A. Braeken and B. Preneel. On the algebraic immunity of symmetric Boolean functions. Proceed-
ings of Indocrypt 2005, Lecture Notes in Computer Science 3797, pp. 35–48, 2005. Some false
results of this reference are corrected in Braeken’s PhD thesis “Cryptographic properties of Boolean
functions and S-boxes.” See pages 335, 336, and 358.
[128] A. Braeken and B. Preneel. Probabilistic algebraic attacks. Proceedings of IMA Conference on
Cryptography and Coding 2005, Lecture Notes in Computer Science 3796, pp. 290–303, 2005.
See page 324.
[129] N. Brandstätter, T. Lange, and A. Winterhof. On the non-linearity and sparsity of Boolean functions
related to the discrete logarithm in finite fields of characteristic two. Proceedings of International
Workshop on Coding and Cryptography WCC 2005, Lecture Notes in Computer Science 3969,
pp. 135–143, 2006. See page 338.
[130] E. Brier and P. Langevin. Classification of cubic Boolean functions of 9 variables. Proceedings of
the IEEE Information Theory Workshop ITW 2003, Paris, France, 2003. See page 155.
[131] J. Bringer, C. Carlet, H. Chabanne, S. Guilley, and H. Maghrebi. Orthogonal direct sum masking –
a smartcard friendly computation paradigm in a code, with builtin protection against side-channel
and fault attacks. Proceedings of WISTP, Springer, Heraklion, 2014, 40–56. See pages 444, 445,
and 446.
[132] J. Bringer, H. Chabanne, and T. Ha Le. Protecting AES against side-channel analysis using wire-tap
codes. Journal of Cryptographic Engineering JCEN 2 (2), pp. 129–141, 2012. See page 431.
[133] J. Bringer, V. Gillot, and P. Langevin. Exponential sums and Boolean functions. Proceedings of the
Conference BFCA 2005, Publications des universités de Rouen et du Havre, pp. 177–185, 2005.
See page 81.
[134] M. Brinkmann and G. Leander. On the classification of APN functions up to dimension five.
Designs, Codes and Cryptography 49, Issue 1–3, pp. 273–288, 2008. Revised and extended version
of a paper with the same title in the Proceedings of the Workshop on Coding and Cryptography
WCC 2007, pp. 39–48, 2007 See pages 138, 144, 399, 402, and 408.
[135] K. Browning, J. F. Dillon, R. E. Kibler, and M. McQuistan. APN polynomials and related codes.
Special Volume of Journal of Combinatorics, Information and System Sciences, Honoring the 75-th
Birthday of Prof. D.K. Ray-Chaudhuri 34, Issue 1–4, pp. 135–159, 2009. See pages 144, 379, 380,
392, 399, 403, 404, and 411.
[136] K. Browning, J. F. Dillon, M. McQuistan, and A. J. Wolfe. An APN permutation in dimension 6.
Proceedings of Conference Finite Fields and Applications Fq9, Contemporary Mathematics 518,
pp. 33–42, 2009. See pages 10, 399, 411, 417, and 478.
[137] R. A. Brualdi, N. Cai, and V. S. Pless. Orphans of the first order Reed–Muller codes. IEEE
Transactions on Information Theory 36, pp. 399–401, 1990. See page 262.
[138] J. O. Brüer. On pseudorandom sequences as crypto generators. Proceedings of International Zurich
Seminar on Digital Communications, pp. 157–161, 1984. See page 356.
[139] L. Budaghyan. The equivalence of almost bent and almost perfect nonlinear functions and their
generalizations. PhD thesis. Otto-von-Guericke-University, 2005.
[140] L. Budaghyan. The simplest method for constructing APN polynomials EA-inequivalent to power
functions. Proceedings of the International Workshop on the Arithmetic of Finite Fields, WAIFI
2007, Lecture Notes in Computer Science 4547, pp. 177–188, 2007. See page 397.
References 505

[141] L. Budaghyan. Construction and Analysis of Cryptographic Functions. Springer, 2014. See
page 369.
[142] L. Budaghyan, M. Calderini, C. Carlet, R. S. Coulter and I. Villa. Constructing APN Functions
through isotopic shifts. To appear in IEEE Transactions on Information Theory. See also “On
isotopic construction of APN functions”, IACR Cryptology ePrint Archive (http://eprint.iacr.org/)
2018/769, 2018. Presented at SETA 2018. See pages 398 and 407.
[143] L. Budaghyan, M. Calderini, C. Carlet, R. S. Coulter, and I. Villa. Generalized isotopic shift of
Gold functions. Proceedings of International Workshop on Coding and Cryptography WCC 2019,
2019. See page 399.
[144] L. Budaghyan, M. Calderini, C. Carlet, and N. Kaleyski. On a relationship between Gold and
Kasami functions and its generalization for other power APN functions. Boolean Functions and
Application, Florence, Italy, June 16–21, 2019. See page 395.
[145] L. Budaghyan, M. Calderini, and I. Villa. On relations between CCZ- and EA-equivalences.
Cryptography and Communications 12 (1), pp. 85–100, 2020 See also IACR Cryptology ePrint
Archive (http://eprint.iacr.org/) 2018/796, 2018. See pages 397 and 478.
[146] L. Budaghyan, M. Calderini and I. Villa. On equivalence between known families of quadratic
APN functions. To appear in Finite Fields and Their Applications. See also “On equivalence
between some families of APN functions”, Proceedings of International Workshop on Coding and
Cryptography WCC 2019. See pages 406, 408, and 409.
[147] L. Budaghyan and C. Carlet. Classes of quadratic APN trinomials and hexanomials and related
structures. IEEE Transactions on Information Theory 54 (5), pp. 2354–2357, 2008. See pages 393,
407, and 408.
[148] L. Budaghyan and C. Carlet. On CCZ-equivalence and its use in secondary constructions of bent
functions. Proceedings of International Workshop on Coding and Cryptography WCC 2009 and
IACR Cryptology ePrint Archive (http://eprint.iacr.org/) 2009/42, 2009. See pages 30 and 269.
[149] L. Budaghyan and C. Carlet. CCZ-equivalence of single and multi output Boolean functions. AMS
Contemporary Math. 518, Post-Proceedings of the Conference Fq9, pp. 43–54, 2010. See pages 29,
30, and 192.
[150] L. Budaghyan and C. Carlet. CCZ-equivalence of bent vectorial functions and related constructions.
Designs, Codes and Cryptography 59(1–3), pp. 69–87, 2011. See pages 30, 192, 231, 269, and 272.
[151] L. Budaghyan, C. Carlet, P. Felke, and G. Leander. An infinite class of quadratic APN functions
which are not equivalent to power functions. Proceedings of IEEE International Symposium on
Information Theory (ISIT) 2006, 2006. See pages 373, 393, 397, 398, and 506.
[152] L. Budaghyan, C. Carlet, and T. Helleseth. On bent functions associated to AB functions.
Proceedings of the IEEE Information Theory Workshop ITW 2011, 2011. See pages 228 and 376.
[153] L. Budaghyan, C. Carlet, T. Helleseth, and N. Kaleyski. Changing values in APN functions. To
appear in IEEE Transactions on Information Theory. See also Changing points in APN functions.
ACR Cryptology ePrint Archive (http://eprint.iacr.org/) 2018/1217. Preprint, 2019. See pages 373
and 403.
[154] L. Budaghyan, C. Carlet, T. Helleseth, and A. Kholosha. On o-equivalence of Niho bent functions.
Proceedings of International Workshop on the Arithmetic of Finite Fields pp. 155–168, 2014. See
pages 219 and 220.
[155] L. Budaghyan, C. Carlet, T. Helleseth, A. Kholosha, and S. Mesnager. Further results on Niho
bent functions. IEEE Transactions on Information Theory 58, No. 11, pp. 6979–6985, 2012. See
page 222.
[156] L. Budaghyan, C. Carlet, T. Helleseth, N. Li, and B. Sun. On upper bounds for algebraic degrees
of APN functions. IEEE Transactions on Information Theory 64 (6), pp. 4399–4411, 2018. See
page 373.
[157] L. Budaghyan, C. Carlet, and G. Leander. Another class of quadratic APN binomials over F2n :
the case n divisible by 4. Proceedings of the Workshop on Coding and Cryptography, WCC 2007,
pp. 49–58, 2007. See pages 393, 405, and 506.
[158] L. Budaghyan, C. Carlet, and G. Leander. Two classes of quadratic APN binomials inequivalent
to power functions. IEEE Transactions on Information Theory 54 (9), pp. 4218–4229, 2008. This
506 References

paper is a completed and merged version of [151] and [157]. See pages 373, 393, 396, 397, 398, 403,
405, 407, and 496.
[159] L. Budaghyan, C. Carlet, and G. Leander. On inequivalence between known power APN functions.
Proceedings of the conference BFCA 2008, 2008. See pages 396, 402, and 403.
[160] L. Budaghyan, C. Carlet, and G. Leander. Constructing new APN functions from known ones.
Finite Fields and Their Applications 15, pp. 150–159, 2009. See pages 393, 398, 403, 406, and 407.
[161] L. Budaghyan, C. Carlet, and G. Leander. On a construction of quadratic APN functions.
Proceedings of the IEEE Information Theory Workshop ITW 2009, pp. 374–378, 2009. See
pages 398, 406, and 407.
[162] L. Budaghyan, C. Carlet, and A. Pott. New classes of almost bent and almost perfect nonlinear
polynomials. Proceedings of the Workshop on Coding and Cryptography 2005, pp. 306–315, 2005.
See pages 29, 36, 138, 396, 397, 404, 405, and 506.
[163] L. Budaghyan, C. Carlet, and A. Pott. New classes of almost bent and almost perfect nonlinear
functions. IEEE Transactions on Information Theory 52 (3), pp. 1141–1152, 2006. This is a
completed version of [162]. See pages 28, 29, 36, 138, 272, 396, 397, 404, and 405.
[164] L. Budaghyan and T. Helleseth. New commutative semifields defined by new PN multinomials.
Cryptography and Communications 3 (1), pp. 1–16, 2011. See page 30.
[165] L. Budaghyan, T. Helleseth and N. Kaleyski. A new family of APN quadrinomials. IACR
Cryptology ePrint Archive (http://eprint.iacr.org/) 2019/994 See pages 407 and 410.
[166] L. Budaghyan, T. Helleseth, N. Li, and B. Sun. Some results on the known classes of quadratic
APN functions. Proceedings of C2SI 2017, Lecture Notes in Computer Science 10194, pp. 3–16,
2017. See pages 406 and 408.
[167] L. Budaghyan, N. Kaleyski, S. Kwon, C. Riera, and P. Stănică. Partially APN Boolean functions
and classes of functions that are not APN infinitely often. To appear in the Special Issue on Boolean
Functions and Their Applications 2018, Cryptography and Communications. See page 373.
[168] L. Budaghyan, A. Kholosha, C. Carlet, and T. Helleseth. Univariate Niho bent functions from
o-polynomials. IEEE Transactions on Information Theory 62 (4), pp. 2254–2265, 2016. Extended
version of “Niho Bent functions from quadratic o-monomials.” Proceedings of IEEE International
Symposium on Information Theory (ISIT) 2014, pp. 1827–1831, 2014. See pages 220 and 222.
[169] L. Budaghyan and A. Pott. On differential uniformity and nonlinearity of functions. Discrete
Mathematics, Special Issue ”Combinatorics 2006” 309 (2), pp. 371–384, 2009.
[170] L. Burnett, G. Carter, E. Dawson, and W. Millan. Efficient methods for generating MARS-like
S-boxes. Proceedings of Fast Software Encryption FSE 2000, Lecture Notes in Computer Science
1978, pp. 300–314, 2000. See page 145.
[171] W. Burnside. Theory of Groups of Finite Order. Cambridge University Press, 1897. See page 144.
[172] E. Byrne and G. McGuire. On the non-existence of crooked functions on finite fields. Proceedings
of the Workshop on Coding and Cryptography, WCC 2005, pp. 316–324, 2005. See page 373.
[173] J. Cai, F. Green, and T. Thierauf. On the correlation of symmetric functions. Math. Systems Theory
29, pp. 245–258, 1996. See page 354.
[174] A. R. Calderbank, G. McGuire, B. Poonen, and M. Rubinstein. On a conjecture of Helleseth
regarding pairs of binary m-sequences. IEEE Transactions on Information Theory 42, pp. 988–990,
1996. See pages 73, 275, and 384.
[175] A. R. Calderbank and W. M. Kantor. The geometry of two-weight codes. Bull. Lond. Math. Soc. 18
(2), pp. 97–122, 1986.
[176] M. Calderini, M. Sala, and I. Villa. A note on APN permutations in even dimension. Finite Fields
and Their Applications 46, pp. 1–16, 2017. See page 411.
[177] P. J. Cameron and J. H. van Lint. Designs, Graphs, Codes and Their Links. Cambridge University
Press, 1991. See page 254.
[178] P. Camion and A. Canteaut. Construction of t-resilient functions over a finite alphabet. Proceedings
of EUROCRYPT 1996, Lecture Notes in Computer Sciences 1070, pp. 283–293, 1996. See page 284.
[179] P. Camion and A. Canteaut. Generalization of Siegenthaler inequality and Schnorr–Vaudenay
multipermutations. Proceedings of CRYPTO 1996, Lecture Notes in Computer Science 1109,
pp. 372–386, 1996. See page 129.
References 507

[180] P. Camion and A. Canteaut. Correlation-immune and resilient functions over finite alphabets and
their applications in cryptography. Designs, Codes and Cryptography 16, 1999. See pages 314
and 317.
[181] P. Camion, C. Carlet, P. Charpin, and N. Sendrier. On correlation-immune functions. Proceedings
of CRYPTO 1991, Lecture Notes in Computer Science 576, pp. 86–100, 1991. See pages 86, 165,
210, 284, 293, 298, 312, and 314.
[182] E. R. Canfield, Z. Gao, C. Greenhill, B. D. McKay, and R. W. Robinson. Asymptotic enumeration
of correlation-immune boolean functions. Cryptography and Communications 2 (1), pp. 111–126,
2010. See page 312.
[183] C. De Cannière. Analysis and design of symmetric encryption algorithms. PhD thesis, KU Leuven,
2007. See pages 144 and 417.
[184] A. Canteaut. Differential cryptanalysis of Feistel ciphers and differentially uniform mappings.
Proceedings of Selected Areas on Cryptography, SAC 1997, pp. 172–184, 1997. See page 411.
[185] A. Canteaut. On the weight distributions of optimal cosets of the first-order Reed–Muller code.
IEEE Transactions on Information Theory, 47(1), pp. 407–413, 2001. See page 157.
[186] A. Canteaut. Cryptographic functions and design criteria for block ciphers. Proceedings of
INDOCRYPT 2001, Lecture Notes in Computer Science 2247, pp. 1–16, 2001. See page 382.
[187] A. Canteaut. On the correlations between a combining function and functions of fewer variables.
Proceedings of the IEEE Information Theory Workshop ITW 2002 pp. 78–81, 2002. See pages 87,
102, 290, and 319.
[188] A. Canteaut. Open problems related to algebraic attacks on stream ciphers. Proceedings of
Workshop on Coding and Cryptography WCC 2005, pp. 1–10, 2005. See also a revised version
in Lecture Notes in Computer Science 3969, pp. 120–134, 2006. See pages 96 and 328.
[189] A. Canteaut. Analysis and design of symmetric ciphers. Habilitation for Directing Theses,
University of Paris 6, 2006. See pages 89, 194, 344, and 386.
[190] A. Canteaut, C. Carlet, P. Charpin, and C. Fontaine. Propagation characteristics and correlation-
immunity of highly nonlinear Boolean functions. Proceedings of EUROCRYPT 2000, Lecture Notes
in Computer Science 187, pp. 507–522, 2000. See pages 58, 100, and 192.
[191] A. Canteaut, C. Carlet, P. Charpin, and C. Fontaine. On cryptographic properties of the cosets
of R(1, m). IEEE Transactions on Information Theory 47 (4), pp. 1494–1513, 2001. See pages 58,
62, 98, 106, 153, 192, 232, 240, 241, 259, 262, and 263.
[192] A. Canteaut, S. Carpov, C. Fontaine, T. Lepoint, M. Naya-Plasencia, P. Paillier, and R. Sirdey.
Stream ciphers: a practical solution for efficient homomorphic-ciphertext compression. Proceedings
of Fast Software Encryption FSE 2016, Lecture Notes in Computer Science 9783, pp. 313–333,
2016. See page 454.
[193] A. Canteaut and P. Charpin. Decomposing bent functions. IEEE Transactions on Information
Theory 49, pp. 2004–2019, 2003. See pages 100, 198, 203, 208, and 241.
[194] A. Canteaut, P. Charpin, and H. Dobbertin. A new characterization of almost bent func-
tions. Proceedings of Fast Software Encryption 99, Lecture Notes in Computer Science 1636,
pp. 186–200, 1999. See pages 73, 276, and 382.
[195] A. Canteaut, P. Charpin, and H. Dobbertin. Binary m-sequences with three-valued crosscorrelation:
a proof of Welch’s conjecture. IEEE Transactions on Information Theory 46 (1), pp. 4–8, 2000.
See pages 382, 388, and 395.
[196] A. Canteaut, P. Charpin, and H. Dobbertin. Weight divisibility of cyclic codes, highly nonlinear
functions on GF(2m ) and crosscorrelation of maximum-length sequences. SIAM Journal on
Discrete Mathematics, 13(1), pp. 105–138, 2000. See pages 73, 276, 382, 395, 401, 402, and 412.
[197] A. Canteaut, P. Charpin, and G. Kyureghyan. A new class of monomial bent functions. Finite Fields
and Their Applications 14(1), pp. 221–241, 2008. See page 230.
[198] A. Canteaut, M. Daum, H. Dobbertin, and G. Leander. Finding nonnormal bent functions. Discrete
Applied Mathematics 154, pp. 202 - 218, 2006. See also “Normal and Non-Normal Bent Functions.”
Proceedings of the Workshop on Coding and Cryptography 2003, pp. 91–100, 2003. See pages 203,
211, 241, and 252.
508 References

[199] A. Canteaut, S. Duval, and L. Perrin. A generalisation of Dillon’s APN permutation with the best
known differential and nonlinear properties for all fields of size 24k+2 . IEEE Transactions on
Information Theory 63 (11), pp. 7575–7591, 2017. See pages 411, 421, and 478.
[200] A. Canteaut and M. Naya-Plasencia. Structural weaknesses of permutations with a low differential
uniformity and generalized crooked functions. Contemporary Mathematics 518, pp. 55–71, 2010.
See page 279.
[201] A. Canteaut and L. Perrin. On CCZ-equivalence, extended-affine equivalence, and function
twisting. Finite Fields and Their Applications 56, pp. 209–246, 2019. Preliminary version available
in IACR Cryptology ePrint Archive (http://eprint.iacr.org/) 2018/713. See pages 72 and 136.
[202] A. Canteaut and Y. Rotella. Attacks against filter generators exploiting monomial mappings.
Proceedings of Fast Software Encryption FSE 2016, Lecture Notes in Computer Science 9783,
pp. 78–98, 2016. See page 243.
[203] A. Canteaut and M. Trabbia. Improved fast correlation attacks using parity-check equations of
weight 4 and 5, Proceedings of EUROCRYPT 2000. Lecture Notes in Computer Science 1807,
pp. 573–588, 2000. See pages 78, 87, 194, and 290.
[204] A. Canteaut and M. Videau. Degree of composition of highly nonlinear functions and applications
to higher order differential cryptanalysis. Proceedings of EUROCRYPT 2002, Lecture Notes in
Computer Science 2332, pp. 518–533, 2002. See pages 57, 64, 114, 115, 295, and 371.
[205] A. Canteaut and M. Videau. Symmetric Boolean functions. IEEE Transactions on Information
Theory 51 (8), pp. 2791–2811, 2005. See pages 354, 355, 356, and 357.
[206] X. Cao, H. Chen, and S. Mesnager. Further results on semi-bent functions in polynomial form.
Advances in Mathematics of Communications 10 (4), pp. 725–741, 2016. See page 263.
[207] J. R. du Carlet. La Cryptographie, contenant une très subtile manière d’escrire secrètement,
composée par Maistre Jean Robert Du Carlet, 1644. A manuscript exists at the Bibliothèque
Nationale (Très Grande Bibliothèque), Paris, France. See page 1.
[208] C. Carlet. A simple description of Kerdock codes. Proceedings of Coding Theory and Applications
1988, 3rd International Colloquium, Lecture Notes in Computer Science 388, pp. 202–208, 1989.
See page 255.
[209] C. Carlet. Codes de Reed–Muller, codes de Kerdock et de Preparata. PhD thesis. Publication of
LITP, Institut Blaise Pascal, Université Paris 6, 90.59, 1990. See pages 46, 171, 197, and 435.
[210] C. Carlet. A transformation on Boolean functions, its consequences on some problems related to
Reed–Muller codes. Proceedings of EUROCODE 1990, Lecture Notes in Computer Science 514,
pp. 42–50, 1991. See pages 157, 180, 203, and 204.
[211] C. Carlet. Partially-bent functions. Designs Codes and Cryptography, 3, pp. 135–145, 1993, and
Proceedings of CRYPTO 1992, Lecture Notes in Computer Science 740, pp. 280–291, 1993. See
pages 62, 190, 256, and 257.
[212] C. Carlet. Two new classes of bent functions. Proceedings of EUROCRYPT 1993, Lecture Notes
in Computer Science 765, pp. 77–101, 1994. See pages 58, 63, 197, 200, 202, 208, 210, 211, 215,
and 252.
[213] C. Carlet. Generalized partial spreads. IEEE Transactions on Information Theory 41 (5),
pp. 1482–1487, 1995. See pages 241 and 242.
[214] C. Carlet. Hyper-bent functions. PRAGOCRYPT 1996, Czech Technical University Publishing
House, pp. 145–155, 1996. See page 244.
[215] C. Carlet. A construction of bent functions. Finite Fields and Applications, London Mathematical
Society, Lecture Series 233, Cambridge University Press, pp. 47–58, 1996. See pages 234 and 235.
[216] C. Carlet. More correlation-immune and resilient functions over Galois fields and Galois rings.
Proceedings of EUROCRYPT 1997, Lecture Notes in Computer Science 1233, pp. 422–433, 1997.
See pages 295 and 303.
[217] C. Carlet. On Kerdock codes, American Mathematical Society. Proceedings of the Conference
Finite Fields and Applications Fq4, Contemporary Mathematics 225, pp. 155–163, 1999. See
page 255.
[218] C. Carlet. On the propagation criterion of degree  and order k. Proceedings of EUROCRYPT 1998,
Lecture Notes in Computer Science 1403, pp. 462–474, 1998. See pages 318, 319, and 320.
References 509

[219] C. Carlet. On cryptographic propagation criteria for Boolean functions. Information and Computa-
tion 151, Academic Press, pp. 32–56, 1999. See pages 97, 198, 318, 319, and 320.
[220] C. Carlet. On the coset weight divisibility and nonlinearity of resilient and correlation-immune
functions. Proceedings of International Conference on Sequences and Their Applications SETA
2001, Discrete Mathematics and Theoretical Computer Science, pp. 131–144, 2001. See pages 47,
286, 287, and 289.
[221] C. Carlet. A larger class of cryptographic Boolean functions via a study of the Maiorana–McFarland
construction. Proceedings of CRYPT0 2002, Lecture Notes in Computer Science 2442, pp. 549–564,
2002. See pages 179, 293, 294, and 295.
[222] C. Carlet, On cryptographic complexity of Boolean functions. Finite Fields with Applications to
Coding Theory, Cryptography and Related Areas (Proceedings of the Conference Fq6), Springer-
Verlag, Berlin, pp. 53–69, 2002. See pages 103, 104, 105, 107, 108, 328, and 509.
[223] C. Carlet. On the confusion and diffusion properties of Maiorana–McFarland’s and extended
Maiorana–McFarland’s functions. Special Issue “Complexity Issues in Coding and Cryptography”,
Dedicated to Prof. Harald Niederreiter on the Occasion of his 60th Birthday, Journal of Complexity
20, pp. 182–204, 2004. See pages 211, 264, 293, and 296.
[224] C. Carlet. On the degree, nonlinearity, algebraic thickness and non-normality of Boolean functions,
with developments on symmetric functions. Extended version of [222]. IEEE Transactions on
Information Theory 50, pp. 2178–2185, 2004. See pages 80, 103, 104, 105, 107, 328, and 356.
[225] C. Carlet. On the secondary constructions of resilient and bent functions. Proceedings of the
Workshop on Coding, Cryptography and Combinatorics 2003, Published by Birkhäuser Verlag,
pp. 3–28, 2004. See pages 233, 300, 301, and 317.
[226] C. Carlet. Concatenating indicators of flats for designing cryptographic functions. Designs, Codes
and Cryptography 36 (2), pp. 189–202, 2005. See pages 181, 182, and 295.
[227] C. Carlet. On bent and highly nonlinear balanced/resilient functions and their algebraic immunities.
Proceedings of AAECC-16 Conference, Lecture Notes in Computer Science 3857, pp. 1–28, 2006.
Extended version of “Improving the algebraic immunity of resilient and nonlinear functions and
constructing bent function”, IACR Cryptology ePrint Archive (http://eprint.iacr.org/) 2004/276,
2004, and of “Designing bent functions and resilient functions from known ones, without extending
their number of variables,” Proceedings of IEEE International Symposium on Information Theory
(ISIT) 2005, pp. 1096–1100, 2005. See pages 231, 236, 237, 238, 302, 303, 332, and 341.
[228] C. Carlet. On the higher order nonlinearities of algebraic immune functions. Proceedings of
CRYPTO 2006, Lecture Notes in Computer Science 4117, pp. 584–601, 2006. See pages 323
and 331.
[229] C. Carlet. The complexity of Boolean functions from cryptographic viewpoint. Dagstuhl
Seminar Proceedings 06111 Complexity of Boolean Functions, 2006 (http://drops.dagstuhl.de/
opus/volltexte/) 2006/604 See pages 80, 83, and 103.
[230] C. Carlet. A method of construction of balanced functions with optimum algebraic immunity.
Proceedings of the International Workshop on Coding and Cryptography 2007, World Scientific
Publishing, Series of Coding and Cryptology, pp. 25–43, 2008. Preliminary version available in
IACR Cryptology ePrint Archive (http://eprint.iacr.org/) 2006/149, pp. 25–43 2006. See page 336.
[231] C. Carlet. Partial covering sequences: a method for designing classes of cryptographic functions.
Proceedings of “The First Symposium on Algebraic Geometry and Its Applications” Dedicated
to Gilles Lachaud (SAGA’07), Tahiti, 2007, World Scientific, Series on Number Theory and its
Applications 5, pp. 366–387, 2008. See pages 183 and 184.
[232] C. Carlet. Recursive lower bounds on the nonlinearity profile of Boolean functions and their
applications. IEEE Transactions on Information Theory 54 (3), pp. 1262–1272, 2008. See pages 84,
85, and 125.
[233] C. Carlet. On the higher order nonlinearities of Boolean functions and S-boxes, and their
generalizations. Proceedings of International Conference on Sequences and Their Applications
SETA 2008, Lecture Notes in Computer Science 5203, pp. 345–367, 2008. See pages 122, 123,
124, 125, 348, and 349.
510 References

[234] C. Carlet. On almost perfect nonlinear functions. Special Section on Signal Design and Its
Application in Communications, IEICE Trans. Fundamentals E91-A (12), pp. 3665–3678, 2008.
See pages 122 and 273.
[235] C. Carlet. On the algebraic immunities and higher order nonlinearities of vectorial Boolean
functions. NATO Science for Peace and Security Series, D: Information and Communication
Security – Vol 23; Enhancing Cryptographic Primitives with Techniques from Error Correcting
Codes, pp. 104–116, 2009. See pages 128, 129, 344, 345, 346, 347, and 348.
[236] C. Carlet. Boolean functions for cryptography and error correcting codes. Chapter of the monogra-
phy Boolean Models and Methods in Mathematics, Computer Science, and Engineering, Y. Crama
and P. Hammer, eds., Cambridge University Press, pp. 257–397, 2010. See pages ix, 198, 207,
and 338.
[237] C. Carlet. Vectorial Boolean functions for cryptography. Chapter of the monography Boolean Mod-
els and Methods in Mathematics, Computer Science, and Engineering, Y. Crama and P. Hammer,
eds., Cambridge University Press, pp. 398–469, 2010. See pages ix, 138, and 381.
[238] C. Carlet. Comment on “constructions of cryptographically significant Boolean functions using
primitive polynomials.” IEEE Transactions on Information Theory 57 (7), pp. 4852–4853, 2011.
See pages 337 and 339.
[239] C. Carlet. Relating three nonlinearity parameters of vectorial functions and building APN functions
from bent. Designs, Codes and Cryptography 59 (1), pp. 89–109, 2011. See pages 119, 120, 138,
140, 141, 393, 406, 408, 409, and 422.
[240] C. Carlet. More vectorial Boolean functions with unbounded nonlinearity profile. Special Issue on
Cryptography of International Journal of Foundations of Computer Science 22 (6), pp. 1259–1269,
2011. See page 85.
[241] C. Carlet. On known and new differentially uniform functions. Proceedings of the 16th Australasian
Conference on Information Security and Privacy ACISP 2011, Lecture Notes in Computer Science
6812, pp. 1–15, 2011. See pages 418 and 419.
[242] C. Carlet. A survey on nonlinear Boolean functions with optimal algebraic immunity suitable for
stream ciphers. Proceedings of the SMF-VMS conference, Hué, Vietnam, August 20–24, 2012.
Special Issue of the Vietnam Journal of Mathematics 41 (4), pp. 527–541, 2013. See pages 3, 337,
338, and 340.
[243] C. Carlet. More constructions of APN and differentially 4-uniform functions by concatenation.
Science China Mathematics 56 (7), pp. 1373–1384, 2013. See pages 410 and 422.
[244] C. Carlet. Correlation-immune Boolean functions for leakage squeezing and rotating S-box
masking against side channel attacks. Proceedings of SPACE 2013, Lecture Notes in Computer
Science 8204, pp. 70–74, 2013. See page 433.
[245] C. Carlet. Open problems on binary bent functions. Proceedings of the Conference Open Problems
in Mathematical and Computational Sciences, September 18–20, 2013, in Istanbul, Turkey,
Springer, pp. 203–241, 2014. See pages 47, 234, 249, 250, 251, 361, 362, and 475.
[246] C. Carlet. More PS and H-like bent functions. IACR Cryptology ePrint Archive (http://eprint
.iacr.org/) 2015/168, 2015. See pages 216, 223, 224, and 225.
[247] C. Carlet. Boolean and vectorial plateaued functions, and APN functions. IEEE Transactions on
Information Theory 61 (11), pp. 6272–6289, 2015. See pages 190, 258, 260, 261, 263, 265, 266, 275,
276, 277, 278, 279, 280, 281, 282, 392, 393, 394, and 402.
[248] C. Carlet. Open questions on nonlinearity and on APN functions. Proceedings of Arithmetic of
Finite Fields 5th International Workshop, WAIFI 2014, 2014, Lecture Notes in Computer Science
9061, pp. 83–107, 2015. See pages 82, 272, 338, 475, and 478.
[249] C. Carlet. On the nonlinearity of monotone Boolean functions. Special Issue SETA 2016 of
Cryptography and Communications 10 (6), pp. 1051–1061, 2018. See pages 364, 366, 367, and 368.
[250] C. Carlet. Characterizations of the differential uniformity of vectorial functions by the Walsh
transform, IEEE Transactions on Information Theory 64 (9), pp. 6443–6453, 2018. (Preliminary
version available in IACR Cryptology ePrint Archive (http://eprint.iacr.org/) 2017/516, 2017. See
pages 386, 387, 412, 413, and 414.
References 511

[251] C. Carlet. Componentwise APNness, Walsh uniformity of APN functions and cyclic-additive
difference sets. Finite Fields and Their Applications 53, pp. 226–253, 2018. (Preliminary version
available in IACR Cryptology ePrint Archive (http://eprint.iacr.org/) 2017/528, 2017. See pages 390,
391, 414, 415, and 416.
[252] C. Carlet. On APN exponents, characterizations of differentially uniform functions by the Walsh
transform, and related cyclic-difference-set-like structures. Designs, Codes and Cryptography 87
(2) (Postproceedings of WCC 2017), pp. 203–224, 2018. See pages 278, 389, 412, 414, 415, and 416.
[253] C. Carlet. Handling vectorial functions by means of their graph indicators, To appear in IEEE
Transactions on Information Theory, 2020. See pages 35, 40, and 47.
[254] C. Carlet. Graph indicators of S-boxes and related bounds on the algebraic degree of composite
functions. Preprint, 2020. See pages 35, 39, 40, 41, 47, 64, and 115.
[255] C. Carlet and Y. Alsalami. A new construction of differentially 4-uniform (n, n − 1)-functions.
Advances in Mathematics of Communications 9 (4), pp. 541–565, 2015. See pages 423 and 424.
[256] C. Carlet and P. Charpin. Cubic Boolean functions with highest resiliency. IEEE Transactions on
Information Theory 51 (2), pp. 562–571, 2005. See page 312.
[257] C. Carlet, P. Charpin, and V. Zinoviev. Codes, bent functions and permutations suitable for DES-
like cryptosystems. Designs, Codes and Cryptography, 15 (2), pp. 125–156, 1998. See pages 28,
160, 370, 372, 375, 376, 378, 380, 381, 382, 385, 387, 396, and 398.
[258] C. Carlet and X. Chen. Constructing low-weight dth-order correlation-immune Boolean functions
through the Fourier–Hadamard transform. IEEE Transactions on Information Theory 64 (4)
(Special Issue in honor of Solomon Golomb), pp. 2969–2978, 2018. See pages 291, 304, 305,
306, 307, 308, 309, 310, and 311.
[259] C. Carlet, X. Chen, and L. Qu. Constructing infinite families of low differential uniformity
(n, m)-functions with m > n/2. Designs, Codes and Cryptography 87 (7), pp. 1577–1599, 2019.
Preliminary version available in IACR Cryptology ePrint Archive (http://eprint.iacr.org/) 2018/1046.
See page 423.
[260] C. Carlet, A. Daif, S. Guilley, and C. Tavernier. Polynomial direct sum masking to protect against
both SCA and FIA. Journal of Cryptographic Engineering JCEN, 9 (3), pp. 303–312, 2019.
Preliminary version available in IACR Cryptology ePrint Archive (http://eprint.iacr.org/) 2018/531,
2018. See page 447.
[261] C. Carlet, D. Dalai, K. Gupta, and S. Maitra. Algebraic immunity for cryptographically significant
Boolean functions: analysis and construction. IEEE Transactions on Information Theory 52 (7),
pp. 3105–3121, 2006. See pages 323, 330, 331, 336, 342, 343, and 344.
[262] C. Carlet, J.-L. Danger, S. Guilley, and H. Maghrebi. Leakage squeezing of order two. Proceedings
of INDOCRYPT 2012, Lecture Notes in Computer Science 7668, pp. 120–139, 2012. See pages 427
and 432.
[263] C. Carlet, J.-L. Danger, S. Guilley, and H. Maghrebi. Leakage squeezing: optimal implementation
and security evaluation. Journal of Mathematical Cryptology 8 (3), pp. 249–295, 2014. See
page 431.
[264] C. Carlet, J.-L. Danger, S. Guilley, H. Maghrebi, and E. Prouff. Achieving side-channel high-order
correlation immunity with leakage squeezing. Journal of Cryptographic Engineering JCEN 4(2),
pp. 107–121, 2014. See pages 427 and 432.
[265] C. Carlet, L. E. Danielsen, M. G. Parker, and P. Solé. Self-dual bent functions. Special Issue of
the International Journal of Information and Coding Theory (IJICoT) dedicated to Vera Pless
1 (4), pp. 384–399, 2010. A preliminary version appeared in the proceedings of the BFCA 2008
conference. See pages 198 and 199.
[266] C. Carlet and C. Ding. Highly nonlinear mappings. Special Issue “Complexity Issues in Coding
and Cryptography,” dedicated to Prof. Harald Niederreiter on the Occasion of his 60th Birthday,
Journal of Complexity 20, pp. 205–244, 2004. See pages 122 and 193.
[267] C. Carlet and C. Ding. Nonlinearities of S-boxes. Finite Fields and Their Applications 13 (1),
pp. 121–135, 2007. See pages 114, 122, 138, and 140.
[268] C. Carlet, C. Ding, and H. Niederreiter. Authentication schemes from highly nonlinear functions.
Designs, Codes and Cryptography 40 (1), pp. 71–79, 2006. See page 150.
512 References

[269] C. Carlet, C. Ding, and J. Yuan. Linear codes from perfect nonlinear mappings and their secret
sharing schemes. IEEE Transactions on Information Theory 51 (6), pp. 2089–2102, 2005. See
pages 147 and 160.
[270] C. Carlet, H. Dobbertin, and G. Leander. Normal extensions of bent functions. IEEE Transactions
on Information Theory 50 (11), pp. 2880–2885, 2004. See pages 235, 253, and 476.
[271] C. Carlet and S. Dubuc. On generalized bent and q-ary perfect nonlinear functions. Proceedings of
Finite Fields and Applications Fq5, Springer, pp. 81–94, 2000. See page 193.
[272] C. Carlet, J.-C. Faugère, C. Goyet, and G. Renault. Analysis of the algebraic side channel attack.
Journal of Cryptographic Engineering JCEN 2(1), pp. 45–62, 2012. See page 436.
[273] C. Carlet and K. Feng. An infinite class of balanced functions with optimum algebraic immunity,
good immunity to fast algebraic attacks and good nonlinearity. Proceedings of ASIACRYPT 2008,
Lecture Notes in Computer Science 5350, pp. 425–440, 2008. See pages 321, 336, 337, 338, and 348.
[274] C. Carlet and K. Feng. An infinite class of balanced vectorial Boolean functions with optimum
algebraic immunity and good nonlinearity. Proceedings of IWCC 2009, Lecture Notes in Computer
Science 5557, pp. 1–11, 2009. See page 350.
[275] C. Carlet and S. Feukoua. Three basic questions on Boolean functions. Advances in Mathematics
of Communications 11 (4), pp. 837–855, 2017. See pages 38, 257, and 258.
[276] C. Carlet and S. Feukoua. Three parameters of Boolean functions related to their constancy on
affine spaces. To appear in Advances in Mathematics of Communications. See page 105.
[277] C. Carlet, F. Freibert, S. Guilley, M. Kiermaier, J.-L. Kim, and P. Solé. Higher-order CIS codes.
IEEE Transactions on Information Theory 60 (9), pp. 5283–5295, 2014. See page 432.
[278] C. Carlet and P. Gaborit. Hyper-bent functions and cyclic codes. Journal of Combinatorial Theory,
Series A 113 (3), 466–482, 2006. See pages 189, 216, 243, 244, 245, 246, 253, and 477.
[279] C. Carlet and P. Gaborit. On the construction of balanced Boolean functions with a good algebraic
immunity. Proceedings of IEEE International Symposium on Information Theory (ISIT) 2005.
Longer version in the Proceedings of the Conference BFCA 2005, Publications des universités de
Rouen et du Havre, pp. 1–20, 2005. See pages 324, 335, 342, and 363.
[280] C. Carlet, P. Gaborit, J.-L. Kim, and P. Solé. A new class of codes for Boolean masking of
cryptographic computations. IEEE Transactions on Information Theory 58 (9), pp. 6000–6011,
2012. See page 432.
[281] C. Carlet and G. Gao. A secondary construction and a transformation on rotation symmetric
functions, and their action on bent and semi-bent functions. Journal of Combinatorial Theory,
Series A 127 (1), pp. 161–175, 2014. See pages 249, 251, 252, 361, and 362.
[282] C. Carlet, G. Gao, and W. Liu. Results on constructions of rotation symmetric bent and semi-bent
functions. Proceedings of International Conference on Sequences and Their Applications SETA
2014, Lecture Notes in Computer Science 8865, pp. 21–33, 2014. See pages 251, 252, and 263.
[283] C. Carlet, G. Gong, and Y. Tan. Quadratic zero-difference balanced functions, APN functions
and strongly regular graphs. Designs, Codes and Cryptography 78 (3), pp. 629–654, 2016. See
pages 153, 393, and 394.
[284] C. Carlet, L. Goubin, E. Prouff, M. Quisquater, and M. Rivain. Higher-order masking schemes for
S-boxes. Proceedings of Fast Software Encryption FSE 2012, Lecture Notes in Computer Science
7549 , pp. 366–384, 2012. See pages 433 and 434.
[285] C. Carlet and A. Gouget. An upper bound on the number of m-resilient Boolean functions.
Proceedings of ASIACRYPT 2002, Lecture Notes in Computer Science 2501, pp. 484–496, 2002.
See page 312.
[286] C. Carlet and S. Guilley. Side-channel indistinguishability. Proceedings of HASP 2013, 2nd
International Workshop on Hardware and Architectural Support for Security and Privacy, ACM,
pp. 9:1–9:8 2013. See pages 284 and 433.
[287] C. Carlet and S. Guilley, Correlation-immune Boolean functions for easing counter-measures to side
channel attacks. Proceedings of the Workshop “Emerging Applications of Finite Fields”, Algebraic
Curves and Finite Fields, Radon Series on Computational and Applied Mathematics, de Gruyter,
pp. 41–70, 2014. See pages 304 and 306.
References 513

[288] C. Carlet and S. Guilley. Complementary dual codes for counter-measures to side-channel attacks.
Advances in Mathematics of Communications 10 (1), pp. 131–150, 2016. See pages 444 and 446.
[289] C. Carlet and S. Guilley. Statistical properties of side-channel and fault injection attacks using
coding theory. Cryptography and Communications 10 (5), pp. 909–933, 2018. See pages 433
and 446.
[290] C. Carlet and P. Guillot. A characterization of binary bent functions. Journal of Combinatorial
Theory, Series A 76 (2), pp. 328–335, 1996. See page 241.
[291] C. Carlet and P. Guillot. An alternate characterization of the bentness of binary functions, with
uniqueness. Designs, Codes and Cryptography 14, pp. 133–140, 1998. See page 242.
[292] C. Carlet and P. Guillot. A new representation of Boolean functions. Proceedings of AAECC-13
Conference, Lecture Notes in Computer Science 1719, pp. 94–103, 1999. See pages 47, 48, 49, 50,
51, 66, 156, 157, 195, and 201.
[293] C. Carlet and P. Guillot. Bent, resilient functions and the numerical normal form. DIMACS Series
in Discrete Mathematics and Theoretical Computer Science, 56, pp. 87–96, 2001. See pages 47, 48,
51, 195, and 286.
[294] C. Carlet, P. Guillot, and S. Mesnager. On immunity profile of Boolean functions. Proceedings
of International Conference on Sequences and Their Applications SETA 2006, Lecture Notes in
Computer Science 4086, pp. 364–375, 2006. See page 290.
[295] C. Carlet, C. Güneri, S. Mesnager, and F. Özbudak. Construction of codes suitable for both SCA
and FIA. Proceedings of WAIFI 2018, Lecture Notes in Computer Science 11321, pp. 95–107, 2018.
See page 446.
[296] C. Carlet, T. Helleseth, A. Kholosha, and S. Mesnager. On the duals of bent functions with 2r
Niho exponents. Proceedings of IEEE International Symposium on Information Theory (ISIT) 2011,
pp. 703–707, 2011. See page 222.
[297] C. Carlet, A. Heuser, and S. Picek. Trade-offs for S-boxes: cryptographic properties and side-
channel resilience. Proceedings of ACNS 2017, Lecture Notes in Computer Science 10355,
pp. 393–414, 2017. See pages 145 and 433.
[298] C. Carlet, D. Joyner, P. Stănică, and D. Tang. Cryptographic properties of monotone Boolean
functions. Journal of Mathematical Cryptology 10 (1), pp. 1–14, 2016. See pages 364 and 368.
[299] C. Carlet, K. Khoo, C.-W. Lim, and C.-W. Loe. Generalized correlation analysis of vectorial
Boolean functions. Proceedings of Fast Software Encryption FSE 2007, Lecture Notes in Computer
Science 4593, pp. 382–398, 2007. See pages 131 and 132.
[300] C. Carlet, K. Khoo, C.-W. Lim, and C.-W. Loe. On an improved correlation analysis of stream
ciphers using multi-output Boolean functions and the related generalized notion of nonlinearity.
Advances in Mathematics of Communications 2 (2), pp. 201–221, 2008. See page 133.
[301] C. Carlet and A. Klapper. Upper bounds on the numbers of resilient functions and of bent functions.
This paper was meant to appear in an issue of Lecture Notes in Computer Sciences dedicated to
Philippe Delsarte, Editor Jean-Jacques Quisquater, which never appeared. Shorter version in the
Proceedings of the 23rd Symposium on Information Theory in the Benelux, Louvain-La-Neuve,
Belgium, 2002. See pages 67, 195, 243, and 312.
[302] C. Carlet and A. Klapper. On the arithmetic Walsh coefficients of Boolean functions. Designs,
Codes and Cryptography 73 (2), pp. 299–318, 2014. See page 57.
[303] C. Carlet, J. C. Ku-Cauich, and H. Tapia-Recillas. Bent functions on a Galois ring and systematic
authentication codes. Advances in Mathematics of Communications 6 (2), pp. 249–258, 2012. See
page 150.
[304] C. Carlet, A.B. Levina, and S. V. Taranov. Algebraic manipulation detection codes with perfect
nonlinear functions under non-uniform distribution. Scientific and Technical Journal of Information
Technologies, Mechanics and Optics 17, 6 (112), pp. 1052–1062, 2017. See page 450.
[305] C. Carlet and P. Méaux. Boolean functions for homomorphic-friendly stream ciphers. Proceedings
of the Conference on Algebra, Codes and Cryptology (A2C), Lecture Notes in Computer Science,
pp. 166–182, Springer, Cham 2019 2019 (this version does not include proofs; a full paper is to
come later). See pages 341, 342, 360, 363, and 453.
514 References

[306] C. Carlet, P. Méaux, and Y. Rotella. Boolean functions with restricted input and their robustness;
application to the FLIP cipher. IACR Transactions on Symmetric Cryptology 2017 (3), pp. 192–227,
2017. See pages 321, 363, 453, 456, 457, 458, 459, 460, 461, 463, 464, 465, 466, and 467.
[307] C. Carlet and B. Merabet. Asymptotic lower bound on the algebraic immunity of random balanced
multi-output Boolean functions. Advances in Mathematics of Communications 7 (2), pp. 197–217,
2013. See page 93.
[308] C. Carlet and S. Mesnager. On the supports of the Walsh transforms of Boolean functions.
Proceedings of the Conference BFCA 2005, Publications des universités de Rouen et du Havre,
pp. 65–82, 2005. See pages 57 and 183.
[309] C. Carlet and S. Mesnager. Improving the upper bounds on the covering radii of binary Reed–
Muller codes. IEEE Transactions on Information Theory 53, pp. 162–173, 2007. See pages 83, 158,
and 159.
[310] C. Carlet and S. Mesnager. On the construction of bent vectorial functions. Special Issue of the
International Journal of Information and Coding Theory (IJICoT) 1 (2), dedicated to Vera Pless,
pp. 133–148, 2010. See pages 269, 270, 271, and 274.
[311] C. Carlet and S. Mesnager. On Dillon’s class H of bent functions, Niho bent functions and
o-polynomials. Journal of Combinatorial Theory, Series A 118, pp. 2392–2410, 2011. See
pages 168, 199, 217, 218, 219, 220, 222, 362, and 476.
[312] C. Carlet and S. Mesnager. On semi-bent Boolean functions. IEEE Transactions on Information
Theory 58, pp. 3287–3292, 2012. See pages 61, 263, and 362.
[313] C. Carlet and S. Mesnager. Four decades of research on bent functions. Designs, Codes and
Cryptography 78 (1), pp. 5–50, 2016. See pages 190, 201, 222, 266, and 270.
[314] C. Carlet and S. Mesnager. Characterizations of o-polynomials by the Walsh transform,
arXiv:1709.03765, 2017 https://arxiv.org/abs/1709.03765. See page 414.
[315] C. Carlet and S. Mesnager. On those multiplicative subgroups of F∗2n which are Sidon sets and/or
sum-free sets. Preprint 2019. See page 390.
[316] C. Carlet and S. Picek. On the exponents of APN power functions and Sidon sets, sum-free sets, and
Dickson polynomials. IACR Cryptology ePrint Archive (http://eprint.iacr.org/) 2017/1179, 2017.
See pages 388, 389, and 390.
[317] C. Carlet and E. Prouff. On plateaued functions and their constructions. Proceedings of Fast
Software Encryption FSE 2003, Lecture Notes in Computer Science 2887, pp. 54–73, 2003. See
pages 180, 193, 258, 260, 264, 265, and 295.
[318] C. Carlet and E. Prouff. On a new notion of nonlinearity relevant to multi-output pseudo-random
generators. Proceedings of Selected Areas in Cryptography 2003, Lecture Notes in Computer
Science 3006, pp. 291–305, 2004. See pages 131 and 133.
[319] C. Carlet and E. Prouff. Vectorial functions and covering sequences. Proceedings of Finite Fields
and Applications, Fq7, Lecture Notes in Computer Science 2948, pp. 215–248, 2004. See pages 184,
186, 187, and 316.
[320] C. Carlet and E. Prouff. Polynomial evaluation and side channel analysis. The New Codebreakers,
Dedicated to David Kahn on the Occasion of His 85th Birthday, pp. 315–341, 2016. See pages 74,
425, 431, 434, and 484.
[321] C. Carlet, E. Prouff, M. Rivain, and T. Roche. Algebraic decomposition for probing security.
Proceedings of CRYPTO 2015, Lecture Notes in Computer Science 9215, pp. 742–763, 2015. See
pages 435 and 441.
[322] C. Carlet and P. Sarkar. Spectral domain analysis of correlation immune and resilient Boolean
functions. Finite Fields and Their Applications 8, pp. 120–130, 2002. See pages 287, 289, and 290.
[323] C. Carlet and Y. Tan. On group rings and some of their applications to combinatorics and
cryptography. International Journal of Group Theory 4 (4), pp. 61–74, 2015. See page 153.
[324] C. Carlet and D. Tang. Enhanced Boolean functions suitable for the filter model of pseudo-random
generator. Designs, Codes and Cryptography 76 (3), pp. 571–587, 2015. See pages 94, 322, 343,
and 344.
References 515

[325] C. Carlet, D. Tang, X. Tang, and Q. Liao. New construction of differentially 4-uniform
bijections. Proceedings of INSCRYPT 2013, 9th International Conference, Guangzhou, China,
November 27–30, 2013, Lecture Notes in Computer Science 8567, pp. 22–38, 2014. See pages 420
and 421.
[326] C. Carlet and Y. V. Tarannikov. Covering sequences of Boolean functions and their cryptographic
significance. Designs, Codes and Cryptography, 25, pp. 263–279, 2002. See pages 182, 290,
and 291.
[327] C. Carlet and J. L. Yucas. Piecewise constructions of bent and almost optimal Boolean functions.
Designs, Codes and Cryptography 37 (3), pp. 449–464, 2005. See pages 237 and 255.
[328] C. Carlet, X. Zeng, C. Lei, and L. Hu. Further properties of several classes of Boolean functions
with optimum algebraic immunity. Proceedings of the First International Conference on Symbolic
Computation and Cryptography SCC 2008, LMIB, pp. 42–54, 2008. See pages 336 and 355.
[329] C. Carlet, F. Zhang, and Y. Hu. Secondary constructions of bent functions and their enforcement.
Advances in Mathematics of Communications 6 (3), pp. 305–314, 2012. See page 233.
[330] L. Carlitz. Permutations in a finite field. Proc. Amer. Math. Soc. 4, p. 538, 1953. See page 494.
[331] L. Carlitz. A note on permutation functions over a finite field. Duke Math. Journal 29 (2),
pp. 325–332, 1962. See page 442.
[332] L. Carlitz. Explicit evaluation of certain exponential sums. Math. Scand., 44, pp. 5–16, 1979. See
page 177.
[333] L. Carlitz and S. Uchiyama. Bounds for exponential sums. Duke Math. Journal 1, pp. 37–41, 1957.
See pages 188 and 401.
[334] F. N. Castro, and L. A. Medina. Linear recurrences and asymptotic behavior of exponential sums
of symmetric Boolean functions. The Electronic Journal of Combinatorics 18 (2), p. 8, 2011. See
page 355.
[335] A. Çeşmelioglu, W. Meidl, and A. Pott. Bent functions, spreads and o-polynomials. SIAM Journal
on Discrete Mathematics 29 (2), pp. 854–867, 2015. See page 224.
[336] A. Çeşmelioglu, W. Meidl, and A. Pott. There are infinitely many bent functions for which the dual
is not bent. IEEE Transactions on Information Theory 62 (9), pp. 5204–5208, 2016. See page 234.
[337] A. Çeşmelioglu, W. Meidl, and A. Pott. Vectorial bent functions and their duals. Linear Algebra
and Its Applications 548, pp. 305–320, 2018. See page 269.
[338] A. Çeşmelioglu, W. Meidl, and A. Topuzoğlu. Partially bent functions and their properties. Applied
Algebra and Number Theory 2014, pp. 22–38, 2014. See page 256.
[339] H. Chabanne, G. D. Cohen, and A. Patey. Towards secure two-party computation from the wire-tap
channel. Proceedings of ICISC 2013, Lecture Notes in Computer Science 8565, pp. 34–46, 2013.
See page 149.
[340] H. Chabanne, H. Maghrebi, and E. Prouff. Linear repairing codes and side-channel attacks. IACR
Transactions on Cryptographic Hardware and Embedded Systems 2018 (1), pp. 118–141, 2018.
See pages 428 and 436.
[341] F. Chabaud and S. Vaudenay. Links between differential and linear cryptanalysis. Proceedings of
EUROCRYPT 1994, Lecture Notes in Computer Science 950, pp. 356–365, 1995. See pages 117,
118, 119, and 372.
[342] C. Chaigneau, T. Fuhr, H. Gilbert, J. Guo, J. Jean, J.-R. Reinhard, and L. Song. Key-recovery
attacks on full Kravatte. IACR Transactions on Symmetric Cryptology 2018 (1), pp. 5–28, 2018.
See page 96.
[343] K. Chakraborty, S. Sarkar, S. Maitra, B. Mazumdar, D. Mukhopadhyay, and E. Prouff. Redefining
the transparency order. Designs, Codes and Cryptography 82 (1–2), pp. 95–115, 2017. See
page 429.
[344] A. H. Chan and R. A. Games. On the quadratic spans of De Bruijn sequences. IEEE Transactions
on Information Theory 36 (4), pp. 822–829, 1990. See page 23.
[345] S. Chang and J. Y. Hyun. Linear codes from simplicial complexes. Designs, Codes and Cryptogra-
phy 86 (10), pp. 2167–2181, 2018. See page 149.
[346] S. Chanson, C. Ding, and A. Salomaa. Cartesian authentication codes from functions with optimal
nonlinearity. Theoretical Computer Science 290, pp. 1737–1752, 2003. See page 370.
516 References

[347] C. Charnes, M. Rötteler, and T. Beth. Homogeneous bent functions, invariants, and designs.
Designs, Codes and Cryptography, 26, pp. 139–154, 2002. See pages 209 and 248.
[348] P. Charpin. Open Problems on Cyclic Codes. In Handbook of Coding Theory, Part 1, chapter 11,
V. S. Pless and W. C. Huffman, eds., R. A. Brualdi, assistant editor. Elsevier, part 1, chapter 11,
pp. 963–1063, 1998. See page 6.
[349] P. Charpin. Normal Boolean functions. Special Issue “Complexity Issues in Coding and Cryptog-
raphy,” Dedicated to Prof. Harald Niederreiter on the Occasion of His 60th Birthday, Journal of
Complexity 20, pp. 245–265, 2004. See pages 241 and 253.
[350] P. Charpin and G. Gong. Hyperbent functions, Kloosterman sums and Dickson polynomials. IEEE
Transactions on Information Theory 54 (9), pp. 4230–4238, 2008. See pages 215, 231, 246, and 247.
[351] P. Charpin, T. Helleseth, and V. Zinoviev. Propagation characteristics of x → 1/x and Kloosterman
sums. Finite Fields and Their Applications 13 (2), pp. 366–381, 2007. See page 401.
[352] P. Charpin and G. Kyureghyan. Cubic monomial bent functions: a subclass of M. SIAM Journal
on Discrete Mathematics 22 (2), pp. 650–665, 2008. See pages 230 and 231.
[353] P. Charpin and G. Kyureghyan. On sets determining the differential spectrum of mappings.
International Journal of Information and Coding Theory (IJICoT) 4 (2/3), pp. 170–184, 2017. See
also: A note on verifying the APN property. IACR Cryptology ePrint Archive (http://eprint.iacr.org/)
2013/475, 2013. See pages 192 and 373.
[354] P. Charpin and E. Pasalic. On propagation characteristics of resilient functions. Proceedings of SAC
2002, Lecture Notes in Computer Science 2595, pp. 356–365, 2002. See pages 101, 290, and 292.
[355] P. Charpin, E. Pasalic, and C. Tavernier. On bent and semi-bent quadratic Boolean functions. IEEE
Transactions on Information Theory 51 (12), pp. 4286–4298, 2005. See pages 178, 206, 230,
and 263.
[356] P. Charpin and J. Peng. New links between nonlinearity and differential uniformity. Finite Fields
and Their Applications 56, pp. 188–208, 2019. See pages 387 and 418.
[357] S. Chee, S. Lee, and K. Kim. Semi-bent functions. Proceedings of ASIACRYPT 1994, Lecture Notes
in Computer Science 917, pp. 107–118, 1994. See page 263.
[358] S. Chee, S. Lee, K. Kim, and D. Kim. Correlation immune functions with controlable nonlinearity.
ETRI Journal 19, (4), pp. 389–401, 1997. See page 294.
[359] S. Chee, S. Lee, D. Lee, and S. H. Sung. On the correlation immune functions and their nonlinearity.
Proceedings of ASIACRYPT 1996, Lecture Notes in Computer Science 1163, pp. 232–243, 1997.
See page 294.
[360] L. Chen, S. Jordan, Y.-K. Liu, D. Moody, R. Peralta, R. Perlner, and D. Smith-Tone. Report on
post-quantum cryptography. US Department of Commerce, National Institute of Standards and
Technology, NISTIR 8105. See page 1.
[361] V. Y.-W. Chen. The Gowers’ norm in the testing of Boolean functions. PhD thesis, Massachusetts
Institute of Technology, 2009. See page 469.
[362] X. Chen, Y. Deng, M. Zhu, and L. Qu. An equivalent condition on the switching construction
of differentially 4-uniform permutations on from the inverse function. International Journal of
Computer Mathematics 94 (6), pp. 1252–1267, 2017. See page 420.
[363] X. Chen, L. Qu, C. Li, and J. Du. A new method to investigate the CCZ-equivalence between
functions with low differential uniformity. Finite Fields and Their Applications 42, pp. 165–186,
2016. See page 420.
[364] Y. Chen and P. Lu. Two classes of symmetric Boolean functions with optimum algebraic immunity:
construction and analysis. IEEE Transactions on Information Theory 57, pp. 2522–2538, 2011. See
page 358.
[365] W. Cheng, C. Carlet, K. Goli, S. Guilley, and J.-L. Danger. Detecting faults in inner product masking
scheme. IACR Cryptology ePrint Archive (http://eprint.iacr.org/) 2019/919. Presented at PROOFS
2019, 8th International Workshop on Security Proofs for Embedded Systems, Atlanta, USA, 2019
(https://easychair.org/publications/paper/HTzP). See page 447.
[366] W. Cheng, S. Guilley, C. Carlet, J.-L. Danger, and A. Schaub. Optimal codes for inner product
masking. 17th CryptArchi Workshop, Prague 2019. See page 446.
References 517

[367] J. H. Cheon. Nonlinear vector resilient functions. Proceedings of CRYPTO 2001, Lecture Notes in
Computer Science 2139, pp. 458–469, 2001. See page 317.
[368] J. H. Cheon and D. H. Lee. Resistance of S-boxes against algebraic attacks. Proceedings of Fast
Software Encryption FSE 2004, Lecture Notes in Computer Science 3017, pp. 83–94, 2004. See
page 345.
[369] V. Chepyzhov and B. Smeets. On a fast correlation attack on certain stream ciphers. Proceedings of
EUROCRYPT 1991, Lecture Notes in Computer Science 547, pp. 176–185, 1992. See page 78.
[370] B. Chor, O. Goldreich, J. Hastad, J. Freidmann, S. Rudich, and R. Smolensky. The bit extraction
problem or t-resilient functions. Proceedings of the 26th IEEE Symposium on Foundations of
Computer Science, pp. 396–407, 1985. See pages 86, 129, 284, 314, and 356.
[371] C. Cid, T. Huang, T. Peyrin, Y. Sasaki, and L. Song. Boomerang connectivity table: a new
cryptanalysis tool. Proceedings of EUROCRYPT (2) 2018, Lecture Notes in Computer Science
10821, pp. 683–714, 2018. See page 141.
[372] J. A. Clark, J. L. Jacob, and S. Stepney. The design of S-boxes by simulated annealing. New
Generation Computing 23 (3), pp. 219–231, 2005. See page 145.
[373] E. M. Clarke, K. L. McMillan, X. Zhao, M. Fujita, and J. Yang. Spectral transforms for large
Boolean functions with applications to technology mapping. Proceedings of 30th ACM/IEEE
Design Automation Conference, IEEE, pp. 54–60, 1993. See page 57.
[374] G. Cohen and J. P. Flori. On a generalized combinatorial conjecture involving addition mod 2k − 1.
IACR Cryptology ePrint Archive (http://eprint.iacr.org/) 2011/400, 2011. See page 340.
[375] G. Cohen, I. Honkala, S. Litsyn, and A. Lobstein. Covering Codes. North-Holland, 1997. See
pages 7 and 158.
[376] G. Cohen and S. Mesnager. On constructions of semi-bent functions from bent functions. Journal
Contemporary Mathematics 625, Discrete Geometry and Algebraic Combinatorics, American
Mathematical Society, pp. 141–154, 2014. See page 263.
[377] G. Cohen and A. Tal. Two structural results for low degree polynomials and applications. 18th
International Workshop on Approximation Algorithms for Combinatorial Optimization Problems,
APPROX 2015, and 19th International Workshop on Randomization and Computation, RANDOM
2015 – Princeton, United States. CoRR, abs/1404.0654, 2014. See page 109.
[378] G. D. Cohen, M. G. Karpovsky, H. F. Mattson Jr., and J. R. Schatz. Covering radius – survey and
recent results. IEEE Transactions on Information Theory 31 (3), pp. 328–343, 1985. See page 158.
[379] G. D. Cohen and S. Litsyn. On the covering radius of Reed–Muller codes. Discrete Mathematics
106–107, pp. 147–155, 1992. See page 158.
[380] S. D. Cohen and R. W. Matthews. A class of exceptional polynomials. Transactions of the AMS
345, pp. 897–909, 1994. See page 400.
[381] J.-S. Coron, E. Prouff, M. Rivain, and T. Roche. Higher-order side channel security and mask
refreshing. Proceedings of Fast Software Encryption FSE 2013, Lecture Notes in Computer Science
8424, pp. 410–424, 2013, and IACR Cryptology ePrint Archive (http://eprint.iacr.org/) 2015/359,
2015. See page 427.
[382] J.-S. Coron, A. Roy, and S. Vivek. Fast evaluation of polynomials over binary finite fields
and application to side-channel countermeasures. Proceedings of International Workshop Crypto-
graphic Hardware and Embedded Systems CHES 2014, Lecture Notes in Computer Science 8731,
pp. 170–187, 2014. See page 434.
[383] A. Coşgun and F. Özbudak. A correction and improvements of some recent results on Walsh
transforms of Gold type and Kasami–Welch type functions. Proceedings of WAIFI 2016, Lecture
Notes in Computer Science 10064, pp. 243–257, 2016. See page 178.
[384] R. S. Coulter. On the evaluation of a class of Weil sums in characteristic 2, New Zealand J. Math.
28, pp. 171–184, 1999. See page 178.
[385] R. S. Coulter. The number of rational points of a class of Artin–Schreier curves. Finite Fields and
Their Applications 8, pp. 397–413, 2002. See page 178.
[386] R. S. Coulter and S. Mesnager. Bent functions from involutions over F2n . IEEE Transactions on
Information Theory 64 (4), pp. 2979–2986, 2018. See page 237.
518 References

[387] N. Courtois. Higher order correlation attacks, XL algorithm and cryptanalysis of Toyocrypt.
Proceedings of ICISC 2002, Lecture Notes in Computer Science 2587, pp. 182–199, 2003. See
pages 3 and 83.
[388] N. Courtois. Fast algebraic attacks on stream ciphers with linear feedback. Proceedings of CRYPTO
2003, Lecture Notes in Computer Science 2729, pp. 177–194, 2003. See pages 76, 89, 93, and 284.
[389] N. Courtois. Algebraic attacks on combiners with memory and several outputs. Proceedings of
ICISC 2004, Lecture Notes in Computer Science 3506, pp. 3–20, 2005. See page 91.
[390] N. Courtois. Cryptanalysis of SFINKS. Proceedings of ICISC 2005. Preliminary version available
in IACR Cryptology ePrint Archive (http://eprint.iacr.org/) pp. 261–269, 2005/243, 2005. See
page 93.
[391] N. Courtois and W. Meier. Algebraic attacks on stream ciphers with linear feedback. Proceedings
of EUROCRYPT 2003, Lecture Notes in Computer Science 2656, pp. 346–359. See pages 76, 89,
91, 92, and 93.
[392] N. Courtois and J. Pieprzyk. Cryptanalysis of block ciphers with overdefined systems of equations.
Proceedings of ASIACRYPT 2002, Lecture Notes in Computer Science 2501, pp. 267–287, 2003.
See pages 125, 126, 127, and 345.
[393] G. Couteau, A. Dupin, P. Méaux, M. Rossi, and Y. Rotella. On the concrete security of Goldreich’s
pseudorandom generator. Proceedings of ASIACRYPT 2018, Part I, Lecture Notes in Computer
Science, 11273, pp. 96–124, 2018. See pages 467 and 468.
[394] Y. Crama and P. L. Hammer. Boolean Models and Methods in Mathematics, Computer Science, and
Engineering. Cambridge University Press, 2010 Cambridge University Press, 2010. See page ix.
[395] R. Cramer, Y. Dodis, S. Fehr, C. Padro, and D. Wichs. Detection of algebraic manipulation with
application to robust secret sharing and fuzzy extractors. Proceedings of EUROCRYPT 2008,
Lecture Notes in Computer Science 4965, pp. 471–488, 2008 (preliminary version available in IACR
Cryptology ePrint Archive http://eprint.iacr.org/ 2008/030). See pages 450, 451, 452, and 453.
[396] R. Cramer, S. Fehr, and C. Padro. Algebraic manipulation detection codes. Science China
Mathematics 56 (7), pp. 1349–1358, 2013. See pages 451, 452, and 453.
[397] R. Cramer, C. Padr, and C. Xing. Optimal algebraic manipulation detection codes in the constant-
error model. Proceedings of TCC (1) 2015, Lecture Notes in Computer Science 9014, pp. 481–501,
2015. See pages 452 and 453.
[398] T. W. Cusick. On constructing balanced correlation immune functions. Proceedings of International
Conference on Sequences and Their Applications SETA 1998, Discrete Mathematics and Theoreti-
cal Computer Science, pp. 184–190, 1999. See page 293.
[399] T. W. Cusick. Weight recursions for any rotation symmetric Boolean functions. IEEE Transactions
on Information Theory 64 (4), pp. 2962–2968, 2018. See page 362.
[400] T. W. Cusick, C. Ding, and A. Renvall. Stream Ciphers and Number Theory, North-Holland
Mathematical Library 55. North-Holland/Elsevier, 1998.
[401] T. W. Cusick and P. Stănică. Cryptographic Boolean Functions and Applications (second edition),
Elsevier, 2017. See page 76.
[402] J. Daemen. Changing of the guards: a simple and efficient method for achieving uniformity in
threshold sharing. Proceedings of International Workshop Cryptographic Hardware and Embedded
Systems CHES 2017, Lecture Notes in Computer Science 10529, pp. 137–153, 2017. See page 443.
[403] J. Daemen and V. Rijmen. AES proposal: Rijndael, 1999. See www.quadibloc.com/crypto/
co040401.htm. See page 3.
[404] J. Daemen and V. Rijmen, The Design of Rijndael: AES – The Advanced Encryption Standard.
Springer, 2002. See pages 3 and 25.
[405] J. Daemen and V. Rijmen. Probability distributions of correlation and differentials in block ciphers.
Journal of Mathematical Cryptology (JMC) 1 (3), pp. 221–242, 2007. See page 136.
[406] D. Dalai. On 3-to-1 and power APN S-boxes. Proceedings of International Conference on
Sequences and Their Applications SETA 2008, Lecture Notes in Computer Science 5203,
pp. 377–389, 2008. See pages 385 and 386.
References 519

[407] D. K. Dalai, K. C. Gupta, and S. Maitra. Results on algebraic immunity for cryptographically
significant Boolean functions. Proceedings of Indocrypt 2004, Lecture Notes in Computer Science
3348, pp. 92–106, 2004. See pages 324, 334, and 343.
[408] D. K. Dalai, K. C. Gupta, and S. Maitra. Cryptographically significant Boolean functions:
construction and analysis in terms of algebraic immunity. Proceedings of Fast Software Encryption
FSE 2005, Lecture Notes in Computer Science 3557, pp. 98–111, 2005. See page 336.
[409] D. K. Dalai, S. Maitra, and S. Sarkar. Basic theory in construction of Boolean functions with
maximum possible annihilator immunity. Designs, Codes and Cryptography 40 (1) pp, 41–58, 2006
(preliminary version available in IACR Cryptology ePrint Archive, http://eprint.iacr.org/ 2005/229,
2005). See pages 334, 335, and 360.
[410] E. R. van Dam and D. Fon-Der-Flaass. Codes, graphs, and schemes from nonlinear functions.
European Journal of Combinatorics 24 (1), pp. 85–98, 2003. See pages 373 and 376.
[411] E. R. van Dam and M. Muzychuk. Some implications on amorphic association schemes. Journal of
Combinatorial Theory, Series A 117 (2), pp. 111–127, 2010. See page 162.
[412] M. Daum, H. Dobbertin, and G. Leander. An algorithm for checking normality of Boolean
functions. Proceedings of the Workshop on Coding and Cryptography 2003, pp. 133–142, 2003.
See page 252.
[413] M. Daum, H. Dobbertin, and G. Leander. Short description of an algorithm to create bent functions.
Private communication. See page 243.
[414] D. Davidova. Magic action of o-polynomials and EA equivalence of Niho bent functions.
BFA 2018, June 2018, Loen, Norway (https://people.uib.no/chunlei.li/workshops/BFA2018/Slides/
Davidova.pdf). See page 221.
[415] J. A. Davis and J. Jedwab. A unifying construction for difference sets. Journal of Combinatorial
Theory Series A 80, pp. 13–78, 1997. See page 235.
[416] J. A. Davis and J. Jedwab. Peak-to-mean power control in OFDM, Golay complementary sequences
and Reed–Muller codes. IEEE Transactions on Information Theory 45 (7), pp. 2397–2417, 1999.
See page 189.
[417] E. Dawson and C.-K. Wu. Construction of correlation immune Boolean functions. Proceedings of
ICICS 1997, pp. 170–180, 1997. See page 292.
[418] M. Delgado. The state of the art on the conjecture of exceptional APN functions. Note Mat. 37 (1),
pp. 41–51, 2017. See page 404.
[419] M. Delgado and H. Janwa. On the conjecture on APN functions and absolute irreducibility of
polynomials. Designs, Codes and Cryptography 82 (3), pp. 617–627, 2017. See page 404.
[420] P. Delsarte. A geometrical approach to a class of cyclic codes. Journal of Combinatorial Theory 6
(4), pp. 340–358, 1969. See page 154.
[421] P. Delsarte. Bounds for unrestricted codes, by linear programming. Philips Research Reports 27,
pp. 272–289, 1972. See page 129.
[422] P. Delsarte. An algebraic approach to the association schemes of coding theory. PhD thesis.
Université Catholique de Louvain, 1973. See pages 88, 254, 304, and 314.
[423] P. Delsarte. Four fundamental parameters of a code and their combinatorial significance. Informa-
tion and Control 23 (5), pp. 407–438, 1973. See page 88.
[424] P. Delsarte and V. I. Levenshtein. Association schemes and coding theory. IEEE Transactions on
Information Theory 44 (6), pp. 2477–2504, 1998. See page 162.
[425] P. Dembowski. Finite Geometries. Springer, 1968. See pages 224 and 225.
[426] U. Dempwolff. Automorphisms and equivalence of bent functions and of difference sets in
elementary Abelian 2-groups. Comm. Algebra 34 (3), pp. 1077–1131, 2006. See page 192.
[427] U. Dempwolff and T. Neumann. Geometric and design-theoretic aspects of semi-bent functions I.
Designs, Codes and Cryptography 57 (3), pp. 373–381, 2010. See page 262.
[428] U. Dempwolff. Dimensional doubly dual hyperovals and bent functions. Innov. Incidence Geom.
13 (1), pp. 149–178, 2013. See page 197.
[429] U. Dempwolff. CCZ equivalence of power functions. Designs, Codes and Cryptography 86 (3),
pp. 665–692, 2018. See page 396.
520 References

[430] U. Dempwolff and Y. Edel. Isomorphisms and automorphisms of extensions of bilinear dimensional
dual hyperovals and quadratic APN functions. Journal of Group Theory 19 (2), pp. 249–322, 2016.
See page 383.
[431] U. Dempwolff and P. Müller. Permutation polynomials and translation planes of even order. Adv.
Geom. 13, pp. 293–313, 2013. See page 225.
[432] J. D. Denev and V. D. Tonchev. On the number of equivalence classes of Boolean functions under
a transformation group. IEEE Transactions on Information Theory 26 (5), pp. 625–626, 1980 See
page 29.
[433] O. Denisov. An asymptotic formula for the number of correlation-immune of order k Boolean
functions. Discrete Mathematics and Applications 2 (4), pp. 407–426, 1992. Translation of a
Russian article in Diskretnaya Matematika 3, pp. 25–46, 1990. See page 312.
[434] O. Denisov. A local limit theorem for the distribution of a part of the spectrum of a random binary
function. Discrete Mathematics and Applications 10 (1), pp. 87–102, 2000. See page 312.
[435] S. Dib. Distribution of Boolean functions according to the second-order nonlinearity. Proceedings
of Arithmetic of Finite Fields WAIFI 2010, Lecture Notes in Computer Science 6087, pp. 86–96,
2010. See page 84.
[436] S. Dib. Asymptotic nonlinearity of vectorial Boolean functions. Cryptography and Communications
6 (2), pp. 103–115, 2013. See pages 117 and 369.
[437] F. Didier. A new upper bound on the block error probability after decoding over the erasure channel.
IEEE Transactions on Information Theory 52, pp. 4496–4503, 2006. See pages 93 and 334.
[438] F. Didier. Using Wiedemann’s algorithm to compute the immunity against algebraic and fast
algebraic attacks. Proceedings of Indocrypt 2006, Lecture Notes in Computer Science 4329,
pp. 236–250, 2006. See page 334.
[439] F. Didier and J.-P. Tillich. Computing the algebraic immunity efficiently. Proceedings of Fast
Software Encryption FSE 2006, Lecture Notes in Computer Science 4047, pp. 359–374, 2006. See
page 334.
[440] J. F. Dillon. A survey of bent functions. NSA Technical Journal Special Issue, pp. 191–215, 1972.
See pages 196, 208, 210, and 230.
[441] J. F. Dillon. Elementary Hadamard difference sets. Ph. D. Thesis, University of Maryland, 1974.
See pages 165, 167, 197, 198, 200, 201, 209, 212, 213, 216, 217, 218, 227, 231, 232, and 246.
[442] J. F. Dillon. Elementary Hadamard difference sets. Proceedings of the Sixth S-E Conf. Comb. Graph
Theory and Comp., Winnipeg Utilitas Math, pp. 237–249, 1975. See pages 214, 215, and 232.
[443] J. F. Dillon. Multiplicative difference sets via additive characters. Designs, Codes and Cryptography
17, pp. 225–235, 1999. See pages 394, 395, and 416.
[444] J. F. Dillon. Geometry, codes and difference sets: exceptional connections. Codes and Designs,
Proceedings of a Conference Honoring Professor D. K. Ray-Chaudury, Columbus, OH, 2000, Ohio
State University, volume 10, pp. 73–85, 2002. See page 404.
[445] J. F. Dillon. APN polynomials and related codes. Banff Conference, November 2006. See pages 408
and 412.
[446] J. F. Dillon. More DD difference sets. Designs, Codes and Cryptography 49 (1–2), pp. 23–32, 2008.
[447] J. F. Dillon. On the dimension of an APN code. Cryptography and Communications 3 (4) (Special
issue in honor of Jacques Wolfmann), pp. 275–279, 2011. See page 379.
[448] J. F. Dillon and H. Dobbertin. New cyclic difference sets with Singer parameters. Finite Fields and
Their Applications 10, pp. 342–389, 2004. See pages 173, 178, 230, 231, 252, 382, 395, 400, 415,
and 416.
[449] J. F. Dillon and G. McGuire. Near bent functions on a hyperplane. Finite Fields and Their
Applications 14 (3), pp. 715–720, 2008. See page 232.
[450] J. F. Dillon and J. R. Schatz. Block designs with the symmetric difference property. Proc. NSA
Mathematical Sciences Meetings (R. L. Ward, ed.), pp. 159–164, U.S. Govt. Printing Office, 1987.
www.openmathtexts.org/papers/dillon-shatz-designs.pdf. See page 201.
[451] C. Ding. Optimal constant composition codes from zero-difference balanced functions. IEEE
Transactions on Information Theory 54 (12), pp. 5766–5770, 2008. See page 394.
References 521

[452] C. Ding. Cyclic codes from some monomials and trinomials. SIAM Journal on Discrete Mathemat-
ics 27 (4), pp. 1977–1994, 2013. See page 161.
[453] C. Ding. Linear codes from some 2-designs. IEEE Transactions on Information Theory 60 (6),
pp. 3265–3275, 2015. See pages 160, 189, and 263.
[454] C. Ding. A construction of binary linear codes from Boolean functions. Discrete Mathematics 339
(9), pp. 2288–2303, 2016. See page 159.
[455] C. Ding. A sequence construction of cyclic codes over finite fields. Cryptography and Communica-
tions 10 (2), pp. 319–341, 2018. See page 159.
[456] C. Ding, Z. Heng, and Z. Zhou. Minimal binary linear codes. IEEE Transactions on Information
Theory 64 (10), pp. 6536–6545, 2018. See pages 147 and 149.
[457] C. Ding, C. Li, and Y. Xia. Another generalisation of the binary Reed–Muller codes and its
applications. Finite Fields and Their Applications 53, pp. 144–174, 2018. See page 151.
[458] C. Ding, S. Mesnager, C. Tang, and M. Xiong. Cyclic bent functions and their applications in codes,
codebooks, designs, MUBs and sequences. arXiv:1811.07725, 2019. See page 207.
[459] C. Ding, A. Munemasa, and V. D. Tonchev. Bent vectorial functions, codes and designs. Transac-
tions on Information Theory 65 (11), pp. 7533–7541, 2019. https://arxiv.org/abs/1808.08487. See
page 270.
[460] C. Ding and H. Niederreiter. Systematic authentication codes from highly nonlinear functions. IEEE
Transactions on Information Theory 50 (10), pp. 2421–2428, 2004. See page 150.
[461] C. Ding, D. Pei, and A. Salomaa. Chinese Remainder Theorem: Applications in Computing, Coding,
Cryptography. World Scientific, 1996. See pages 147 and 284.
[462] C. Ding and Y. Tan. Zero-difference balanced functions with applications. Journal of Statistical
Theory and Practice 6 (1), pp. 3–19, 2012. See page 394.
[463] C. Ding, G. Z. Xiao, and W. Shan. The Stability Theory of Stream Ciphers, Lecture Notes in
Computer Science 561, 1991. See page 3.
[464] C. Ding and Z. Zhou. Binary cyclic codes from explicit polynomials over GF (2m ). Discrete
Mathematics 321, pp. 76–89, 2014. See page 161.
[465] I. Dinur and A. Shamir. Breaking Grain-128 with dynamic cube attacks. Proceedings of Fast
Software Encryption FSE 2011, Lecture Notes in Computer Science 6733, pp. 167–187, 2011.
See pages 97 and 114.
[466] H. Dobbertin. Construction of bent functions and balanced Boolean functions with high nonlinear-
ity. Proceedings of Fast Software Encryption FSE 1995, Lecture Notes in Computer Science 1008,
pp. 61–74, 1995. See pages 81, 105, 228, and 388.
[467] H. Dobbertin. one-to-one highly nonlinear power functions on GF (2n ). Applicable Algebra in
Engineering, Communication and Computing (AAECC) 9 (2), pp. 139–152, 1998. See pages 402
and 418.
[468] H. Dobbertin. Kasami power functions, permutation polynomials and cyclic difference sets.
Proceedings of the NATO-A.S.I. Workshop “Difference Sets, Sequences and Their Correlation
Properties,” Bad Windsheim, Kluwer Verlag, pp. 133–158, 1998. See page 400.
[469] H. Dobbertin. Private communication, 1998. See page 385.
[470] H. Dobbertin. Another proof of Kasami’s theorem. Designs, Codes and Cryptography 17,
pp. 177–180, 1999. See pages 229 and 394.
[471] H. Dobbertin, Almost perfect nonlinear power functions on GF (2n ): the Welch case. IEEE
Transactions on Information Theory 45 (4), pp. 1271–1275, 1999. See page 395.
[472] H. Dobbertin. Almost perfect nonlinear power functions on GF (2n ): The Niho case. Information
and Computation 151, pp. 57–72, 1999. See page 395.
[473] H. Dobbertin. Almost perfect nonlinear power functions on GF(2n ): a new case for n divisible by
5. Proceedings of Finite Fields and Applications Fq5, Augsburg, Germany. Springer, pp. 113–121,
2000. See page 401.
[474] H. Dobbertin. Uniformly representable permutation polynomials. Proceedings of International
Conference on Sequences and Their Applications SETA 2001 (International Conference on
Sequences and Their Applications), Discrete Mathematics and Theoretical Computer Science.
Springer, pp. 1–22, 2002. See pages 395 and 400.
522 References

[475] H. Dobbertin, P. Felke, T. Helleseth, and P. Rosenthal. Niho type cross-correlation functions via
Dickson polynomials and Kloosterman sums. IEEE Transactions on Information Theory 52 (2),
pp. 613–627, 2006. See pages 222 and 247.
[476] H. Dobbertin and G. Leander. Cryptographer’s toolkit for construction of 8-bit bent functions, IACR
Cryptology ePrint Archive (http://eprint.iacr.org/) 2005/089, 2005. See page 239.
[477] H. Dobbertin and G. Leander. Bent functions embedded into the recursive framework of Z-bent
functions. Designs, Codes and Cryptography 49 (1–3), pp. 3–22, 2008. See pages 232, 239, and 240.
[478] H. Dobbertin and G. Leander. A survey of some recent results on bent functions. Proceeding
of International Conference on Sequences and Their Applications SETA 2004, Lecture Notes in
Computer Science 3486, pp. 1–29, 2005. See page 231.
[479] H. Dobbertin, G. Leander, A. Canteaut, C. Carlet, P. Felke, and P. Gaborit. Construction of
bent functions via Niho power functions. Journal of Combinatorial Theory, Series A 113 (5),
pp. 779–798, 2006. See pages 221, 222, 231, and 271.
[480] C. Dobraunig, M. Eichlseder, L. Grassi, et al. Rasta: a cipher with low ANDdepth and few ANDs
per Bit. Proceedings of CRYPTO 2018 (1), Lecture Notes in Computer Science 10991, pp. 662–692,
2018. See page 454.
[481] Y. Dodis, J. Katz, L. Reyzin, and A. Smith. Robust fuzzy extractors and authenticated key agreement
from close secrets. Proceedings of CRYPTO 2006, Lecture Notes in Computer Science 4117,
pp. 232–250, 2006. See page 450.
[482] S. M. Dodunekov and V. A. Zinoviev. A note on Preparata codes. Proceedings of Sixth Intern. Symp.
on Information Theory, Moscow – Tashkent Part 2, pp. 78–80, 1984. See pages 379 and 380.
[483] D. Dong, X. Zhang, L. Qu and S. Fu. A note on vectorial bent functions. Information Processing
Letters 113 (22–24), pp. 866–870, 2013. See page 272.
[484] B. Dravie, J. Parriaux, P. Guillot, and G. Millérioux. Matrix representations of vectorial Boolean
functions and eigenanalysis. Cryptography and Communications 8 (4), pp. 555–577, 2016. See
pages 51 and 52.
[485] Y. Du and F. Zhang. On the existence of Boolean functions with optimal resistance against fast
algebraic attacks. IACR Cryptology ePrint Archive (http://eprint.iacr.org/) 2012/210, 2012. See
page 322.
[486] S. Dubuc. Characterization of linear structures. Designs, Codes and Cryptography 22, pp. 33–45,
2001. See page 100.
[487] A. Duc, S. Dziembowski, and S. Faust. Unifying leakage models: from probing attacks to
noisy leakage. Proceedings of EUROCRYPT 2014, Lecture Notes in Computer Science 8441,
pp. 423–440, 2014 See page 429.
[488] I. Dumer and O. Kapralova. Spherically punctured Reed–Muller Codes. IEEE Transactions on
Information Theory 63 (5), pp. 2773–2780, 2017. See page 460.
[489] O. Dunkelman and S. Huang. Reconstructing an S-box from its difference distribution table. IACR
Transactions on Symmetric Cryptology 2019 (2), pp. 193–217, 2019. See also IACR Cryptology
ePrint Archive (http://eprint.iacr.org/) 2018/811. See page 376.
[490] S. Duval, V. Lallemand, and Y. Rotella. Cryptanalysis of the FLIP family of stream ciphers.
Proceedings of CRYPTO (1) 2016, Lecture Notes in Computer Science 9814, pp. 457–475, 2016.
See page 456.
[491] S. Dziembowski and S. Faust. Leakage-resilient cryptography from the inner-product extractor.
Proceedings of ASIACRYPT 2011, Lecture Notes in Computer Science 7073, pp. 702–721, 2011.
See page 446.
[492] Y. Edel. Quadratic APN functions as subspaces of alternating bilinear forms. Proceedings of the
Contact Forum Coding Theory and Cryptography III, Belgium 2009, pp. 11–24, 2011. See
page 399.
[493] Y. Edel, G. Kyureghyan, and A. Pott. A new APN function which is not equivalent to a power
mapping. IEEE Transactions on Information Theory 52 (2), pp. 744–747, 2006. See pages 397,
399, and 405.
[494] Y. Edel and A. Pott. A new almost perfect nonlinear function which is not quadratic. Advances in
Mathematics of Communications 3 (1), pp. 59–81, 2009. See pages 399, 402, 403, 407, 408, and 478.
References 523

[495] eSTREAM Project. www.ecrypt.eu.org/stream/. See pages 3, 22, 23, and 93.
[496] J. H. Evertse. Linear structures in block ciphers. Proceedings of EUROCRYPT 1987, Lecture Notes
in Computer Science 304, pp. 249–266, 1988. See page 99.
[497] J.-C. Faugère and G. Ars. An algebraic cryptanalysis of nonlinear filter generators using Gröbner
bases. Rapport de Recherche INRIA 4739, 2003. See pages 89 and 90.
[498] X. Feng and G. Gong. On algebraic immunity of trace inverse functions over finite fields with
characteristic two. Journal of Systems Science and Complexity 29 (1), pp. 272–288, 2016 and IACR
Cryptology ePrint Archive (http://eprint.iacr.org/) 2013/585, 2013. See page 324.
[499] T. Feng, K. Leung, and Q. Xiang. Binary cyclic codes with two primitive nonzeros. Science China
Mathematics 56 (7), pp. 1403–1412, 2013.
[500] K. Feng, Q. Liao, and J. Yang. Maximal values of generalized algebraic immunity. Designs, Codes
and Cryptography 50, pp. 243–252, 2009. See pages 128, 337, 346, 348, and 350.
[501] K. Feng and J. Yang. Vectorial Boolean functions with good cryptographic properties. International
Journal of Foundations of Computer Science 22 (6), pp. 1271–1282, 2011. See page 340.
[502] T. Feulner, L. Sok, P. Solé, and A. Wassermann. Towards the classification of self-dual bent
functions in eight variables. Designs, Codes and Cryptography 68 (1–3), pp. 395–406, 2013. See
page 198.
[503] E. Filiol and C. Fontaine. Highly nonlinear balanced Boolean functions with a good correlation-
immunity. Proceedings of EUROCRYPT 1998, Lecture Notes in Computer Science 1403,
pp. 475–488, 1998. See pages 248, 292, and 360.
[504] Y. Filmus. Friedgut-kalai-naor theorem for slices of the Boolean cube. Chicago J. Theor. Comput.
Sci., 2016. See pages 456 and 457.
[505] Y. Filmus. An orthogonal basis for functions over a slice of the Boolean hypercube. The Electronic
Journal of Combinatorics 23 (1), p. P1. 23, 2016. See page 456.
[506] Y. Filmus and F. Ihringer. Boolean degree 1 functions on some classical association schemes.
Journal of Combinatorial Theory, Series A 162, pp. 241–270, 2019. See page 163.
[507] Y. Filmus, G. Kindler, E. Mossel, and K. Wimmer. Invariance principle on the slice. 31st Conference
on Computational Complexity, CCC 2016, pp. 15:1–15:10, 2016. See page 457.
[508] Y. Filmus and E. Mossel. Harmonicity and invariance on slices of the Boolean cube. 31st
Conference on Computational Complexity, CCC 2016, pp. 16:1–16:13, 2016. See page 457.
[509] S. Fischer and W. Meier. Algebraic immunity of S-boxes and augmented functions. Proceedings of
Fast Software Encryption FSE 2007. Lecture Notes in Computer Science 4593, pp. 366–381, 2007.
See page 95.
[510] R. W. Fitzgerald. Trace forms over finite fields of characteristic 2 with prescribed invariants. Finite
Fields and Their Applications 15 (1), pp. 69–81, 2009. See page 178.
[511] J. P. Flori and S. Mesnager. Dickson polynomials, hyperelliptic curves and hyper-bent functions.
Proceedings of International Conference on Sequences and Their Applications SETA 2012, Lecture
Notes in Computer Science 7780, pp. 40–52, Springer, 2012. See page 247.
[512] J.-P. Flori and S. Mesnager. An efficient characterization of a family of hyper-bent functions with
multiple trace terms. Journal of Mathematical Cryptology 7 (1), pp. 43–68, 2013. See also IACR
Cryptology ePrint Archive (http://eprint.iacr.org/) 2011/373. See page 247.
[513] J. P. Flori H. Randriambololona, G. Cohen, and S. Mesnager. On a conjecture about binary strings
distribution. Proceedings of International Conference on Sequences and Their Applications SETA
2010, Lecture Notes in Computer Science 6338, pp. 346–358, 2010. See page 339.
[514] D. G. Fon-Der-Flaass. A bound on correlation immunity. Sib. Elektron. Mat. Izv. 4, pp. 133–135,
2007. (http://semr.math.nsc.ru/v4/p133-135.pdf). See page 285.
[515] C. Fontaine. On some cosets of the first-order Reed–Muller code with high minimum weight. IEEE
Transactions on Information Theory 45 (4), pp. 1237–1243, 1999. See pages 81, 248, and 292.
[516] R. Forré. The strict avalanche criterion: spectral properties of Boolean functions and an extended
definition. Proceedings of CRYPTO 1988, Lecture Notes in Computer Science 403, pp. 450–468,
1989. See pages 97 and 318.
524 References

[517] R. Forré. A fast correlation attack on nonlinearly feedforward filtered shift register sequences.
Proceedings of EUROCRYPT 1989, Lecture Notes in Computer Science 434, pp. 586–595, 1990.
See page 78.
[518] R. Fourquet and C. Tavernier. List decoding of second order Reed–Muller and its covering radius
implications. Proceedings of Workshop on Coding and Cryptography WCC 2007, pp. 147–156,
2007. See page 84.
[519] E. Friedgut. Boolean functions with low average sensitivity depend on few coordinates. Combina-
torica 18 (1), pp. 27–36, 1998 See page 102.
[520] J. Friedman. On the bit extraction problem. Proceedings of the 33rd IEEE Symposium on
Foundations of Computer Science, pp. 314–319, 1992. See pages 87, 312, and 313.
[521] S. Fu and X. Feng. Involutory differentially 4-uniform permutations from known constructions.
Designs, Codes and Cryptography 87 (1), pp. 31–56, 2018. See also IACR Cryptology ePrint
Archive (http://eprint.iacr.org/) 2017/292. See pages 417 and 421.
[522] S. Fu, X. Feng, Q. Wang, and C. Carlet. On the derivative imbalance and ambiguity of functions.
IEEE Transactions on Information Theory 65 (9), pp. 5833–5845, 2019. See pages 138 and 139.
[523] S. Fu, X. Feng, and B. Wu. Differentially 4-uniform permutations with the best known nonlinearity
from butterflies. IACR Transactions on Symmetric Cryptology, 2017 (2), pp. 228–249, 2017. See
page 421.
[524] R. G. Gallager. Low Density Parity Check Codes. MIT Press, 1963. See page 78.
[525] S. Gangopadhyay, A. K. Gangopadhyay, S. Pollatos, and P. Stănică. Cryptographic Boolean
functions with biased inputs. Cryptography and Communications 9(2), pp. 301–314, 2017. See
page 457.
[526] S. Gangopadhyay, A. Joshi, G. Leander, and R. K. Sharma. A new construction of bent functions
based on Z-bent functions. Designs, Codes and Cryptography 66, (1–2), pp. 243–256, 2013. See
page 240.
[527] S. Gangopadhyay, P. H. Keskar, and S. Maitra. Patterson–Wiedemann construction revisited.
Discrete Mathematics 306, pp. 1540–1556, 2002 (selected papers from R. C. Bose Centennial
Symposium on Discrete Mathemematics and Applications). See page 320.
[528] S. Gangopadhyay, B. Mandal, and P. Stănică. Gowers U3 norm of some classes of bent Boolean
functions. Designs, Codes and Cryptography 86 (5), pp. 1131–1148, 2018. See pages 473 and 474.
[529] S. Gangopadhyay, E. Pasalic, and P. Stănică. A note on generalized bent criteria for Boolean
functions. IEEE Transactions on Information Theory 59 (5), 3233–3236, 2013. See page 266.
[530] G. Gao, Y. Guo, and Y. Zhao. Recent results on balanced symmetric Boolean functions. IEEE
Transactions on Information Theory 62 (9), pp. 5199–5203, 2015. See page 354.
[531] G. Gao, X. Zhang, W. Liu, and C. Carlet. Constructions of quadratic and cubic rotation symmetric
bent functions. IEEE Transactions on Information Theory 58, 4908–4913, 2012. See pages 249
and 251.
[532] J. von zur Gathen and J. R. Roche. Polynomials with two values. Combinatorica 17 (3), pp. 345–
362, 1997. See pages 354 and 357.
[533] J. von zur Gathen and J. Gerhard. Modern Computer Algebra. Cambridge University Press, 2013
(third edition). See page 21.
[534] S. Ge, Z. Wang, P. Luo, and M. Karpovsky. Reliable and secure memories based on algebraic
manipulation detection codes and robust error correction. Proceedings of Int. Depend Symp.
Citeseer, 2013. See page 453.
[535] S. Ge, Z. Wang, P. Luo, and M. Karpovsky. Secure memories resistant to both random errors and
fault injection attacks using nonlinear error correction codes. Proceedings of HASP 2013, ACM
2013, pp. 1–8, 2013. See page 452.
[536] C. Gentry. Fully homomorphic encryption using ideal lattices. Proceedings of ACM STOC 2009,
pp. 169–178, 2009. See page 453.
[537] C. Gentry, A. Sahai, and B. Waters. Homomorphic encryption from learning with errors:
conceptually-simpler, asymptotically-faster, attribute-based. Proceedings of CRYPTO 2013, Part
I, Lecture Notes in Computer Science 8042, pp. 75–92, 2013. See pages 453 and 454.
References 525

[538] R. Gode and S. Gangopadhyay. Third-order nonlinearities of a subclass of Kasami functions.


Cryptography and Communications 2, pp. 69–83, 2010. See page 85.
[539] C. Godsil and A. Roy. Two characterizations of crooked functions. IEEE Transactions on
Information Theory 54 (2), pp. 864–866, 2008. See page 279.
[540] R. Gold. Maximal recursive sequences with 3-valued recursive crosscorrelation functions. IEEE
Transactions on Information Theory 14, pp. 154–156, 1968. See pages 394, 400, and 401.
[541] O. Goldreich. Candidate one-way functions based on expander graphs. Electronic Colloquium on
Computational Complexity (ECCC) 7(90), 2000. See also IACR Cryptology ePrint Archive (http://
eprint.iacr.org/) 2000/063, 2000. See page 467.
[542] O. Goldreich. Introduction to Property Testing. Cambridge University Press, 2017. See page 469.
[543] O. Goldreich and R. Izsak. Monotone circuits: one-way functions versus pseudorandom generators.
Theory of Computing 8 (1), pp. 231–238, 2012. See page 467.
[544] J. Golić. Fast low order approximation of cryptographic functions. Proceedings of EUROCRYPT
1996, Lecture Notes in Computer Science 1070, pp. 268–282, 1996. See page 83.
[545] J. Golić. On the security of nonlinear filter generators. Proceedings of Fast Software Encryption
FSE 1996, Lecture Notes in Computer Science 1039, pp. 173–188, 1996. See pages 89 and 344.
[546] F. Göloglŭ. Almost bent and almost perfect nonlinear functions, exponential sums, geometries and
sequences. PhD dissertation, University of Magdeburg, 2009. See page 247.
[547] F. Göloglŭ. Almost perfect nonlinear trinomials and hexanomials. Finite Fields and Their Applica-
tions 33, pp. 258–282, 2015. See pages 168 and 408.
[548] F. Göloglu and A. Pott. Results on the crosscorrelation and autocorrelation of sequences. Proceed-
ings of International Conference on Sequences and Their Applications SETA 2008, Lecture Notes
in Computer Science 5203, pp. 95–105, 2008. See pages 178, 232, and 396.
[549] S. W. Golomb. On the classification of Boolean functions. IEEE Transactions on Information
Theory 5 (5), pp. 176–186, 1959. See pages 52, 87, and 143.
[550] S. W. Golomb. Shift Register Sequences. Aegean Park Press, 1982. See pages 20 and 384.
[551] S. W. Golomb. Shift register sequences – a retrospective account. Proceedings of International
Conference on Sequences and Their Applications SETA 2006, Lecture Notes in Computer Science
4086, pp. 1–4, 2006. See page 384.
[552] S. W. Golomb and G. Gong. Signal Design for Good Correlation. Cambridge University Press,
2005. See page 384.
[553] G. Gong. Sequences, DFT and resistance against fast algebraic attacks. Proceedings of International
Conference on Sequences and Their Applications SETA 2008, Lecture Notes in Computer Science
5203, pp. 197–218, 2008. See page 94.
[554] G. Gong and S. W. Golomb. Transform domain analysis of DES. IEEE Transactions on Information
Theory 45 (6), pp. 2065–2073, 1999. See pages 243 and 244.
[555] G. Gong, T. Helleseth, and P. V. Kumar. Solomon W. Golomb – mathematician, engineer and
pioneer. IEEE Transactions on Information Theory 64 (4), pp. 2844–2857, 2018 See pages 52
and 384.
[556] G. Gong, S. Rønjom, T. Helleseth, and H. Hu. Fast discrete Fourier spectra attacks on stream
ciphers. IEEE Transactions on Information Theory 57 (8), pp. 5555–5565, 2011. See pages 95
and 96.
[557] K. Gopalakrishnan, D. G. Hoffman and D. R. Stinson. A note on a conjecture concerning symmetric
resilient functions. Information Processing Letters 47 (3), pp. 139–143, 1993. See pages 356
and 357.
[558] S. D. Gordon, Y. Ishai, T. Moran, R. Ostrovsky, and A. Sahai. On complete primitives for fairness.
Proceedings of Theory of Cryptography, TCC 2010, Lecture Notes in Computer Science 5978,
pp. 91–108, 2010. See page 452.
[559] M. Goresky and A. Klapper. Fibonacci and Galois representation of feedback with carry shift
registers. IEEE Transactions on Information Theory 48, pp. 2826–2836, 2002. See page 23.
[560] M. Goresky and A. Klapper. Periodicity and distribution properties of combined FCSR sequences.
Proceedings of International Conference on Sequences and Their Applications SETA 2006, Lecture
Notes in Computer Science 4086, pp. 334–341, 2006. See page 23.
526 References

[561] A. Gorodilova. On differential equivalence of APN functions. Cryptography and Communications


11 (4), pp. 793–813, 2019. Preliminary version in IACR Cryptology ePrint Archive (http://eprint
.iacr.org/) 2017/907. See also: On a remarkable property of APN Gold functions, IACR Cryptology
ePrint Archive (http://eprint.iacr.org/) 2016/286. See page 376.
[562] L. Goubin and A. Martinelli. Protecting AES with Shamir’s secret sharing scheme. Proceedings
of International Workshop Cryptographic Hardware and Embedded Systems CHES 2011, Lecture
Notes in Computer Science 6917, pp. 79–94, 2011. See pages 428 and 436.
[563] D. Goudarzi, A. Joux, and M. Rivain. How to securely compute with noisy leakage in quasi-
linear complexity. Proceedings of ASIACRYPT 2018, Lecture Notes in Computer Science 11273,
pp. 547–574, 2018. See page 431.
[564] D. Goudarzi, A. Martinelli, A. Passelegue, and T. Prest. Unifying leakage models on a Rényi day.
IACR Cryptology ePrint Archive (http://eprint.iacr.org/) 2019/138. See page 429.
[565] D. Goudarzi and M. Rivain. On the multiplicative complexity of Boolean functions and bitsliced
higher-order masking. Proceedings of International Workshop Cryptographic Hardware and
Embedded Systems CHES 2016, Lecture Notes in Computer Science 9813, pp. 457–478, 2016. See
page 434.
[566] A. Gouget. On the propagation criterion of Boolean functions. Proceedings of the Workshop on
Coding, Cryptography and Combinatorics 2003, Birkhäuser Verlag, pp. 153–168, 2004. See
page 355.
[567] A. Gouget and H. Sibert. Revisiting correlation-immunity in filter generators. Proceedings of SAC
2007, Lecture Notes in Computer Science 4876, pp. 378–395, 2007. See page 89.
[568] W. T. Gowers. A new proof of Szemerédi’s theorem. Geom. Funct. Anal. 11 (3), pp. 465–588, 2001.
See pages 469, 470, 471, and 473.
[569] W. T. Gowers and L. Milicevic. A quantitative inverse theorem for the U4 norm over finite fields,
2017 (https://arxiv.org/pdf/1712.00241.pdf). See page 470.
[570] M. Grassl. Code tables: bounds on the parameters of various types of codes. Available at www
.codetables.de/, Universitat Karlsruhe. See pages 6 and 316.
[571] B. Green. Finite field models in additive combinatorics. Proceedings of British Combinatorial
Conference 2005, Surveys in Combinatorics, pp. 1–27, https://arxiv.org/pdf/math/0409420.pdf,
2005. See pages 469 and 470.
[572] B. Green and T. Tao. An inverse theorem for the Gowers U3 norm (arXiv:math/0503014
[math.NT]), 2006. See page 473.
[573] B. Green and T. Tao. The distribution of polynomials over finite fields, with applications to the
Gowers norms. arXiv/0711.3191, 2007. See page 474.
[574] H. Gross and S. Mangard Reconciling d + 1 masking in hardware and software. Proceedings of
International Workshop Cryptographic Hardware and Embedded Systems CHES 2017, Lecture
Notes in Computer Science 10529, pp. 115–136, 2017. See page 444.
[575] H. Gross, S. Mangard, and T. Korak. Domain-oriented masking: compact masked hardware
implementations with arbitrary protection order. Proceedings of the 2016 ACM Workshop on
Theory of Implementation Security (TIS) and IACR Cryptology ePrint Archive (http://eprint.iacr
.org/) 2016/486, 2016. See page 444.
[576] L. Grover. A fast quantum mechanical algorithm for database search. Proceedings of ACM STOC
1996, pp. 212–219, 1996 (also Bell Labs, New Jersey, Tech. Rep., 1996). See page 3.
[577] S. Guilley, A. Heuser, and O. Rioul. Codes for side-channel attacks and protections. Proceedings of
C2SI, Lecture Notes in Computer Science 10194, pp. 35–55, 2017. See page 429.
[578] P. Guillot. Partial bent functions. Proceedings of the World Multiconference on Systemics, Cyber-
netics and Informatics, SCI 2000, 2000. See page 258.
[579] P. Guillot. Completed GPS covers all bent functions. Journal of Combinatorial Theory, Series A
93, pp. 242–260, 2001. See pages 69 and 242.
[580] K. Gupta and P. Sarkar. Improved construction of nonlinear resilient S-boxes. Proceedings of
ASIACRYPT 2002, Lecture Notes in Computer Science 2501, pp. 466–483, 2002. See pages 315
and 316.
References 527

[581] K. Gupta and P. Sarkar. Construction of perfect nonlinear and maximally nonlinear multiple-
output Boolean functions satisfying higher order strict avalanche criteria. IEEE Transactions on
Information Theory 50, pp. 2886–2894, 2004. See page 318.
[582] K. C. Gupta and P. Sarkar. Computing partial Walsh transform from the algebraic normal form of
a Boolean function. IEEE Transactions on Information Theory 55 (3), pp. 1354–1359, 2009. See
page 57.
[583] V. Guruswami and M. Wootters. Repairing Reed–Solomon codes. IEEE Transactions on Informa-
tion Theory 63 (9), pp. 5684–5698, 2017. See page 146.
[584] S. Halevi and V. Shoup. Algorithms in HeLib. Proceedings of CRYPTO 2014, Lecture Notes in
Computer Science 8616, pp. 554–571, 2014. See page 454.
[585] R. W. Hamming. Error detecting and error correcting codes. The Bell System Technical Journal 29
(2), pp. 147–160, 1950. See page 5.
[586] A. R. Hammons Jr., P. V. Kumar, A. R. Calderbank, N. J. A. Sloane, and P. Solé. The Z4 -linearity
of Kerdock, Preparata, Goethals and related codes. IEEE Transactions on Information Theory 40,
pp. 301–319, 1994. See page 255.
[587] H. Han and C. Tang. New classes of even-variable Boolean functions with optimal algebraic
immunity and very high nonlinearity. International Journal of Advanced Computer Technology
5 (2), pp. 419–428, 2013. See page 340.
[588] M. A. Harrison. On the classification of Boolean functions by the general linear and affine groups.
Journal of the Society for Industrial and Applied Mathematics 12 (2), pp. 285–299, 1964. See
page 143.
[589] P. Hawkes and L. O’Connor. XOR and Non-XOR differential probabilities. Proceedings of
EUROCRYPT 1999, Lecture Notes in Computer Science 1592, pp. 272–285, 1999. See page 136.
[590] P. Hawkes and G. Rose. Rewriting variables: the complexity of fast algebraic attacks on stream
ciphers. Proceedings of CRYPTO 2004, Lecture Notes in Computer Science 3152, pp. 390–406,
2004. See page 94.
[591] A. S. Hedayat, N. J. A. Sloane, and J. Stufken. Orthogonal Arrays, Theory and Applications.
Springer Series in Statistics, 1999. See pages 87, 303, and 304.
[592] T. Helleseth. Some results about the cross-correlation function between two maximal linear
sequences. Discrete Mathematics 16 (3), pp. 209–232, 1976. See pages 72 and 372.
[593] T. Helleseth. Open problems on the cross-correlation of m-sequences. Proceeding of the Conference
Open Problems in Mathematical and Computational Sciences, September 18–20, 2013, Istanbul,
Turkey. Springer, pp. 163–179, 2014. See page 384.
[594] T. Helleseth and A. Kholosha. On the equation x 2 +1 + x + a = 0 over GF (2k ). Finite Fields and
l

Their Applications 14 (1), pp. 159–176, 2008. See page 495.


[595] T. Helleseth and A. Kholosha. x 2 +1 + x + a and related affine polynomials over GF (2k ).
l

Cryptography and Communications 2 (1), pp. 85–109, 2010. See page 495.
[596] T. Helleseth, A. Kholosha, and S. Mesnager. Niho bent functions and Subiaco hyperovals.
Proceedings of the 10-th International Conference on Finite Fields and Their Applications (Fq’10),
Contemporary Mathematics, 579, pp. 91–101, 2012. See pages 221 and 222.
[597] T. Helleseth, T. Kløve, and J. Mykkelveit. On the covering radius of binary codes. IEEE
Transactions on Information Theory 24 (5), pp. 627–628, 1978. See page 81.
[598] T. Helleseth and P. V. Kumar. Sequences with low correlation. Handbook of Coding Theory, V.
Pless and W. C. Huffman, eds. Elsevier, vol. II, pp. 1765–1854, 1998. See page 384.
[599] T. Helleseth and H. F. Mattson Jr. On the cosets of the simplex code. Discrete Mathematics 56,
pp. 169–189, 1985. See page 262.
[600] T. Helleseth and S. Rønjom. Simplifying algebraic attacks with univariate analysis. Proceedings of
Information Theory and Applications Workshop, ITA 2011, San Diego, California, USA, February
6–11, 2011, pp. 153–159, 2011. See pages 95, 326, and 327.
[601] T. Helleseth and V. Zinoviev. On Z4 -linear Goethals codes and Kloosterman sums. Designs, Codes
and Cryptography 17, pp. 269–288, 1999. See page 402.
528 References

[602] Z. Heng, C. Ding, and Z. Zhou. Minimal linear codes over finite fields. Finite Fields and Their
Applications 54, pp. 176–196, 2018. See page 149.
[603] F. Hernando and G. McGuire. Proof of a conjecture on the sequence of exceptional numbers,
classifying cyclic codes and APN functions. Journal of Algebra 343 (1), pp. 78–92, 2011. See
page 404.
[604] D. Hertel and A. Pott, Two results on maximum nonlinear functions, Designs, Codes and
Cryptography 47 (1–3), pp. 225–235, 2008. See page 418.
[605] S. Hirose and K. Ikeda. Complexity of Boolean functions satisfying the propagation criterion.
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 78
(4), pp. 470–478, 1995. Presented at 1995 Symposium on Cryptography and Information Security,
SCIS95-B3.3, 1995. See page 318.
[606] J. J. Hoch and A. Shamir. Fault analysis of stream ciphers. Proceedings of International Workshop
Cryptographic Hardware and Embedded Systems CHES 2004, Lecture Notes in Computer Science
3156, Springer, pp. 240–253, 2004. See page 427.
[607] R. Hofer and A. Winterhof. r-th order nonlinearity, correlation measure and least significant bit of
the discrete logarithm. Cryptography and Communications 11 (5), pp. 993–998, 2019. See page 85.
[608] H. Hollmann and Q. Xiang. A proof of the Welch and Niho conjectures on crosscorrelations of
binary m-sequences. Finite Fields and Their Applications 7, pp. 253–286, 2001. See page 395.
[609] K. Horadam. Hadamard Matrices and Their Applications. Princeton University Press, 2006. See
page 53.
[610] X.-D. Hou. Some results on the covering radii of Reed–Muller codes. IEEE Transactions on
Information Theory 39 (2), pp. 366–378, 1993. See page 158.
[611] X.-D. Hou. Classification of cosets of the Reed–Muller code R(m − 3, m). Discrete Mathematics,
128, pp. 203–224, 1994. See page 155.
[612] X.-D. Hou. The covering radius of R(1, 9) in R(4, 9). Designs, Codes and Cryptography 8 (3),
pp. 285–292, 1995. See page 158.
[613] X.-D. Hou. AGL(m, 2) acting on R(r, m)/R(s, m). Journal of Algebra 171, pp. 921–938, 1995.
See pages 144 and 155.
[614] X.-D. Hou. Covering radius of the Reed–Muller code R(1, 7) – a simpler proof. Journal of
Combinatorial Theory, Series A 74, pp. 337–341, 1996. See page 158.
[615] X.-D. Hou. GL(m, 2) acting on R(r, m)/R(r −1, m). Discrete Mathematics 149, pp. 99–122, 1996.
See page 155.
[616] X.-D. Hou. On the covering radius of R(1, m) in R(3, m). IEEE Transactions on Information Theory
42 (3), pp. 1035–1037, 1996. See pages 108 and 158.
[617] X.-D. Hou. On the norm and covering radius of the first-order Reed–Muller codes. IEEE
Transactions on Information Theory 43 (3), pp. 1025–1027, 1997. See pages 81 and 157.
[618] X.-D. Hou. Cubic bent functions. Discrete Mathematics 189, pp. 149–161, 1998. See page 208.
[619] X.-D. Hou. On the coefficients of binary bent functions. Proc. Amer. Math. Soc. 128 (4),
pp. 987–996, 2000. See page 200.
[620] X.-D. Hou. New constructions of bent functions. Proceedings of the International Conference
on Combinatorics, Information Theory and Statistics; Journal of Combinatorics, Information and
System Sciences 25 (1–4), pp. 173–189, 2000. See pages 200, 234, and 236.
[621] X.-D. Hou. On binary resilient functions. Designs, Codes and Cryptography 28 (1), pp. 93–112,
2003. See page 287.
[622] X.-D. Hou. Group actions on binary resilient functions. Applicable Algebra in Engineering,
Communication and Computing (AAECC) 14 (2), pp. 97–115, 2003. See page 88.
[623] X.-D. Hou. A note on the proof of a theorem of Katz. Finite Fields and Their Applications 11,
pp. 316–319, 2005. See page 156.
[624] X.-D. Hou. Affinity of permutations of Fn2 . Proceedings of Workshop on Coding and Cryptography
WCC 2003, pp. 273–280, 2003. Completed version in Discrete Applied Mathematics 154 (2),
pp. 313–325, 2006. See page 411.
[625] X.-D. Hou. Explicit evaluation of certain exponential sums of binary quadratic functions. Finite
Fields and Their Applications 13, pp. 843–868, 2007. See pages 173, 174, 176, and 178.
References 529

[626] X.-D. Hou. Classification of self dual quadratic bent functions. Designs, Codes and Cryptography
63 (2), pp. 183–198, 2012. See pages 198, 199, and 476.
[627] X.-D. Hou and P. Langevin. Results on bent functions. Journal of Combinatorial Theory, Series A
80, pp. 232–246, 1997. See pages 195 and 235.
[628] X.-D. Hou, G. L. Mullen, J. A. Sellers, and J. Yucas. Reversed Dickson polynomials over finite
fields. Finite Fields and Their Applications 15, pp. 748–773, 2009. See page 389.
[629] H. Hu and D. Feng. On quadratic bent functions in polynomial forms. IEEE Transactions on
Information Theory 53, pp. 2610–2615, 2007. See pages 178, 206, 230, and 231.
[630] H. Hu and G. Gong. Periods on two kinds of nonlinear feedback shift registers with time
varying feedback functions. International Journal of Foundations of Computer Science 22 (6),
pp. 1317–1329, 2011. See page 23.
[631] H. Huang. Induced subgraphs of hypercubes and a proof of the sensitivity conjecture. Annals of
Mathematics 190 (3), pp. 949–955, 2019. See also arXiv preprint arXiv:1907.00847, 2019. See
page 320.
[632] D. Huang, C. Tang, Y. Qi, and M. Xu. New quadratic bent functions in polynomial forms with
coefficients in extension fields. Applicable Algebra in Engineering, Communication and Computing
(AAECC) 30, pp. 333–347, 2019. See also C. Tang, Y. Qi and M. Xu, IACR Cryptology ePrint
Archive (http://eprint.iacr.org/) 2013/405, 2013. See page 206.
[633] J.Y. Hyun, H. Lee, and Y. Lee. MacWilliams duality and Gleason-type theorem on self-dual bent
functions. Designs, Codes and Cryptography 63 (3), pp. 295–304, 2012. See page 196.
[634] J.Y. Hyun, H. Lee, and Y. Lee. Boolean functions with MacWilliams duality. Designs, Codes and
Cryptography 72 (2), pp. 273–287, 2014. See page 196.
[635] K. Ireland and M. Rosen. A Classical Introduction to Modern Number Theory. Graduate Texts in
Mathematics (Book 84) (second edition). Springer-Verlag, 2010. See page 486.
[636] Y. Ishai, E. Kushilevitz, R. Ostrovsky, and A. Sahai. Cryptography with constant computational
overhead. Proceedings of ACM STOC 2008, pp. 433–442, ACM Press, 2008. See page 467.
[637] Y. Ishai, A. Sahai, and D. Wagner. Private circuits: securing hardware against probing attacks.
Proceedings of CRYPTO 2003, Lecture Notes in Computer Science 2729, pp. 463–481, 2003. See
pages 428 and 430.
[638] T. Iwata and K. Kurosawa. Probabilistic higher order differential attack and higher order bent
functions. Proceedings of ASIACRYPT 1999, Lecture Notes in Computer Science 1716, pp. 62–74,
1999. See pages 83, 85, and 114.
[639] T. Jakobsen and L. R. Knudsen. The interpolation attack on block ciphers. Proceedings of Fast
Software Encryption FSE 1997, Lecture Notes in Computer Science 1267, pp. 28–40, 1997. See
page 142.
[640] C. J. A. Jansen and D. E. Boekee. The shortest feedback shift register that can generate a given
sequence. Proceedings of CRYPTO 1989, Lecture Notes in Computer Science 435, pp. 90–99, 1990.
See page 23.
[641] H. Janwa and R. Wilson. Hyperplane sections of Fermat varieties in P 3 in char. 2 and some
applications to cyclic codes. Proceedings of AAECC-10 Conference, Lecture Notes in Computer
Science 673, pp. 180–194, 1993. See pages 378 and 400.
[642] H. Janwa, G. McGuire, and R. Wilson. Double-error-correcting codes and absolutely irreducible
polynomials over GF (2). Journal of Algebra 178, pp. 665–676, 1995. See pages 378 and 404.
[643] D. Jedlicka. APN monomials over GF (2n ) for infinitely many n. Finite Fields and Their
Applications 13, pp. 1006–1028, 2007. See page 136.
[644] Q. Jin, Z. Liu, B. Wu, and X. Zhang. A general conjecture similar to T-D conjecture and its
applications in constructing Boolean functions with optimal algebraic immunity. IACR Cryptology
ePrint Archive (http://eprint.iacr.org/) 2011/515, 2011. See page 340.
[645] T. Johansson and F. Jönsson. Improved fast correlation attack on stream ciphers via convolutional
codes. Proceedings of EUROCRYPT 1999, Lecture Notes in Computer Science 1592, pp. 347–362,
1999. See page 78.
[646] T. Johansson and F. Jönsson. Fast correlation attacks based on turbo code techniques. Proceedings
of CRYPTO 1999, Lecture Notes in Computer Science 1666, pp. 181–197, 1999. See page 78.
530 References

[647] T. Johansson and F. Jönsson. Fast correlation attacks through reconstruction of linear polynomials.
Proceedings of CRYPTO 2000, in Lecture Notes in Computer Science 1880, pp. 300–315, 2000.
See page 78.
[648] T. Johansson and E. Pasalic. A construction of resilient functions with high nonlinearity. Proceed-
ings of the IEEE International Symposium on Information Theory, Sorrento, Italy, pp. 494–501,
2000. See pages 314 and 316.
[649] N. Johnson, V. Jha, and M. Biliotti. Handbook of Finite Translation Planes. Pure and Applied
Mathematics 289. Chapman & Hall/CRC, 2007. See pages 224 and 225.
[650] F. Jönsson. Some results on fast correlation attacks. PhD thesis, Lund University, 2002. See page 80.
[651] D. Jungnickel and A. Pott. Difference sets: an introduction. In Difference Sets, Sequences and Their
Autocorrelation Properties. A. Pott, P. V. Kumar, T. Helleseth, and D. Jungnickel, eds. Kluwer,
pp. 259–295, 1999. See page 190.
[652] J. Kahn, G. Kalai, and N. Linian. The influence of variables on Boolean functions. Proceedings
of 29th Annual Symposium on Foundations of Computer Science (IEEE), pp. 68–80, 1988. See
page 68.
[653] G. Kalai. Boolean functions: influence, threshold and noise. European Congress of Mathematics,
pp. 85–110, 2018. See page 68.
[654] N. Kaleyski. Changing APN functions at two points. Special Issue on Boolean Functions and Their
Applications 2018, Cryptography and Communications 11 (6), pp. 1165–1184, 2019 See page 373.
[655] N. Kaleyski. An update on known invariants of vectorial Boolean functions. Proceedings of
International Workshop on Signal Design and its Applications in Communications (IWSDA) 2019,
pp. 1–3, 2019. See page 392.
[656] W. M. Kantor. Symplectic groups, symmetric designs, and line ovals. Journal of Algebra 33,
pp. 43–58, 1975. See page 201.
[657] W. M. Kantor. An exponential number of generalized Kerdock codes. Information and Control 53,
pp. 74–80, 1982. See page 255.
[658] W. M. Kantor. Spreads, translation planes and Kerdock sets II. SIAM Journal on Algebraic and
Discrete Methods 3, pp. 308–318, 1982. See page 255.
[659] W. M. Kantor. Exponential numbers of two-weight codes, difference sets and symmetric designs.
Discrete Mathematics 46 (1), pp. 95–98, 1983. See pages 192 and 226.
[660] W. M. Kantor. Commutative semifields and symplectic spreads. Journal of Algebra 270,
pp. 96–114, 2003. See page 217.
[661] W. M. Kantor. Finite semifields. Finite Geometries, Groups, and Computation (Proc. of Conf. at
Pingree Park, CO Sept. 2005), pp. 103–114, de Gruyter, 2006. See pages 224 and 225.
[662] W. M. Kantor. Bent functions generalizing Dillon’s partial spread functions. arXiv:1211.2600,
2012. See page 225.
[663] W. M. Kantor. Bent functions and spreads. (https://pages.uoregon.edu/kantor/PAPERS/Bent+spreads
Final.pdf), (not meant to be published), 2015. See page 217.
[664] M. G. Karpovsky, K. J Kulikowski, and Z. Wang. Robust error detection in communication
and computational channels. Proceedings of Spectral Methods and Multirate Signal Processing,
SMMSP2007. Citeseer, 2007. See page 447.
[665] M. G. Karpovsky , K. J. Kulikowski, and Z. Wang. On-line self error detection with equal protection
against all errors. Int. J. High. Reliab. Electron. Syst. Des., 2008. See page 448.
[666] M. G. Karpovsky and P. Nagvajara. Optimal codes for the minimax criterion on error detection.
IEEE Transactions on Information Theory 35 (6), pp. 1299–1305, 1989. See pages 447, 448,
and 449.
[667] M. G. Karpovsky and A. Taubin. A new class of nonlinear systematic error detecting codes. IEEE
Transactions on Information Theory 50 (8), pp. 1818–1820, 2004. See pages 447 and 448.
[668] M. Karpovsky and Z. Wang. Design of strongly secure communication and computation channels
by nonlinear error detecting codes. IEEE Transactions on Computers 63 (11), pp. 2716–2728, 2014.
See page 452.
References 531

[669] T. Kasami. The weight enumerators for several classes of subcodes of the second order binary
Reed–Muller codes. Information and Control 18, pp. 369–394, 1971. See pages 230, 394, 401,
and 418.
[670] T. Kasami and N. Tokura. On the weight structure of the Reed–Muller codes, IEEE Transactions
on Information Theory 16, pp. 752–759, 1970. See pages 157 and 181.
[671] T. Kasami, N. Tokura, and S. Azumi. On the weight enumeration of weights less than 2.5d of
Reed–Muller Codes. Information and Control, 30:380–395, 1976. See page 157.
[672] C. Kaşıkcı, W. Meidl, and A. Topuzoğlu. Spectra of a class of quadratic functions: average
behavior and counting functions. Cryptography and Communications 8 (2), pp. 191–214, 2016.
See page 178.
[673] D. J. Katz. Weil sums of binomials, three-level cross-correlation, and a conjecture of Helleseth.
Journal of Combinatorial Theory, Series A 119 (8), pp. 1644–1659, 2012. See page 73.
[674] D. J. Katz. Divisibility of Weil sums of binomials. Proc. Amer. Math. Soc. 143 (11), pp. 4623–4632,
2015. See page 65.
[675] D. J. Katz and P. Langevin. New open problems related to old conjectures by Helleseth.
Cryptography and Communications 8 (2), pp. 175–189, 2016. See page 73.
[676] D. J. Katz, P. Langevin, S. Lee, and Y. Sapozhnikov. The p-adic valuations of Weil sums of
binomials. Journal of Number Theory 181, pp. 1–26, 2017. See page 65.
[677] N. Katz. On a theorem of Ax. American Journal of Mathematics 93, pp. 485–499, 1971. See
page 156.
[678] T. Kaufman and S. Lovett. New extension of the Weil bound for character sums with applications
to coding. Proceedings of IEEE 52nd Annual Symposium on Foundations of Computer Science,
pp. 788–796, 2011. See page 188.
[679] T. Kaufman, S. Lovett, and E. Porat. Weight distribution and list-decoding size of Reed–Muller
codes. IEEE Transactions on Information Theory 58 (5), pp. 2689–2696, 2012. See page 156.
[680] S. Kavut. Results on rotation-symmetric s-boxes. Information Sciences 201, pp. 93–113, 2012. See
page 362.
[681] S. Kavut. Correction to the paper: Patterson–Wiedemann construction revisited. Discrete Applied
Mathematics 202, pp. 185–187, 2016. See page 320.
[682] S. Kavut and S. Baloğlu. Results on symmetric S-boxes constructed by concatenation of RSSBs.
Cryptography and Communications 11 (4), pp. 641–660, 2019. See page 145.
[683] S. Kavut, S. Maitra, and D. Tang. Searching balanced boolean functions on even number of
variables with excellent autocorrelation profile. Designs, Codes and Cryptography 87 (2–3),
pp. 261–276, 2019. See page 320.
[684] S. Kavut, S. Maitra, and M. D. Yücel. Search for Boolean functions with excellent profiles in the
rotation symmetric class. IEEE Transactions on Information Theory 53 (5), pp. 1743–1751, 2007.
See pages 81, 157, 320, and 360.
[685] S. Kavut and M. D. Yücel. Generalized rotation symmetric and dihedral symmetric Boolean
functions–9 variable Boolean functions with nonlinearity 242. Applied Algebra, Algebraic Algo-
rithms and Error-Correcting Codes. Springer Berlin Heidelberg, pp. 321–329, 2007. See pages 82
and 362.
[686] S. Kavut and M. D. Yücel. 9-variable Boolean functions with nonlinearity 242 in the generalized
rotation symmetric class. Information and Computation 208 (4), pp. 341–350, 2010. See pages 81,
157, 158, and 360.
[687] European Telecommunications Standards Institute. Technical Specification 135 202 V9.0.0: uni-
versal mobile telecommunications system (UMTS); LTE; specification of the 3GPP confidentiality
and integrity algorithms; Document 2: KASUMI specification (3GPP TS 35.202 V9.0.0 Release 9).
See page 410.
[688] A. Kerckhoffs. La Cryptographie Militaire. Journal des Sciences Militaires, 1883. See page 1.
[689] A. M. Kerdock. A class of low-rate non linear codes. Information and Control 20, pp. 182–187,
1972. See pages 177 and 254.
532 References

[690] O. Keren, I. Shumsky, and M. G. Karpovsky. Robustness of security- oriented binary codes under
non-uniform distribution of codewords. Proceedings of 6th Int. Conf. on Dependability 2013,
pp. 25–30, 2013. See page 450.
[691] J. D. Key, T. P. McDonough, and V. C. Mavron. Information sets and partial permutation decoding
for codes from finite geometries. Finite Fields and Their Applications 12 (2), pp. 232–247, 2006.
See page 335.
[692] A. V. Khalyavin, M. S. Lobanov and Y. V. Tarannikov. On plateaued Boolean functions with the
same spectrum support. Sib. Elektron. Mat. Izv. 13, pp. 1346–1368, 2016. See pages 259 and 264.
[693] J. Khan, G. Kalai, and N. Linial. The influence of variables on Boolean functions. IEEE 29th Symp.
on Foundations of Computer Science, pp. 68–80, 1988. See page 59.
[694] M. A. Khan and F. Özbudak. Improvement in non-linearity of Carlet–Feng infinite class of Boolean
functions. Cryptology and Network Security,Lecture Notes in Computer Science 7712, pp 280–295,
2012. See pages 144 and 335.
[695] A. Kholosha and A. Pott. Bent and related functions. Handbook of Finite Fields, CRC Press Book,
Subsection 9.3, pp. 262–273, 2013. See page 197.
[696] K. Khoo and G. Gong. New constructions for resilient and highly nonlinear Boolean functions.
Proceedings of 8th Australasian Conference, ACISP 2003, Lecture Notes in Computer Science
2727, pp. 498–509, 2003. See pages 81, 299, and 317.
[697] K. Khoo, G. Gong, and D. R. Stinson. A new family of Gold-like sequences. Proceedings of IEEE
International Symposium on Information Theory (ISIT) 2002, p. 181, 2002. See page 178.
[698] K. Khoo, G. Gong, and D. Stinson. Highly nonlinear S-boxes with reduced bound on maximum
correlation. Proceedings of 2003 IEEE International Symposium on Information Theory. 2003.
www.cacr.math.uwaterloo.ca/techreports/2003/corr2003-12.ps. See page 133.
[699] K. Khoo, G. Gong, and D. Stinson. A new characterization of semi-bent and bent functions on
finite fields. Designs, Codes and Cryptography 38 (2), pp. 279–295, 2006. See pages 178, 206, 230,
and 362.
[700] K. H. Kim and S. Mesnager. Solving x 2 +1 +x+a = 0 in F2n with gcd(n, k) = 1. IACR Cryptology
k

ePrint Archive (http://eprint.iacr.org/) 2019/307, 2019. See page 495.


[701] S. H. Kim and J. S. No. New families of binary sequences with low correlation. IEEE Transactions
on Information Theory 49 (11), pp. 3059–3065, 2003. See page 230.
[702] K. Kjeldsen. On the cycle structure of a set of nonlinear shift registers with symmetric feedback
functions. Journal of Combinatorial Theory, Series A 20 (2), pp. 154–169, 1976. See page 23.
[703] A. Klapper and M. Goresky. Feedback shift registers, 2-adic span, and combiners with memory.
Journal of Cryptology 10, pp. 111–147. 1997. See page 23.
[704] A. Klapper and M. Goresky. Arithmetic correlations and Walsh transforms. IEEE Transactions on
Information Theory 58 (1), pp. 479–492, 2012. See page 57.
[705] A. Klimov and A. Shamir. Cryptographic applications of T-functions. Proceedings of Selected Areas
in Cryptography 2003, Lecture Notes in Computer Science 3006, pp. 248–261, 2004. See page 26.
[706] L. Knudsen. Truncated and higher order differentials. Proceedings of Fast Software Encryption FSE
1995, Lecture Notes in Computer Science 1008, pp. 196–211, 1995. See pages 114, 136, and 142.
[707] L. R. Knudsen and M. P. J. Robshaw. Non-linear approximations in linear cryptanalysis. Proceed-
ings of EUROCRYPT 1996, Lecture Notes in Computer Science 1070, pp. 224–236, 1996. See
page 83.
[708] L. R. Knudsen and M. P. J. Robshaw. The Block Cipher Companion. Information Security and
Cryptography. Springer, 2011. See page 26.
[709] L. R. Knudsen and D. Wagner. Integral cryptanalysis. Proceedings of Fast Software Encryption FSE
2002, Lecture Notes in Computer Science 2365, pp. 112–127, 2002. See page 114.
[710] D. E. Knuth. Finite semifields and projective planes. Journal of Algebra 2, pp. 182–217, 1965. See
page 226.
[711] N. Koçak, S. Mesnager, and F. Özbudak. Bent and semi-bent functions via linear translators.
Proceedings of IMA Conference on Cryptography and Coding 2015, Lecture Notes in Computer
Science 9496, pp. 205–224, 2015. See pages 237 and 263.
References 533

[712] P. Kocher. Timing attacks on implementations of Diffie–Hellman, RSA, DSS, and other systems.
Proceedings of CRYPTO 1996, Lecture Notes in Computer Science 1109, pp. 104–113, 1996. See
page 425.
[713] P. Kocher, J. Jaffe, and B. Jun. Differential power analysis. Proceedings of CRYPTO 1999, Lecture
Notes in Computer Science 1666, pp. 388–397, 1999. See pages 425, 426, and 427.
[714] N. Kolokotronis and K. Limniotis. Maiorana–McFarland Functions with high second-order nonlin-
earity. IACR Cryptology ePrint Archive (http://eprint.iacr.org/) 2011/212. See pages 85 and 158.
[715] N. Kolokotronis, K. Limniotis, and N. Kalouptsidis. Best affine and quadratic approximations
of particular classes of Boolean functions. IEEE Transactions on Information Theory 55 (11),
pp. 5211–5222, 2009. See pages 157 and 158.
[716] N. A. Kolomeec. The graph of minimal distances of bent functions and its properties. Designs,
Codes and Cryptography 85 (3), pp. 1–16, 2017. See page 205.
[717] K. J. Kulikowski, M. G. Karpovsky, and A. Taubin. Robust codes and robust, fault-tolerant
architectures of the advanced encryption standard. Journal of Systems Architecture 53 (2–3),
pp. 139–149, 2007. See pages 448, 449, and 450.
[718] P. V. Kumar, R. A. Scholtz and L. R. Welch. Generalized bent functions and their properties. Journal
of Combinatorial Theory, Series A 40, pp. 90–107, 1985. See page 193.
[719] K. Kurosawa, T. Iwata, and T. Yoshiwara. New covering radius of Reed–Muller codes for t-resilient
functions. Proceedings of Selected Areas in Cryptography, 8th Annual International Workshop,
Lecture Notes in Computer Science 2259, pp. 75 ff, 2001, and IEEE Transactions on Information
Theory 50, pp. 468–475, 2004. See page 287.
[720] K. Kurosawa, T. Johansson, and D. Stinson. Almost k-wise independent sample spaces and their
applications. Journal of Cryptology 14 (4), pp. 231–253, 2001. See page 290.
[721] K. Kurosawa and R. Matsumoto. Almost security of cryptographic Boolean functions. IEEE
Transactions on Information Theory 50 (11), pp. 2752–2761, 2004. See pages 97 and 290.
[722] K. Kurosawa and T. Satoh. Design of SAC/P C() of order k Boolean functions and three other
cryptographic criteria. Proceedings of EUROCRYPT 1997, Lecture Notes in Computer Science
1233, pp. 434–449, 1997. See pages 319 and 320.
[723] K. Kurosawa, T. Satoh, and K. Yamamoto. Highly nonlinear t-resilient functions. Journal of
Universal Computer Science 3 (6), pp. 721–729, 1997. See pages 314 and 316.
[724] S. Kutzner, P. Ha Nguyen, and A. Poschmann. Enabling 3-share threshold implementations for all
4-bit s-boxes. Proceedings of ICISC 2013, Lecture Notes in Computer Science 8565, pp. 91–108,
2013. See page 441.
[725] A. S. Kuzmin, V. T. Markov, A. A. Nechaev and A. B. Shishkov. Approximation of Boolean
functions by monomial ones. Discrete Mathematics and Applications 16 (1), pp. 7–28, 2006. See
page 246.
[726] G. Kyureghyan. Differentially affine maps. Proceedings of the Workshop on Coding and Cryptog-
raphy, WCC 2005, pp. 296–305, 2005. See page 373.
[727] G. Kyureghyan. Crooked maps in finite fields. 2005 European Conference on Combinatorics,
Graph Theory and Applications (EuroComb ’05), Discrete Mathematics & Theoretical Computer
Science Proceedings, pp. 167–170, 2005. See pages 278 and 373.
[728] G. Kyureghyan. The only crooked power functions are x 2 +2 . European Journal of Combinatorics
k l

28 (4), pp. 1345–1350, 2007. See pages 278 and 373.


[729] G. Kyureghyan. Crooked maps in F2n . Finite Fields and Their Applications 13 (3), pp. 713–726,
2007. See pages 278 and 373.
[730] G. Kyureghyan. Special mappings of finite fields. Finite Fields and Their Applications, Radon Ser.
Comput. Appl. Math. 11, pp. 117–144, De Gruyter, Berlin, 2013. See page 278.
[731] G. Kyureghyan and V. Suder. On inversion in Z2n −1 . Finite Fields and Their Applications 25,
pp. 234–254, 2014. See pages 177 and 394.
[732] P. Lacharme. Post processing functions for a physical random number generator. Proceedings of
Fast Software Encryption FSE 2008, Lecture Notes in Computer Science 5086, pp 334–342, 2008.
See page 290.
534 References

[733] G. Lachaud and J. Wolfmann. The weights of the orthogonals of the extended quadratic binary
goppa codes. IEEE Transactions on Information Theory 36, pp. 686–692, 1990. See pages 72, 215,
and 402.
[734] J. Lahtonen, G. McGuire, and H. Ward. Gold and Kasami–Welch functions, quadratic forms and
bent functions. Advances in Mathematics of Communications 1, pp. 243–250, 2007. See pages 178,
231, 395, and 396.
[735] X. Lai. Higher order derivatives and differential cryptanalysis. Proceedings of the ”Symposium on
Communication, Coding and Cryptography”, in honor of J. L. Massey on the Occasion of his 60’th
birthday, pp. 227–233, 1994. See pages 38, 114, and 142.
[736] X. Lai. Additive and linear structures of cryptographic functions. Proceedings of Fast Software
Encryption FSE 1995, Lecture Notes in Computer Science 1008, pp. 75–85, 1995. See page 100.
[737] P. Langevin. Covering radius of RM(1, 9) in RM(3, 9). Eurocode 1990, Lecture Notes in Computer
Science 514, pp. 51–59, 1991. See page 63.
[738] P. Langevin. On the orphans and covering radius of the Reed–Muller codes. Proceedings of AAECC-
9 Conference, Lecture Notes in Computer Science 539, pp. 234–240, 1991. See page 262.
[739] P. Langevin and G. Leander. Classification of Boolean quartic forms in eight variables. NATO
Science for Peace and Security Series – D: Information and Communication Security, IOS Press,
vol. 18: Boolean Functions in Cryptology and Information Security, pp. 139–147, 2008. See
page 144.
[740] P. Langevin and G. Leander. Monomial bent functions and Stickelberger’s theorem. Finite Fields
and Their Applications 14 (3), pp. 727–742, 2008. See page 156.
[741] P. Langevin and G. Leander. Counting all bent functions in dimension eight
99270589265934370305785861242880. Designs, Codes and Cryptography 59 (1–3), pp. 193–205,
2011 (see also the proceedings of WCC 2009). See pages 144 and 208.
[742] P. Langevin, G. Leander, G. McGuire, and E. Zalinescu. Analysis of Kasami–Welch functions in
odd dimension using Stickelberger’s theorem. Journal of Combinatorics and Number Theory 2 (1),
pp. 55–72, 2010. See page 396.
[743] P. Langevin, G. Leander, P. Rabizzoni, P. Veron, and J.-P. Zanotti. Webpage http://langevin.univ-
tln.fr/project/quartics/. See pages 208 and 243.
[744] P. Langevin, P. Rabizzoni, P. Veron, and J.-P. Zanotti. On the number of bent functions with 8
variables. Proceedings of the Conference BFCA 2006, Publications des universités de Rouen et du
Havre, pp. 125–136, 2007. See page 243.
[745] P. Langevin and P. Solé. Kernels and defaults. (Proceedings of the Conference Finite Fields and
Applications Fq4) Contemporary Mathematics 225, pp. 77–85, 1999. See page 181.
[746] P. Langevin and P. Véron. On the nonlinearity of power functions. Designs, Codes and Cryptogra-
phy 37 (1), pp. 31–43, 2005. See pages 73 and 156.
[747] P. Langevin and J.-P. Zanotti. Nonlinearity of some invariant Boolean functions. Designs, Codes
and Cryptography 36, pp. 131–146, 2005. See page 81.
[748] C. Lauradoux and M. Videau. Matriochka symmetric Boolean functions. Proceedings of IEEE
International Symposium on Information Theory (ISIT) 2008 pp. 1631–1635, 2008. See page 362.
[749] G. Leander. Bent functions with 2r Niho exponents. Proceedings of the Workshop on Coding and
Cryptography 2005, pp. 454–461, 2005. See pages 221 and 231.
[750] G. Leander. Monomial bent functions. Proceedings of the Workshop on Coding and Cryptogra-
phy 2005, Bergen, pp. 462–470, 2005. And IEEE Transactions on Information Theory 52 (2),
pp. 738–743, 2006. See pages 215, 230, and 246.
[751] G. Leander. Another class of non-normal bent functions. Proceedings of the Conference BFCA
2006, Publications des universités de Rouen et du Havre, pp. 87–98, 2006.
[752] G. Leander and A. Kholosha. Bent functions with 2r Niho exponents. IEEE Transactions on
Information Theory. 52 (12), pp. 5529–5532, 2006. See pages 221 and 231.
[753] G. Leander and P. Langevin. On exponents with highly divisible Fourier–Hadamard coefficients and
conjectures of Niho and Dobbertin. Proceedings of “The First Symposium on Algebraic Geometry
and Its Applications” Dedicated to Gilles Lachaud (SAGA’07), 2007, World Scientific, Series on
Number Theory and Its Applications 5, pp. 410–418, 2008. See page 390.
References 535

[754] G. Leander and G. McGuire. Spectra of functions, subspaces of matrices, and going up versus going
down. Proceedings of AAECC-17 Conference, Lecture Notes in Computer Science 4851, pp. 51–66,
2007. See page 232.
[755] G. Leander and G. McGuire. Construction of bent functions from near-bent functions. Journal of
Combinatorial Theory, Series A 116, pp. 960–970, 2009. See page 232.
[756] G. Leander and A. Poschmann. On the classification of 4 bit S-boxes. Proceedings of International
Workshop on the Arithmetic of Finite Fields WAIFI 2007, Lecture Notes in Computer Science 4547,
pp. 159–176, 2007. See pages 144 and 417.
[757] G. Leander and F. Rodier. Bounds on the degree of APN polynomials: the case of x −1 + g(x).
Designs, Codes and Cryptography 59 (1–3), pp. 207–222, 2011. See page 401.
[758] J.-M. Le Bars and A. Viola. Equivalence classes of Boolean functions for first-order correlation.
IEEE Transactions on Information Theory 56 (3), pp. 1247–1261, 2010. See page 313.
[759] R. J. Lechner. Harmonic analysis of switching functions. Recent Developments in Switching Theory,
Academic Press, pp. 121–228, 1971. See page 58.
[760] V. I. Levenshtein. Split orthogonal arrays and maximum independent resilient systems of functions.
Designs, Codes and Cryptography 12 (2), pp. 131–160, 1997. See page 129.
[761] J. Li, C. Carlet, X. Zeng, C. Li, L. Hu, and J. Shan. Two constructions of balanced Boolean functions
with optimal algebraic immunity, high nonlinearity and good behavior against fast algebraic attacks.
Designs, Codes and Cryptography 76 (2), pp. 279–305, 2015. See page 340.
[762] K. Li, L. Qu, B. Sun, and C. Li. New results about the boomerang uniformity of permutation
polynomials. IEEE Transactions on Information Theory 65 (11), pp. 7542–7553, 2019. Also: arXiv
preprint arXiv:1901.10999, 2019 – arxiv.org. See page 142.
[763] N. Li, T. Helleseth, A. Kholosha, and X. Tang. On the Walsh transform of a class of functions
from Niho exponents. IEEE Transactions on Information Theory 59 (7), pp. 4662–4667, 2013. See
page 222.
[764] N. Li, T. Helleseth, X. Tang and A. Kholosha. Several new classes of bent functions from Dillon
exponents. IEEE Transactions on Information Theory 59 (3), pp. 1818–1831, 2013. See pages 231
and 247.
[765] N. Li and W.-F. Qi. Symmetric Boolean functions depending on an odd number of variables with
maximum algebraic immunity. IEEE Transactions on Information Theory 52 (5), pp. 2271–2273,
2006. See page 357.
[766] N. Li and W.-F. Qi. Construction and analysis of Boolean functions of 2t + 1 variables with
maximum algebraic immunity. Proceedings of ASIACRYPT 2006, Lecture Notes in Computer
Science 4284, pp. 84–98, 2006. See page 336.
[767] N. Li, L. Qu, W.-F. Qi, G. Feng, C. Li, and D. Xie. On the construction of Boolean functions
with optimal algebraic immunity. IEEE Transactions on Information Theory 54 (3), pp. 1330–1334,
2008. See page 336.
[768] N. Li, X. Tang and T. Helleseth. New Constructions of Quadratic Bent Functions in Polynomial
Form. IEEE Transactions on Information Theory 60 (9), pp. 5760–5767, 2014. See page 206.
[769] N. Li and X. Zeng. A survey on the applications of Niho exponents. Cryptography and Communi-
cations 11 (3), pp. 1–40, 2018. See pages 169, 220, and 222.
[770] Y. Li. Characterization of robust immune symmetric boolean functions. Cryptography and Com-
munications 7 (3), pp. 297–315, 2015. See page 91.
[771] Y. Li and M. Wang. The nonexistence of permutations EA-equivalent to certain AB functions. IEEE
Transactions on Information Theory 59 (1), pp. 672–679, 2013. See pages 30 and 397.
[772] Y. Li and M. Wang. Constructing differentially 4-uniform permutations over GF (22m ) from
quadratic APN permutations over GF (22m+1 ). Designs, Codes and Cryptography 72 (2),
pp. 249–264, 2014. See page 419.
[773] Y. Li, M. Wang, and Y. Yu. Constructing differentially 4-uniform permutations over GF (22k ) from
the inverse function revisited. IACR Cryptology ePrint Archive (http://eprint.iacr.org/) 2013/731,
2013. See page 421.
[774] Q. Liao, F. Liu, and K. Feng. On (2m +1)-variable symmetric Boolean functions with submaximum
algebraic immunity 2m−1 . Science China Mathematics 52(1), pp. 17–28, 2009. See page 358.
536 References

[775] R. Lidl and H. Niederreiter. Finite Fields. Cambridge University Press, vol. 20, 1997 See pages 41,
112, 187, 248, 254, and 480.
[776] K. Limniotis and N. Kolokotronis. Boolean functions with maximum algebraic immunity:
further extensions of the Carlet–Feng construction. Designs, Codes and Cryptography 86 (8),
pp. 1685–1706, 2018. See page 341.
[777] K. Limniotis, N. Kolokotronis, and N. Kalouptsidis. Secondary constructions of Boolean functions
with maximum algebraic immunity. Cryptography and Communications 5, pp. 179–199, 2013. See
page 340.
[778] S. J. Lin, Y. S. Han and N. Yu. New Locally Correctable Codes Based on Projective Reed-Muller
Codes. IEEE Transactions on Information Theory 67 (6), pp. 3834–3841, 2019. See page 151.
[779] N. Linial, Y. Mansour, and N. Nisan. Constant depth circuits, Fourier transform, and learnability.
Journal of the Association for Computing Machinery 40 (3), pp. 607–620, 1993. See pages 59
and 62.
[780] J. H. van Lint. Introduction to Coding Theory. Springer, 1982. See page 4.
[781] P. Lisoněk. On the connection between Kloosterman sums and elliptic curves. Proceedings of
International Conference on Sequences and Their Applications SETA 2008, Lecture Notes in
Computer Science 5203, pp. 182–187, 2008. See page 215.
[782] P. Lisoněk. An efficient characterization of a family of hyperbent functions. IEEE Transactions on
Information Theory 57 (9), pp. 6010–6014, 2011. See page 247.
[783] P. Lisoněk and M. Marko. On zeros of Kloosterman sums. Designs, Codes and Cryptography 59,
pp. 223–230, 2011. See page 251.
[784] S. Litsyn and A. Shpunt. On the distribution of Boolean function nonlinearity. SIAM Journal on
Discrete Mathematics 23 (1), pp. 79–95, 2008. See page 80.
[785] F. Liu and K. Feng. On the 2m -variable symmetric Boolean functions with maximum algebraic
immunity 2m−1 . Proceedings of Workshop on Coding and Cryptography WCC 2007, pp. 225–232,
2007. See page 358.
[786] F. Liu and K. Feng. Efficient computation of algebraic immunity of symmetric Boolean functions.
Proceedings of TAMC 2007, Lecture Notes in Computer Science 4484, pp. 318–329, 2007. See
page 358.
[787] J. Liu and S. Mesnager. Weightwise perfectly balanced functions with high weightwise nonlin-
earity profile. Designs, Codes and Cryptography 87 (8), pp. 1797–1813, 2019; see also CoRR
abs/1709.02959 (2017). See page 458.
[788] J. Liu, S. Mesnager, and L. Chen. On the nonlinearity of S-boxes and linear codes. Cryptography
and Communications 9 (3), pp. 345–361, 2017. See pages 122 and 161.
[789] M. Liu and D. Lin. Almost perfect algebraic immune functions with good nonlinearity. Proceedings
of IEEE International Symposium on Information Theory (ISIT) 2014, pp. 1837–1841, 2014, and
IACR Cryptology ePrint Archive (http://eprint.iacr.org/) 2012/498. See pages 322 and 340.
[790] M. Liu and D. Lin. Results on highly nonlinear Boolean functions with provably good immunity to
fast algebraic attacks. Information Sciences 421, pp. 181–203, 2017. See page 340.
[791] M. Liu, D. Lin, and D. Pei. Fast algebraic attacks and decomposition of symmetric Boolean
functions. IEEE Transactions on Information Theory 57, pp. 4817–4821, 2011. See also ArXiv:
0910.4632v1 [cs.CR] (http://arxiv.org/abs/0910.4632). See pages 94, 322, and 358.
[792] M. Liu, D. Lin, and D. Pei. Results on the immunity of Boolean functions against probabilistic
algebraic attacks. Proceedings of ACISP 2011, Lecture Notes in Computer Science 6812, pp. 34–46,
2011. See page 331.
[793] M. Liu, Y. Zhang, and D. Lin. Perfect algebraic immune functions. Proceedings of ASIACRYPT
2012, Lecture Notes in Computer Science 7658, pp. 172–189, 2012. See pages 322, 331, and 338.
[794] M. Liu, Y. Zhang, and D. Lin. On the immunity of Boolean functions against fast algebraic attacks
using bivariate polynomial representation. IACR Cryptology ePrint Archive (http://eprint.iacr.org/)
2012/498, 2012. See pages 334 and 340.
[795] Y. Liu, V. Rijmen, and G. Leander. Nonlinear diffusion layers. Designs, Codes and Cryptography
86 (11), pp. 2469–2484, 2018. See page 162.
References 537

[796] Z. Liu and B. Wu. Recent results on constructing Boolean functions with (potentially) optimal
algebraic immunity based on decompositions of finite fields. J. Systems Science & Complexity 32
(1), pp. 356–374, 2019. See page 340.
[797] S. Lloyd. Properties of binary functions. Proceedings of EUROCRYPT 1990, Lecture Notes in
Computer Science 473, pp. 124–139, 1991. See pages 58 and 320.
[798] M. Lobanov. Tight bound between nonlinearity and algebraic immunity. IACR Cryptology ePrint
Archive (http://eprint.iacr.org/) 2005/441, 2005. See page 331.
[799] M. Lobanov. Exact relation between nonlinearity and algebraic immunity. Discrete Mathematics
and Applications 16 (5), pp. 453–460, 2006. See page 331.
[800] M. Lobanov. Tight bounds between algebraic immunity and nonlinearities of high orders. NATO
Science for Peace and Security Series – D: Information and Communication Security, IOS Press,
vol 18: Boolean Functions in Cryptology and Information Security, pp. 296–306, 2008, and IACR
Cryptology ePrint Archive (http://eprint.iacr.org/) 2007/444, 2007 and Journal of Applied and
Industrial Mathematics 3 (3), pp. 367–376, 2009 (title: Exact relations between nonlinearity and
algebraic immunity) and private communication. See pages 328, 329, and 331.
[801] M. Lobanov. A method for obtaining lower bounds on the higher order nonlinearity. IACR
Cryptology ePrint Archive (http://eprint.iacr.org/) 2013/332, 2013. See pages 329 and 332.
[802] O. A. Logachev, A. A. Salnikov and V. V. Yashchenko. Bent functions on a finite Abelian group.
Discrete Mathematics and Applications 7 (6), pp. 547–564, 1997. See pages 193 and 212.
[803] V. Lomné, E. Prouff, and T. Roche. Behind the scene of side channel attacks. Proceedings of
ASIACRYPT 2013, Lecture Notes in Computer Science 8269, pp. 506–525, 2013. See page 427.
[804] Y. Lou, H. Han, C. Tang, Z. Wu, and M. Xu. Constructing vectorial Boolean functions with
high algebraic immunity based on group decomposition. International Journal of Computer
Mathematics 92 (3), pp. 451–462, 2015 (preliminary version by Y. Lou, H. Han, C. Tang, and
M. Xu, IACR Cryptology ePrint Archive http://eprint.iacr.org/, 2012/335, 2012). See pages 168,
271, and 351.
[805] S. Lovett, R. Meshulam, and A. Samorodnitsky. Inverse conjecture for the Gowers norm is false.
Proceedings of ACM STOC 2008, pp. 547–556, 2008. See page 474.
[806] G. Luo, X. Cao, and S. Mesnager. Several new classes of self-dual bent functions derived from
involutions. Special Issue on Boolean Functions and Their Applications 2018, Cryptography and
Communications 11 (6), pp. 1261–1273, 2019. See page 199.
[807] O. B. Lupanov. On circuits of functional elements with delay. Probl. Kibern. 23, pp. 43–81, 1970.
See page 103.
[808] W. Ma, M. Lee, and F. Zhang. A new class of bent functions. EICE Trans. Fundamentals E88-A
(7), pp. 2039–2040, 2005. See pages 230 and 250.
[809] F. J. MacWilliams and N. J. Sloane. The Theory of Error-Correcting Codes. North Holland, 1977.
See pages 4, 7, 12, 13, 14, 15, 16, 44, 105, 107, 124, 128, 152, 153, 155, 156, 172, 173, 289, 333, 347,
353, 355, 359, 388, and 487.
[810] H. Maghrebi, C. Carlet, S. Guilley, and J.-L. Danger. Optimal first-order masking with linear and
non-linear bijections. Proceedings of AFRICACRYPT, Lecture Notes in Computer Science 7374,
pp. 60–377, 2012. See pages 431 and 432.
[811] M. Maghrebi, S. Guilley, and J.-L. Danger. Leakage squeezing countermeasure against high-order
attacks. Proceedings of WISTP, Lecture Notes in Computer Science 6633, pp. 208–223, 2011. See
page 431.
[812] J. A. Maiorana. A classification of the cosets of the Reed–Muller code R(1, 6). Mathematics of
Computation 57 (195), pp. 403–414, 1991. See pages 144 and 155.
[813] S. Maitra. Highly nonlinear balanced Boolean functions with very good autocorrelation property.
Proceedings of the Workshop on Coding and Cryptography 2001, Electronic Notes in Discrete
Mathematics, Elsevier, vol. 6, pp. 355–364, 2001. See page 99.
[814] S. Maitra. Autocorrelation properties of correlation immune Boolean functions. Proceedings of
INDOCRYPT 2001, Lecture Notes in Computer Science 2247, pp. 242–253, 2001. See pages 288,
290, and 292.
538 References

[815] S. Maitra. Boolean functions on odd number of variables having nonlinearity greater than the
bent concatenation bound. NATO Science for Peace and Security Series – D: Information and
Communication Security, IOS Press, vol 18: Boolean Functions in Cryptology and Information
Security, pp. 173–182, 2008. See pages 81 and 82.
[816] S. Maitra, S. Kavut, and M. Yücel. Balanced Boolean function on 13-variables having nonlinearity
greater than the Bent concatenation bound. Proceedings of the Conference BFCA 2008, Copen-
hagen, pp. 109–118, 2008. See pages 81 and 82.
[817] S. Maitra and E. Pasalic. Further constructions of resilient Boolean functions with very high
nonlinearity. IEEE Transactions on Information Theory 48 (7), pp. 1825–1834, 2002. See page 295.
[818] S. Maitra and P. Sarkar. Maximum nonlinearity of symmetric Boolean functions on odd number of
variables. IEEE Transactions on Information Theory 48, pp. 2626–2630, 2002. See page 356.
[819] S. Maitra and P. Sarkar. Highly nonlinear resilient functions optimizing Siegenthaler’s inequality.
Proceedings of CRYPTO 1999, Lecture Notes in Computer Science 1666, pp. 198–215, 1999. See
page 292.
[820] S. Maitra and P. Sarkar. Modifications of Patterson–Wiedemann functions for cryptographic
applications. IEEE Transactions on Information Theory 48, pp. 278–284, 2002. See pages 81
and 320.
[821] S. Maity and S. Maitra. Minimum distance between bent and 1-resilient Boolean functions.
Proceedings of Fast Software Encryption FSE 2004, Lecture Notes in Computer Science 3017,
pp. 143–160, 2004. See page 297.
[822] B. Mandal, P. Stănică, S. Gangopadhyay, and E. Pasalic. An analysis of the C class of bent functions.
Fundamenta Informaticae 146 (3), pp. 271–292, 2016. See page 211.
[823] S. Mangard, E. Oswald, and T. Popp. Power Analysis Attacks: Revealing the Secrets of Smart Cards.
Springer, 2006. www.dpabook.org/. See page 425.
[824] H. B. Mann. Addition Theorems. Inderscience, 1965. See page 197.
[825] A. Maschietti. Difference sets and hyperovals. Designs, Codes and Cryptography 14 (1), pp. 89–98,
1998. See page 416.
[826] J. L. Massey. Shift-register analysis and BCH decoding. IEEE Transactions on Information Theory
15, pp. 122–127, 1969. See pages 21 and 76.
[827] J. L. Massey. Minimal codewords and secret sharing. Proceedings of 6th Joint Swedish–Russian
Workshop on Information Theory, Mlle, Sweden, August 22–27, 1993 See pages 146, 148, and 432.
[828] J. L. Massey. Randomness, arrays, differences and duality. IEEE Transactions on Information
Theory 48, pp. 1698–1703, 2002. See page 88.
[829] M. Matsui. Linear cryptanalysis method for DES cipher. Proceedings of EUROCRYPT 1993,
Lecture Notes in Computer Science 765, pp. 386–397, 1994. See pages 79, 115, and 121.
[830] M. Matsui. Block encryption algorithm MISTY. Proceedings of Fast Software Encryption FSE
1997, Lecture Notes in Computer Science 1267, pp. 54–68, 1997. See pages 410 and 442.
[831] U. M. Maurer. New approaches to the design of self-synchronizing stream ciphers. Proceedings of
EUROCRYPT 1991, Lecture Notes in Computer Science 547, pp. 458–471, 1991. See page 83.
[832] V. Mavroudis, K. Vishi, M. D. Zych, and A. Jøsang. The impact of quantum computing on
present cryptography. International Journal of Advanced Computer Science and Applications 9
(3) (https://arxiv.org/pdf/1804.00200), 2018. See page 1.
[833] R. J. McEliece. Weight congruence for p-ary cyclic codes. Discrete Mathematics, 3, pp. 177–192,
1972. See pages 13 and 156.
[834] R. L. McFarland. A family of noncyclic difference sets. Journal of Combinatorial Theory, Series A
15, pp. 1–10, 1973. See pages 165 and 209.
[835] G. McGuire and A. R. Calderbank. Proof of a conjecture of Sarwate and Pursley regarding pairs of
binary m-sequences. IEEE Transactions on Information Theory 41 (4), pp. 1153–1155, 1995. See
page 275.
[836] J. McLaughlin and J. A. Clark. Evolving balanced Boolean functions with optimal resistance to
algebraic and fast algebraic attacks, maximal algebraic degree, and very high nonlinearity. IACR
Cryptology ePrint Archive (http://eprint.iacr.org/) 2013/011, 2013. See page 335.
References 539

[837] A. McLoughlin. The covering radius of the (m − 3)rd-order Reed–Muller codes and a lower bound
on the (m−4)th-order Reed–Muller codes. SIAM Journal on Applied Mathematics 37, pp. 419–422,
1979. See page 158.
[838] P. Méaux, C. Carlet, A. Journault, and F.-X. Standaert. Improved filter permutators for efficient
FHE: better instances and implementations. Proceedings of Indocrypt 2019, Lecture Notes in
Computer Science 11898, pp. 68–91, 2019. See page 469.
[839] P. Méaux, A. Journault, F.-X. Standaert, and C. Carlet. Towards stream ciphers for efficient FHE
with low-noise ciphertexts. Proceedings of EUROCRYPT 2016, Lecture Notes in Computer Science
9665, pp. 311–343, 2016. See pages 232, 321, 363, 453, 454, 455, and 456.
[840] W. Meidl, S. Roy and A. Topuzoğlu. Enumeration of quadratic functions with prescribed Walsh
spectrum. IEEE Transactions on Information Theory 60, pp. 6669–6680, 2014. See page 178.
[841] W. Meidl and A. Topuzoğlu. Quadratic functions with prescribed spectra. Designs, Codes and
Cryptography 66, pp. 257–273, 2013. See page 178.
[842] W. Meier, E. Pasalic, and C. Carlet. Algebraic attacks and decomposition of Boolean functions.
Proceedings of EUROCRYPT 2004, Lecture Notes in Computer Science 3027, pp. 474–491, 2004.
See pages 76, 90, and 91.
[843] W. Meier and O. Staffelbach. Fast correlation attacks on stream ciphers. Proceedings of EURO-
CRYPT 1988, Lecture Notes in Computer Science 330, pp. 301–314, 1988. See pages 76, 78, 103,
and 115.
[844] W. Meier and O. Staffelbach. Nonlinearity criteria for cryptographic functions. Proceedings of
EUROCRYPT 1989, Lecture Notes in Computer Science 434, pp. 549–562, 1990. See pages 76,
101, and 103.
[845] W. Meier and O. Staffelbach. Correlation properties of combiners with memory in stream ciphers.
Proceedings of EUROCRYPT 1990, Lecture Notes in Computer Science 473, pp. 204–213, 1990.
See page 286.
[846] A. Menezes, P. van Oorschot, and S. Vanstone. Handbook of Applied Cryptography. CRC Press
Series on Discrete Mathematics and Its Applications, 1996. See page 2.
[847] Q. Meng, L. Chen, and F.-W. Fu. On homogeneous rotation symmetric bent functions. Discrete
Applied Mathematics 158 (10), pp. 111–1117, 2010. See page 248.
[848] Q. Meng, H. Zhang, M. Yang, and J. Cui. On the degree of homogeneous bent functions. Discrete
Applied Mathematics 155 (5), pp. 665–669, 2007. See page 248.
[849] S. Mesnager. Improving the lower bound on the higher order nonlinearity of Boolean func-
tions with prescribed algebraic immunity. IEEE Transactions on Information Theory 54 (8),
pp. 3656–3662, 2008. Preliminary version available in IACR Cryptology ePrint Archive (http://
eprint.iacr.org/) 2007/117, 2007. See page 331.
[850] S. Mesnager. On the number of resilient Boolean functions. Proceedings of “The First Symposium
on Algebraic Geometry and Its Applications” Dedicated to Gilles Lachaud (SAGA’07), Tahiti, 2007,
Published by World Scientific, Series on Number Theory and Its Applications 5, pp. 419–433, 2008.
See page 312.
[851] S. Mesnager. A new family of hyper-bent Boolean functions in polynomial form. Proceedings of
IMA Conference on Cryptography and Coding 2009, Lecture Notes in Computer Science 5921,
pp. 402–417, 2009. See pages 231 and 246.
[852] S. Mesnager. Hyper-bent Boolean functions with multiple trace terms. Proceedings of International
Workshop on the Arithmetic of Finite Fields WAIFI 2010, Lecture Notes in Computer Science 6087,
pp. 97–113, 2010. See pages 231 and 247.
[853] S. Mesnager. A new class of bent and hyper-bent Boolean functions in polynomial forms. Designs,
Codes and Cryptography 59 (1–3), pp. 265–279, 2011. See pages 231 and 246.
[854] S. Mesnager. Bent and hyper-bent functions in polynomial form and their link with some
exponential sums and Dickson polynomials. IEEE Transactions on Information Theory 57 (9),
pp. 5996–6009, 2011. See pages 189 and 272.
[855] S. Mesnager. Semi-bent functions from Dillon and Niho exponents, Kloosterman sums, and
Dickson polynomials. IEEE Transactions on Information Theory 57, pp. 7443–7458, 2011. See
page 263.
540 References

[856] S. Mesnager. Semi-bent functions with multiple trace terms and hyperelliptic curves. Proceeding
of International Conference on Cryptology and Information Security in Latin America (IACR),
Latincrypt 2012, Lecture Notes in Computer Science 7533. CpbfcCPM, pp. 18–36, 2012. See
page 263.
[857] S. Mesnager. Semi-bent functions from oval polynomials. Proceedings of IMA Conference on
Cryptography and Coding 2013, Lecture Notes in Computer Science 8308, pp. 1–15, 2013. See
page 263.
[858] S. Mesnager. Characterizations of plateaued and bent functions in characteristic p. Proceedings
of International Conference on Sequences and Their Applications SETA 2014, Lecture Notes in
Computer Science 8865, pp. 72–82, 2014. See pages 258, 261, and 281.
[859] S. Mesnager. On semi-bent functions and related plateaued functions over the Galois field F2n .
Proceedings of the Conference Open Problems in Mathematical and Computational Sciences,
September 18–20, 2013, in Istanbul, Turkey, Springer, pp. 243–273, 2014. See pages 262 and 263.
[860] S. Mesnager. Several new infinite families of bent functions and their duals. IEEE Transactions on
Information Theory 60 (7), pp. 4397–4407, 2014. See page 237.
[861] S. Mesnager. Bent functions from spreads. Proceedings of the 11th International Conference on
Finite Fields and Their Applications (Fq’11), Journal of the American Mathematical Society (AMS),
Contemporary Mathematic 632, pp. 295–316, 2015. See page 223.
[862] S. Mesnager. Bent vectorial functions and linear codes from o-polynomials. Designs, Codes and
Cryptography. 77 (1), pp. 99–116, 2015. See pages 149, 161, and 271.
[863] S. Mesnager. A note on constructions of bent functions from involutions. IACR Cryptology ePrint
Archive (http://eprint.iacr.org/) 2015/982, 2015. See page 237.
[864] S. Mesnager. Further constructions of infinite families of bent functions from new permutations and
their duals. Cryptography and Communications 8 (2), pp. 229–246, 2016. See page 237.
[865] S. Mesnager. Bent functions: Fundamentals and Results. Springer, pp. 1–544, 2016. See pages 189
and 190.
[866] S. Mesnager. Linear codes with few weights from weakly regular bent functions based on a generic
construction. Cryptography and Communications 9 (1), pp. 71–84, 2017 (preliminary version
available in IACR Cryptology ePrint Archive, http://eprint.iacr.org/2015/1103). See page 189.
[867] S. Mesnager. Linear codes with few weights from weakly regular bent functions based on a generic
construction. Cryptography and Communications 9 (1), pp. 71–84, 2017. See page 147.
[868] S. Mesnager and G. Cohen. On the link of some semi-bent functions with Kloosterman sums.
Proceedings of International Workshop on Coding and Cryptology, IWCC 2011, Lecture Notes in
Computer Science 6639, Springer, pp. 263–272, 2011. See page 263.
[869] S. Mesnager and G. Cohen. Cyclic codes and algebraic immunity of Boolean functions. Proceed-
ings of ITW 2015, pp. 1–5, 2015. See page 327.
[870] S. Mesnager and G. Cohen. Fast algebraic immunity of Boolean functions. Advances in Mathemat-
ics of Communications 11 (2), pp. 373–377, 2017. See pages 94 and 323.
[871] S. Mesnager and J. P. Flori. Hyper-bent functions via Dillon-like exponents. IEEE Transactions on
Information Theory 59 (5), pp. 3215–3232, 2013. See pages 231, 247, and 362.
[872] S. Mesnager, G. McGrew, J. Davis, D. Steele, and K. Marsten. A comparison of Carlet’s second-
order nonlinearity bounds. International Journal of Computer Mathematics 94 (3), pp. 427–436,
2017. See page 86.
[873] S. Mesnager, P. Ongan, and F. Özbudak. New bent functions from permutations and linear
translators. Proceedings of C2SI 2017, Lecture Notes in Computer Science 10194, pp. 282–297,
2017. See page 237.
[874] S. Mesnager, F. Özbudak, and A. Sınak. Linear codes from weakly regular plateaued functions and
their secret sharing schemes. Designs, Codes and Cryptography 87 (2–3), pp. 463–480, 2019. See
pages 147 and 262.
[875] S. Mesnager, C. Tang, and M. Xiong. On the boomerang uniformity of quadratic permutations
over F2n . IACR Cryptology ePrint Archive (http://eprint.iacr.org/2019/277 and arXiv preprint
arXiv:1903.00501, 2019 – arxiv.org). See page 142.
References 541

[876] S. Mesnager and F. Zhang. On constructions of bent, semi-bent and five valued spectrum functions
from old bent functions. Advances in Mathematics of Communications 11 (2), pp. 339–345, 2017.
See pages 233 and 263.
[877] S. Mesnager, F. Zhang, C. Tang, and Y. Zhou. Further study on the maximum number of bent
components of vectorial functions. Designs, Codes and Cryptography 87 (11), pp. 2597–2610,
2019. Also: arXiv:1801.06542. See page 243.
[878] S. Mesnager, Z. Zhou, and C. Ding. On the nonlinearity of Boolean functions with restricted input.
Cryptography and Communications 11 (1), pp. 63–76, 2019. See page 460.
[879] T.S. Messerges. Using Second-order Power Analysis to Attack DPA Resistant software. Proceed-
ings of International Workshop Cryptographic Hardware and Embedded Systems CHES 2000,
Lecture Notes in Computer Science 1965, pp. 238–251, 2000. See page 427.
[880] S. Micali and L. Reyzin. Physically observable cryptography (extended abstract). Proceedings of
TCC, Lecture Notes in Computer Science 2951, pp. 278–296, 2004. See page 429.
[881] M. J. Mihaljevic, S. Gangopadhyay, G. Paul, and H. Imai. Generic cryptographic weakness of
k-normal boolean functions in certain stream ciphers and cryptanalysis of grain-128. Periodica
Mathematica Hungarica 65 (2), pp. 205–227, 2012. See page 105.
[882] W. Millan. Low order approximation of cipher functions. Proceedings of Cryptographic Policy and
Algorithms, Lecture Notes in Computer Science 1029, pp. 144–155, 1996. See page 83.
[883] W. Millan, L. Burnett, G. Carter, A. Clark, and E. Dawson. Evolutionary heuristics for finding
cryptographically strong S-boxes. Proceedings of Information and Communication Security,
Lecture Notes in Computer Science 1726, pp. 263–274, 1999. See page 145.
[884] W. Millan, A. Clark and E. Dawson. An effective genetic algorithm for finding highly nonlin-
ear boolean functions. Proceedings of ICICS 1997, Lecture Notes in Computer Science 1334,
pp. 149–158, 1997. See page 144.
[885] W. Millan, A. Clark, and E. Dawson. Heuristic design of cryptographically strong balanced
Boolean functions. Proceedings of EUROCRYPT 1998, Lecture Notes in Computer Science 1403,
pp. 489–499, 1998.
[886] M. Minsky and S. A. Papert. Perceptrons: An Introduction to Computational Geometry. Reissue of
the 1988 Expanded Edition. MIT Press, 2017. See page 47.
[887] C. J. Mitchell. Enumerating Boolean functions of cryptographic signifiance. Journal of Cryptology
2 (3), pp. 155–170, 1990. See pages 356 and 357.
[888] A. Moradi, A. Poschmann, S. Ling, C. Paar, and H. Wang. Pushing the limits: a very compact and a
threshold implementation of AES. Proceedings of EUROCRYPT 2011, Lecture Notes in Computer
Science 6632, pp. 69–88, 2011.
[889] E. Mossel, A. Shpilka, and L. Trevisan. On e-biased generators in NC0. Proceedings of 44th FOCS,
pp. 136–145. IEEE Computer Society Press, 2003. See page 468.
[890] G. Mullen and D. Panario. Handbook of Finite Fields. CRC Press Book, 2013. See pages 20, 41,
76, 248, 254, 389, 480, and 488.
[891] D. E. Muller. Application of boolean algebra to switching circuit design and to error detection.
Trans. I.R.E. Prof. Group on Electronic Computers, 3 (3), pp. 6–12, 1954. See page 151.
[892] A. Muratović-Ribić, E. Pasalic, and S. Bajrić. Vectorial bent functions from multiple terms trace
Functions. IEEE Transactions on Information Theory 60 (2), pp. 1337–1347, 2014. See page 272.
[893] A. Muratović-Ribić, E. Pasalic, and S. Bajrić. Vectorial hyperbent trace functions from the P Sap
Class – Their Exact Number and Specification. IEEE Transactions on Information Theory 60 (7),
pp. 4408–4413, 2014. See page 272.
[894] J. Mykkelveit. The covering radius of the [128,8] Reed–Muller code is 56. IEEE Transactions on
Information Theory 26 (3), pp. 359–362, 1980. See pages 81 and 157.
[895] N. Nakagawa. On equations of finite fields of characteristic 2 and APN functions. AKCE
International Journal of Graphs and Combinatorics 12, pp. 75–93, 2015. See page 383.
[896] N. Nakagawa and S. Yoshiara. A construction of differentially 4-uniform functions from commu-
tative semifields of characteristic 2. Proceedings of International Workshop on the Arithmetic of
Finite Fields WAIFI 2007, Lecture Notes in Computer Science 4547, pp. 134–146, 2007. See
pages 394 and 422.
542 References

[897] M. Nassar, S. Guilley, and J.-L. Danger. Formal analysis of the entropy/security trade-off in first-
order masking countermeasures against side-channel attacks. Proceedings of INDOCRYPT 2011,
Lecture Notes in Computer Science 7107, pp. 22–39, 2011. See page 429.
[898] M. Nassar, Y. Souissi, S. Guilley, and J.-L. Danger. RSM: a small and fast countermeasure for AES,
secure against 1st and 2nd-order zero-offset SCAs. Proceedings of 2012 Design, Automation & Test
in Europe Conference & Exhibition (DATE 2012) IEEE 2012, pp. 1173–1178, 2012. See page 432.
[899] Y. Nawaz, G. Gong, and K. Gupta. Upper bounds on algebraic immunity of power functions.
Proceedings of Fast Software Encryption FSE 2006, Lecture Notes in Computer Science 4047,
pp. 375–389, 2006. See page 323.
[900] Y. Nawaz, K. Gupta, and G. Gong. Algebraic immunity of S-boxes based on power mappings:
analysis and construction. IEEE Transactions on Information Theory 55 (9), pp. 4263–4273, 2009
(preliminary version available in IACR Cryptology ePrint Archive http://eprint.iacr.org/2006/322).
See page 323.
[901] NESSIE Project. www.cosic.esat.kuleuven.be/nessie/. See pages 3 and 23.
[902] Y. Niho. Multi-valued cross-correlation functions between two maximal linear recursive sequences.
PhD dissertation, University of Southern California, Los Angeles, 1972. See page 169.
[903] S. Nikova, C. Rechberger, and V. Rijmen. Threshold implementations against side-channel attacks
and glitches. Proceedings of ICICS 2006, Lecture Notes in Computer Science 4307, pp. 529–545,
2006. See pages 437, 440, and 441.
[904] S. Nikova, V. Rijmen, and M. Schläffer. Secure hardware implementation of nonlinear functions in
the presence of glitches. Journal of Cryptology 24 (2), pp. 292–321, 2011. See pages 437 and 440.
[905] N. Nisan and M. Szegedy. On the degree of Boolean functions as real polynomials. Comput.
Complexity 4, pp. 301–313, 1994. See pages 47, 48, 62, 67, 68, and 320.
[906] K. Nyberg. Perfect non-linear S-boxes. Proceedings of EUROCRYPT 1991, Lecture Notes in
Computer Science 547, pp. 378–386, 1992. See pages 135, 190, 269, 270, and 314.
[907] K. Nyberg. On the construction of highly nonlinear permutations. Proceedings of EUROCRYPT
1992, Lecture Notes in Computer Science 658, pp. 92–98, 1993. See pages 28, 117, 135, 177,
and 316.
[908] K. Nyberg. Differentially uniform mappings for cryptography. Proceedings of EUROCRYPT 1993,
Lecture Notes in Computer Science 765, pp. 55–64, 1994. See pages 135, 137, 394, 395, 400, 401,
and 417.
[909] K. Nyberg. New bent mappings suitable for fast implementation. Proceedings of Fast Software
Encryption FSE 1993, Lecture Notes in Computer Science 809, pp. 179–184, 1994. See page 270.
[910] K. Nyberg. S-boxes and round functions with controllable linearity and differential uniformity.
Proceedings of Fast Software Encryption FSE 1994, Lecture Notes in Computer Science 1008,
pp. 111–130, 1995. See pages 136, 372, 391, and 411.
[911] K. Nyberg. Multidimensional Walsh transform and a characterization of bent functions. Proceed-
ings of the IEEE Information Theory Workshop ITW 2007, pp. 1–4, 2007. See page 74.
[912] K. Nyberg and L. R. Knudsen. Provable security against differential cryptanalysis. Journal of
Cryptology 8 (1), pp. 27–37, 1995, (extended version of the Proceedings of CRYPT0’ 92, Lecture
Notes in Computer Science 740, pp. 566–574, 1993). See pages 135 and 137.
[913] L. O’Connor. On the distribution of characteristics in bijective mappings. Proceedings of EURO-
CRYPT 1993, Lecture Notes in Computer Science 765, pp. 360–370, 1993. See page 136.
[914] R. O’Donnell. Analysis of Boolean Functions. Cambridge University Press, 2014. See pages 37,
59, 68, 102, and 358.
[915] R. O’Donnell and D. Witmer. Goldreich’s PRG: evidence for near-optimal polynomial stretch. IEEE
Conference on Computational Complexity 2014, pp. 1–12, 2014. See page 468.
[916] W. Ogata and K. Kurosawa. Optimum secret sharing scheme secure against cheating. Proceedings
of EUROCRYPT 1996, Lecture Notes in Computer Science 1070, pp. 200–211, 1996. See page 452.
[917] D. Olejár and M. Stanek. On cryptographic properties of random Boolean functions. Journal of
Universal Computer Science 4 (8), pp. 705–717, 1998. See page 80.
[918] O. Olmez. Plateaued functions and one-and-half difference sets. Designs, Codes and Cryptography
76 (3), pp. 537–549, 2015. See page 259.
References 543

[919] J. D. Olsen, R. A. Scholtz, and L. R. Welch. Bent-function sequences. IEEE Transactions on


Information Theory 28 (6), pp. 858–864, 1982. See page 189.
[920] E. Oswald, S. Mangard, C. Herbst, and S. Tillich. Practical second-order DPA attacks for masked
smart card implementations of block ciphers. Proceedings of CT-RSA 2006, Lecture Notes in
Computer Science 3860, pp. 192–207, 2006. See page 427.
[921] D. Panario, A. Sakzad, B. Stevens, D. Thomson, and Qiang Wang. Ambiguity and deficiency of
permutations over finite fields with linearized difference map. IEEE Transactions on Information
Theory 59 (9), pp. 5616–5626, 2013. See page 139.
[922] D. Panario, A. Sakzad, B. Stevens, and Q. Wang. Two new measures for permutations: ambiguity
and deficiency. IEEE Transactions on Information Theory 57 (11), pp. 7648–7657, 2011. See
page 139.
[923] D. Panario, A. Sakzad, and D. Thomson. Ambiguity and deficiency of reversed Dickson permuta-
tions. Proceedings of Fq13, Contemporary Mathematics: Topics in Finite Fields 632, pp. 347–358,
2013. See page 139.
[924] D. Panario, D. Santana, and Q. Wang. Ambiguity, deficiency and differential spectrum of
normalized permutation polynomials over finite fields. Finite Fields and Their Applications 47,
pp. 330–350, 2017. See page 139.
[925] D. Panario, B. Stevens, and Q. Wang. Ambiguity and deficiency in Costas arrays and APN
permutations. Proceedings of LATIN 2010, Lecture Notes in Computer Science 6034, pp. 397–406,
2010. See page 139.
[926] S. M. Park, S. Lee, S. H. Sung, and K. Kim. Improving bounds for the number of correlation-
immune Boolean functions. Information Processing Letters 61, pp. 209–212, 1997. See page 312.
[927] M. G. Parker and A. Pott. On Boolean functions which are bent and negabent. Sequences,
Subsequences, and Consequences, Lecture Notes in Computer Science 4893, pp. 9–23, 2007. See
pages 266 and 267.
[928] E. Pasalic. Maiorana–McFarland class: degree optimization and algebraic properties. IEEE Trans-
actions on Information Theory 52 (10), pp. 4581–4594, 2006. See page 295.
[929] E. Pasalic. Almost fully optimized infinite classes of Boolean functions resistant to (fast) algebraic
cryptanalysis. Proceedings of ICISC 2008, Lecture Notes in Computer Science 5461, pp. 399–414,
2008. See pages 322 and 336.
[930] E. Pasalic. A note on nonexistence of vectorial bent functions with binomial trace representation in
the PS- class. Information Processing Letters 115 (2), pp. 139–140, 2015. See page 272.
[931] E. Pasalic. Corrigendum to “a note on nonexistence of vectorial bent functions with binomial
trace representation in the PS- class” [Information Processing Letters 115 (2) (2015) 139–140].
Information Processing Letters 115 (4): 520, 2015. See page 272.
[932] E. Pasalic, S. Hodžić, F. Zhang, and Y. Wei. Bent functions from nonlinear permutations and
conversely. Cryptography and Communications 11 (2), pp. 207–225, 2019. See page 236.
[933] E. Pasalic and S. Maitra. Linear codes in generalized construction of resilient functions with very
high nonlinearity. IEEE Transactions on Information Theory 48, pp. 2182–2191, 2002, completed
version of a paper published in the Proceedings of Selected Areas in Cryptography, SAC 2001,
Lecture Notes in Computer Science 2259, pp. 60–74, 2002. See pages 315 and 316.
[934] E. Pasalic, S. Maitra, T. Johansson, and P. Sarkar. New constructions of resilient and correlation
immune Boolean functions achieving upper bound on nonlinearity. Proceedings of the Workshop on
Coding and Cryptography 2001, published by Electronic Notes in Discrete Mathematics, Elsevier,
vol. 6, pp. 425–434, 2001. See pages 291, 295, and 300.
[935] E. Pasalic and W.-G. Zhang. On multiple output bent functions. Information Processing Letters 112
(21), pp. 811–815, 2012. See page 272.
[936] N. J. Patterson and D. H. Wiedemann. The covering radius of the [215 , 16] Reed–Muller code is at
least 16276. IEEE Transactions on Information Theory 29, pp. 354–356, 1983. See pages 81, 157,
and 543.
[937] N. J. Patterson and D. H. Wiedemann. Correction to [936]. IEEE Transactions on Information
Theory 36 (2), pp. 443, 1990. See pages 81 and 157.
544 References

[938] J. Peng and C. H. Tan. New explicit constructions of differentially 4-uniform permutations via
special partitions of F22k . Finite Fields and Their Applications 40, pp. 73–89, 2016. See page 420.
[939] J. Peng and C. H. Tan. New differentially 4-uniform permutations by modifying the inverse function
on subfields. Cryptography and Communications 9 (3), pp. 363–378, 2017. See pages 420 and 421.
[940] J. Peng, C. H. Tan, and Q. Wang. A new family of differentially 4-uniform permutations over F2k
for odd k. Science China Mathematics 59 (6), pp. 1221–1234, 2016. See pages 420 and 421.
[941] J. Peng, Q. Wu, and H. Kan. On symmetric Boolean functions with high algebraic immunity on even
number of variables. IEEE Transactions on Information Theory 57 (10), pp. 7205–7220, 2011. See
page 358.
[942] T. Penttila (Joint work with L. Budaghyan, C. Carlet, T. Helleseth, and A. Kholosha). Projective
equivalence of ovals and EA-equivalence of Niho bent functions. Invited talk at the Finite
Geometries Fourth Irsee Conference (2014). See pages 220 and 221.
[943] L. Perrin, A. Canteaut, and S. Tian. If a generalised butterfly is APN then it operates on
6 bits. Special Issue on Boolean Functions and Their Applications 2018, Cryptography and
Communications 11 (6), pp. 1147–1164, 2019. See page 411.
[944] L. Perrin, A. Udovenko, and A. Biryukov. Cryptanalysis of a theorem: decomposing the only
known solution to the big APN problem. Proceedings of CRYPTO 2016, Lecture Notes in Computer
Science 9815, part II, pp. 93–122, 2016. See pages 411 and 421.
[945] S. Picek, C. Carlet, S. Guilley, J. F. Miller, and D. Jakobovic. Evolutionary algorithms for Boolean
functions in diverse domains of cryptography. Evolutionary Computation 24 (4), pp. 667–694,
2016. See page 144.
[946] S. Picek, S. Guilley, C. Carlet, D. Jakobovic, and J. F. Miller. Evolutionary approach for finding
correlation immune Boolean functions of order t with minimal hamming weight. Proceedings of
TPNC 2015, Lecture Notes in Computer Science 9477, pp. 71–82, 2015. See pages 144 and 305.
[947] S. Picek and D. Jakobovic. Evolving algebraic constructions for designing bent Boolean functions.
Proceedings of the Genetic and Evolutionary Computation Conference GECCO 2016, pp. 781–788,
2016. See page 144.
[948] S. Picek, K. Knezevic, and D. Jakobovic. On the evolution of bent (n, m) functions. Proceedings of
CEC 2017, pp. 2137–2144, 2017. See page 145.
[949] S. Picek, K. Knezevic, D. Jakobovic, and C. Carlet. A search for differentially-6 uniform (n, n − 2)
functions. Proceedings of IEEE CEC 2018. See pages 145 and 424.
[950] S. Picek, L. Mariot, B. Yang, D. Jakobovic, and N. Mentens. Design of S-boxes defined with cellular
automata rules. Proceedings of the Computing Frontiers Conference, CF’17, pp. 409–414, 2017.
See page 145.
[951] S. Picek, R. I. McKay, R. Santana, and T. Gedeon. Fighting the symmetries: the structure of cryp-
tographic Boolean function spaces. Proceedings of the Conference on Genetic and Evolutionary
Computation GECCO 2015, pp. 457–464, 2015. See page 144.
[952] S. Picek, D. Sisejkovic, and D. Jakobovic. Immunological algorithms paradigm for construction
of Boolean functions with good cryptographic properties. Engineering Applications of Artificial
Intelligence 62, pp. 320–330, 2017. See page 144.
[953] S. Picek, B. Yang, V. Rozic, and N. Mentens. On the construction of hardware-friendly 4 × 4
and 5 × 5 S-boxes. Proceedings of Selected Areas in Cryptography – SAC 2016, Lecture Notes in
Computer Science 10532, pp. 161–179, 2016. See page 145.
[954] J. Pieprzyk and C. Qu. Fast hashing and rotation symmetric functions, Journal of Unversal
Computer Science 5, pp. 20–31, 1999. See page 248.
[955] J. Pieprzyk and X.-M. Zhang. Computing Möbius transforms of Boolean functions and character-
izing coincident Boolean functions. Proceedings of the Conference BFCA 2007, Publications des
universités de Rouen et du Havre, 2007. See page 37.
[956] G. Pirsic and A. Winterhof. Boolean functions derived from pseudorandom binary sequences.
Proceedings of International Conference on Sequences and Their Applications SETA 2012, Lecture
Notes in Computer Science 7280, pp. 101–109, 2012. See page 104.
References 545

[957] G. Piret, T. Roche, and C. Carlet. PICARO – a block cipher allowing efficient higher-order
side-channel resistance. Proceedings of ACNS 2012, Lecture Notes in Computer Science 7341,
pp. 311–328, 2012. See pages 26, 112, 142, 189, and 422.
[958] V. Pless. Power moment identities on weight distributions in error-correcting codes. Information
and Control 6, pp. 147–152, 1963. See page 381.
[959] V. S. Pless, W. C. Huffman, eds., R. A. Brualdi, assistant editor. Handbook of Coding Theory.
Elsevier, 1998. See pages 20, 21, and 156.
[960] R. Poussier, Q. Guo, F.-X. Standaert, C. Carlet, and S. Guilley. Connecting and improving direct
sum masking and inner product masking. Proceedings of CARDIS 2017, Lecture Notes in Computer
Science 10728, pp. 123–141, 2017. See pages 445 and 446.
[961] A. Pott, E. Pasalic, A. Muratović-Ribić, and S. Bajrić. Vectorial quadratic bent functions as a
product of two linearized polynomials. Proceedings of Workshop on Coding and Cryptography
WCC, 2015. See page 272.
[962] A. Pott, E. Pasalic, A. Muratović-Ribić, and S. Bajrić. On the maximum number of bent components
of vectorial functions. IEEE Transactions on Information Theory 64 (1), pp. 403–411, 2018. See
page 243.
[963] A. Pott, K.-U. Schmidt, and Y. Zhou. Semifields, relative difference sets, and bent functions.
Proceedings of the Workshop “Emerging Applications of Finite Fields”, Algebraic Curves and
Finite Fields, Radon Series on Computational and Applied Mathematics, de Gruyter, pp. 161–177,
2014. See pages 217 and 269.
[964] A. Pott, Y. Tan, T. Feng, and S. Ling. Association schemes arising from bent functions. Designs,
Codes and Cryptography 59 (1–3), pp. 319–331, 2011. See page 163.
[965] A. Pott, Q. Wang, and Y. Zhou. Sequences and functions derived from projective planes and their
difference sets. Proceedings of International Workshop on the Arithmetic of Finite Fields WAIFI
2012, Lecture Notes in Computer Science 7369, pp. 64–80, 2012. See page 197.
[966] A. Pott and Y. Zhou. CCZ and EA equivalence between mappings over finite Abelian groups.
Designs, Codes and Cryptography 66 (1–3), pp. 99–109, 2013. See page 29.
[967] T. F. Prabowo and C. H. Tan. Implicit quadratic property of differentially 4-uniform permutations.
Proceedings of INDOCRYPT 2016, Lecture Notes in Computer Science 10095, pp 364–379, 2016.
See page 421.
[968] B. Preneel. Analysis and Design of Cryptographic Hash Functions, PhD thesis, Katholieke
Universiteit Leuven, K. Mercierlaan 94, 3001 Leuven, Belgium, U.D.C. 621.391.7, 1993. See
pages 97, 109, 208, 243, and 319.
[969] B. Preneel, R. Govaerts, and J. Vandevalle. Boolean functions satisfying higher order propagation
criteria. Proceedings of EUROCRYPT 1991, Lecture Notes in Computer Sciences 547, pp. 141–152,
1991. See pages 97 and 256.
[970] B. Preneel, W. Van Leekwijck, L. Van Linden, R. Govaerts, and J. Vandevalle. Propagation char-
acteristics of Boolean functions. Proceedings of EUROCRYPT 1990, Lecture Notes in Computer
Sciences 473, pp. 161–173, 1991. See pages 76, 97, and 320.
[971] E. Prouff. DPA attacks and S-boxes. Proceedings of Fast Software Encryption FSE 2005, Lecture
Notes in Computer Science 3557, pp. 424–442, 2005.
[972] E. Prouff and M. Rivain. A generic method for secure Sbox implementation. Proceedings of WISA
2007, Lecture Notes in Computer Science 4867, pp. 227–244, 2007. See page 429.
[973] E. Prouff and M. Rivain. Masking against side-channel attacks: a formal security proof. Proceedings
of EUROCRYPT 2013, Lecture Notes in Computer Science 7881, pp. 142–159, 2013. See page 429.
[974] E. Prouff and T. Roche. Higher-order glitches free Implementation of the AES using secure multi-
party computation protocols. Proceedings of International Workshop Cryptographic Hardware
and Embedded Systems CHES 2011, Lecture Notes in Computer Science 6917, pp. 63–78, 2011.
Extended version (T. Roche and E. Prouff): Journal of Cryptographic Engineering JCEN 2 (2),
pp. 111–127, 2012. See pages 428, 431, 436, and 447.
[975] C. Qu, J. Seberry, and J. Pieprzyk. Homogeneous bent functions. Discrete Applied Mathematics
102 (1–2), pp. 133–139, 2000. See page 248.
546 References

[976] L. Qu, K. Feng, L. Feng, and L. Wang. Constructing symmetric Boolean functions with maximum
algebraic immunity. IEEE Transactions on Information Theory 55, pp. 2406–2412, 2009. See
page 358.
[977] L. Qu, S. Fu, Q. Dai, and C. Li. When a Boolean function can be expressed as the sum of two bent
functions. IACR Cryptology ePrint Archive (http://eprint.iacr.org/) 2014/48, 2014. See page 242.
[978] L. Qu and C. Li. Weight support technique and the symmetric Boolean functions with maximum
algebraic immunity on even number of variables. Proceedings of INSCRYPT 2007, Lecture Note in
Computer Science 4990, pp. 271–282. See page 358.
[979] L. Qu, C. Li, and K. Feng. A note on symmetric Boolean functions with maximum alge-
braic immunity in odd number of variables. IEEE Transactions on Information Theory 53,
pp. 2908–2910, 2007. See pages 357 and 358.
[980] L. Qu, Y. Tan, and C. Li. On the Walsh spectrum of a family of quadratic APN functions with five
terms. Science China Information Sciences 57 (2), pp. 1–7, 2014. See page 409.
[981] L. Qu, Y. Tan, C. Li, and G. Gong. More constructions of differentially 4-uniform permutations on
F222k . Designs, Codes and Cryptography 78 (2), pp. 391–408, 2016. See page 420.
[982] L. Qu, Y. Tan, C. H. Tan, and C. Li. Constructing differentially 4-uniform permutations over F22k
via the switching method. IEEE Transactions on Information Theory 59 (7), pp. 4675–4686, 2013.
See page 420.
[983] L. Qu, H. Xiong, and C. Li. A negative answer to Bracken–Tan–Tan’s problem on differentially
4-uniform permutations over F2n . Finite Fields and Their Applications 24, pp. 55–65, 2013. See
page 418.
[984] J.-J. Quisquater and D. Samyde. ElectroMagnetic Analysis (EMA): measures and countermea-
sures for smart cards. Proceedings of E-smart 2001, Lecture Notes in Computer Science 2140,
pp. 200–210, 2001. See page 425.
[985] M. Quisquater. Applications of character theory and the Möbius inversion principle to the study of
cryptographic properties of Boolean functions. PhD thesis, Katholieke Universiteit Leuven, 2004.
See page 69.
[986] M. Quisquater, B. Preneel, and J. Vandewalle. A new inequality in discrete Fourier–Hadamard
theory. IEEE Transactions on Information Theory 49, pp. 2038–2040, 2003. See page 256.
[987] M. Quisquater, B. Preneel, and J. Vandewalle. Spectral characterization of cryptographic Boolean
functions satisfying the (extended) propagation criterion of degree l and order k. Information
Processing Letters 93 (1), pp. 25–28, 2005. See page 320.
[988] C. R. Rao. Factorial experiments derivable from combinatorial arrangements of arrays. J. Royal
Statist. Soc. 9, pp. 128–139, 1947. See pages 86, 87, and 129.
[989] I. S. Reed. A class of multiple-error-correcting codes and the decoding scheme. Transactions of the
IRE Professional Group on Information Theory 4 (4), pp. 38–49, 1954. See page 151.
[990] M. Renauld, F.-X. Standaert, and N. Veyrat-Charvillon. Algebraic side-channel attacks on the AES:
why time also matters in DPA. Proceedings of International Workshop Cryptographic Hardware
and Embedded Systems CHES 2009, Lecture Notes in Computer Science 5747, pp. 97–111, 2009.
See page 436.
[991] S. Renner. Protection des algorithmes cryptographiques embarqués. Thesis, Bordeaux.
(https://tel.archives-ouvertes.fr/tel-01149061/document). See page 147.
[992] O. Reparaz, B. Bilgin, S. Nikova, B. Gierlichs, and I. Verbauwhede. Consolidating Mask-
ing Schemes. Proceedings of CRYPTO (1) 2015, Lecture Notes in Computer Science 9215,
pp. 764–783, 2015. See page 437.
[993] C. Riera and M. G. Parker. Generalised bent criteria for Boolean functions. IEEE Transactions on
Information Theory 52 (9), pp. 4142–4159, 2006. See pages 191 and 266.
[994] V. Rijmen, P. S. L. M. Barreto, and D. L. G. Filho. Rotation symmetry in algebraically generated
cryptographic substitution tables. Information Processing Letters 106 (6), pp. 246–250, 2008 See
page 362.
[995] V. Rijmen, B. Preneel, and E. De Win. On weaknesses of non-surjective round functions. Designs,
Codes and Cryptography 12 (3), pp. 253–266, 1997. See page 142.
References 547

[996] M. Rivain and E. Prouff. Provably secure higher-order masking of aes. Proceedings of Interna-
tional Workshop Cryptographic Hardware and Embedded Systems CHES 2010, Lecture Notes in
Computer Science 6225, pp. 413–427, 2010. See page 430.
[997] P. Rizomiliotis. On the resistance of Boolean functions against algebraic attacks using univariate
polynomial representation. IEEE Transactions on Information Theory 56, pp. 4014–4024, 2010.
See pages 334, 339, and 340.
[998] P. Rizomiliotis. Improving the higher order nonlinearity lower bound for Boolean functions with
given algebraic immunity. Discrete Applied Mathematics 158 (18), pp. 2049–2055, 2010. See
page 331.
[999] P. Rizomiliotis. On the security of the Feng–Liao–Yang Boolean functions with optimal algebraic
immunity against fast algebraic attacks. Designs, Codes and Cryptography 57 (3), pp. 283–292,
2010. See page 334.
[1000] F. Rodier. Asymptotic nonlinearity of Boolean functions. Designs, Codes and Cryptography 40
(1), pp 59–70, 2006 (preliminary version: Proceedings of WCC 2003, Workshop on Coding and
Cryptography, pp. 397–405, 2003). See pages 80, 84, and 334.
[1001] S. Rønjom. Improving algebraic attacks on stream ciphers based on linear feedback shift register
over F2k . Designs, Codes and Cryptography 82, pp. 27–41, 2017. See pages 95 and 96.
[1002] S. Rønjom, G. Gong, and T. Helleseth. On attacks on filtering generators using linear subspace
structures. Proceedings of Sequences, Subsequences, and Consequences, Lecture Notes in Com-
puter Science 4893, pp. 204–217, 2007. See page 95.
[1003] S. Rønjom and T. Helleseth. A new attack on the filter generator. IEEE Transactions on Information
Theory 53 (5), pp. 1752–1758, 2007. See pages 95 and 284.
[1004] S. Rønjom and T. Helleseth. Attacking the filter generator over GF (2m ). Proceedings of Interna-
tional Workshop on the Arithmetic of Finite Fields WAIFI 2007, Lecture Notes in Computer Science
4547, pp. 264–275, June 2007. See page 95.
[1005] O. S. Rothaus. On “bent” functions. Journal of Combinatorial Theory, Series A 20, pp. 300–305,
1976. See pages 144, 190, 197, 200, 208, and 232.
[1006] R. A. Rueppel. Analysis and Design of Stream Ciphers. Com. and Contr. Eng. Series. Springer,
1986. See pages 21 and 77.
[1007] R. A. Rueppel and O. J. Staffelbach. Products of linear recurring sequences with maximum
complexity. IEEE Transactions on Information Theory 33 (1), pp. 124–131, 1987. See page 77.
[1008] B.V. Ryazanov. On the distribution of the spectral complexity of Boolean functions. Discrete
Mathematics and Applications 4 (3), pp. 279–288, 1994. See pages 109 and 110.
[1009] M.-J. O. Saarinen. Cryptographic analysis of all 4 × 4-bit S-boxes. Proceedings of SAC 2011,
Lecture Notes in Computer Science 7118, pp. 118–133, 2011. See page 144.
[1010] A. Samorodnitsky. Low-degree tests at large distances. Proceedings of ACM STOC 2007,
pp. 506–515 (https://arxiv.org/pdf/math/0604353.pdf), 2007. See page 474.
[1011] P. Sarkar and S. Maitra. Construction of nonlinear Boolean functions with important crypto-
graphic properties. Proceedings of EUROCRYPT 2000, Lecture Notes in Computer Science 1807,
pp. 485–506, 2000. See pages 291, 292, and 320.
[1012] P. Sarkar and S. Maitra. Nonlinearity bounds and constructions of resilient Boolean functions.
Proceedings of CRYPTO 2000, Lecture Notes in Computer Science 1880, pp. 515–532, 2000. See
pages 287, 288, and 291.
[1013] P. Sarkar and S. Maitra. Construction of nonlinear resilient Boolean functions using “small” affine
functions. IEEE Transactions on Information Theory 50 (9), pp. 2185–2193, 2004. See page 295.
[1014] P. Sarkar and S. Maitra. Balancedness and correlation immunity of symmetric Boolean functions.
Discrete Mathematics 307, pp. 2351–2358, 2007. See page 357.
[1015] P. Sarkar and S. Maitra. Construction of rotation symmetric Boolean functions on odd number
of variables with maximum algebraic immunity. Proceedings of AAECC 2007, Lecture Notes in
Computer Science 4851, pp. 271–280, 2007. See page 362.
[1016] S. Sarkar and S. Maitra. Idempotents in the neighbourhood of Patterson–Wiedemann functions
having Walsh spectra zeros. Designs, Codes and Cryptography, 49, pp. 95–103, 2008. See pages 81
and 82.
548 References

[1017] D. V. Sarwate and M. B. Pursley. Crosscorrelation properties of pseudorandom and related


sequences. Proceedings of the IEEE 68 (5), pp. 593–619 (1980). Correction in Proceedings of the
IEEE 68 (12), pp. 1554, 1980.
[1018] T. Satoh, T. Iwata, and K. Kurosawa. On cryptographically secure vectorial Boolean functions.
Proceedings of ASIACRYPT 1999, Lecture Notes in Computer Science 1716, pp. 20–28, 1999. See
page 270.
[1019] P. Savicky. On the bent Boolean functions that are symmetric. European Journal of Combinatorics
15, pp. 407–410, 1994. See page 355.
[1020] J. Schatz. The second-order Reed–Muller code of length 64 has covering radius 18. IEEE
Transactions in Information Theory 27 (4), pp. 529–530, 1981. See page 158.
[1021] K.-U. Schmidt. Nonlinearity measures of random Boolean functions. Cryptography and Communi-
cations 8 (4), pp. 637–645, 2016. See pages 84 and 159.
[1022] K.-U. Schmidt. Asymptotically optimal Boolean functions. Journal of Combinatorial Theory,
Series A 164, pp. 50–59, 2019. See page 157.
[1023] K.-U. Schmidt and Y. Zhou. Planar functions over fields of characteristic two. CoRR abs/1301.6999,
2013. See page 269.
[1024] M. Schneider. A note on the construction and upper bounds of correlation-immune functions.
Proceedings of IMA Conference on Cryptography and Coding 1997, Lecture Notes In Computer
Science 1355, pp. 295–306, 1997. An extended version appeared under the title “On the con-
struction and upper bounds of balanced and correlation-immune functions,” Selected Areas in
Cryptography (SAC), pp. 73–87, 1997. See page 312.
[1025] J. Seberry and X-.M. Zhang. Constructions of bent functions from two known bent functions.
Australasian Journal of Combinatorics 9, pp. 21–35, 1994. See page 234.
[1026] J. Seberry, X-.M. Zhang, and Y. Zheng. On constructions and nonlinearity of correlation immune
Boolean functions. Proceedings of EUROCRYPT 1993, Lecture Notes in Computer Science 765,
pp. 181–199, 1994. See page 294.
[1027] J. Seberry, X-.M. Zhang, and Y. Zheng. Nonlinearly balanced Boolean functions and their
propagation characteristics. Proceedings of CRYPTO 1993, Lecture Notes in Computer Science
773, pp. 49–60, 1994. See page 81.
[1028] J. Seberry, X.-M. Zhang, and Y. Zheng. Nonlinearity characteristics of quadratic substitution boxes.
Proceedings of Selected Areas in Cryptography (SAC 1994). This paper appeared under the title
“Relationship among nonlinearity criteria” in the Proceedings of EUROCRYPT 1994, Lecture Notes
in Computer Science, 950 pp. 376–388, 1995. See page 391.
[1029] N. V. Semakov and V. A. Zinoviev. Balanced codes and tactical configurations. Problems of
Information Transmission 5 (3), pp. 22–28 (1969) See page 255.
[1030] A. Shamir. How to share a secret. Commun. ACM 22 (11), pp. 612–613, 1979. See page 145.
[1031] J. Shan, L. Hu, and X. Zeng. Cryptographic properties of nested functions and algebraic immunity
of the Boolean function in Hitag2 stream cipher. Cryptography and Communications 6 (3),
pp. 233–254, 2014. See page 335.
[1032] A. Shanbhag, V. Kumar, and T. Helleseth. An upper bound for the extended Kloosterman sums over
Galois rings. Finite Fields and Their Applications 4, pp. 218–238, 1998. See page 188.
[1033] C. E. Shannon. A mathematical theory of communication. Bell System Technical Journal, 27,
pp. 379–423, 1948. See page 4.
[1034] C. E. Shannon. Communication theory of secrecy systems. Bell System Technical Journal, 28,
pp. 656–715, 1949. See pages 19, 76, and 89.
[1035] C. E. Shannon. The synthesis of two-terminal switching circuits. Bell System Technical Journal, 28,
pp. 59–98, 1949. See page 103.
[1036] R. Shaltiel. Dispersers for affine sources with sub-polynomial entropy. Proceedings of 52nd Annual
Symposium on Foundations of Computer Science, FOCS 2011, pp. 247–256, 2011. See pages 105
and 108.
[1037] T. Shimoyama and T. Kaneko. Quadratic relation of S-box and its application to the linear attack
of full round DES. Proceedings of CRYPTO 1998, Lecture Notes in Computer Science 1462,
pp. 200–211, 1998. See page 122.
References 549

[1038] I. Shparlinski. On the singularity of generalised Vandermonde matrices over finite fields. Finite
Fields and Their Applications 11, pp. 193–199, 2005. See page 96.
[1039] L. Shuai, L. Wang, L. Miao, and X. Zhou. Differential uniformity of the composition of two
functions. Cryptography and Communications12 (2), pp. 205–220, 2020, See page 421.
[1040] V. M. Sidelnikov. On the mutual correlation of sequences. Soviet Math. Dokl. 12, pp. 197–201,
1971. See page 118.
[1041] T. Siegenthaler. Correlation-immunity of nonlinear combining functions for cryptographic applica-
tions. IEEE Transactions on Information Theory 30 (5), pp. 776–780, 1984. See pages 76, 86, 284,
285, 297, and 298.
[1042] T. Siegenthaler. Decrypting a Class of stream ciphers using ciphertext only. IEEE Transactions on
Computer C-34 (1), pp. 81–85, 1985. See pages 76 and 87.
[1043] R. Singh, B. Sarma, and A. Saikia. Public key cryptography using permutation p-polynomials over
finite fields. IACR Cryptology ePrint Archive (http://eprint.iacr.org/) 2009/208, 2009. See page 250.
[1044] S. Smyshlyaev. Perfectly balanced Boolean functions and Golić conjecture. Journal of Cryptology
25, pp. 464–48, 2012. See page 344.
[1045] J. Søreng. The periods of the sequences generated by some symmetric shift registers. Journal of
Combinatorial Theory, Series A 21 (2), pp. 164–187, 1976. See page 23.
[1046] J. Søreng. Symmetric shift registers. Pacific J. Math. 85 (1), pp. 201–229, 1979. See page 23.
[1047] F.-X. Standaert, N. Veyrat-Charvillon, E. Oswald, etc. The world is not enough: another look on
second-order dpa. IACR Cryptology ePrint Archive (http://eprint.iacr.org/) 2010/180, 2010. See
page 427.
[1048] P. Stănică, S. Maitra, and J. Clark. Results on rotation symmetric bent and correlation immune
Boolean functions. Proceedings of Fast Software Encryption FSE 2004, Lecture Notes in Computer
Science 3017, pp. 161–177, 2004. See page 360.
[1049] D. R. Stinson. Resilient functions and large sets of orthogonal arrays. Congressus Numer. 92,
pp. 105–110, 1993. See page 314.
[1050] D. R. Stinson and J. L. Massey. An infinite class of counterexamples to a conjecture concerning
nonlinear resilient functions. Journal of Cryptology 8, n◦ 3, pp. 167–173, 1995. See page 314.
[1051] K. Stoffelen. Optimizing S-box implementations for several criteria using SAT solvers. Proceedings
of Fast Software Encryption FSE 2016, Lecture Notes in Computer Science 9783, pp. 140–160,
2016. See page 144.
[1052] V. Strassen. Gaussian elimination is not optimal. Numerische Math. 13, pp. 354–356, 1969. See
page 90.
[1053] I. Strazdins. Universal affine classification of Boolean functions. Acta Applicandae Mathematicae
46, pp. 147–167, 1997. See page 155.
[1054] S. Su and X. Tang. Systematic constructions of rotation symmetric bent functions, 2-rotation
symmetric bent functions, and bent idempotent functions. IEEE Transactions on Information
Theory 63 (7), pp. 4658–4667, 2017. See also: On the systematic constructions of rotation
symmetric bent functions with any possible algebraic degrees. IACR Cryptology ePrint Archive
(http://eprint.iacr.org/) 2015/451, 2015. See page 252.
[1055] W. Su, X. Zeng, and L. Hu. Construction of 1-resilient Boolean functions with optimum algebraic
immunity. International Journal of Computer Mathematics 88 (2), pp. 222–238, 2011. See
page 344.
[1056] T. Sugawara. 3-share threshold implementation of AES S-box without fresh randomness. IACR
Transactions on Cryptographic Hardware and Embedded Systems 2019 (1), pp. 123–145, 2019.
See page 443.
[1057] T. Sugita, T. Kasami, and T. Fujiwara. Weight distributions of the third and fifth order Reed–Muller
codes of length 512. Nara Inst. Sci. Tech. Report, 1996. See page 155.
[1058] T.-F. Sun, B. Hu, Y. Liu, and L.-P. Xu. On primary construction of plateaued functions. Proceedings
of 3rd International Conference on Material, Mechanical and Manufacturing Engineering IC3ME
2015, Atlantis Press, 2015. See page 265.
[1059] S. H. Sung, S. Chee, and C. Park. Global avalanche characteristics and propagation criterion of
balanced Boolean functions. Information Processing Letters 69, pp. 21–24, 1999.
550 References

[1060] Y. Tan, G. Gong, and B. Zhu. Enhanced criteria on differential uniformity and nonlinearity of
cryptographically significant functions. Cryptography and Communications 8 (2), pp. 291–311,
2016. See page 136.
[1061] Y. Tan, L. Qu, S. Ling, and C. H. Tan. On the Fourier spectra of new APN functions. SIAM Journal
on Discrete Mathematics 27 (2), pp. 791–801, 2013. See page 412.
[1062] C. Tang and Y. Qi. Constructing Hyper-Bent Functions from Boolean Functions with the Walsh
Spectrum Taking the Same Value Twice. Proceedings of International Conference on Sequences
and Their Applications SETA 2014, Lecture Notes in Computer Science, pp. 60–71, 2014. See
page 247.
[1063] C. Tang and Y. Qi. A class of hyper-bent functions and Kloosterman sums. Cryptography and
Communications 9 (5), pp. 647–664, 2017. See page 247.
[1064] C. Tang, Y. Qi, and M. Xu. Multiple output bent functions characterized by families of bent
functions. Journal of Cryptologic Research 1 (4), pp. 321–326, 2014. See page 272.
[1065] C. Tang, Y. Qi, Z. Zhou, and C. Fan. Two infinite classes of rotation symmetric bent functions
with simple representation. Applicable Algebra in Engineering Communication and Computing
(AAECC) 29 (3), pp. 197–208, 2018. See pages 249 and 251.
[1066] D. Tang, C. Carlet, and X. Tang. On the second-order nonlinearities of some bent functions.
Information Sciences 223, pp. 322–330, 2013. See page 85.
[1067] D. Tang, C. Carlet, and X. Tang. Highly nonlinear Boolean functions with optimal algebraic
immunity and good behavior against fast algebraic attacks. IEEE Transactions on Information
Theory 59 (1), pp. 653–664, 2013 (preliminary version available in IACR Cryptology ePrint
Archive, http://eprint.iacr.org/2011/366, 2011). See pages 332 and 340.
[1068] D. Tang, C. Carlet, and X. Tang. A class of 1-resilient Boolean functions with optimal algebraic
immunity and good behavior against fast algebraic attacks. International Journal of Foundations of
Computer Science 25 (6), pp. 763–780, 2014. See page 340.
[1069] D. Tang, C. Carlet, and X. Tang. Differentially 4-uniform bijections by permuting the inverse
function. Designs, Codes and Cryptography 77 (1), pp. 117–141, 2015. See page 421.
[1070] D. Tang, C. Carlet, X. Tang, and Z. Zhou. Construction of highly nonlinear 1-resilient Boolean
functions with optimal algebraic immunity and provably high fast algebraic immunity. IEEE
Transactions on Information Theory 63 (9), pp. 6113–6125, 2017. See page 344.
[1071] D. Tang, C. Carlet, and Z. Zhou. Binary linear codes from vectorial Boolean functions and their
weight distribution. Discrete Mathematics 340 (12), pp. 3055–3072, 2017. See page 161.
[1072] D. Tang, S. Kavut, B. Mandal, and S. Maitra. Modifying Maiorana–McFarland type bent functions
for good cryptographic properties and efficient implementation. SIAM Journal of Discrete Mathe-
matics 33 (1), pp. 238–256, 2019. See page 99.
[1073] D. Tang and J. Liu. A family of weightwise (almost) perfectly balanced Boolean functions with
optimal algebraic immunity. Special Issue on Boolean Functions and Their Applications 2018,
Cryptography and Communications 11 (6), pp. 1185–1197, 2019. See page 459.
[1074] D. Tang and S. Maitra. Construction of n-variable (n ≡ 2 mod 4) balanced Boolean functions
n
with maximum absolute value in autocorrelation spectra < 2 2 . IEEE Transactions on Information
Theory 64 (1), pp. 393–402, 2018. See page 320.
[1075] C. Tang, Z. Zhou, Y. Qi, X. Zhang, C. Fan, and T. Helleseth. Generic construction of bent functions
and bent idempotents with any possible algebraic degrees. IEEE Transactions on Information
Theory 63 (10), pp. 6149–6157, 2017. See page 251.
[1076] X. H. Tang, D. Tang, X. Zeng, and L. Hu. Balanced Boolean functions with (almost) optimal
algebraic immunity and very high nonlinearity. IACR Cryptology ePrint Archive (http://eprint.iacr.
org/) 2010/443, 2010. See pages 340 and 344.
[1077] H. Taniguchi. On some quadratic APN functions. Designs, Codes and Cryptography 87 (9),
pp. 1973–1983, 2019. See pages 407 and 410.
[1078] T. Tao. Structure and randomness in combinatorics. Proceedings of FOCS 2007, pp. 3–15 (also
arXiv:0707.4269v2[math.CO], 2007).
References 551

[1079] H. Tapia-Recillas and G. Vega. An upper bound on the number of iterations for transforming a
Boolean function of degree greater than or equal than 4 to as function of degree 3. Designs, Codes
and Cryptography 24, pp. 305–312, 2001.
[1080] Y. V. Tarannikov. On resilient Boolean functions with maximum possible nonlinearity. Proceedings
of INDOCRYPT 2000, Lecture Notes in Computer Science 1977, pp. 19–30, 2000. See pages 287,
288, 291, and 300.
[1081] Y. V. Tarannikov. New constructions of resilient Boolean functions with maximum nonlinearity.
Proceedings of Fast Software Encryption FSE 2001, Lecture Notes in Computer Science 2355,
pp. 66–77, 2001. See pages 287 and 291.
[1082] Y. V. Tarannikov and D. Kirienko. Spectral analysis of high order correlation immune functions.
Proceedings of 2001 IEEE International Symposium on Information Theory, p. 69, 2001. Prelimi-
nary version available in IACR Cryptology ePrint Archive (http://eprint.iacr.org/) 2000/050, 2000.
See pages 101 and 313.
[1083] Y. V. Tarannikov, P. Korolev, and A. Botev. Autocorrelation coefficients and correlation immunity
of Boolean functions. Proceedings of ASIACRYPT 2001, Lecture Notes in Computer Science 2248,
pp. 460–479, 2001 See pages 287, 290, and 313.
[1084] A. Tardy-Corfdir and H. Gilbert. A known plaintext attack on feal-4 and feal-6. Proceedings of
CRYPTO 1991, Lecture Notes in Computer Science 576, pp. 172–181, 1991. See page 115.
[1085] E. Thomé. Subquadratic computation of vector generating polynomials and improvement of the
block Wiedemann algorithm. Journal of Symbolic Computation 33 (5), pp. 757–775, 2002. See
page 21.
[1086] Y. Todo. Structural evaluation by generalized integral property. Proceedings of EUROCRYPT 2015,
Part I. Lecture Notes in Computer Science 9056, pp. 287–314, 2015. See page 114.
[1087] N. Tokareva. On the number of bent functions from iterative constructions: lower bounds and
hypotheses. Advances in Mathematics of Communications 5 (4), pp., 609–621, 2011. See page 242.
[1088] N. Tokareva. Duality between bent functions and affine functions. Discrete Mathematics 312,
pp. 666–670, 2012. See page 191.
[1089] N. Tokareva. Bent Functions, Results and applications to cryptography. Elsevier, 2015. See
page 190.
[1090] S. Tsai. Lower bounds on representing Boolean functions as polynomials in Z∗m . SIAM Journal on
Discrete Mathematics 9 (1), pp. 55–62, 1996. See page 37.
[1091] Z. Tu and Y. Deng. A conjecture on binary string and its applications on constructing Boolean
functions of optimal algebraic immunity. Designs, Codes and Cryptography 60 (1), pp. 1–14, 2011.
See pages 332, 337, and 339.
[1092] Z. Tu and Y. Deng. Boolean functions optimizing most of the cryptographic criteria. Discrete
Applied Mathematics 160 (4), pp. 427–435, 2012. See page 344.
[1093] Z. Tu, D. Zheng, X. Zeng, and L. Hu. Boolean functions with two distinct Walsh coeffi-
cients. Applicable Algebra in Engineering, Communication and Computing AAECC 22 (5–6),
pp. 359–366, 2011. See page 190.
[1094] K. Varici, S. Nikova, V. Nikov, and V. Rijmen. Constructions of S-boxes with uniform sharing.
Cryptography and Communications 11 (3), pp. 385–398, 2019. Preliminary version in IACR
Cryptology ePrint Archive (http://eprint.iacr.org/) 2018/92, 2018. See pages 442 and 479.
[1095] S. Vaudenay. On the need for multipermutations: cryptanalysis of MD4 and SAFER. Proceedings of
Fast Software Encryption FSE 1995, Lecture Notes in Computer Science 1008, pp. 286–297, 1995.
See page 129.
[1096] I. Villa. On APN functions L1 (x 3 ) + L2 (x 9 ) with linear L1 and L2 . Cryptography and Communi-
cations 11 (1), pp. 3–20, 2019. See page 406.
[1097] S. F. Vinokurov and N. A. Peryazev. An expansion of Boolean function into a sum of products of
subfunctions. Discrete Mathematics and Applications 3 (5), pp. 531–533, 1993.
[1098] J. F. Voloch. Symmetric cryptography and algebraic curves. Proceedings of “The First Symposium
on Algebraic Geometry and Its Applications” Dedicated to Gilles Lachaud (SAGA’07), Tahiti, 2007,
Published by World Scientific, Series on Number Theory and Its Applications 5, pp. 135–141, 2008.
See pages 136 and 369.
552 References

[1099] T. Wadayama, T. Hada, K. Wagasugi, and M. Kasahara. Upper and lower bounds on the maximum
nonlinearity of n-input m-output Boolean functions. Designs, Codes and Cryptography 23,
pp. 23–33, 2001. See pages 160, 370, and 378.
[1100] J. Waddle and D. Wagner. Towards efficient second-order power analysis. Proceedings of Interna-
tional Workshop Cryptographic Hardware and Embedded Systems CHES 2010, Lecture Notes in
Computer Science 3156, pp. 1–15, 2004. See page 427.
[1101] D. Wagner. The boomerang attack. Proceedings of Fast Software Encryption FSE 1999, Lecture
Notes in Computer Science 1636, pp. 156–170, 1999. See page 141.
[1102] Q. Wang. The covering radius of the Reed–Muller code RM (2, 7) is 40. Discrete Mathematics 342
(12), p. 111625, 2019. See pages 157 and 158.
[1103] H. Wang, J. Peng, Y. Li, and H. Kan. On 2k-variable symmetric Boolean functions with maximum
algebraic immunity k. IEEE Transactions on Information Theory 58 (8), pp. 5612–5624, 2012. See
page 358.
[1104] Q. Wang. Hadamard matrices, d-linearly independent sets and correlation-immune Boolean
functions with minimum hamming weights. Designs, Codes and Cryptography 87 (10),
pp. 2321–2333, 2019 . See also IACR Cryptology ePrint Archive (http://eprint.iacr.org/) 2018/284.
See pages 291, 304, 305, and 306.
[1105] Q. Wang, C. Carlet, P. Stănică, and C. H. Tan. Cryptographic properties of the hidden weighted bit
function. Discrete Applied Mathematics 174, pp. 1–10, 2014. See page 343.
[1106] Q. Wang and T. Johansson. A note on fast algebraic attacks and higher order nonlinearities.
Proceedings of INSCRYPT 2010, Lecture Notes in Computer Science 6584, pp. 84–98, 2010. See
pages 94, 332, and 340.
[1107] Q. Wang and H. Kan. A note on the construction of differentially uniform permutations using
extension fields. IEICE Transactions 95-A (11), pp. 2080–2083, 2012. See page 419.
[1108] Q. Wang and P. Stănică. A trigonometric sum sharp estimate and new bounds on the nonlinearity of
some cryptographic Boolean functions. Designs, Codes and Cryptography 87 (8), pp. 1749–1763,
2019. See page 340.
[1109] Q. Wang, C. H. Tan, and T. F. Prabowo. On the covering radius of the third order Reed–Muller code
RM (3, 7). Designs, Codes and Cryptography 86 (1), pp. 151–159, 2018. See page 158.
[1110] Z. Wang and G. Gong. Discrete Fourier transform of Boolean functions over the complex field
and its applications. IEEE Transactions on Information Theory 64 (4) (Special Issue in Honor of
Solomon Golomb), pp. 3000–3009, 2018. See pages 44, 52, 53, and 88.
[1111] T. Wang, M. Liu, and D. Lin. Construction of resilient and nonlinear Boolean functions with almost
perfect immunity to algebraic and fast algebraic attacks. Information Security and Cryptology.
Inscrypt 2012, M. Kutyłowski and M. Yung, eds. Lecture Notes in Computer Science, vol. 7763,
Springer, pp. 276–293, 2013. See page 344.
[1112] Z. Wang and M. Karpovsky. Algebraic manipulation detection codes and their applications for
design of secure cryptographic devices. Proceedings of On-Line Testing Symposium (IOLTS), 2011,
pp. 234–239, 2011. See pages 451 and 452.
[1113] Z. Wang and M. Karpovsky. New error detecting codes for the design of hardware resistant to strong
fault injection attacks. Proceedings of International Conference on Security and Management, SAM,
2012. See pages 451, 452, and 453.
[1114] Z. Wang, M. Karpovsky, and K. J Kulikowski. Design of memories with concurrent error detection
and correction by nonlinear SEC-DED codes. Journal of Electronic Testing 26 (5), pp. 559–580,
2010. See page 453.
[1115] Z. Wang, X. Zhang, S. Wang, Z. Zheng, and W. Wang. Construction of Boolean functions with
excellent cryptographic criteria using bivariate polynomial representation. International Journal of
Computer Mathematics, 93 (3), pp. 425–444, 2016. See page 344.
[1116] A. F. Webster and S. E. Tavares. On the design of S-boxes. Proceedings of CRYPTO 1985, Lecture
Notes in Computer Science 219, pp. 523–534, 1985. See page 97.
[1117] I. Wegener, The Complexity of Boolean Functions, John Wiley & Sons, 1987. See pages 19 and 352.
References 553

[1118] G. Weng, Y. Tan, and G. Gong. On almost perfect nonlinear functions and their related algebraic
objects. Proceedings of International Workshop on Coding and Cryptography, pp. 48–57, 2013.
See pages 393 and 399.
[1119] K. S. Williams, Note on cubics over GF (2n ) and GF (3n ). Journal of Number Theory 7 (4),
pp. 361–365, 1975. See page 495.
[1120] J. Wolfmann. Bent functions and coding theory. Difference Sets, Sequences and their Correlation
Properties, A. Pott, P. V. Kumar, T. Helleseth, and D. Jungnickel, eds., Kluwer, pp. 393–417. 1999.
See pages 160, 195, and 196.
[1121] J. Wolfmann. Cyclic code aspects of bent functions. Finite Fields Theory and Applications,
Contemporary Mathematics Series of the AMS, Amer. Math Soc., vol. 518, pp. 363–384, 2010.
See page 262.
[1122] J. Wolfmann. Sequences of bent and near-bent functions. Cryptography and Communications 9 (6),
pp. 729–736, 2017. See page 240.
[1123] B. Wu. PS bent functions constructed from finite pre-quasifield spreads. http://arxiv.org/abs/1308
.3355, 2013. See pages 217, 225, and 226.
[1124] B. Wu, J. Zheng and D. Lin: Constructing Boolean functions with (potentially) optimal algebraic
immunity based on multiplicative decompositions of finite fields. Proceedings of IEEE Interna-
tional Symposium on Information Theory (ISIT) 2015, pp. 491–495, 2015. See page 340.
[1125] C.-K. Wu and D. Feng. Boolean Functions and Their Applications in Cryptography. Springer, 2016.
See page 76.
[1126] T. Xia, J. Seberry, J. Pieprzyk, and C. Charnes. Homogeneous bent functions of degree n in 2n
variables do not exist for n > 3. Discrete Applied Mathematics 142 (1–3), pp. 127–132, 2004. See
page 248.
[1127] Y. Xia, N. Li, X. Zeng, and T. Helleseth. An open problem on the distribution of a Niho type cross-
correlation function. IEEE Transactions on Information Theory 62 (12), pp. 7546–7554, 2016. See
page 384.
[1128] G.-Z. Xiao and J. L. Massey. A spectral characterization of correlation-immune combining
functions. IEEE Transactions on Information Theory 34 (3), pp. 569–571, 1988. See page 87.
[1129] M. Xiong, H. Yan, and P. Yuan. On a conjecture of differentially 8-uniform power functions.
Designs, Codes and Cryptography 86 (8), pp. 1601–1621, 2018. See page 422.
[1130] G. Xu and X. Cao. Constructing new piecewise differentially 4-uniform permutations from known
APN functions. International Journal of Foundations of Computer Science 26 (5), pp. 599–609,
2015. See page 420.
[1131] G. Xu, X. Cao, and S. Xu. Several new classes of Boolean functions with few Walsh transform
values. Applicable Algebra in Engineering Communication and Computing (AAECC) 28 (2),
pp. 155–176, 2017. See pages 231 and 263.
[1132] G. Xu and L. Qu. Two classes of differentially 4-uniform permutations over Fn2 with n even. To
appear in Advances in Mathematics of Communications 14 (1), 2020. See page 420.
[1133] Y. Xu, C. Carlet, S. Mesnager, and C. Wu. Classification of bent monomials, constructions of bent
multinomials and upper bounds on the nonlinearity of vectorial functions. IEEE Transactions on
Information Theory 64 (1), pp. 367–383, 2018. See pages 122, 140, and 272.
[1134] X. Yang and J. L. Massey. The condition for a cyclic code to have a complementary dual. Discrete
Mathematics 126, pp. 391–393, 1994. See page 446.
[1135] Y. X. Yang and B. Guo. Further enumerating Boolean functions of cryptographic signifiance.
Journal of Cryptology 8 (3), pp. 115–122, 1995. See pages 356 and 357.
[1136] A. C. Yao. Protocols for secure computations (extended abstract). Proceedings of FOCS 1982,
pp. 160–164, 1982. See page 146.
[1137] R. Yarlagadda and J. E. Hershey. Analysis and synthesis of bent sequences, IEE Proceedings. Part
E. Computers and Digital Techniques 136, pp. 112–123, 1989. See pages 231 and 293.
[1138] K. Yasunaga and T. Fujiwara. On correctable errors of binary linear codes. IEEE Transactions on
Information Theory 56 (6), pp. 2537–2548, 2010. See page 79.
[1139] S. Yoshiara. Equivalences of quadratic APN functions. Journal of Algebraic Combinatorics 35,
pp. 461–475, 2012. See pages 30 and 281.
554 References

[1140] S. Yoshiara. Equivalences of power APN functions with power or quadratic APN functions. Journal
of Algebraic Combinatorics 44 (3), pp. 561–585, 2016. See pages 281, 396, 398, 402, 405, and 406.
[1141] S. Yoshiara. Equivalences among plateaued APN functions. Designs, Codes and Cryptography 85
(2), pp. 205–217, 2017. See pages 281 and 402.
[1142] S. Yoshiara. Plateaudness of Kasami APN functions. Finite Fields and Their Applications 47,
pp. 11–32, 2017. See pages 382 and 400.
[1143] A. M. Youssef and G. Gong. Hyper-bent functions. Proceedings of EUROCRYPT 2001, Lecture
Notes in Computer Science 2045, Berlin, pp. 406–419, 2001. See pages 244 and 272.
[1144] N. Y. Yu and G. Gong. Constructions of quadratic bent functions in polynomial forms. IEEE
Transactions on Information Theory 52 (7), pp. 3291–3299, 2006. See pages 206, 230, 231, and 250.
[1145] Y. Yu, M. Wang, and Y. Li. A matrix approach for constructing quadratic APN functions.
Proceedings of International Workshop on Coding and Cryptography, pp. 39–47, 2013, and
Designs, Codes and Cryptography 73 (2), pp 587–600, 2014. See pages 392, 393, and 399.
[1146] Y. Yu, M. S. Wang, and Y. Q. Li. Constructing low differential uniformity functions from known
ones. Chinese Journal of Electronics 22 (3), pp. 495–499, 2013. See also IACR Cryptology ePrint
Archive (http://eprint.iacr.org/) 2011/47, entitled “Constructing differential 4-uniform permutations
from known ones.” See page 420.
[1147] J. Yuan, C. Carlet, and C. Ding. The weight distribution of a class of linear codes from perfect
nonlinear functions. IEEE Transactions on Information Theory 52 (2), pp. 712–717, 2006. See
page 160.
[1148] X. Zeng, C. Carlet, J. Shan, and L. Hu. More balanced Boolean functions with optimal algebraic
immunity and good nonlinearity and resistance to fast algebraic attacks. IEEE Transactions on
Information Theory 57 (9), pp. 6310–6320, 2011. See pages 337 and 339.
[1149] X. Zeng and L. Hu. Constructing Boolean functions by modifying Maiorana–McFarland’s
superclass functions. IEICE Transactions on Fundamentals of Electronics, Communications and
Computer Sciences 88-A (1), pp. 59–66, 2005. See page 340.
[1150] Z. Zha, L. Hu, and S. Sun. Constructing new differentially 4-uniform permutations from the inverse
function. Finite Fields and Their Applications 25, pp. 64–78, 2014. See page 420.
[1151] Z. Zha, L. Hu, S. Sun, and J. Shan. Further results on differentially 4-uniform permutations over
F22m . Science China Mathematics 58 (7), pp. 1577–1588, 2015. See page 420.
[1152] F. Zhang, C. Carlet, Y. Hu, and T.-J. Cao. Secondary constructions of highly nonlinear Boolean
functions and disjoint spectra plateaued functions. Information Sciences 283, pp. 94–106, 2014.
See page 266.
[1153] F. Zhang, C. Carlet, Y. Hu, and W. Zhang. New secondary constructions of bent functions.
Applicable Algebra in Engineering, Communication and Computing (AAECC) 27 (5), pp. 413–434,
2016. See page 234.
[1154] F. Zhang, E. Pasalic, N. Cepak, and Y. Wei. Bent functions in C and D Outside the completed
Maiorana–McFarland class. Proceedings of C2SI 2017, Lecture Notes in Computer Science 10194,
pp. 298–313, 2017. See page 211.
[1155] M. Zhang. Maximum correlation analysis of nonlinear combining functions in stream ciphers.
Journal of Cryptology 13 (3), pp. 301–313, 2000. See pages 101 and 290.
[1156] M. Zhang and A. Chan. Maximum correlation analysis of nonlinear S-boxes in stream ciphers.
Proceedings of CRYPTO 2000, Lecture Notes in Computer Science 1880, pp. 501–514, 2000. See
pages 131 and 132.
[1157] W. Zhang. High-meets-low: construction of strictly almost optimal resilient Boolean functions via
fragmentary Walsh spectra. IEEE Transactions on Information Theory 65 (9), pp. 5856–5864, 2019.
See page 317.
[1158] W. Zhang, Z. Bao, V. Rijmen, and M. Liu. A new classification of 4-bit optimal S-boxes and
its application to PRESENT, RECTANGLE and SPONGENT. Proceedings of Fast Software
Encryption FSE 2015, Lecture Notes in Computer Science 9054, pp. 494–515, 2015; see also
https://eprint.iacr.org/2015/433. See page 144.
References 555

[1159] W. Zhang, L. Li, and E. Pasalic. Construction of resilient S-boxes with higher-dimensional vectorial
outputs and strictly almost optimal non-linearity. IET Information Security 11 (4), pp. 199–203,
2017. See page 317.
[1160] W. Zhang and E. Pasalic. Highly nonlinear balanced S-boxes with good differential properties.
IEEE Transactions on Information Theory 60 (12), pp. 7970–7979, 2014. See page 422.
[1161] W. Zhang and E. Pasalic. Constructions of resilient S-Boxes with strictly almost optimal nonlinear-
ity through disjoint linear codes. IEEE Transactions on Information Theory 60 (3), pp. 1638–1651,
2014. See page 317.
[1162] W. Zhang, Z. Xing, and K. Feng. A construction of bent functions with optimal algebraic degree
and large symmetric group. Advances in Mathematics of Communications 14 (1), pp. 23–33, 2020.
See also IACR Cryptology ePrint Archive (http://eprint.iacr.org/) 2017/197. See page 192.
[1163] W.-G. Zhang and E. Pasalic. Generalized Maiorana–McFarland construction of resilient Boolean
functions with high nonlinearity and good algebraic properties. IEEE Transactions on Information
Theory 60 (10), pp. 6681–6695, 2014. See page 295.
[1164] W. G. Zhang and G. Z. Xiao. Constructions of almost optimal resilient Boolean functions on large
even number of variables. IEEE Transactions on Information Theory 55 (12), pp. 5822–5831, 2009.
See page 295.
[1165] X. Zhang, X. Cao, and R. Feng. A method of evaluation of exponential sum of binary quadratic
functions. Finite Fields and Their Applications 18, pp. 1089–1103, 2012. See page 179.
[1166] X. Zhang and M. Zhou. Construction of CCZ transform for quadratic APN functions. Cognitive
Systems Research 57, pp. 41–45, 2019. See page 404.
[1167] X.-M. Zhang and Y. Zheng. GAC – the criterion for global avalanche characteristics of crypto-
graphic functions. Journal of Universal Computer Science, 1 (5), pp. 320–337, 1995. See pages 97
and 320.
[1168] X.-M. Zhang and Y. Zheng. On nonlinear resilient functions. Proceedings of EUROCRYPT 1995,
Lecture Notes in Computer Science 921, pp. 274–288, 1995. See pages 129 and 317.
[1169] X.-M. Zhang and Y. Zheng. Auto-correlations and new bounds on the nonlinearity of Boolean
functions. Proceedings of EUROCRYPT 1996, Lecture Notes in Computer Science 1070,
pp. 294–306, 1996. See pages 81, 98, and 318.
[1170] X.-M. Zhang and Y. Zheng. Cryptographically resilient functions. IEEE Transactions on Informa-
tion Theory 43, pp. 1740–1747, 1997. See page 317.
[1171] X.-M. Zhang and Y. Zheng. The nonhomomorphicity of Boolean functions. Proceedings of SAC
1998, Lecture Notes in Computer Science 1556, pp. 280–295, 1999. See page 111.
[1172] J. Zheng, B. Wu, Y. Chen, and Z. Liu. Constructing 2m-variable Boolean functions with optimal
algebraic immunity based on polar decomposition of F∗2m . International Journal of Foundations of
2
Computer Science 25 (5), pp. 537–552, 2014. See page 340.
[1173] Y. Zheng and X. M. Zhang. Plateaued functions. Proceedings of ICICS 1999, Lecture Notes in
Computer Science 1726, pp. 284–300, 1999. See pages 81, 98, 190, 258, 259, and 264.
[1174] Y. Zheng and X. M. Zhang. Relationships between bent functions and complementary plateaued
functions. Lecture Notes in Computer Science 1787, pp. 60–75, 1999. See page 258.
[1175] Y. Zheng and X.-M. Zhang. On relationships among avalanche, nonlinearity and correla-
tion immunity. Proceedings of ASIACRYPT 2000, Lecture Notes in Computer Science 1976,
pp. 470–483, 2000. See page 290.
[1176] Y. Zheng and X. M. Zhang. On plateaued functions. IEEE Transactions on Information Theory 47
(3), pp. 1215–1223, 2001. See page 258.
[1177] Y. Zheng and X.-M. Zhang. Improving upper bound on the nonlinearity of high order correlation
immune functions. Proceedings of Selected Areas in Cryptography 2000, Lecture Notes in
Computer Science 2012, pp. 262–274, 2001. See page 287.
[1178] Y. Zheng and X.-M. Zhang. On balanced nonlinear Boolean functions. NATO Science for Peace
and Security Series – D: Information and Communication Security, IOS Press, Vol 18: Boolean
Functions in Cryptology and Information Security, pp. 243–282, 2008. See pages 257, 258, and 320.
[1179] Y. Zheng, X.-M. Zhang, and H. Imai. Restriction, terms and nonlinearity of Boolean functions.
Theoretical Computer Science, 226 (1–2), pp. 207–223, 1999. See pages 71, 106, and 153.
556 References

[1180] Y. Zhou. On the distribution of auto-correlation value of balanced Boolean functions. Advances in
Mathematics of Communications 7 (3), pp. 335–347, 2013. See page 98.
[1181] Y. Zhou and C. Li. The Walsh spectrum of a new family of APN functions. Proceedings of WSPC,
2008. See also IACR Cryptology ePrint Archive (http://eprint.iacr.org/) 2008/154. See page 408.
[1182] Y. Zhou and A. Pott. A new family of semifields with 2 parameters. Advances in Mathematics 234,
pp. 43–60, 2013. See pages 407 and 410.
[1183] N. Zierler and W. H. Mills. Products of linear recurring sequences. Journal of Algebra 27,
pp. 147–157, 1973. See page 77.
[1184] M. Zieve. On a theorem of Carlitz. Journal of Group Theory 17, pp. 667–669, 2014. See page 442.
Index

2-design, 201 ambiguity, 139


2-weight, 45, 155, 324, 327 AMD code, 450
amplitude, 258
(n, m)-function, 24, 30, 51, 71, 113, 127, 186, 268, 313, ANF, 31, 90, 128, 174, 195, 200, 292, 320, 335, 353, 359,
344, 348, 369, 415, 437, 476 441
(n, m, t)-function, 129, 313 annihilator, 91, 127
[n, k, d]-code, 7, 292 APN, 137, 371, 478
[n, k, d]q -code, 7 APN exponent, 383
Z-bent function, 239 arithmetic Walsh transform, 57
k-normal function, 105 asymmetric cryptography, 2
k-th order derivative, 39, 159 atomic function, 31
k-weakly normal function, 105 attacker model, 20, 425
n-variable, 17 authentication scheme, 149
p-ary function, 159 autocorrelation function, 61, 71
p-polynomial, 491 automorphism group, 88
r-th order nonlinearity, 83 automorphism group of function, 72, 192
s-th order Kronecker sum, 310 Ax’s theorem, 156

AB, 370, 478, 479 balanced, 16, 76, 112, 252, 475–477
absolute indicator, 97 balanced incomplete block design, 201
absolute trace function, 42 basic algebraic immunity, 127
absolute trace representation, 43 BCH bound, 13, 338, 387
access structure, 147 bent, 80, 370, 449, 476, 477, 479
address function, 68, 264 bent concatenation bound, 81
adjoint operator, 63 bent concatenation construction, 234
Advanced Encryption Standard (AES), 25 bent exponents, 230
affine disperser, 105 bent monomial Boolean univariate functions, 229
affine function, 37, 79, 131, 152, 166, 190, 369 bent4 , 267
affine invariant, 29, 88, 92, 97, 101, 102, 114, 443 Berlekamp–Massey (BM) algorithm, 21
affine plane, 137, 218, 220 best affine approximation, 57
affinely equivalent, 28, 103, 165, 335 bias in the output distribution, 134
algebraic attack, 89, 189, 321 big open APN problem, 410
algebraic degree, 35, 40, 63, 91, 99, 102, 151, 190, 202, binary entropy function, 289
213, 222, 242, 251, 259, 287, 318, 321, 337, 476 binary expansion, 45
algebraic immunity, 91, 127, 321, 344 binary Möbius transform, 33, 37, 353
algebraic immunity with inputs in E, 464 binomial AB functions, 397
algebraic manipulation, 450 binomial APN functions, 405
algebraic manipulation detection codes, 450 bit-probing security, 429, 445
algebraic normal form, 31, 39, 103, 213, 248, 375, 476 bivariate APN functions, 409
algebraic thickness, 103 bivariate representation, 47, 207, 209, 213, 231, 249,
almost bent, 119, 370 270, 361
almost perfect nonlinear, 137, 371 black box attacker model, 425

557
558 Index

block cipher, 3 CPA, 426


Boolean functions in dimension n, 17 CPRR method, 435
Boolean masking, 428, 444, 445, 447 crooked, 231, 278
boomerang uniformity, 141 cryptanalysis, 1
bounded moment security model, 429 cryptography, 1
Brinkmann–Leander–Edel–Pott function, 408 cryptology, 2
butterfly construction, 411, 421 cryptosystem, 1
cubic functions, 36, 115, 248
CAPN, 391, 478 cubic sums, 177
Carlet–Feng function, 337 CWU, 414
Cayley graph, 70 cyclic code, 12, 155, 215, 245, 249, 262, 326, 341, 387,
CCZ equivalence, 29, 345, 370, 403, 478 446
CCZ equivalent, 28, 72, 115, 281, 380, 475 cyclic difference set with Singer parameters, 415
CCZ invariant, 29, 117, 136, 138, 275, 403 cyclic-additive difference set, 416
changing of the guards, 443 cyclotomic class, 43
characteristic polynomial, 21 cyclotomic method, 434
ciphertext, 1
classical Singer set, 416 Data Encryption Standard (DES), 25
classify, 27, 478 DDT, 135
cosupport, 201 dealer, 146
code, 5 decomposable functions, 232
code design, 201 decryption, 1
codebook, 34 defining set, 12, 387
derivative, 38, 52, 61, 67, 81, 98, 275, 414, 476, 477
codeword, 4
derivative imbalance, 138
coding theory, 4
Desarguesian spread, 213
coincident functions, 37
designed distance, 13
collineation, 221
deterministic leak, 426
combiner model, 21, 86, 126, 284
Dickson form, 173
complementary information set code (CIS), 432
Dickson polynomial, 389, 409, 491
complete class, 192
difference distribution table, 135
completed Maiorana–McFarland class, 165, 173
difference set, 197, 448
complexity parameters, 103
difference set design, 201
component algebraic immunity, 127
difference table, 71
component functions, 40, 112, 115, 122, 268, 272, 274,
differential, 134
278, 345, 369, 414, 439, 493
differential attack, 134
componentwise APN, 391, 478 differential cryptanalysis, 134
componentwise Walsh uniform, 414 differential power analysis, 426
composite function, 39, 64, 371 differential spectrum, 135
concatenated code, 9 differential uniformity, 135, 371
confusion, 76 differentially δ-uniform, 135
continuous side-channel attacks, 427 diffusion, 76
conventional cryptography, 2 Dillon exponents, 215
coordinate functions, 8, 24 Dillon’s functions, 213
Coron–Roy–Vivek (CRV) method, 434 Dirac (or Kronecker) symbol, 55
correction capacity, 5 direct sum, 232, 265, 297, 341, 362
correctness, 437 direct sum masking, 444
correlation attack, 87 direct sum of bent functions, 274
correlation immune, 86, 129, 313 direct sum vector, 363
correlation immunity order, 86 distance enumerator, 16, 255
correlation power analysis, 426 distance invariant, 255
coset, 7 distance to linear structures, 101
coset leader, 7, 43, 79, 475 distinguisher, 116, 134
Courtois–Meier bound, 92, 469 distinguishing attack, 89
covered, 33, 341 Dobbertin function, 401, 412
covering radius, 6, 81, 157, 159, 205 Dobbertin’s conjecture, 297
covering radius bound, 80, 118, 133, 269 domain-oriented masking, 444
covering sequence, 182 double simplex code, 10
Index 559

DPA, 426 global avalanche criterion, 97


DSM, 444 Gold AB functions, 394
dual code, 8, 17, 146 Gold APN functions, 400
dual distance, 16, 88, 162, 284, 292, 314, 432 Gold Boolean functions, 206
dual function, 197 Goldreich’s function, 468
dual-bent vectorial function, 269 Goldreich’s pseudorandom generator, 468
Golomb–Xiao–Massey characterization, 87, 286, 363
EA equivalent, 28, 219, 281, 359, 394, 404 Gowers inverse conjecture, 473
EA invariant, 29, 36, 40, 79, 192, 275 graph, 7, 34, 39, 40, 51, 71, 72, 136, 275, 388
edges, 70 graph algebraic immunity, 127
encryption, 1 graph theory, 70
equivalent codes, 8 gray box attacker model, 425
error detecting/correcting codes, 4 group algebra, 152, 403
error vector, 9 guess and determine, 96
eSTREAM Project, 22
exact repair problem, 146 Hadamard difference set, 197
exceptional, 404 Hamming bound, 6
expander graph, 468 Hamming code, 8, 13, 155, 379
extended code, 6, 380 Hamming distance, 5, 27
extended propagation criterion, 97, 319 Hamming distance leakage model, 426
extended Walsh spectrum, 55, 71 Hamming weight, 4, 27, 112, 130, 154, 180, 195, 215,
extended Walsh transform, 244 285, 303, 475, 477, 479
extension of Maiorana–McFarland type, 235 Hamming weight leakage model, 426
hexadecimal, 417
FAA, 93, 291, 344 hexanomial APN functions, 408
fast algebraic attack, 93, 190, 284, 291, 321, 335, 338 hidden weight bit function, 343, 477
fast algebraic complexity, 94, 322 higher-order differential attack, 114
fast algebraic immunity, 94, 321 higher-order nonlinearity, 83, 114
fast correlation attack, 78, 194, 244 higher-order side-channel attack, 427
fast Fourier–Hadamard transform, 53 HO-SCA, 427
fast Möbius transform, 33 homogeneous function, 201, 209, 248
fault injection attack, 427 HWBF, 343
feedback coefficients, 21 hybrid symmetric-FHE encryption, 453
feedback polynomial, 20 hyper-bent function, 244, 246, 272, 338, 477
feedback shift register, 23 hypergraph, 70
Feistel cipher, 26, 112, 423 hyperoval, 219
FHE, 453 hyper-nonlinearity, 338, 477
FIA, 427
filter model, 22, 89, 126 ideal autocorrelation, 416
filter permutator, 454 idempotent function, 248, 360
flat, 36 imbalance, 114, 268
FLIP cipher, 454 indicator, 51, 58, 71, 103, 127, 154, 181, 275, 345, 375,
Fourier–Hadamard spectrum, 53 416, 432, 448
Fourier–Hadamard support, 53 indirect sum, 233, 235, 265, 300
Fourier–Hadamard transform, 53, 117, 132, 306 influence of variable, 68
Frobenius automorphism, 42, 222, 487 information bits, 7
fully homomorphic encryption, 453 information protection, 17
information set, 161, 314, 328, 432
general affine group, 155 initial functions, 209
generalized correlation attack, 131 inner product, 37, 58, 61, 160, 164, 166, 209, 213
generalized degree, 48 inner product masking, 446
generalized nonlinearity, 132 interpolation attack, 142
generalized partial spread, 242 inverse Fourier–Hadamard transform formula, 59, 65, 88
generator matrix, 7 inverse function, 85, 137, 317, 324, 400, 401, 412, 417,
generator polynomial, 12 423, 433, 442, 479
Gleason theorem, 16 inverse Walsh transform formula, 59, 65, 71, 109, 380,
glitches, 436 438
560 Index

IPM, 446 master key, 24


ISW algorithm, 430 Mattson–Solomon polynomial, 44
maximal odd weighting, 153
Jacobi symbol, 177 maximum correlation with respect to I , 101
maximum distance separable, 6, 14
Kasami AB functions, 394 maximum length sequence, 21, 383
Kasami APN functions, 400, 478 maximum likelihood decoding, 6
Kasami exponents, 230 McEliece’s theorem, 13, 65, 156
Kasami function, 418 MCM polynomial, 400
Kasami–Welch functions, 394 MDS, 6, 9, 14, 147, 162, 446
Kerdock code, 254, 477 Menon design, 201
key scheduling algorithm, 24 message, 4
keystream, 3, 19, 344, 454 metric complements, 191
Kloosterman sums, 188 minimal code, 149
Knuth–Eve method, 434 minimal codeword, 148
Krawtchouk polynomial, 355 minimum degree, 41
Kronecker sum, 309 minimum distance, 5
Möbius transform over integers, 49
last round attack, 134 modeled leakage, 426
LCD, 446 modified derivatives, 266
LCP, 445 modified planar, 269
leakage, 425 monomial bent, 229
leakage squeezing, 431 monomial Boolean (multivariate) function, 363
leakage trace, 427 monomial Boolean (univariate) function, 66
level of the covering sequence, 182 monomial vectorial function, 383
LFSR, 20 monotone, 363
line ovals, 220 monovariate attack, 425
linear attack, 115 Müller–Cohen–Matthews polynomial, 400
linear code, 7 multidimensional Walsh transform, 74
linear complementary dual, 446 multinomial APN functions, 231, 406
linear complementary pair, 445 multioutput Boolean function, 24
linear complexity, 21, 77 multiparty computation, 146, 436, 453, 454
linear exact repairing code, 428 multipermutation, 129
linear feedback shift register, 20 multiplicative inverse permutation, 400
linear invariant, 29
linear kernel, 99, 383 naive bound, 243
linear leakage model, 426 near-bent, 119, 178, 262
linear secret sharing scheme, 146, 450 nega-bent, 191, 267
linear span, 21 NFSR, 23, 475
linear structure, 99 Niho functions, 395
linearized polynomial, 46 NNF, 47, 156, 195, 326, 352
linearly equivalent, 28, 99, 134, 337, 400 nodes, 70
local pseudorandom generator, 467 noisy leakage model, 429
log-alog, 75 noncompleteness, 439
lookup table, 30 nonhomomorphicity, 111
Lucas’ theorem, 487 noninterference, 430
nonlinearity, 79, 117, 369, 475, 478, 479
m-sequence, 21, 77, 383, 490 nonlinearity profile, 83
MacWilliams’ identity, 15 nonlinearity with inputs in E, 459
Maiorana–McFarland, 165, 263, 293 nontrivial covering sequence, 182
Maiorana–McFarland original class, 209 nonzeros of the cyclic code, 12
Maiorana–McFarland vectorial functions, 167 Nordstrom–Robinson code, 254
majority function, 335, 468 normal basis, 248
masked version of function, 428 normal extension, 253
masking, 428 normal function, 105
masking complexity, 433 numerical degree, 48, 67, 69, 70, 201, 287, 356, 475
masking order, 428 numerical normal form, 47, 199, 286
masks, 428 Nyberg’s bound, 423, 478
Index 561

ODSM, 446 punctured code, 6, 155, 162


one-way function, 2, 467 puncturing at position i, 6
one time pad, 19
orphan, 159, 262 quadratic bound, 81
orthogonal array, 86 quadratic function, 36, 64, 85, 99, 103, 107, 115, 165,
orthogonal direct sum masking, 446 172, 177, 223, 257, 393, 441, 475, 478
orthogonal space, 58 quadrinomial APN functions, 405
oval polynomial, 219 qualified coalition, 147

pair of metrically regular sets, 191 radical, 152, 171


parity check bits, 7 random local function, 467
parity check matrix, 8 rank of βf , 171
parity check polynomial, 78 Rayleigh quotient, 199
parity code, 5 realization, 437
Parseval’s relation, 60, 61, 68, 79, 118, 258 rectangles, 239
partial bent functions, 258 reduced cipher, 115, 134
partial spread, 212 redundancy, 5
partial spread class, 212 Reed–Muller code, 23, 37, 65, 80, 151, 154, 328, 341,
partially defined, 37 475, 477
partially-bent function, 256 Reed–Solomon code, 14, 45, 151
perfect algebraic immune, 322 relative difference set, 197
perfect code, 6 resiliency order, 86
perfect nonlinear, 135, 193, 449, 477, 478 resilient, 86, 129, 284, 313, 356, 433, 468, 477
perfect robust code, 448 reversed Dickson polynomial, 389
permutation, 10 robust code, 448
permutation equivalent, 28 rotating S-box masking, 432
permutation invariant, 29, 48, 88, 102 rotation symmetric, 248, 292, 360
physical attack, 427 Rothaus construction, 233
plaintext, 1 Rothaus’ bound, 200
planar, 269 round key, 24
plateaued, 258–261, 264, 275, 279, 283, 382, 391, 393 rounds, 24
plateaued with single amplitude, 140, 274, 372 RS code, 14
player, 146
Poisson summation formula, 58, 65, 87, 106, 200, S-box, 24, 112, 127, 129, 134, 369, 475
286, 364 S-box of the AES, 417
polar representation, 168, 271, 340, 351 Sarkar et al.’s bound, 287
polynomial masking, 436 Sarkar–Maitra’s divisibility bound, 287
polynomial representation of codeword, 12 SCA, 425
power bent, 229 SCV bound, 118
power function, 24, 72, 134, 161, 263, 278, 324, 478 second–order bent function, 257
preferred cross-correlation, 384 second-order covering sequence, 193
PRG, 19 second-order Poisson summation formula, 62, 65, 365
primary construction, 209, 263, 270, 282, 291, 336, secondary construction, 144, 209, 211, 232, 251, 266,
417, 479 273, 297, 317, 362, 442, 457, 478
primitive element, 11, 44, 155, 168, 336, 378, 405, secret sharing scheme, 145, 428
477, 487 self-dual bent function, 198, 476
primitive length, 12 semi-bent function, 178, 262, 268
private-key cryptography, 2 semidirect sum, 234
probing security model, 428 sensitive variable, 426
projective equivalence, 221 sequences, 383
projective plane, 219 session key, 2
propagation criterion, 97, 319 Shannon effect, 103
pseudo-Boolean, 17, 37, 256, 377, 445 sharing, 145
pseudoplanar, 269 shifted bent, 267
pseudorandom generator, 19, 86, 321, 467 shortened code, 6
pseudorandom sequence, 19 side-channel attacks, 425
public-key cryptography, 2 Sidelnikov–Chabaud–Vaudenay, 118
562 Index

Sidon set, 386, 388 threshold function, 358


Siegenthaler bound, 285, 469 threshold implementation, 437, 478
sign function, 55 threshold secret sharing schemes, 147
simplex code, 8, 13, 155, 381 TI, 437, 478
simplified ANF vector, 353 Titsworth relation, 61
simplified value vector, 352 trace form, 43
Singer set, 389, 416 transmission rate, 6, 449
Singleton bound, 6, 452 triangular function, 363, 455
slices, 456 trinomial APN functions, 408
slide attack, 142 triple construction, 228, 388
source vectors, 4 truth table, 30, 152, 166, 298, 361
spectral complexity, 109 Tu–Deng function, 339
spectral immunity, 96, 327 two-weight code, 149
sphere covering bound, 7
sphere-packing bound, 6 uniformity (of TI), 440
splitting field, 13, 485 uniformly packed code, 10, 381
spread, 212 uniformly robust code, 448
statistical distance, 429 unitary transformation, 266
Stickelberger theorem, 156 univariate attack, 425
stream cipher, 3 univariate representation, 42, 142, 214, 221, 231,
stretch, 468 271, 476
strict avalanche criterion, 97 unrestricted code, 7, 23, 83
strongly plateaued, 274, 278 unrestricted nonlinearity, 131
subfield trace representation, 43 usual inner product, 8, 15, 37, 53, 117, 165, 190, 307
substitution box, 24
substitution permutation network, 26 vectorial bent4 functions, 274
sum-free set, 388 vectorial Boolean functions, 24
sum-of-squares indicator, 97, 288 Vernam cipher, 19
supplementary subspaces, 62, 212 vertices, 70
support of function, 27
support of vector, 27 Walsh functions, 53, 109
switching, 402, 407 Walsh spectrum, 55, 71
symmetric Boolean function, 206, 244, 322, 328, 335, Walsh support, 55, 71
352, 477 Walsh transform, 55, 71
symmetric cryptography, 2 weakly APN, 383
symplectic matrix, 171, 199 weight distribution, 14
synchronous, 20 weight enumerator, 14
syndrome, 9, 10 weightwise almost perfectly balanced, 459
systematic, 7, 149, 161, 314, 448, 451 weightwise perfectly balanced, 457
systematic form, 147 Weil’s bound, 188
systematic generator matrix, 7 Welch functions, 395
Wiener–Khintchine formula, 61
T-function, 26 worst error-masking probability, 447, 448
Tarannikov et al.’s construction, 300
three-valued almost optimal, 262 zero-difference 2-balanced, 394
three-valued functions, 258 zeros of the cyclic code, 12

You might also like