Moder Cryptographybook
Moder Cryptographybook
2005
1
Department of Computer Science and Engineering, University of California at San Diego, La Jolla, CA
92093, USA. [email protected], http://www-cse.ucsd.edu/users/mihir
2
Department of Computer Science, Kemper Hall of Engineering, University of California at Davis,
Davis, CA 95616, USA; and Department of Computer Science, Faculty of Science, Chiang Mai University,
Chiang Mai, 50200 Thailand. [email protected], http://www.cs.ucdavis.edu/∼rogaway
2
Preface
This is a set of class notes that we have been developing jointly for some years. We use them for
cryptography courses that we teach at our respective institutions. Each time one of us teaches
the class, he takes the token and updates the notes a bit. The process has resulted in an evolving
document that has lots of gaps, as well as plenty of “unharmonized” parts. One day it will, with
luck, be complete and cogent.
The viewpoint taken throughout these notes is to emphasize the theory of cryptography as it
can be applied to practice. This is an approach that the two of us have pursued in our research,
and it seems to be a pedagogically desirable approach as well.
We would like to thank the following students of past versions of our courses who have pointed
out errors and made suggestions for changes: Andre Barroso, Keith Bell, Kostas Bimpikis, Alexan-
dra Boldyreva, Dustin Boswell, Brian Buesker, Michael Burton, Chris Calabro, Sashka Davis, Alex
Gantman, Bradley Huffaker, Hyun Min Kang, Vivek Manpuria, Chanathip Namprempre, Adriana
Palacio, Wenjing Rao, Fritz Schneider, Juliana Wong. We welcome further corrections, comments
and suggestions.
1 Introduction 7
1.1 Goals and settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Other goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3 What cryptography is about . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.4 Approaches to the study of cryptography . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5 What background do I need? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2 Classical Encryption 29
2.1 Substitution ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2 One-time-pad encryption and perfect security . . . . . . . . . . . . . . . . . . . . . . 33
2.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3 Blockciphers 37
3.1 What is a blockcipher? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2 Data Encryption Standard (DES) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.3 Key recovery attacks on blockciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4 Iterated-DES and DESX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.5 Advanced Encryption Standard (AES) . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.6 Limitations of key-recovery based security . . . . . . . . . . . . . . . . . . . . . . . . 52
3.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4 Pseudorandom Functions 57
4.1 Function families . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2 Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.3 Random functions and permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.4 Pseudorandom functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.5 Pseudorandom permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.6 Modeling blockciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.7 Example attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.8 Security against key recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.9 The birthday attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.10 The PRP/PRF switching lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.11 Historical notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.12 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4 CONTENTS
5 Symmetric Encryption 83
5.1 Symmetric encryption schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.2 Some symmetric encryption schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.3 Issues in privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.4 Indistinguishability under chosen-plaintext attack . . . . . . . . . . . . . . . . . . . . 90
5.5 Example chosen-plaintext attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.6 Semantic security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.7 Security of CTR modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.8 Security of CBC with a random IV . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.9 Indistinguishability under chosen-ciphertext attack . . . . . . . . . . . . . . . . . . . 106
5.10 Example chosen-ciphertext attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.11 Historical notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.12 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Introduction
Historically, cryptography arose as a means to enable parties to maintain privacy of the information
they send to each other, even in the presence of an adversary with access to the communication
channel. While providing privacy remains a central goal, the field has expandeded to encompass
many others, including not just other goals of communication security, such as guaranteeing in-
tegrity and authenticity of communications, but many more sophisticated and fascinating goals.
Once largely the domain of the military, cryptography is now in widespread use, and you are
likely to have used it even if you don’t know it. When you shop on the Internet, for example to buy
a book at www.amazon.com, cryptography is used to ensure privacy of your credit card number as
it travels from you to the shop’s server. Or, in electronic banking, cryptography is used to ensure
that your checks cannot be forged.
Cryptography has been used almost since writing was invented. For the larger part of its
history, cryptography remained an art, a game of ad hoc designs and attacks. Although the field
retains some of this flavor, the last twenty-five years have brought in something new. The art of
cryptography has now been supplemented with a legitimate science. In this course we shall focus
on that science, which is modern cryptography.
Modern cryptography is a remarkable discipline. It is a cornerstone of computer and communi-
cations security, with end products that are imminently practical. Yet its study touches on branches
of mathematics that may have been considered esoteric, and it brings together fields like number
theory, computational-complexity theory, and probabiltity theory. This course is your invitation
to this fascinating field.
x x x x
x x
S R
A
Figure 1.1: Several cryptographic goals aim to imitate some aspect of an ideal channel connecting
a sender S to a receiver R.
it. Nobody else can look inside the pipe or change what’s there. This pipe provides the perfect
medium, available only to the sender and receiver, as though they were alone in the world. It is an
“ideal” communication channel from the security point of view. See Fig. 1.1.
Unfortunately, in real life, there are no ideal channels connecting the pairs of parties that might
like to communicate with each other. Usually such parties are communicating over some public
network like the Internet.
The most basic goal of cryptography is to provide such parties with a means to imbue their
communications with security properties akin to those provided by the ideal channel.
At this point we should introduce the third member of our cast. This is our adversary, de-
noted A. An adversary models the source of all possible threats. We imagine the adversary as
having access to the network and wanting to compromise the security of the parties communica-
tions in some way.
Not all aspects of an ideal channel can be emulated. Instead, cryptographers distill a few central
security goals and try to achieve them. The first such goal is privacy. Providing privacy means
hiding the content of a transmission from the adversary. The second goal is authenticity or integrity.
We want the receiver, upon receiving a communication pertaining to be from the sender, to have a
way of assuring itself that it really did originate with the sender, and was not sent by the adversary,
or modified en route from the sender to the receiver.
Trust models. It is not hard to convince yourself that in order to communicate securely, there
must be something that a party knows, or can do, that the adversary does not know, or cannot
do. There has to be some “asymmetry” between the situation in which the parties finds themselves
and situation in which the adversary finds itself.
The trust model specifies who, initially, has what keys. There are two central trust models: the
symmetric (or shared-key) trust model and the asymmetric (or public-key) trust model. We look
at them, and the cryptographic problems they give rise to, in turn.
Bellare and Rogaway 9
We will sometimes use words from the theory of “formal languages.” Here is the
vocabulary you should know.
An alphabet is a finite nonempty set. We usually use the Greek letter Σ to denote
an alphabet. The elements in an alphabet are called characters. So, for example,
Σ = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} is an alphabet having ten characters, and Σ = {0, 1}
is an alphabet, called the binary alphabet, which has two characters. A string
is finite sequence of characters. The number of characters in a string is called
its length, and the length of a string X is denoted |X|. So X = 1011 is a string
of length four over the binary alphabet, and Y = cryptography is a string of
length 12 over the alphabet of English letters. The string of length zero is called
the empty string and is denoted ε. If X and Y are strings then the concatenation
of X and Y , denoted X kY , is the characters of X followed by the characters of Y .
So, for example, 1011 k 0 = 10110. We can encode almost anything into a string.
We like to do this because it is as (binary) strings that objects are represented in
computers. Usually the details of how one does this are irrelevant, and so we use
the notation hsomethingi for any fixed, natural way to encode something as a
string. For example, if n is a number and X is a string then Y = hn, Xi is some
string which encodes n and X. It is easy to go from n and X to Y = hn, Xi,
and it is also easy to go from Y = hn, Xi back to n and X. A language is a set
of strings, all of the strings being drawn from the same alphabet, Σ. If Σ is an
alphabet then Σ∗ denotes the set of all strings whose characters are drawn from
Σ. For example, {0, 1}∗ = {ε, 0, 1, 00, 01, 10, 11, 000, . . .}.
Symmetric encryption schemes. A protocol used to provide privacy in the symmetric setting
is called a symmetric encryption scheme. When we specify such a scheme Π, we must specify three
algorithms, so that the scheme is a triple of algorithms, Π = (K, E, D). The encapsulation algorithm
we discussed above is, in this context, called an encryption algorithm, and is the algorithm E. The
message M that the sender wishes to transmit is usually referrred to as a plaintext. The sender
10 INTRODUCTION
K K
M C M
S E D R
coins
or
state A
Figure 1.3: Symmetric encryption. The sender and the receiver share a secret key, K. The adversary
lacks this key. The message M is the plaintext; the message C is the ciphertext.
encrypts the plaintext under the shared key K by applying E to K and M to obtain a ciphertext
C. The ciphertext is transmitted to the receiver. The above-mentioned decapsulation procedure,
in this context, is called a decryption algorithm, and is the algorithm D. The receiver applies D
to K and C. The decryption process might be unsuccessful, indicated by its returning a special
symbol ⊥, but, if successful, it ought to return the message that was originally encrypted. The first
algorithm in Π is the key generation algorithm which specifies the manner in which the key is to
be chosen. In most cases this algorithm simply returns a random string of length the key length.
The encryption algorithm E may be randomized, or it might keep some state around. A picture
for symmetric encryption can be found in Figure 1.3.
The encryption scheme does not tell the adversary what to do. It does not say how the key,
once generated, winds its way into the hands of the two parties. And it does not say how messages
are transmitted. It only says how keys are generated and how the data is processed.
What is privacy? The goal of a symmetric encryption scheme is that an adversary who obtains
the ciphertext be unable to learn anything about the plaintext. What exactly this means, however,
is not clear, and obtaining a definition of privacy will be an important objective in later chapters.
One thing encryption does not do is hide the length of a plaintext string. This is usually
recoverable from the length of the ciphertext string.
As an example of the issues involved in defining privacy, let us ask ourselves whether we could
hope to say that it is impossible for the adversary to figure out M given C. But this cannot be
true, because the adversary could just guess M , by outputting a random sequence of |M | bits. (As
indicated above, the length of the plaintext is usually computable from the length of the ciphertext.)
She would be right with probability 2−n . Not bad, if, say n = 1! Does that make the scheme bad?
No. But it tells us that security is a probabilistic thing. The scheme is not secure or insecure, there
is just some probability of breaking it.
Another issue is a priori knowledge. Before M is transmitted, the adversary might know some-
thing about it. For example, that M is either 0n or 1n . Why? Because she knows Alice and Bob
are talking about buying or selling a fixed stock, and this is just a buy or sell message. Now, she
can always get the message right with probability 1/2. How is this factored in?
So far one might imagine that an adversary attacking the privacy of an encryption scheme is
passive, merely obtaining and examining ciphertexts. In fact, this might not be the case at all. We
will consider adversaries that are much more powerful than that.
Message Authenticity. In the message-authentication problem the receiver gets some message
which is claimed to have originated with a particular sender. The channel on which this message
Bellare and Rogaway 11
M M’
accept
M MAC σ σ’ MAC
gen vf
reject
K coins K
or
state
S A R
Figure 1.4: A message authentication code. The tag σ accompanies the message M . The receiver
R uses it to decide if the message really did originate with the sender S with whom he shares the
key K.
flows is insecure. Thus the receiver R wants to distinguish the case in which the message really
did originate with the claimed sender S from the case in which the message originated with some
imposter, A. In such a case we consider the design of an encapsulation mechanism with the property
that un-authentic transmissions lead to the decapsulation algorithm outputting the special symbol
⊥.
The most common tool for solving the message-authentication problem in the symmetric setting
is a message authentication scheme, also called a message authentication code (MAC). Such a
scheme is specified by a triple of algorithms, Π = (K, T , V). When the sender wants to send a
message M to the receiver she computes a “tag,” σ, by applying T to the shared key K and
the message M , and then transmits the pair (M, σ). (The encapsulation procedure referred to
above thus consists of taking M and returning this pair. The tag is also called a MAC.) The
computation of the MAC might be probabilistic or use state, just as with encryption. Or it may
well be deterministic. The receiver, on receipt of M and σ, uses the key K to check if the tag
is OK by applying the verification algorithm V to K, M and σ. If this algorithms returns 1, he
accepts M as authentic; otherwise, he regards M as a forgery. An appropriate reaction might range
from ignoring the bogus message to tearing down the connection to alerting a responsible party
about the possible mischief. See Figure 1.4.
PKR SKR
M C M
S E D R
coins
A Public Secret
R : PKR SKR
Figure 1.5: Asymmetric encryption. The receiver R has a public key, pk R , which the sender knows
belongs to R. The receiver also has a corresponding secret key, sk R .
scheme Π = (K, E, D) is specified by the algorithms for key generation, encryption and decryption.
For a picture of encryption in the public-key setting, see Fig. 1.5.
The idea of public-key cryptography, and the fact that we can actually realize this goal, is
remarkable. You’ve never met the receiver before. But you can send him a secret message by
looking up some information in a phone book and then using this information to help you garble
up the message you want to send. The intended receiver will be able to understand the content of
your message, but nobody else will. The idea of public-key cryptography is due to Whitfield Diffie
and Martin Hellman and was published in 1976 [10].
Digital signatures. The tool for solving the message-authentication problem in the asymmetric
setting is a digital signature. Here the sender has a public key pk S and a corresponding secret key
sk S . The receiver is assumed to know the key pk S and that it belongs to party S. (The adversary
is assumed to know pk S too.) When the sender wants to send a message M she attaches to it
some extra bits, σ, which is called a signature for the message and is computed as a function of
M and sk S by applying to them a signing algorithm Sig. The receiver, on receipt of M and σ,
checks if it is OK using the public key of the sender, pk S , by applying a verification algorithm
V. If this algorithm accepts, the receiver regards M as authentic; otherwise, he regards M as an
attempted forgery. The digital signature scheme Π = (K, Sig, V) is specified by the algorithms for
key generation, signing and verifying. A picture is given in Fig. 1.6.
One difference between a MAC and a digital signature concerns what is called non-repudiation.
With a MAC anyone who can verify a tagged message can also produce one, and so a tagged message
would seem to be of little use in proving authenticity in a court of law. But with a digitally-signed
message the only party who should be able to produce a message that verifies under public key
pk S is the party S herself. Thus if the signature scheme is good, party S cannot just maintain that
the receiver, or the one presenting the evidence, concocted it. If signature σ authenticates M with
respect to public key pk S , then it is only S that should have been able to devise σ. The sender
cannot refute that. Probably the sender S can claim that the key sk S was stolen from her. Perhaps
this, if true, might still be construed the sender’s fault.
Bellare and Rogaway 13
M M’
accept
M σ σ’ Verify
Sign
reject
S A R
Public Secret
S : PKS SKS
Figure 1.6: A digital signature scheme. The signature σ accompanies the message M . The receiver
R uses it to decide if the message really did originate with the sender S with has public key pk S .
1.1.3 Summary
To summarize, there are two common aims concerned with mimicking an ideal channel: achieving
message privacy and achieving message authenticity. There are two main trust models in which
we are interested in achieving these goals: the symmetric trust model and the asymmetric trust
model. The tools used to achieve these four goals are named as shown in Fig. 1.7.
dom.”
In some applications, people use Linear Congruential Generators (LCGs) for pseudorandom
number generation. But LCGs do not have good properties with regard to the quality of pseudo-
randomness of the bits output. With the ideas and techniques of modern cryptography, one can do
much better. We will say what it means for a pseudorandom number generator to be “good” and
then how to design one that is good in this sense. Our notion of “good” is such that our generators
provably suffice for typical applications.
It should be clarified that pseudorandom generators do not generate pseudorandom bits from
scratch. They need as input a random seed, and their job is to stretch this. Thus, they reduce the
task of random number generation to the task of generating a short random seed. As to how to
do the latter, we must step outside the domain of cryptography. We might wire to our computer
a Geiger counter that generates a “random” bit every second, and run the computer for, say, 200
seconds, to get a 200 bit random seed, which we can then stretch via the pseudorandom number
generator. Sometimes, more ad hoc methods are used; a computer might obtain a “random” seed
by computing some function of various variable system parameters such as the time and system
load.
We won’t worry about the “philosophical” question as to whether the bits that form the seed
are random in any real sense. We’ll simply assume that these bits are completely unpredictable to
anything “beyond” the computer which has gathered this data—mathematically, we’ll treat these
bits as random. We will then study pseudorandom number generation under the assumption that
a random seed is available.
Choose bit α at
random. Put α in α
an envelope & send it.
Choose bit β
A β at random and
send it.
B
The shared bit is α xor β. α
Open up the
envelope for so B can Compute the shared
likewise compute it. bit α xor β.
Bob is not as bright as Alice, but something troubles him about this arrangement.
The telephone-coin-flip problem is to come up with a protocol so that, to the maximal extent
possible, neither Alice nor Bob can cheat the other and, at the same time, each of them learn the
outcome of a fair coin toss.
Here is a solution—sort of. Alice puts a random bit α inside an envelope and sends it to Bob.
Bob announces a random bit β. Now Alice opens the envelope for Bob to see. The shared bit is
defined as α ⊕ β. See Figure 1.8.
To do this over the telephone we need some sort of “electronic envelope” (in cryptography,
this called a commitment scheme). Alice can put a value in the envelope and Bob can’t see what
the envelope contains. Later, Alice can open the envelope so that Bob can see what the envelope
contains. Alice can’t change her mind about an envelope’s contents—it can only be opened up in
one way.
Here is a simple technique to implement an electronic envelope. To put a “0” inside an envelope
Alice chooses two random 500-bit primes p and q subject to the constraints that p < q and p ≡ 1
(mod 4) and q ≡ 3 (mod 4). The product of p and q, say N = pq, is the commitment to zero;
that is what Alice would send to commit to 0. To put a “1” inside an envelope Alice chooses too
random 500-bit primes p and q subject to the constraints that p < q and p ≡ 3 (mod 4) and q ≡ 1
(mod 4). The product of these, N = pq, is the commitment to 1. Poor Bob, seeing N , would like
to figure out if the smaller of its two prime factors is congruent to 1 or to 3 modulo 4. We have
no idea how to make that determination short of factoring N —and we don’t know how to factor
1000 digit numbers which are the product of random 500-digit primes. Our best algorithms would,
take way too long to run. When Alice wants to decommit (open the envelope) N she announces p
and q. Bob verifies that they are prime (this is easy to do) and multiply to N , and then he looks
to see if the smaller factor is congruent to 1 or to 3 modulo 4.
we should at least be achieving high reliability. After all, if a powerful adversary can’t succeed in
disrupting our endeavors, then neither will noisy lines, transmission errors due to software bugs,
unlucky message delivery times, careless programmers sending improperly formatted messages, and
so forth.
When we formalize adversaries they will be random access machines (RAMs) with access to an
oracle.
Government organizations that deal in cryptography often do not make their mechanisms public.
For them, learning the cryptographic mechanism is one more hoop that that the adversary must
jump through. Why give anything away? Some organizations may have other reasons for not
wanting mechanisms to be public, like a fear of disseminating cryptographic know-how, or a fear
that the organization’s abilities, or inabilities, will become better understood.
Problem
Proposed Solution
Bug!
Revised Solution
...
Implement
Bug!
...
Figure 1.9: The classical-cryptography approach.
There are some difficulties with the approach of cryptanalysis-drive design. The obvious problem
is that one never knows if things are right, nor when one is finished! The process should iterate
until one feels “confident” that the solution is adequate. But one has to accept that design errors
might come to light at any time. If one is making a commercial product one must eventually say
that enough is enough, ship the product, and hope for the best. With luck, no damaging attacks
will subsequently emerge. But sometimes they do, and when this happens the company that owns
the product may find it difficult or impossible to effectively fix the fielded solution. They might
try to keep secret that there is a good attack, but it is not easy to keep secret such a thing. See
Figure 1.9.
Doing cryptanalysis well takes a lot of cleverness, and it is not clear that insightful cryptanalysis
is a skill that can be effectively taught. Sure, one can study the most famous attacks—but will
they really allow you to produce a new, equally insightful one? Great cleverness and mathematical
prowess seem to be the requisite skills, not any specific piece of knowledge. Perhaps for these
reasons, good cryptanalysts are very valuable. Maybe you have heard of Adi Shamir or Don
Coppersmith, both renowned cryptanalysts.
Sadly, it is hard to base a science on an area where assurance is obtained by knowing that
Coppersmith thought about a mechanism and couldn’t find an attack. We need to pursue things
differently.
Shannon, which we consider in more depth later, is to say that a scheme is perfectly secure if, for
any two messages M1 , M2 , and any ciphertext C, the latter is just as likely to show up when M1 is
encrypted as when M2 is encrypted. Here, likelihood means the probability, taken over the choice
of key, and coins tossed by the encryption algorithm, if any.
Perfect security is a very powerful guarantee; indeed, in some sense, the best one can hope for.
However, it has an important limitation, namely that, to achieve it, the number of message bits
that one can encrypt cannot exceed the number of bits in the key. But if we want to do practical
cryptography, we must be able to use a single short key to encrypt lots of bits. This means that
we will not be able to achieve Shannon’s perfect security. We must seek a different paradigm and
a different notion of security that although “imperfect” is good enough.
Assuming the adversary uses no more than t computing cycles, her probability of break-
ing the scheme is at most t/2200 .
Notice again the statement is probabilistic. Almost all of our statements will be.
Notice another important thing. Nobody said anything about how the adversary operates.
What algorithm, or technique, does she use? We do not know anything about that. The statement
holds nonetheless. So it is a very strong statement.
It should be clear that, in practice, a statement like the one above would be good enough. As
the adversary works harder, her chance of breaking the scheme increases, and if the adversary had
2200 computing cycles at her disposal, we’d have no security left at all. But nobody has that much
computing power.
Now we must ask ourselves how we can hope to get protocols with such properties. The
legitimate parties must be able to efficiently execute the protocol instructions: their effort should
be reasonable. But somehow, the task for the adversary must be harder.
Atomic Primitives
↓
Protocols
What’s the distinction? Perhaps the easiest way to think of it is that the protocols we build
address a cryptographic problem of interest. They say how to encrypt, how to authenticate, how to
Bellare and Rogaway 21
distribute a key. We build our protocols out of atomic primitives. Atomic primitives are protocols
in their own right, but they are simpler protocols. Atomic primitives have some sort of “hardness”
or “security” properties, but by themselves they don’t solve any problem of interest. They must
be properly used to achieve some useful end.
In the early days nobody bothered to make such a distinction between protocols and the prim-
itives that used them. And if you think of the one-time pad encryption method, there is really just
one object, the protocol itself.
Atomic primitives are drawn from two sources: engineered constructs and mathematical prob-
lems. In the first class fall standard blockciphers such as the well-known DES algorithm. In the
second class falls the RSA function. We’ll be looking at both types of primitives later.
The computational nature of modern cryptography means that one must find, and base cryp-
tography on, computationally hard problems. Suitable ones are not so commonplace. Perhaps the
first thought one might have for a source of computationally hard problems is NP-complete prob-
lems. Indeed, early cryptosystems tried to use these, particularly the Knapsack problem. However,
these efforts have mostly failed. One reason is that NP-complete problems, although apparently
hard to solve in the worst-case, may be easy on the average.
An example of a more suitable primitive is a one-way function. This is a function f : D → R
mapping some domain D to some range R with two properties:
(1) f is easy to compute: there is an efficient algorithm that given x ∈ D outputs y = f (x) ∈ R.
(2) f is hard to invert: an adversary I given a random y ∈ R has a hard time figuring out a point
x such that f (x) = y, as long as her computing time is restricted.
The above is not a formal definition. The latter, which we will see later, will talk about probabilities.
The input x will be chosen at random, and we will then talk of the probability an adversary can
invert the function at y = f (x), as a function of the time for which she is allowed to compute.
Can we find objects with this strange asymmetry? It is sometimes said that one-way functions
are obvious from real life: it is easier to break a glass than to put it together again. But we want
concrete mathematical functions that we can implement in systems.
One source of examples is number theory, and this illustrates the important interplay between
number theory and cryptography. A lot of cryptography has been done using number theory. And
there is a very simple one-way function based on number theory—something you already know quite
well. Multiplication! The function f takes as input two numbers, a and b, and multiplies them
together to get N = ab. There is no known algorithm that given a random N = ab, always and
quickly recovers a pair of numbers (not 1 and N , of course!) that are factors of N . This “backwards
direction” is the factoring problem, and it has remained unsolved for hundreds of years.
Here is another example. Let p be a prime. The set Zp∗ = {1, . . . , p − 1} turns out to be a
group under multiplication modulo p. We fix an element g ∈ Zp∗ which generates the group (that
is, {g0 , g 1 , g2 , . . . , gp−2 } is all of Zp∗ ) and consider the function f : {0, . . . , p − 2} → Zp∗ defined by
f (x) = g x mod p. This is called the discrete exponentiation function, and its inverse is called the
discrete logarithm function: logg (y) is the value x such that y = gx . It turns out there is no known
fast algorithm that computes discrete logarithms, either. This means that for large enough p (say
1000 bits) the task is infeasible, given current computing power, even in thousands of years. So
this is another one-way function.
It should be emphasized though that these functions have not been proven to be hard functions
to invert. Like P versus NP, whether or not there is a good one-way function out there is an open
question. We have some candidate examples, and we work with them. Thus, cryptography is build
on assumptions. If the assumptions are wrong, a lot of protocols might fail. In the meantime we
live with them.
22 INTRODUCTION
Problem
Definition
Protocol
Reduction
Implement
DONE
onewayness of RSA to the security of my protocol, I am giving you a transformation with the
following property. Suppose you claim to be able to break my protocol P . Let A be the adversary
that you have that does this. My transformation takes A and turns it into another adversary, A′ ,
that breaks RSA. Conclusion: as long as we believe you can’t break RSA, there could be no such
adversary A. In other words, my protocol is secure.
Those familiar with the theory of NP-completeness will recognize that the basic idea of reduc-
tions is the same. When we provide a reduction from SAT to some computational problem Ξ we are
saying our Ξ is hard unless SAT is easy; when we provide a reduction from RSA to our protocol Π,
we are saying that Π is secure unless RSA is easy to invert. The analogy is further spelled out in
Fig. 1.11, for the benefit of those of you familiar with the notion of NP-Completeness.
Experience has taught us that the particulars of reductions in cryptography are a little harder
to comprehend than they were in elementary complexity theory. Part of the difficulty lies in the
fact that every problem domain will have it’s own unique notion of what is an “effective attack.”
It’s rather like having a different “version” of the notion of NP-Completeness as you move from
one problem to another. We will also be concerned with the quality of reductions. One could have
concerned oneself with this in complexity theory, but it’s not usually done. For doing practical
work in cryptography, however, paying attention to the quality of reductions is important. Given
these difficulties, we will proceed rather slowly through the ideas. Don’t worry; you will get it (even
if you never heard of NP-Completeness).
The concept of using reductions in cryptography is a beautiful and powerful idea. Some of us
24 INTRODUCTION
by now are so used to it that we can forget how innovative it was! And for those not used to it,
it can be hard to understand (or, perhaps, believe) at first hearing—perhaps because it delivers so
much. Protocols designed this way truly have superior security guarantees.
In some ways the term “provable security” is misleading. As the above indicates, what is
probably the central step is providing a model and definition, which does not involve proving
anything. And then, one does not “prove a scheme secure:” one provides a reduction of the
security of the scheme to the security of some underlying atomic primitive. For that reason, we
sometimes use the term “reductionist security” instead of “provable security” to refer to this genre
of work.
variables and their expectations. We won’t use anything deep from probability theory, but we will
draw heavily on the language and basic concepts of this field.
You should know about alphabets, strings and formal languages, in the style of an undergraduate
course in the theory of computation.
You should know about algorithms and how to measure their complexity. In particular, you
should have taken and understood at least an undergraduate algorithms class.
Most of all you should have general mathematical maturity, meaning, especially, you need to
be able to understand what is (and what is not) a proper definition.
1.6 Problems
Problem 1 Besides the symmetric and the asymmetric trust models, think of a couple more ways
to “create asymmetry” between the receiver and the adversary. Show how you would encrypt a bit
in your model.
Problem 2 In the telephone coin-flipping protocol, what should happen if Alice refuses to send
her second message? Is this potentially damaging?
Problem 3 Argue that what we have said about keeping the algorithm public but the key secret
is fundamentally meaningless.
Problem 5 Composition of EPT Algorithms. John designs an EPT (expected polynomial time)
algorithm to solve some computational problem Π—but he assumes that he has in hand a black-
box (ie., a unit-time subroutine) which solves some other computational problem, Π′ . Ted soon
discovers an EPT algorithm to solve Π′ . True or false: putting these two pieces together, John and
Ted now have an EPT algorithm for Π. Give a proof or counterexample.
(When we speak of the worst-case running time of machine M we are looking at the function
T (n) which gives, for each n, the maximal time which M might spend on an input of size n: T (n) =
maxx, |x|=n [#StepsM (x)]. When we speak of the expected running time of M we are instead looking
at the function T (n) which gives, for each n, the maximal value among inputs of length n of the
expected value of the running time of M on this input—that is, T (n) = maxx, |x|=n E[#StepsM (x)],
where the expectation is over the random choices made by M .)
26 INTRODUCTION
Bibliography
[DH] Whitfield Diffie and Martin Hellman. New directions in cryptography. IEEE Trans.
Info. Theory, Vol. IT-22, No. 6, November 1976, pp. 644–654.
28 BIBLIOGRAPHY
Chapter 2
Classical Encryption
In this chapter we take a quick look at some classical encryption techniques, illustrating their
weakness and using these examples to initiate questions about how to define privacy. We then
discuss Shannon’s notion of perfect security.
symmetric encryption scheme in which the output of the key-generation algorithm K is always a
permutation over Σ and the encryption and decryption algorithms are as follows:
Algorithm Eπ (M ) Algorithm Dπ (C)
For i = 1, . . . , |M | do For i = 1, . . . , |C| do
C[i] ← π(M [i]) M [i] ← π −1 (C[i])
Return C Return M
Above, the plaintext M is a string over Σ, as is the ciphertext C. The key is denoted π and is a
permutation over Σ. We will let Keys(SE) denote the set of all keys that might be output by K.
There are many possible substitution ciphers over Σ, depending on the set Keys(SE). In the
simplest case, this is the set of all permutations over Σ, and K is picking a permutation at random.
But one might consider schemes in which permutations are chosen from a much smaller set.
In our examples, unless otherwise indicated, the alphabet will be the English one defined above,
namely Σ contains the 26 English letters, the blank symbol ⊔, and punctuation symbols. We will,
for simplicity, restrict attention to substitution ciphers that are punctuation respecting. By this we
mean that any key (permutation) π ∈ Keys(SE) leaves blanks and punctuation marks unchanged.
In specifying such a key, we need only say how it transforms each of the 26 English letters.
Example 2.1.1 This is an example of how encryption is performed with a (punctuation respecting)
substitution cipher. An example key (permutation) π is depicted below:
σ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
π(σ) D B U P W I Z L A F N S G K H T J X C M Y O V E Q R
Note every English letter appears once and exactly once in the second row of the table. That’s why
π is called a permutation. The inverse π −1 permutation is obtained by reading the table backwards.
Thus π −1 (D) = A and so on. The encryption of the plaintext
M = HI THERE
is
C = π(H)π(H)π(I)π(⊔)π(T)π(H)π(E)π(R)π(E) = LA MLWXW
Now let SE = (K, E, D) be an arbitrary substitution cipher. We are interested in its security. To
assess this we think about what the adversary has and what it might want to do.
The adversary begins with the disadvantage of not being given the key π. It is assumed however
to come in possession of a ciphertext C. The most basic goal that we can consider for it is that it
wants to recover the plaintext M = D(π, C) underlying C.
The adversary is always assumed to know the “rules of the game.” Meaning, it knows the
algorithms K, E, D. It knows that a substitution cipher is being used, and that it is punctuation
respecting in our case. The only thing it does not know a priori is the key, for that is assumed to
have been shared secretly and privately between the sender and receiver.
So the adversary sees some gibberish, such as the text LA MLWXW. One might imagine that in
the absence of the key π it would have a tough time figuring out that the message was HI THERE.
But in fact, substitution ciphers are not so hard to cryptanalyze. Indeed, breaking a substitution
cipher is a popular exercise in a Sunday newspaper or magazine, and many of you may have done
it. The adversary can use its knowledge of the structure of English text to its advantage. Often a
good way to begin is by making what is called a frequency table. This table shows, for ever letter
τ , how often τ occurs in the ciphertext. Now it turns out that the most common letter in English
Bellare and Rogaway 31
τ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
π −1 (τ ) R T H A E
π −1 (τ ) R T I N H C A W E
π −1 (τ ) L R T I M F N O H C S A W E
π −1 (τ ) L R T I M F N O P H C U S A D W E
text is typically E. The next most common are the group T, A, O, I, N, S, H, R. (These letters have
roughly the same frequency, somewhat lower than that of E, but higher than other letters.) So if X
is the most frequent ciphertext symbol, a good guess would be that it represents E. (The guess is
not necessarily true, but one attempts to validate or refute it in further stages.) Another thing to
do is look for words that have few letters. Thus, if the letter T occurs by itself in the ciphertext,
we conclude that it must represent A or I. Two letter words give similar information. And so on,
it is remarkable how quickly you actually (usually) can figure out the key.
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
3 3 7 4 0 0 2 3 9 0 4 0 0 1 8 3 2 4 0 8 3 4 0 13 0 0
The most common symbol being X, we guess that π −1 (X) = E. Now we see the word OX, and,
assuming X represents E, O must represent one of B, H, M, W. We also note that O has a pretty high
frequency count, namely 8. So my guess is that O falls in the second group of letters mentioned
above. But of the letters B, H, M and W, only H is in this group, so let’s guess that π −1 (O) = H. Now,
consider the first word in the ciphertext, namely COXBX. We read it as ∗HE ∗ E. This could be THERE
or THESE. I will guess that π −1 (C) = T, keeping in mind that π −1 (B) should be either R or S. The
letter T occurs on its own in the ciphertext, so must represent A or I. But the second ciphertext
word can now be read as ∗RE or ∗SE, depending on our guess for B discussed above. We know that
the ∗ (which stands for the letter T in the ciphertext) decodes to either A or I. Even though a few
choices yield English words, my bet is the word is ARE, so I will guess π −1 (T) = A and π −1 (B) = R.
The second row of the table in Fig. 2.1 shows where we are. Now let us write the ciphertext again,
this time indicating above different letters what we believe them to represent:
THERE ARE T T E A A ’ E HE HE H T E ATE: HE HE
COXBX TBX CVK CDGXR DI T GTI’R ADHX VOXI OX ROKQAU IKC RNXPQATCX: VOXI OX
A ’T A R T, A HE HE A .
PTI’C THHKBU DC, TIU VOXI OX PTI.
Since the last letter of the ciphertext word DC represents T, the first letter must represent A or I.
But we already have something representing A, so we guess that π −1 (D) = I. From the ciphertext
word DI it follows that I must be either N, T or S. It can’t be T because C already represents T. But
32 CLASSICAL ENCRYPTION
I is also the last letter of the ciphertext word VOXI, and ∗HEN is a more likely ending than ∗HES so
I will guess π −1 (I) = N. To make sense of the ciphertext word VOXI, I then guess that π −1 (V) = W.
The ciphertext word PTI’C is now ∗AN’T and so surely π −1 (P) = C. The second row of the table of
Fig. 2.1 shows where we are now, and our text looks like:
THERE ARE TW TI E IN A AN’ I E WHEN HE H N T EC ATE: WHEN HE
COXBX TBX CVK CDGXR DI T GTI’R ADHX VOXI OX ROKQAU IKC RNXPQATCX: VOXI OX
CAN’T A R IT, AN WHEN HE CAN.
PTI’C THHKBU DC, TIU VOXI OX PTI.
At this point I can decrypt the first 8 words of the ciphertext pretty easily: THERE ARE TWO TIMES
IN A MAN’S LIFE. The third row of the table of Fig. 2.1 shows where we are after I put in the
corresponding guesses. Applying them, our status is:
THERE ARE TWO TIMES IN A MAN’S LIFE WHEN HE SHO L NOT S EC LATE: WHEN HE
COXBX TBX CVK CDGXR DI T GTI’R ADHX VOXI OX ROKQAU IKC RNXPQATCX: VOXI OX
CAN’T AFFOR IT, AN WHEN HE CAN.
PTI’C THHKBU DC, TIU VOXI OX PTI.
The rest is easy. The decryption is:
THERE ARE TWO TIMES IN A MAN’S LIFE WHEN HE SHOULD NOT SPECULATE: WHEN HE
COXBX TBX CVK CDGXR DI T GTI’R ADHX VOXI OX ROKQAU IKC RNXPQATCX: VOXI OX
CAN’T AFFORD IT, AND WHEN HE CAN.
PTI’C THHKBU DC, TIU VOXI OX PTI.
The third row of the table of Fig. 2.1 shows our final knowledge of the key π. The text, by the
way, is a quotation from Mark Twain.
Some people argue that this type of cryptanalysis is not possible if the ciphertext is short, and
thus that substitution ciphers work fine if, say, one changes the key quite frequently. Other people
argue for other kinds of variants and extensions. And in fact, these types of systems have been the
basis for encryption under relatively modern times. We could spend a lot of time on this subject,
and many books do, but we won’t. The reason is that, as we will explain, the idea of a substitution
cipher is flawed at a quite fundamental level, and the flaw remains in the various variations and
enhancements proposed. It will take some quite different ideas to get systems that deliver quality
privacy.
To illustrate why the idea of a substitution cipher is flawed at a fundamental level, consider the
following example usage of the scheme. A polling station has a list of voters, call them V1 , V2 , . . . , Vn .
Each voter casts a (secret) ballot which is a choice between two values. You could think of them
as YES or NO, being votes on some Proposition, or BUSH and KERRY. In any case, we represent
them as letters: the two choices are Y and N. At the end of the day, the polling station has a list
v1 , . . . , vn of n votes, where vi is Vi ’s vote. Each vote being a letter, either Y or N, we can think of
the list of votes as a string v = v1 . . . vn over the alphabet of English letters. The polling station
wants to transmit this string to a tally center, encrypted in order to preserve anonymity of votes.
The polling station and tally center have agreed on a key π for a substitution cipher. The polling
station encrypts the message string v to get a ciphertext string c = π(v1 ) . . . π(vn ) and transmits
this to the tally center. Our question is, is this secure?
It quickly becomes apparent that it is not. There are only two letters in v, namely Y and N.
This means that c also contains only two letters. Let’s give them names, say A and B. One of these
is π(Y) and the other is π(N). If the adversary knew which is which, it would, from the ciphertext,
know the votes of all voters. But we claim that it is quite easy for the adversary to know which is
Bellare and Rogaway 33
which. Consider for example that the adversary is one of the voters, say V1 . So it knows its own
vote v1 . Say this is Y. It now looks at the first symbol in the ciphertext. If this is A, then it knows
that A = π(Y) and thus that B = N, and can now immediately recover all of v2 , . . . , vn from the
ciphertext. (If the first symbol is B, it is the other way around, but again it recovers all votes.)
This attack works even when the ciphertext is short (that is, when n is small). The weakness
is exhibits is in the very nature of the cipher, namely that a particular letter is always encrypted
in the same way, and thus repetitions can be detected.
Pinpointing this weakness illustrates something of the types of mode of thought we need to
develop in cryptography. We need to be able to think about usage of application scenarios in which
a scheme that otherwise seems good will be seen to be bad. For example, above, we considered not
only that the encrypted text is votes, but that the adversary could be one of the voters. We need
to always ask, “what if?”
We want symmetric encryption schemes that are not subject to the types of attacks above and,
in particular, would provide security in an application such as the voting one. Towards this end we
now consider one-time-pad encryption.
that C arises as the ciphertext is the same whether M1 or M2 was chosen to be encrypted.
Let us now show that a substitution cipher fails to have this property, even if the ciphertext
encrypted is very short, say three letters.
Claim 2.2.2 Let SE = (K, E, D) be a substitution cipher over the alphabet Σ consisting of the 26
English letters. Assume that K picks a random permutation over Σ as the key. (That is, its code
$
is π ← Perm(Σ) ; return π.) Let Plaintexts be the set of all three letter English words. Assume we
use SE to encrypt a single message from Plaintexts. Then SE is not perfectly secure.
Intuitively, this is due to the weakness we saw above, namely that if a letter appears twice in a
plaintext, it is encrypted the same way in the ciphertext, and thus repetitions can be detected.
If M1 , M2 are messages such that M1 contains repeated letters but M2 does not, then, if the
adversary sees a ciphertext with repeated letters it knows that M1 was encrypted. This means that
this particular ciphertext has different probabilities of showing up in the two cases. We now make
all this formal.
Proof of Claim 2.2.2: We are asked to show that the condition of Definition 2.2.1 does not hold,
so the first thing to do is refer to the definition and right down what it means for the condition
to not hold. It is important here to be careful with the logic. The contrapositive of “for all
M1 , M2 , C some condition holds” is “there exist M1 , M2 , C such that the condition does not hold.”
Accordingly, we need to show there exist M1 , M2 ∈ Plaintexts, and there exists C, such that
Pr [Eπ (M1 ) = C] 6= Pr [Eπ (M2 ) = C] . (2.2)
We have replaced K with π because the key here is a permutation.
We establish the above by picking M1 , M2 , C in a clever way. Namely we set M1 to some three
letter word that contains a repeated letter; specifically, let us set it to FEE. We set M2 to a three
letter word that does not contain any repeated letter; specifically, let us set it to FAR. We set C to
XYY, a ciphertext that has the same “pattern” as FEE in the sense that the last two letters are the
same and the first is different from these. Now we evaluate the probabilities in question:
Recall that the probability is over the choice of key, here π, which is chosen at random from
Perm(Σ), and over the coins of E, if any. In this case, E does not toss coins, so the probability
is over π alone. The probability can be expressed as the ratio of the number of choices of π for
which the stated event, namely that Eπ (FEE) = XYY, is true, divided by the total number of possible
choices of π, namely the size of the set Perm(Σ) from which π is drawn. The second term is 26!.
For the first, we note that the condition means that π(F) = X and π(E) = Y, but the value of π on
any of the other 24 input letters may still be any value other than X or Y. There are 24! different
ways to assign distinct values to the remaining 24 inputs to π, so this is the numerator above. Now,
we proceed similarly for M2 :
In this case, the numerator asks us to count the number of permutations π with the property that
π(F) = X, π(A) = Y and π(R) = Y. But no permutation can have the same output Y on two different
inputs. So the number of permutations meeting this condition is zero.
In conclusion, we have Equation (2.2) because the two probabilities we computed above are differ-
ent.
Let us now show that the OTP scheme with key-length m does have the perfect security property.
Intuitively, the reason is as follows. Say m = 3, and consider two messages, say M1 = 010 and
M2 = 001. Say the adversary receives the ciphertext C = 101. It asks itself whether it was M1 or M2
that was encrypted. Well, it reasons, if it was M1 , then the key must have been K = M1 ⊕ C = 111,
while if M2 was encrypted then the key must have been K = M2 ⊕ C = 100. But either of these
two was equally likely as the key, so how do I know which of the two it was? Here now is the formal
statement and proof.
Claim 2.2.3 Let SE = (K, E, D) be the OTP scheme with key-length m ≥ 1. Assume we use it to
encrypt a single message drawn from {0, 1}m . Then SE is perfectly secure.
Proof of Claim 2.2.3: As per Definition 2.2.1, for any M1 , M2 ∈ {0, 1}m and any C we need to
show that Equation (2.1) is true. So let M1 , M2 be any m-bit strings. We can assume C is also an
m-bit string, since otherwise both sides of Equation (2.1) are zero and thus equal. Now
Pr [EK (M1 ) = C] = Pr [K ⊕ M1 = C]
|{ K ∈ {0, 1}m : K ⊕ M1 = C }|
=
|{0, 1}m |
1
= .
2m
Above, the probability is over the random choice of K from {0, 1}m , with M1 , C fixed. We write
the probability as the ratio of two terms: the first is the number of keys K for which K ⊕ M1 = C,
36 CLASSICAL ENCRYPTION
and the second is the total possible number of keys. The first term is one, because K can only be
the string M1 ⊕ C, while the second term is 2m . Similarly we have
Pr [EK (M2 ) = C] = Pr [K ⊕ M2 = C]
|{ K ∈ {0, 1}m : K ⊕ M2 = C }|
=
|{0, 1}m |
1
= .
2m
In this case the numerator of the fraction is one because only the key K = M2 ⊕ C has the
property that K ⊕ M2 = C. Now, since the two probabilities we have computed above are equal,
Equation (2.1) is true, and thus our proof is complete.
Perfect security is great in terms of security, but comes at a hefty price. It turns out that in any
perfectly secure scheme, the length of the key must be at least the length of the (single) message
encrypted. This means that in practice a perfectly secure scheme (like the OTP) is prohibitively
expensive, requiring parties to exchange very long keys before they can communicate securely.
In practice we want parties to be able to hold a short key, for example 128 bits, and then be
able to securely encrypt essentially any amount of data. To achieve this, we need to make a switch
regarding what kinds of security attributes we seek. As we discussed in the Introduction, we will
ask for security that is not perfect but good enough, the latter interpreted in a computational sense.
Visualizing an adversary as someone running programs to break our system, we will say something
like, yes, in principle you can break my scheme, but it would take more than 100 years running on
the world’s fastest computers to break it with a probability greater than 2−60 . In practice, this is
good enough.
To get schemes like that we need some tools, and this is what we turn to next.
2.3 Problems
Problem 6 Suppose that you want to encrypt a single message M ∈ {0, 1, 2} using a random
shared key K ∈ {0, 1, 2}. Suppose you do this by representing K and M using two bits (00, 01,
or 10), and then XORing the two representations. Does this seem like a good protocol to you?
Explain.
Problem 7 Suppose that you want to encrypt a single message M ∈ {0, 1, 2} using a random
shared key K ∈ {0, 1, 2}. Explain a good way to do this.
Problem 8 Symmetric encryption with a deck of cards. Alice shuffles a deck of cards and deals
it all out to herself and Bob (each of them gets half of the 52 cards). Alice now wishes to send a
secret message M to Bob by saying something aloud. Eavesdropper Eve is listening in: she hears
everything Alice says (but Eve can’t see the cards).
Part A. Suppose Alice’s message M is a string of 48-bits. Describe how Alice can communicate M
to Bob in such a way that Eve will have no information about what is M .
Part B. Now suppose Alice’s message M is 49 bits. Prove that there exists no protocol which allows
Alice to communicate M to Bob in such a way that Eve will have no information about M .
Chapter 3
Blockciphers
Blockciphers are the central tool in the design of protocols for shared-key cryptography (aka. sym-
metric) cryptography. They are the main available “technology” we have at our disposal. This
chapter will take a look at these objects and describe the state of the art in their construction.
It is important to stress that blockciphers are just tools—raw ingredients for cooking up some-
thing more useful. Blockciphers don’t, by themselves, do something that an end-user would care
about. As with any powerful tool, one has to learn to use this one. Even an excellent blockcipher
won’t give you security if you use don’t use it right. But used well, these are powerful tools indeed.
Accordingly, an important theme in several upcoming chapters will be on how to use blockciphers
well. We won’t be emphasizing how to design or analyze blockciphers, as this remains very much
an art.
This chapter gets you acquainted with some typical blockciphers, and discusses attacks on them.
In particular we’ll look at two examples, DES and AES. DES is the “old standby.” It is currently
the most widely-used blockcipher in existence, and it is of sufficient historical significance that every
trained cryptographer needs to have seen its description. AES is a modern blockcipher, and it is
expected to supplant DES in the years to come.
K, C we can readily compute E −1 (K, C). By “readily compute” we mean that there are public and
relatively efficient programs available for these tasks.
In typical usage, a random key K is chosen and kept secret between a pair of users. The function
EK is then used by the two parties to process data in some way before they send it to each other.
Typically, we will assume the adversary will be able to obtain some input-output examples for EK ,
meaning pairs of the form (M, C) where C = EK (M ). But, ordinarily, the adversary will not be
shown the key K. Security relies on the secrecy of the key. So, as a first cut, you might think
of the adversary’s goal as recovering the key K given some input-output examples of EK . The
blockcipher should be designed to make this task computationally difficult. (Later we will refine
the view that the adversary’s goal is key-recovery, seeing that security against key-recovery is a
necessary but not sufficient condition for the security of a blockcipher.)
We emphasize that we’ve said absolutely nothing about what properties a blockcipher should
have. A function like EK (M ) = M is a blockcipher (the “identity blockcipher”), but we shall not
regard it as a “good” one.
How do real blockciphers work? Lets take a look at some of them to get a sense of this.
3.2.2 Construction
The DES algorithm is depicted in Fig. 3.1. It takes input a 56-bit key K and a 64 bit plaintext
M . The key-schedule KeySchedule produces from the 56-bit key K a sequence of 16 subkeys, one
for each of the rounds that follows. Each subkey is 48-bits long. We postpone the discussion of the
KeySchedule algorithm.
The initial permutation IP simply permutes the bits of M , as described by the table of Fig. 3.2.
The table says that bit 1 of the output is bit 58 of the input; bit 2 of the output is bit 50 of the
Bellare and Rogaway 39
Figure 3.1: The DES blockcipher. The text and other figures describe the subroutines
KeySchedule, f, IP, IP−1 .
IP IP−1
58 50 42 34 26 18 10 2 40 8 48 16 56 24 64 32
60 52 44 36 28 20 12 4 39 7 47 15 55 23 63 31
62 54 46 38 30 22 14 6 38 6 46 14 54 22 62 30
64 56 48 40 32 24 16 8 37 5 45 13 53 21 61 29
57 49 41 33 25 17 9 1 36 4 44 12 52 20 60 28
59 51 43 35 27 19 11 3 35 3 43 11 51 19 59 27
61 53 45 37 29 21 13 5 34 2 42 10 50 18 58 26
63 55 47 39 31 23 15 7 33 1 41 9 49 17 57 25
Figure 3.2: Tables describing the DES initial permutation IP and its inverse IP−1 .
input; . . . ; bit 64 of the output is bit 7 of the input. Note that the key is not involved in this
permutation. The initial permutation does not appear to affect the cryptographic strength of the
algorithm. It might have been included to slow-down software implementations.
The permuted plaintext is now input to a loop, which operates on it in 16 rounds. Each round
takes a 64-bit input, viewed as consisting of a 32-bit left half and a 32-bit right half, and, under the
influence of the sub-key Kr , produces a 64-bit output. The input to round r is Lr−1 k Rr−1 , and
the output of round r is Lr k Rr . Each round is what is called a Feistel round, named after Horst
Feistel, one the IBM designers of a precursor of DES. Fig. 3.1 shows how it works, meaning how
Lr k Rr is computed as a function of Lr−1 k Rr−1 , by way of the function f , the latter depending
on the sub-key Kr associated to the r-th round.
One of the reasons to use this round structure is that it is reversible, important to ensure that
DESK is a permutation for each key K, as it should be to qualify as a blockcipher. Indeed, given
Lr k Rr (and Kr ) we can recover Lr−1 k Rr−1 via Rr−1 ← Lr and Lr−1 ← f (Kr , Lr ) ⊕ Rr .
Following the 16 rounds, the inverse of the permutation IP, also depicted in Fig. 3.2, is applied
to the 64-bit output of the 16-th round, and the result of this is the output ciphertext.
A sequence of Feistel rounds is a common high-level design for a blockcipher. For a closer look
we need to see how the function f (·, ·) works. It is shown in Fig. 3.3. It takes a 48-bit subkey J
and a 32-bit input R to return a 32-bit output. The 32-bit R is first expanded into a 48-bit via the
function E described by the table of Fig. 3.4. This says that bit 1 of the output is bit 32 of the
input; bit 2 of the output is bit 1 of the input; . . . ; bit 48 of the output is bit 1 of the input.
Note the E function is quite structured. In fact barring that 1 and 32 have been swapped (see
top left and bottom right) it looks almost sequential. Why did they do this? Who knows. That’s
the answer to most things about DES.
40 BLOCKCIPHERS
Figure 3.3: The f -function of DES. The text and other figures describe the subroutines used.
E P
32 1 2 3 4 5 16 7 20 21
4 5 6 7 8 9 29 12 28 17
8 9 10 11 12 13 1 15 23 26
12 13 14 15 16 17 5 18 31 10
16 17 18 19 20 21 2 8 24 14
20 21 22 23 24 25 32 27 3 9
24 25 26 27 28 29 19 13 30 6
28 29 30 31 32 1 22 11 4 25
Figure 3.4: Tables describing the expansion function E and final permutation P of the DES f -
function.
Now the sub-key J is XORed with the output of the E function to yield a 48-bit result that we
continue to denote by R. This is split into 8 blocks, each 6-bits long. To the i-th block we apply
the function Si called the i-th S-box. Each S-box is a function taking 6 bits and returning 4 bits.
The result is that the 48-bit R is compressed to 32 bits. These 32 bits are permuted according to
the P permutation described in the usual way by the table of Fig. 3.4, and the result is the output
of the f function. Let us now discuss the S-boxes.
Each S-box is described by a table as shown in Fig. 3.5. Read these tables as follows. Si takes
a 6-bit input. Write it as b1 b2 b3 b4 b5 b6 . Read b3 b4 b5 b6 as an integer in the range 0, . . . , 15, naming
a column in the table describing Si . Let b1 b2 name a row in the table describing Si . Take the row
b1 b2 , column b3 b4 b5 b6 entry of the table of Si to get an integer in the range 0, . . . , 15. The output
of Si on input b1 b2 b3 b4 b5 b6 is the 4-bit string corresponding to this table entry.
The S-boxes are the heart of the algorithm, and much effort was put into designing them to
achieve various security goals and resistance to certain attacks.
Finally, we discuss the key schedule. It is shown in Fig. 3.6. Each round sub-key Kr is formed
by taking some 48 bits of K. Specifically, a permutation called PC-1 is first applied to the 56-bit
key to yield a permuted version of it. This is then divided into two 28-bit halves and denoted
C0 k D0 . The algorithm now goes through 16 rounds. The r-th round takes input Cr−1 k Dr−1 ,
computes Cr k Dr , and applies a function PC-2 that extracts 48 bits from this 56-bit quantity. This
is the sub-key Kr for the r-th round. The computation of Cr k Dr is quite simple. The bits of Cr−1
are rotated to the left j positions to get Cr , and Dr is obtained similarly from Dr−1 , where j is
either 1 or 2, depending on r.
The functions PC-1 and PC-2 are tabulated in Fig. 3.7. The first table needs to be read in a
strange way. It contains 56 integers, these being all integers in the range 1, . . . , 64 barring multiples
Bellare and Rogaway 41
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 0 14 4 13 1 2 15 11 8 3 10 6 12 5 9 0 7
S1 : 0 1 0 15 7 4 14 2 13 1 10 6 12 11 9 5 3 8
1 0 4 1 14 8 13 6 2 11 15 12 9 7 3 10 5 0
1 1 15 12 8 2 4 9 1 7 5 11 3 14 10 0 6 13
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 0 15 1 8 14 6 11 3 4 9 7 2 13 12 0 5 10
S2 : 0 1 3 13 4 7 15 2 8 14 12 0 1 10 6 9 11 5
1 0 0 14 7 11 10 4 13 1 5 8 12 6 9 3 2 15
1 1 13 8 10 1 3 15 4 2 11 6 7 12 0 5 14 9
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 0 10 0 9 14 6 3 15 5 1 13 12 7 11 4 2 8
S3 : 0 1 13 7 0 9 3 4 6 10 2 8 5 14 12 11 15 1
1 0 13 6 4 9 8 15 3 0 11 1 2 12 5 10 14 7
1 1 1 10 13 0 6 9 8 7 4 15 14 3 11 5 2 12
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 0 7 13 14 3 0 6 9 10 1 2 8 5 11 12 4 15
S4 : 0 1 13 8 11 5 6 15 0 3 4 7 2 12 1 10 14 9
1 0 10 6 9 0 12 11 7 13 15 1 3 14 5 2 8 4
1 1 3 15 0 6 10 1 13 8 9 4 5 11 12 7 2 14
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 0 2 12 4 1 7 10 11 6 8 5 3 15 13 0 14 9
S5 : 0 1 14 11 2 12 4 7 13 1 5 0 15 10 3 9 8 6
1 0 4 2 1 11 10 13 7 8 15 9 12 5 6 3 0 14
1 1 11 8 12 7 1 14 2 13 6 15 0 9 10 4 5 3
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 0 12 1 10 15 9 2 6 8 0 13 3 4 14 7 5 11
S6 : 0 1 10 15 4 2 7 12 9 5 6 1 13 14 0 11 3 8
1 0 9 14 15 5 2 8 12 3 7 0 4 10 1 13 11 6
1 1 4 3 2 12 9 5 15 10 11 14 1 7 6 0 8 13
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 0 4 11 2 14 15 0 8 13 3 12 9 7 5 10 6 1
S7 : 0 1 13 0 11 7 4 9 1 10 14 3 5 12 2 15 8 6
1 0 1 4 11 13 12 3 7 14 10 15 6 8 0 5 9 2
1 1 6 11 13 8 1 4 10 7 9 5 0 15 14 2 3 12
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 0 13 2 8 4 6 15 11 1 10 9 3 14 5 0 12 7
S8 : 0 1 1 15 13 8 10 3 7 4 12 5 6 11 0 14 9 2
1 0 7 11 4 1 9 12 14 2 0 6 10 13 15 3 5 8
1 1 2 1 14 7 4 10 8 13 15 12 9 0 3 5 6 11
of 8. Given a 56-bit string K = K[1] . . . K[56] as input, the corresponding function returns the
56-bit string L = L[1] . . . L[56] computed as follows. Suppose 1 ≤ i ≤ 56, and let a be the i-th
entry of the table. Write a = 8q + r where 1 ≤ r ≤ 7. Then let L[i] = K[a − q]. As an example, let
us determine the first bit, L[1], of the output of the function on input K. We look at the first entry
in the table, which is 57. We divide it by 8 to get 57 = 8(7) + 1. So L[1] equals K[57 − 7] = K[50],
meaning the 1st bit of the output is the 50-th bit of the input. On the other hand PC-2 is read in
the usual way as a map taking a 56-bit input to a 48 bit output: bit 1 of the output is bit 14 of
the input; bit 2 of the output is bit 17 of the input; . . . ; bit 56 of the output is bit 32 of the input.
42 BLOCKCIPHERS
Figure 3.6: The key schedule of DES. Here leftshiftj denotes the function that rotates its input to
the left by j positions.
PC-1 PC-2
57 49 41 33 25 17 9 14 17 11 24 1 5
1 58 50 42 34 26 18 3 28 15 6 21 10
10 2 59 51 43 35 27 23 19 12 4 26 8
19 11 3 60 52 44 36 16 7 27 20 13 2
63 55 47 39 31 23 15 41 52 31 37 47 55
7 62 54 46 38 30 22 30 40 51 45 33 48
14 6 61 53 45 37 29 44 49 39 56 34 53
21 13 5 28 20 12 4 46 42 50 36 29 32
Figure 3.7: Tables describing the PC-1 and PC-2 functions used by the DES key schedule of Fig. 3.6.
Well now you know how DES works. Of course, the main questions about the design are:
why, why and why? What motivated these design choices? We don’t know too much about this,
although we can guess a little. And one of the designers of DES, Don Coppersmith, has written a
short paper which provides some information.
3.2.3 Speed
One of the design goals of DES was that it would have fast implementations relative to the tech-
nology of its time. How fast can you compute DES? In roughly current technology (well, nothing
is current by the time one writes it down!) one can get well over 1 Gbit/sec on high-end VLSI.
Specifically at least 1.6 Gbits/sec, maybe more. That’s pretty fast. Perhaps a more interesting
figure is that one can implement each DES S-box with at most 50 two-input gates, where the circuit
has depth of only 3. Thus one can compute DES by a combinatorial circuit of about 8 · 16 · 50 = 640
gates and depth of 3 · 16 = 48 gates.
In software, on a fairly modern processor, DES takes something like 80 cycles per byte. This
is disappointingly slow—not surprisingly, since DES was optimized for hardware and was designed
before the days in which software implementations were considered feasible or desirable.
We fix a blockcipher E: {0, 1}k × {0, 1}n → {0, 1}n having key-size k and block size n. It is
assumed that the attacker knows the description of E and can compute it. For concreteness, you
can think of E as being DES.
Historically, cryptanalysis of blockciphers has focused on key-recovery. The cryptanalyst may
think of the problem to be solved as something like this. A k-bit key T , called the target key, is
chosen at random. Let q ≥ 0 be some integer parameter.
Given: The adversary has a sequence of q input-output examples of ET , say
(M1 , C1 ), . . . , (Mq , Cq )
where Ci = ET (Mi ) for i = 1, . . . , q and M1 , . . . , Mq are all distinct n-bit strings.
Find: The adversary wants to find the target key T .
Let us say that a key K is consistent with the input-output examples (M1 , C1 ), . . . , (Mq , Cq ) if
EK (Mi ) = Ci for all 1 ≤ i ≤ q. We let
ConsE ((M1 , C1 ), . . . , (Mq , Cq ))
be the set of all keys consistent with the input-output examples (M1 , C1 ), . . . , (Mq , Cq ). Of course
the target key T is in this set. But the set might be larger, containing other keys. Without
asking further queries, a key-recovery attack cannot hope to differentiate the target key from other
members of ConsE ((M1 , C1 ), . . . , (Mq , Cq )). Thus, the goal is sometimes viewed as simply being to
find some key in this set. For practical blockciphers we expect that, if a few input-output examples
are used, the size of the above set will be one, so the adversary can indeed find the target key. We
will exemplify this when we consider specific attacks.
Some typical kinds of “attack” that are considered within this framework:
Known-message attack: M1 , . . . , Mq are any distinct points; the adversary has no control over
them, and must work with whatever it gets.
Chosen-message attack: M1 , . . . , Mq are chosen by the adversary, perhaps even adaptively.
That is, imagine it has access to an “oracle” for the function EK . It can feed the oracle M1 and
get back C1 = EK (M1 ). It can then decide on a value M2 , feed the oracle this, and get back C2 ,
and so on.
Clearly a chosen-message attack gives the adversary more power, but it may be less realistic in
practice.
The most obvious attack strategy is exhaustive key search. The adversary goes through all
possible keys K ′ ∈ {0, 1}k until it finds one that explains the input-output pairs. Here is the attack
in detail, using q = 1, meaning one input-output example. For i = 1, . . . , 2k let Ti denote the i-th
k-bit string (in lexicographic order).
algorithm Aeks
E (M1 , C1 )
for i = 1, . . . , 2k do
if ETi (M1 ) = C1 then return Ti
This attack always returns a key consistent with the given input-output example (M1 , C1 ). Whether
or not it is the target key depends on the blockcipher. If one imagines the blockcipher to be random,
then the blockcipher’s key length and block length are relevant in assessing if the above attack will
find the “right” key. , The likelihood of the attack returning the target key can be increased by
testing against more input-output examples:
44 BLOCKCIPHERS
A fairly small vaue of q, say somewhat more than k/n, is enough that this attack will usually return
the target key itself. For DES, q = 1 or q = 2 seems to be enough.
Thus, no blockcipher is perfectly secure. It is always possible for an attacker to recover a consis-
tent key. A good blockcipher, however, is designed to make this task computationally prohibitive.
How long does exhaustive key-search take? Since we will choose q to be small we can neglect the
difference in running time between the two versions of the attack above, and focus for simplicity on
the first attack. In the worst case, it uses 2k computations of the blockcipher. However it could be
less since one could get lucky. For example if the target key is in the first half of the search space,
only 2k−1 computations would be used. So a better measure is how long it takes on the average.
This is
2 k 2k 2 k
X X i 1 X 1 2k (2k + 1) 2k + 1
i · Pr[K = Ti ] = = k· i = k · = ≈ 2k−1
i=1 i=1
2k 2 i=1 2 2 2
computations of the blockcipher. This is because the target key is chosen at random, so with
probability 1/2k equals Ti , and in that case the attack uses i E-computations to find it.
Thus to make key-recovery by exhaustive search computationally prohibitive, one must make
the key-length k of the blockcipher large enough.
Let’s look at DES. We noted above that there is VLSI chip that can compute it at the rate of
1.6 Gbits/sec. How long would key-recovery via exhaustive search take using this chip? Since a
DES plaintext is 64 bits, the chip enables us to perform (1.6 · 109 )/64 = 2.5 · 107 DES computations
per second. To perform 255 computations (here k = 56) we thus need 255 /(2.5 · 107 ) ≈ 1.44 · 109
seconds, which is about 45.7 years. This is clearly prohibitive.
It turns out that that DES has a property called key-complementation that one can exploit to
reduce the size of the search space by one-half, so that the time to find a key by exhaustive search
comes down to 22.8 years. But this is still prohibitive.
Yet, the conclusion that DES is secure against exhaustive key search is actually too hasty. We
will return to this later and see why.
Exhaustive key search is a generic attack in the sense that it works against any blockcipher.
It only involves computing the blockcipher and makes no attempt to analyze the cipher and find
and exploit weaknesses. Cryptanalysts also need to ask themselves if there is some weakness in the
structure of the blockcipher they can exploit to obtain an attack performing better than exhaustive
key search.
For DES, the discovery of such attacks waited until 1990. Differential cryptanalysis is capable
of finding a DES key using about 247 input-output examples (that is, q = 247 ) in a chosen-message
attack [1, 2]. Linear cryptanalysis [4] improved differential in two ways. The number of input-
output examples required is reduced to 244 , and only a known-message attack is required. (An
alternative version uses 242 chosen plaintexts [24].)
These were major breakthroughs in cryptanalysis that required careful analysis of the DES
construction to find and exploit weaknesses. Yet, the practical impact of these attacks is small.
Why? Ordinarily it would be impossible to obtain 244 input-output examples. Furthermore, the
storage requirement for these examples is prohibitive. A single input-output pair, consisting of a
64-bit plaintext and 64-bit ciphertext, takes 16 bytes of storage. When there are 244 such pairs, we
need 16 · 244 = 2.81 · 1014 bits, or about 281 terabytes of storage, which is enormous.
Bellare and Rogaway 45
Linear and differential cryptanalysis were however more devastating when applied to other
ciphers, some of which succumbed completely to the attack.
So what’s the best possible attack against DES? The answer is exhaustive key search. What
we ignored above is that the DES computations in this attack can be performed in parallel. In
1993, Weiner argued that one can design a $1 million machine that does the exhaustive key search
for DES in about 3.5 hours on the average [7]. His machine would have about 57,000 chips, each
performing numerous DES computations. More recently, a DES key search machine was actually
built by the Electronic Frontier Foundation, at a cost of $250,000 [5]. It finds the key in 56 hours,
or about 2.5 days on the average. The builders say it will be cheaper to build more machines now
that this one is built.
Thus DES is feeling its age. Yet, it would be a mistake to take away from this discussion the
impression that DES is a weak algorithm. Rather, what the above says is that it is an impressively
strong algorithm. After all these years, the best practical attack known is still exhaustive key
search. That says a lot for its design and its designers.
Later we will see that we would like security properties from a blockcipher that go beyond
resistance to key-recovery attacks. It turns out that from that point of view, a limitation of DES
is its block size. Birthday attacks “break” DES with about q = 232 input output examples. (The
meaning of “break” here is very different from above.) Here 232 is the square root of 264 , meaning
to resist these attacks we must have bigger block size. The next generation of ciphers—things like
AES—took this into account.
3.4.1 Double-DES
Let K1 , K2 be 56-bit DES keys and let M be a 64-bit plaintext. Let
2DES(K1 k K2 , M ) = DES(K2 , DES(K1 , M )) .
This defines a blockcipher 2DES: {0, 1}112 × {0, 1}64 → {0, 1}64 that we call Double-DES. It has a
112-bit key, viewed as consisting of two 56-bit DES keys. Note that it is reversible, as required to
be a blockcipher:
2DES−1 (K1 k K2 , C) = DES−1 (K1 , DES−1 (K2 , C)) .
for any 64-bit C.
The key length of 112 is large enough that there seems little danger of 2DES succumbing
to an exhaustive key search attack, even while exploiting the potential for parallelism and special-
purpose hardware. On the other hand, 2DES also seems secure against the best known cryptanalytic
techniques, namely differential and linear cryptanalysis, since the iteration effectively increases the
number of Feistel rounds. This would indicate that 2DES is a good way to obtain a DES-based
cipher more secure than DES itself.
However, although 2DES has a key-length of 112, it turns out that it can be broken using
about 257 DES and DES−1 computations by what is called a meet-in-the-middle attack, as we now
46 BLOCKCIPHERS
illustrate. Let K1 k K2 denote the target key and let C1 = 2DES(K1 k K2 , M1 ). The attacker, given
M1 , C1 , is attempting to find K1 k K2 . We observe that
C1 = DES(K2 , DES(K1 , M1 )) ⇒ DES−1 (K2 , C1 ) = DES(K1 , M1 ) .
This leads to the following attack. Below, for i = 1, . . . , 256 we let Ti denote the i-th 56-bit string
(in lexicographic order):
AMinM
2DES (M1 , C1 )
for i = 1, . . . , 256 do L[i] ← DES(Ti , M1 )
for j = 1, . . . , 256 do R[j] ← DES−1 (Tj , C1 )
S ← { (i, j) : L[i] = R[j] }
Pick some (l, r) ∈ S and return Tl k Tr
3.4.2 Triple-DES
The triple-DES ciphers use three iterations of DES or DES−1 . The three-key variant is defined by
3DES3(K1 k K2 k K3 , M ) = DES(K3 , DES−1 (K2 , DES(K1 , M )) ,
so that 3DES3: {0, 1}168 × {0, 1}64 → {0, 1}64 . The two-key variant is defined by
3DES2(K1 k K2 , M ) = DES(K2 , DES−1 (K1 , DES(K2 , M )) ,
Bellare and Rogaway 47
so that 3DES2: {0, 1}112 × {0, 1}64 → {0, 1}64 . You should check that these functions are reversible
so that they do qualify as blockciphers. The term “triple” refers to there being three applications
of DES or DES−1 . The rationale for the middle application being DES−1 rather than DES is that
DES is easily recovered via
As with 2DES, the key length of these ciphers appears long enough to make exhaustive key
search prohibitive, even with the best possible engines, and, additionally, differential and linear
cryptanalysis are not particularly effective because iteration effectively increases the number of
Feistel rounds.
3DES3 is subject to a meet-in-the-middle attack that finds the 168-bit key using about 2112
computations of DES or DES−1 , so that it has an effective key length of 112. There does not
appear to be a meet-in-the-middle attack on 3DES2 however, so that its key length of 112 is also
its effective key length.
The 3DES2 cipher is popular in practice and functions as a canonical and standard replacement
for DES. 2DES, although having the same effective key length as 3DES2 and offering what appears
to be the same or at least adequate security, is not popular in practice. It is not entirely apparent
why 3DES2 is preferred over 2DES, but the reason might be Equation (3.2).
3.4.3 DESX
Although 2DES, 3DES3 and 3DES2 appear to provide adequate security, they are slow. The first is
twice as slow as DES and the other two are three times as slow. It would be nice to have a DES based
blockcipher that had a longer key than DES but was not significantly more costly. Interestingly,
there is a simple design that does just this. Let K be a 56-bit DES key, let K1 , K2 be 64-bit strings,
and let M be a 64-bit plaintext. Let
DESX(K k K1 k K2 , M ) = K2 ⊕ DES(K, K1 ⊕ M ) .
This defines a blockcipher DESX: {0, 1}184 × {0, 1}64 → {0, 1}64 . It has a 184-bit key, viewed as
consisting of a 56-bit DES key plus two auxiliary keys, each 64 bits long. Note that it is reversible,
as required to be a blockcipher:
DESX−1 (K k K1 k K2 , C) = K1 ⊕ DES−1 (K, K2 ⊕ C) .
The key length of 184 is certainly enough to preclude exhaustive key search attacks. DESX is no
more secure than DES against linear of differential cryptanalysis, but we already saw that these are
not really practical attacks.
There is a meet-in-the-middle attack on DESX. It finds a 184-bit DESX key using 2120 DES and
DES−1 computations. So the effective key length of DESX seems to be 120, which is large enough
for security.
DESX is less secure than Double or Triple DES because the latter are more more resistant than
DES to linear and differential cryptanalysis while DESX is only as good as DES itself in this regard.
However, this is good enough; we saw that in practice the weakness of DES was not these attacks
but rather the short key length leading to successful exhaustive search attacks. DESX fixes this,
and very cheaply. In summary, DESX is popular because it is much cheaper than Double of Triple
DES while providing adequate security.
48 BLOCKCIPHERS
function AESK (M )
(K0 , . . . , K10 ) ← expand(K)
s ← M ⊕ K0
for r = 1 to 10 do
s ← S(s)
s ← shift-rows(s)
if r ≤ 9 then s ← mix-cols(s) fi
s ← s ⊕ Kr
end for
return s
Figure 3.8: The function AES128. See the accompanying text and figures for definitions of the
maps expand, S, shift-rows, mix-cols.
Refer to Fig. 3.8. The value s is called the state. One initizlizes the state to M and the final
state is the ciphertext C one gets by enciphering M . What happens in each iteration of the for
loop is called a round. So AES consists of ten rounds. The rounds are identical except that each
uses a different subkey Ki and, also, round 10 omits the call to mix-cols.
To understand what goes on in S and mix-cols we will need to review a bit of algebra. Let
us make a pause to do that. We describe a way to do arithmetic on bytes. Identify each byte
a = a7 a6 a5 a4 a3 a2 a1 a0 with the formal polynomial a7 x7 + a6 x6 + a5 x5 + a4 x4 + a3 x3 + a2 x2 + a1 x+ a0 .
We can add two bytes by taking their bitwise xor (which is the same as the mod-2 sum the
corresponding polynomials). We can multiply two bytes to get a degree 14 (or less) polynomial,
and then take the remainder of this polynomial by the fixed irreducible polynomial
m(x) = x8 + x4 + x3 + x + 1 .
This remainder polynomial is a polynomial of degree at most seven which, as before, can be regarded
as a byte. In this way, we can add and multiply any two bytes. The resulting algebraic structure
has all the properties necessary to be called a finite field. In particular, this is one representation
of the finite field known as GF(28 )—the Galois field on 28 = 256 points. As a finite field, you can
find the inverse of any nonzero field point (the zero-element is the zero byte) and you can distribute
addition over multiplication, for example.
There are some useful tricks when you want to multiply two bytes. Since m(x) is another name
for zero, x8 = x4 + x3 + x + 1 = {1b}. (Here the curly brackets simply indicate a hexadecimal
number.) So it is easy to multiply a byte a by the byte x = {02}: namely, shift the 8-bit byte a
one position to the left, letting the first bit “fall off” (but remember it!) and shifting a zero into
the last bit position. We write this operation a hhh 1. If that first bit of a was a 0, we are done.
If the first bit was a 1, we need to add in (that is, xor in) x8 = {1b}. In summary, for a a byte,
a · x = a · {02} is a hhh 1 if the first bit of a is 0, and it is (a hhh 1) ⊕ {1b} if the first bit of a is 1.
Knowing how to multiply by x = {02} let’s you conveniently multiply by other quantities. For
example, to compute {a1} · {03} compute {a1} · ({02} ⊕ {01}) = {a1} · {02} ⊕ {a1} · {01} =
{42} ⊕ {1b} ⊕ a1 = {f8}. Try some more examples on your own.
As we said, each nonzero byte a has a multiplicative inverse, inv(a) = a−1 , The mapping we will
denote S : {0, 1}8 → {0, 1}8 is obtained from the map inv : a 7→ a−1 . First, patch this map to make
it total on {0, 1}8 by setting inv({00}) = {00}. Then, to compute S(a), first replace a by inv(a),
50 BLOCKCIPHERS
63 7c 77 7b f2 6b 6f c5 30 01 67 2b fe d7 ab 76
ca 82 c9 7d fa 59 47 f0 ad d4 a2 af 9c a4 72 c0
b7 fd 93 26 36 3f f7 cc 34 a5 e5 f1 71 d8 31 15
04 c7 23 c3 18 96 05 9a 07 12 80 e2 eb 27 b2 75
09 83 2c 1a 1b 6e 5a a0 52 3b d6 b3 29 e3 2f 84
53 d1 00 ed 20 fc b1 5b 6a cb be 39 4a 4c 58 cf
d0 ef aa fb 43 4d 33 85 45 f9 02 7f 50 3c 9f a8
51 a3 40 8f 92 9d 38 f5 bc b6 da 21 10 ff f3 d2
cd 0c 13 ec 5f 97 44 17 c4 a7 7e 3d 64 5d 19 73
60 81 4f dc 22 2a 90 88 46 ee b8 14 de 5e 0b db
e0 32 3a 0a 49 06 24 5c c2 d3 ac 62 91 95 e4 79
e7 c8 37 6d 8d d5 4e a9 6c 56 f4 ea 65 7a ae 08
ba 78 25 2e 1c a6 b4 c6 e8 dd 74 1f 4b bd 8b 8a
70 3e b5 66 48 03 f6 0e 61 35 57 b9 86 c1 1d 9e
e1 f8 98 11 69 d9 8e 94 9b 1e 87 e9 ce 55 28 df
8c a1 89 0d bf e6 42 68 41 99 2d 0f b0 54 bb 16
Figure 3.9: The AES S-box, which is a function S : {0, 1}8 → {0, 1}8 specified by the following list.
All values in hexadecimal. The meaning is: S(00) = 63, S(01) = 7c, . . ., S(ff) = 16.
number the bits of a by a = a7 a6 a5 a4 a3 a2 a1 a0 , and return the value a′ , where a′ = a′7 a′6 a′5 a′4 a′3 a′2 a′1 a′0
where
a′7 1 0 0 0 1 1 1 1 a7 1
a′6
1 1 0 0 0 1 1 1
a6
1
a′5 1 1 1 0 0 0 1 1 a5 0
a′4 1 1 1 1 0 0 0 1 a4 0
= · +
a′3
1 1 1 1 1 0 0 0
a3
0
a′2
0 1 1 1 1 1 0 0
a2
1
a′1 0 0 1 1 1 1 1 0 a1 1
a′0 0 0 0 1 1 1 1 1 a0 0
All arithmetic is in GF(2), meaning that addition of bits is their xor and multiplication of bits is
the conjunction (and).
All together, the map S is give by Fig. 3.9, which lists the values of
S(0), S(1), . . . , S(255) .
In fact, one could forget how this table is produced, and just take it for granted. But the fact is
that it is made in the simple way we have said.
Now that we have the function S, let us extend it (without bothering to change the name) to
a function with domain {{0, 1}8 }+ . Namely, given an m-byte string A = A[1] . . . A[m], set S(A) to
be S(A[1]) . . . S(A[m]). In other words, just apply S bytewise.
Now we’re ready to understand the first map, S(s). One takes the 16-byte state s and applies
the 8-bit lookup table to each of its bytes to get the modified state s.
Moving on, the shift-rows operation works like this. Imagine plastering the 16 bytes of s =
Bellare and Rogaway 51
function expand(K)
K0 ← K
for i ← 1 to 10 do
Ki [0] ← Ki−1 [0] ⊕ S(Ki−1 [3] hhh 8) ⊕ Ci
Ki [1] ← Ki−1 [1] ⊕ Ki [0]
Ki [2] ← Ki−1 [2] ⊕ Ki [1]
Ki [3] ← Ki−1 [3] ⊕ Ki [2]
od
return (K0 , . . . , K10 )
Figure 3.10: The AES128 key-expansion algorithm maps a 128-bit key K into eleven 128-bit sub-
keys, K0 , . . . , K10 . Constants (C1 , . . . , C10 ) are ({02000000}, {04000000}, {08000000}, {10000000},
{20000000}, {40000000}, {80000000}, {1B000000}, {36000000}, {6C000000}). All other notation
is described in the accompanying text.
s0 s4 s8 s12
s1 s5 s9 s13
s2 s6 s10 s14
s3 s7 s11 s15
For the shift-rows step, left circularly shift the second row by one position; the third row by two
positions; and the the fourth row by three positions. The first row is not shifted at all. Somewhat
less colorfully, the mapping is simply
Using the same convention as before, the mix-cols step takes each of the four columns in the
4×4 table and applies the (same) transformation to it. Thus we define mix-cols(s) on 4-byte words,
and then extend this to a 16-byte quantity wordwise. The value of mix-cols(a0 a1 a2 a3 ) = a′0 a′1 a′2 a′3
is defined by:
a′0 02 03 01 01 a0
a′1 01 02 03 01 a1
= ·
a′2
01 02 02 03
a2
a′3 03 01 01 02 a3
An equivalent way to explain this step is to say that we are multiplying a(x) = a3 x3 +a2 x2 +a1 x1 +a0
by the fixed polynomial c(x) = {03}x3 + {01}x2 + {01}x + {02} and taking the result modulo x4 + 1.
At this point we have described everything but the key-expansion map, expand. That map is
given in Fig. 3.10.
We have now completed the definition of AES. One key property is that AES is a blockcipher:
the map is invertible. This follows because every round is invertible. That a round is invertible
follows from each of its steps being invertible, which is a consequence of S being a permutation and
the matrix used in mix-cols having an inverse.
52 BLOCKCIPHERS
In the case of DES, the rationale for the design were not made public. Some explanation for
different aspects of the design have become more apparent over time as we have watched the effects
on DES of new attack strategies, but fundamentally, the question of why the design is as it is has
not received a satisfying cipher. In the case of AES there was significantly more documentation of
the rationale for design choices. (See the book The design of Rijndael by the designers [9]).
Nonetheless, the security of blockciphers, including DES and AES, eventually comes down to
the statement that “we have been unable to find effective attacks, and we have tried attacks along
the following lines . . ..” If people with enough smarts and experience utter this statement, then
it suggests that the blockcipher is good. Beyond this, it’s hard to say much. Yet, by now, our
community has become reasonably experienced designing these things. It wouldn’t even be that
hard a game were it not for the fact we tend to be agressive in optimizing the block-cipher’s
speed. (Some may come to the opposite opinion, that it’s a very hard game, seeing just how many
reasonable-looking blockciphers have been broken.) Later we give some vague sense of the sort of
cleverness that people muster against blockciphers.
Such a long list of necessary but not sufficient properties is no way to treat security. What
we need is a single “mater” property of a blockcipher which, if met, guarantees security of lots of
natural usages of the cipher.
Such a property is that the blockcipher be a pseudorandom permutation (PRF), a notion
explored in another chapter.
3.7 Problems
Problem 9 Show that for all K ∈ {0, 1}56 and all x ∈ {0, 1}64
DESK (x) = DESK (x) .
This is called the key-complementation property of DES.
Problem 10 Explain how to use the key-complementation property of DES to speed up exhaustive
key search by about a factor of two. Explain any assumptions that you make.
Problem 12 As with AES, suppose we are working in the finite field with 28 elements, representing
field points using the irreducible polynomial m(x) = x8 + x4 + x3 + x + 1. Compute the byte that
is the result of multiplying bytes:
{e1} · {05}
Problem 13 For AES, we have given two different descriptions of mix-cols: one using matric
multiplication (in GF(28 )) and one based on multiplying by a fixed polynomial c(x) modulo a
second fixed polynomial, d(x) = x4 + 1. Show that these two methods are equivalent.
Problem 14 Verify that the matrix used for mix-cols has as its inverse the matrix
0e 0b 0d 09
09 0e 0b 0d
0d 09 0e 0b
0b 0d 09 0e
Explain why it is that all of the entries in this matrix begin with a zero.
Problem 15 How many different permutations are there from 128 bits to 128 bits? How many
different functions are then from 128 bits to 128 bits?
Problem 16 Upper and lower bound, as best you can, the probability that a random function
from 128 bits to 128 bits is actually a permutation.
Problem 18 Justify and then refute the following proposition: enciphering under AES can be
implemented faster than deciphering.
Problem 19 Choose a random DES key K ∈ {0, 1}56 . Let (M, C), where C = DESK (M ), be a
single plaintext/ciphertext pair that an adversary knows. Suppose the adversary does an exhaustive
key search to locate the lexicographically first key T such that C = DEST (M ). Estimate the
probablity that T = K. Discuss any assumptions you must make to answer this question.
Bibliography
[2] E. Biham and A. Shamir. Differential cryptanalysis of the Full 16-round DES. Advances
in Cryptology – CRYPTO ’92, Lecture Notes in Computer Science Vol. 740, E. Brickell ed.,
Springer-Verlag, 1992.
[4] M. Matsui. Linear cryptanalysis method for DES cipher. Advances in Cryptology – EURO-
CRYPT ’93, Lecture Notes in Computer Science Vol. 765, T. Helleseth ed., Springer-Verlag,
1993.
[6] L. Knudsen and J. E. Mathiassen. A Chosen-Plaintext Linear Attack on DES. Fast Soft-
ware Encryption ’00, Lecture Notes in Computer Science Vol. 1978, B. Schneier ed., Springer-
Verlag, 2000.
[7] M. Wiener. Efficient DES Key Search. In Practical Cryptography for Data Internetworks, W.
Stallings editor, IEEE Computer Society Press, 1996, pp. 31–79. http://www3.sympatico.
ca/wienerfamily/Michael/MichaelPapers/dessearch.pdf.
56 BIBLIOGRAPHY
Chapter 4
Pseudorandom Functions
Pseudorandom functions (PRFs) and their cousins, pseudorandom permutations (PRPs), figure as
central tools in the design of protocols, especially those for shared-key cryptography. At one level,
PRFs and PRPs can be used to model blockciphers, and they thereby enable the security analysis
of protocols based on blockciphers. But PRFs and PRPs are also a useful conceptual starting point
in contexts where blockciphers don’t quite fit the bill because of their fixed block-length. So in this
chapter we will introduce PRFs and PRPs and investigate their basic properties.
Here the key length is k = 56 and the input length and output length are ℓ = L = 64. Similarly
AES (when “AES” refers to “AES128”) is a family of permutations AES: K × D → R with
K = {0, 1}128 and D = {0, 1}128 and R = {0, 1}128 .
Here the key length is k = 128 and the input length and output length are ℓ = L = 128.
4.2 Games
We will use code-based games [4] in definitions and some proofs. We recall some background here.
A game —see Fig. 4.1 for an example— has an Initialize procedure, procedures to respond to
adversary oracle queries, and a Finalize procedure. A game G is executed with an adversary
A as follows. First, Initialize executes and its outputs are the inputs to A. Then, A executes,
its oracle queries being answered by the corresponding procedures of G. When A terminates, its
output becomes the input to the Finalize procedure. The output of the latter, denoted GA , is
called the output of the game, and we let “GA ⇒ y” denote the event that this game output takes
value y. Variables not explicitly initialized or assigned are assumed to have value ⊥, except for
booleans which are assumed initialized to false. Games Gi , Gj are identical until bad if their code
differs only in statements that follow the setting of the boolean flag bad to true. The following is
the Fundamental Lemmas of game-playing:
Lemma 4.2.1 [4] Let Gi , Gj be identical until bad games, and A an adversary. Let BADi (resp.
BADj ) denote the event that the execution of Gi (resp. Gj ) with A sets bad. Then
h i h i h i h i
Pr GA A
i ∧ BADi = Pr Gj ∧ BADj and Pr GA A
i − Pr Gj ≤ Pr [BADj ] .
Example 4.3.1 Let’s do some simple probabilistic computations to understand random functions.
In all of the following, we refer to RandR where R = {0, 1}L .
1. Fix X ∈ {0, 1}ℓ and Y ∈ {0, 1}L . Let A be
Adversary A
Z ← Fn(X)
Return (Y = Z)
Then: h i
Pr RandA
R ⇒true = 2
−L
.
Notice that the probability doesn’t depend on ℓ. Nor does it depend on the values of X, Y .
2. Fix X1 , X2 ∈ {0, 1}ℓ and Y ∈ {0, 1}L . Let A be
Adversary A
Z1 ← Fn(X1 )
Z2 ← Fn(X2 )
Return (Y = Z1 ∧ Y = Z2 )
Then:
(
h i 2−2L if X1 =
6 X2
Pr RandA
R ⇒true = −L
2 if X1 = X2
3. Fix X1 , X2 ∈ {0, 1}ℓ and Y ∈ {0, 1}L . Let A be
Adversary A
Z1 ← Fn(X1 )
Z2 ← Fn(X2 )
Return (Y = Z1 ⊕ Z2 )
Then:
−L if X1 6= X2
h i 2
Pr RandA
R ⇒true = 0 if X1 = X2 and Y 6= 0L
1 if X1 = X2 and Y = 0L
4. Suppose l ≤ L and let τ : {0, 1}L → {0, 1}l denote the function that on input Y ∈ {0, 1}L
returns the first l bits of Y . Fix X1 ∈ {0, 1}ℓ and Y1 ∈ {0, 1}l . Let A be
Adversary A
Z1 ← Fn(X1 )
Return (τ (Z1 ) = Y1 )
Then: h i
Pr RandA
R ⇒true = 2
−l
.
Example 4.3.2 In all of the following we refer to game PermD where D = {0, 1}ℓ .
1. Fix X, Y ∈ {0, 1}ℓ . Let’s A be
Adversary A
Z ← Fn(X)
Return (Y = Z)
Then h i
Pr PermA
D ⇒true = 2
−ℓ
.
Pr [Fn(X1 ) ⊕ Fn(X2 ) = Y ]
X
= Pr [Fn(X1 ) = Y1 ∧ Fn(X2 ) = Y1 ⊕ Y ]
Y1
X 1 1
= · ℓ
Y1
2ℓ −1 2
1 1
= 2ℓ · · ℓ
2ℓ −1 2
1
= .
2ℓ −1
Above, the sum is over all Y1 ∈ {0, 1}ℓ . In obtaining the second equality, we used item 2 above
and the assumption that Y 6= 0ℓ .
Bellare and Rogaway 61
Definition 4.4.1 Let F : K×D → R be a family of functions, and let A be an algorithm that takes
an oracle and returns a bit. We consider two games as described in Fig. 4.1. The prf-advantage of
A is defined as h i h i
Advprf A A
F (A) = Pr RealF ⇒1 − Pr RandR ⇒1
It should be noted that the family F is public. The adversary A, and anyone else, knows the
description of the family and is capable, given values K, X, of computing F (K, X).
62 PSEUDORANDOM FUNCTIONS
Game RealF picks a random instance FK of family F and then runs adversary A with oracle
Fn = FK . Adversary A interacts with its oracle, querying it and getting back answers, and
eventually outputs a “guess” bit. The game returns the same bit. Game RandR implements Fn as
a random function with range R. Again, adversary A interacts with the oracle, eventually returning
a bit that is the output of the game. Each game has a certain probability of returning 1. The
probability is taken over the random choices made in the game. Thus, for the first game, the
probability is over the choice of K and any random choices that A might make, for A is allowed to
be a randomized algorithm. In the second game, the probability is over the random choice made
by the game in implementing Fn and any random choices that A makes. These two probabilities
should be evaluated separately; the two games are completely distinct.
To see how well A does at determining which world it is in, we look at the difference in the
probabilities that the two games return 1. If A is doing a good job at telling which world it is in,
it would return 1 more often in the first game than in the second. So the difference is a measure of
how well A is doing. We call this measure the prf-advantage of A. Think of it as the probability
that A “breaks” the scheme F , with “break” interpreted in a specific, technical way based on the
definition.
Different adversaries will have different advantages. There are two reasons why one adversary
may achieve a greater advantage than another. One is that it is more “clever” in the questions it
asks and the way it processes the replies to determine its output. The other is simply that it asks
more questions, or spends more time processing the replies. Indeed, we expect that as an adversary
sees more and more input-output examples of Fn, or spends more computing time, its ability to
tell which world it is in should go up.
The “security” of family F as a pseudorandom function must thus be thought of as depending
on the resources allowed to the attacker. We may want to know, for any given resource limitations,
what is the prf-advantage achieved by the most “clever” adversary amongst all those who are
restricted to the given resource limits.
The choice of resources to consider can vary. One resource of interest is the time-complexity t
of A. Another resource of interest is the number of queries q that A asks of its oracle. Another
resource of interest is the total length µ of all of A’s queries. When we state results, we will pay
attention to such resources, showing how they influence maximal adversarial advantage.
Let us explain more about the resources we have mentioned, giving some important conventions
underlying their measurement. The first resource is the time-complexity of A. To make sense of this
we first need to fix a model of computation. We fix some RAM model, as discussed in Chapter 1.
Think of the model used in your algorithms courses, often implicitly, so that you could measure the
running time. However, we adopt the convention that the time-complexity of A refers not just to
the running time of A, but to the maximum of the running times of the two games in the definition,
plus the size of the code of A. In measuring the running time of the first game, we must count the
time to choose the key K at random, and the time to compute the value FK (x) for any query x
Bellare and Rogaway 63
made by A to its oracle. In measuring the running time of the second game, we count the execution
time of Fn over the call made to it by A.
The number of queries made by A captures the number of input-output examples it sees. In
general, not all strings in the domain must have the same length, and hence we also measure the
sum of the lengths of all queries made.
The strength of this definition lies in the fact that it does not specify anything about the kinds
of strategies that can be used by a adversary; it only limits its resources. A adversary can use
whatever means desired to distinguish the function as long as it stays within the specified resource
bounds.
What do we mean by a “secure” PRF? Definition 4.4.1 does not have any explicit condition or
statement regarding when F should be considered “secure.” It only associates to any adversary
A attacking F a prf-advantage function. Intuitively, F is “secure” if the value of the advantage
function is “low” for all adversaries whose resources are “practical.”
This is, of course, not formal. However, we wish to keep it this way because it better reflects
reality. In real life, security is not some absolute or boolean attribute; security is a function of the
resources invested by an attacker. All modern cryptographic systems are breakable in principle; it
is just a question of how long it takes.
This is our first example of a cryptographic definition, and it is worth spending time to study
and understand it. We will encounter many more as we go along. Towards this end let us summarize
the main features of the definitional framework as we will see them arise later. First, there are
games, involving an adversary. Then, there is some advantage function associated to an adversary
which returns the probability that the adversary in question “breaks” the scheme. These two
components will be present in all definitions. What varies is the games; this is where we pin down
how we measure security.
The intuition is similar to that for Definition 4.4.1. The difference is that here the “ideal” object
that F is being compared with is no longer a random function, but rather a random permutation.
In game RealF , the probability is over the random choice of key K and also over the coin tosses
of A if the latter happens to be randomized. The game returns the same bit that A returns. In game
PermD , a permutation Fn: D → D is chosen at random, and the result bit of A’s computation with
oracle Fn is returned. The probability is over the choice of Fn and the coins of A if any. As before,
the measure of how well A did at telling the two worlds apart, which we call the prp-cpa-advantage
of A, is the difference between the probabilities that the games return 1.
Conventions regarding resource measures also remain the same as before. Informally, a family
F is a secure PRP under CPA if Advprp -cpa (A) is “small” for all adversaries using a “practical”
F
amount of resources.
Bellare and Rogaway 65
The intuition is similar to that for Definition 4.4.1. The difference is that here the adversary
has more power: not only can it query Fn, but it can directly query Fn−1 . Conventions regarding
resource measures also remain the same as before. However, we will be interested in some additional
resource parameters. Specifically, since there are now two oracles, we can count separately the
number of queries, and total length of these queries, for each. As usual, informally, a family F is a
secure PRP under CCA if Advprp -cca (A) is “small” for all adversaries using a “practical” amount
F
of resources.
Though the technical result is easy, it is worth stepping back to explain its interpretation. The
theorem says that if you have an adversary A that breaks F in the PRP-CPA sense, then you have
some other adversary B that breaks F in the PRP-CCA sense. Furthermore, the adversary B will
be just as efficient as the adversary A was. As a consequence, if you think there is no reasonable
adversary B that breaks F in the PRP-CCA sense, then you have no choice but to believe that
there is no reasonable adversary A that breaks F in the PRP-CPA sense. The inexistence of a
reasonable adversary B that breaks F in the PRP-CCA sense means that F is PRP-CCA secure,
while the inexistence of a reasonable adversary A that breaks F in the PRP-CPA sense means
that F is PRP-CPA secure. So PRP-CCA security implies PRP-CPA security, and a statement
like the proposition above is how, precisely, one makes such a statement.
66 PSEUDORANDOM FUNCTIONS
-cpa (A ) ≤ c · t/TDES q
Advprp
DES t,q 1 + c2 · 40
255 2
for any adversary At,q that runs in time at most t and asks at most q 64-bit oracle queries. Here
TDES is the time to do one DES computation on our fixed RAM model of computation, and c1 , c2
are some constants depending only on this model. In other words, we are conjecturing that the best
attacks are either exhaustive key search or linear cryptanalysis. We might be bolder with regard
to AES and conjecture something like
-cpa (B ) ≤ c · t/TAES q
Advprp
AES t,q 1 128
+ c2 · 128 .
2 2
for any adversary Bt,q that runs in time at most t and asks at most q 128-bit oracle queries. We
could also make similar conjectures regarding the strength of blockciphers as PRPs under CCA
rather than CPA.
More interesting is the PRF security of blockciphers. Here we cannot do better than assume
that
t/TDES q2
Advprf
DES (At,q ) ≤ c1 · +
255 264
t/TAES q2
Advprf
AES (B t,q ) ≤ c1 · + .
2128 2128
for any adversaries At,q , Bt,q running in time at most t and making at most q oracle queries. This is
due to the birthday attack discussed later. The second term in each formula arises simply because
the object under consideration is a family of permutations.
We stress that these are all conjectures. There could exist highly effective attacks that break
DES or AES as a PRF without recovering the key. So far, we do not know of any such attacks, but
the amount of cryptanalytic effort that has focused on this goal is small. Certainly, to assume that
a blockcipher is a PRF is a much stronger assumption than that it is secure against key recovery.
Bellare and Rogaway 67
Nonetheless, the motivation and arguments we have outlined in favor of the PRF assumption stay,
and our view is that if a blockcipher is broken as a PRF then it should be considered insecure, and
a replacement should be sought.
Example 4.7.1 We define a family of functions F : {0, 1}k × {0, 1}ℓ → {0, 1}L as follows. We let
k = Lℓ and view a k-bit key K as specifying an L row by ℓ column matrix of bits. (To be concrete,
assume the first L bits of K specify the first column of the matrix, the next L bits of K specify
the second column of the matrix, and so on.) The input string X = X[1] . . . X[ℓ] is viewed as a
sequence of bits, and the value of F (K, x) is the corresponding matrix vector product. That is
K[1, 1] K[1, 2] ··· K[1, ℓ] X[1] Y [1]
K[2, 1] K[2, 2] ··· K[2, ℓ] X[2] Y [2]
FK (X) = .. .. · .. = ..
. . . .
K[L, 1] K[L, 2] · · · K[L, ℓ] X[l] Y [L]
where
Y [1] = K[1, 1] · x[1] ⊕ K[1, 2] · x[2] ⊕ . . . ⊕ K[1, ℓ] · x[ℓ]
Y [2] = K[2, 1] · x[1] ⊕ K[2, 2] · x[2] ⊕ . . . ⊕ K[2, ℓ] · x[ℓ]
.. .
. = ..
Y [L] = K[L, 1] · x[1] ⊕ K[L, 2] · x[2] ⊕ . . . ⊕ K[L, ℓ] · x[ℓ] .
Here the bits in the matrix are the bits in the key, and arithmetic is modulo two. The question
we ask is whether F is a “secure” PRF. We claim that the answer is no. The reason is that one
can design an adversary algorithm A that achieves a high advantage (close to 1) in distinguishing
between the two worlds.
We observe that for any key K we have FK (0ℓ ) = 0L . This is a weakness since a random
function of ℓ-bits to L-bits is very unlikely to return 0L on input 0ℓ , and thus this fact can be the
basis of a distinguishing adversary. Let us now show how the adversary works. Remember that
as per our model it is given an oracle Fn for Fn: {0, 1}ℓ → {0, 1}L and will output a bit. Our
adversary A works as follows:
Adversary A
Y ← Fn(0ℓ )
if Y = 0L then return 1 else return 0
This adversary queries its oracle at the point 0ℓ , and denotes by Y the ℓ-bit string that is returned.
If y = 0L it bets that Fn was an instance of the family F , and if y 6= 0L it bets that Fn was a
random function. Let us now see how well this adversary does. Let R = {0, 1}L . We claim that
h i
Pr RealA
F ⇒1 = 1
h i
Pr RandA
R ⇒1 = 2−L .
68 PSEUDORANDOM FUNCTIONS
Why? Look at Game RealF as defined in Definition 4.4.1. Here Fn = FK for some K. In that case
it is certainly true that Fn(0ℓ ) = 0L so by the code we wrote for A the latter will return 1. On
the other hand look at Game RandR as defined in Definition 4.4.1. Here Fn is a random function.
As we saw in Example 4.3.1, the probability that Fn(0ℓ ) = 0L will be 2−L , and hence this is the
probability that A will return 1. Now as per Definition 4.4.1 we subtract to get
h i h i
Advprf A A
F (A) = Pr RealF ⇒1 − Pr RandR ⇒1
= 1 − 2−L .
Now let t be the time complexity of F . This is O(ℓ + L) plus the time for one computation of F ,
coming to O(ℓ2 L). The number of queries made by A is just one, and the total length of all queries
is l. Our conclusion is that there exists an extremely efficient adversary whose prf-advantage is
very high (almost one). Thus, F is not a secure PRF.
Example 4.7.2 . Suppose we are given a secure PRF F : {0, 1}k × {0, 1}ℓ → {0, 1}L . We want to
use F to design a PRF G: {0, 1}k × {0, 1}ℓ → {0, 1}2L . The input length of G is the same as that of
F but the output length of G is twice that of F . We suggest the following candidate construction:
for every k-bit key K and every ℓ-bit input x
GK (x) = FK (x) k FK (x) .
Here “ k ” denotes concatenation of strings, and x denotes the bitwise complement of the string x.
We ask whether this is a “good” construction. “Good” means that under the assumption that F
is a secure PRF, G should be too. However, this is not true. Regardless of the quality of F , the
construct G is insecure. Let us demonstrate this.
We want to specify an adversary attacking G. Since an instance of G maps ℓ bits to 2L bits,
the adversary D will get an oracle for a function Fn that maps ℓ bits to 2L bits. In the random
world, Fn will be chosen as a random function of ℓ bits to 2L bits, while in the real world, Fn will
be set to GK where K is a random k-bit key. The adversary must determine in which world it is
placed. Our adversary works as follows:
Adversary A
y1 ← Fn(1ℓ )
y2 ← Fn(0ℓ )
Parse y1 as y1 = y1,1 k y1,2 with |y1,1 | = |y1,2 | = L
Parse y2 as y2 = y2,1 k y2,2 with |y2,1 | = |y2,2 | = L
if y1,1 = y2,2 then return 1 else return 0
This adversary queries its oracle at the point 1ℓ to get back y1 and then queries its oracle at the
point 0ℓ to get back y2 . Notice that 1ℓ is the bitwise complement of 0ℓ . The adversary checks
whether the first half of y1 equals the second half of y2 , and if so bets that it is in the real world.
Let us now see how well this adversary does. Let R = {0, 1}2L . We claim that
h i
Pr RealA
G ⇒1 = 1
h i
Pr RandA
R ⇒1 = 2−L .
Why? Look at Game RealG as defined in Definition 4.4.1. Here g = GK for some K. In that case
we have
GK (1ℓ ) = FK (1ℓ ) k FK (0ℓ )
GK (0ℓ ) = FK (0ℓ ) k FK (1ℓ )
Bellare and Rogaway 69
by definition of the family G. Notice that the first half of GK (1ℓ ) is the same as the second half of
GK (0ℓ ). So A will return 1. On the other hand look at Game RandR as defined in Definition 4.4.1.
Here Fn is a random function. So the values Fn(1ℓ ) and Fn(0ℓ ) are both random and independent
2L bit strings. What is the probability that the first half of the first string equals the second half of
the second string? It is exactly the probability that two randomly chosen L-bit strings are equal,
and this is 2−L . So this is the probability that A will return 1. Now as per Definition 4.4.1 we
subtract to get
h i h i
Advprf A A
G (A) = Pr RealG ⇒1 − Pr RandR ⇒1
= 1 − 2−L .
Now let t be the time complexity of A. This is O(ℓ + L) plus the time for two computations of G,
coming to O(ℓ + L) plus the time for four computations of F . The number of queries made by D
is two, and the total length of all queries is 2ℓ. Thus we have exhibited an efficient adversary with
a very high prf-advantage, showing that G is not a secure PRF.
This definition has been made general enough to capture all types of key-recovery attacks. Any of
the classical attacks such as exhaustive key search, differential cryptanalysis or linear cryptanalysis
correspond to different, specific choices of adversary B. They fall in this framework because all have
the goal of finding the key K based on some number of input-output examples of an instance FK
of the cipher. To illustrate let us see what are the implications of the classical key-recovery attacks
on DES for the value of the key-recovery advantage function of DES. Assuming the exhaustive
key-search attack is always successful based on testing two input-output examples leads to the fact
that there exists an adversary B such that Advkr DES (B) = 1 and B makes two oracle queries and
70 PSEUDORANDOM FUNCTIONS
Game KRF
procedure Initialize
$
K ← Keys(F )
procedure Fn(x)
return FK (x)
procedure Finalize(K ′ )
return (K = K ′ )
has running time about 255 times the time TDES for one computation of DES. On the other hand,
linear cryptanalysis implies that there exists an adversary B such that Advkr
DES (B) ≥ 1/2 and B
makes 244 oracle queries and has running time about 244 times the time TDES for one computation
of DES.
For a more concrete example, let us look at the key-recovery advantage of the family of
Example 4.7.1.
Example 4.8.2 Let F : {0, 1}k × {0, 1}l → {0, 1}L be the family of functions from Example 4.7.1.
We saw that its prf-advantage was very high. Let us now compute its kr-advantage. The following
adversary B recovers the key. We let ej be the l-bit binary string having a 1 in position j and zeros
everywhere else. We assume that the manner in which the key K defines the matrix is that the
first L bits of K form the first column of the matrix, the next L bits of K form the second column
of the matrix, and so on.
Adversary B
K ′ ← ε // ε is the empty string
for j = 1, . . . , l do
yj ← Fn(ej )
K ′ ← K ′ k yj
return K ′
The adversary B invokes its oracle to compute the output of the function on input ej . The result,
yj , is exactly the j-th column of the matrix associated to the key K. The matrix entries are
concatenated to yield K ′ , which is returned as the key. Since the adversary always finds the key
we have
Advkr
F (B) = 1 .
The time-complexity of this adversary is t = O(l2 L) since it makes q = l calls to its oracle and each
computation of Fn takes O(lL) time. The parameters here should still be considered small: l is 64
or 128, which is small for the number of queries. So F is insecure against key-recovery.
Note that the F of the above example is less secure as a PRF than against key-recovery: its
advantage function as a PRF had a value close to 1 for parameter values much smaller than those
above. This leads into our next claim, which says that for any given parameter values, the kr-
advantage of a family cannot be significantly more than its prf or prp-cpa advantage.
Bellare and Rogaway 71
prf 1
Advkr
F (B) ≤ AdvF (A) + . (4.1)
|R|
Furthermore if D = R then there also exists a PRP CPA adversary A against F such that A has
running time at most t plus the time for one computation of F , makes at most q + 1 oracle queries,
and
prp-cpa 1
Advkr
F (B) ≤ AdvF (A) + . (4.2)
|D| − q
The Proposition implies that if a family of functions is a secure PRF or PRP then it is also
secure against all key-recovery attacks. In particular, if a blockcipher is modeled as a PRP or PRF,
we are implicitly assuming it to be secure against key-recovery attacks.
Before proceeding to a formal proof let us discuss the underlying ideas. The problem that
adversary A is trying to solve is to determine whether its given oracle Fn is a random instance of
F or a random function of D to R. A will run B as a subroutine and use B’s output to solve its
own problem.
B is an algorithm that expects to be in a world where it gets an oracle Fn for some random key
K ∈ K, and it tries to find K via queries to its oracle. For simplicity, first assume that B makes no
oracle queries. Now, when A runs B, it produces some key K ′ . A can test K ′ by checking whether
F (K ′ , x) agrees with Fn(x) for some value x. If so, it bets that Fn was an instance of F , and if
not it bets that Fn was random.
If B does make oracle queries, we must ask how A can run B at all. The oracle that B wants
is not available. However, B is a piece of code, communicating with its oracle via a prescribed
interface. If you start running B, at some point it will output an oracle query, say by writing this
to some prescribed memory location, and stop. It awaits an answer, to be provided in another
prescribed memory location. When that appears, it continues its execution. When it is done
making oracle queries, it will return its output. Now when A runs B, it will itself supply the
answers to B’s oracle queries. When B stops, having made some query, A will fill in the reply in
the prescribed memory location, and let B continue its execution. B does not know the difference
between this “simulated” oracle and the real oracle except in so far as it can glean this from the
values returned.
The value that B expects in reply to query x is FK (x) where K is a random key from K.
However, A returns to it as the answer to query x the value Fn(x), where Fn is A’s oracle. When
A is in the real world, Fn(x) is an instance of F and so B is functioning as it would in its usual
environment, and will return the key K with a probability equal to its kr-advantage. However
when A is in the random world, Fn is a random function, and B is getting back values that bear
little relation to the ones it is expecting. That does not matter. B is a piece of code that will run
to completion and produce some output. When we are in the random world, we have no idea what
properties this output will have. But it is some key in K, and A will test it as indicated above. It
will fail the test with high probability as long as the test point x was not one that B queried, and
A will make sure the latter is true via its choice of x. Let us now proceed to the actual proof.
Proof of Proposition 4.8.3: We prove the first equation and then briefly indicate how to alter
the proof to prove the second equation.
72 PSEUDORANDOM FUNCTIONS
As per Definition 4.4.1, adversary A will be provided an oracle Fn for a function Fn: D → R, and
will try to determine in which World it is. To do so, it will run adversary B as a subroutine. We
provide the description followed by an explanation and analysis.
Adversary A
i←0
Run adversary B, replying to its oracle queries as follows
When B makes an oracle query x do
i ← i + 1 ; xi ← x
yi ← Fn(xi )
Return yi to B as the answer
Until B stops and outputs a key K ′
Let x be some point in D − {x1 , . . . , xq }
y ← Fn(x)
if F (K ′ , x) = y then return 1 else return 0
As indicated in the discussion preceding the proof, A is running B and itself providing answers
to B’s oracle queries via the oracle Fn. When B has run to completion it returns some K ′ ∈ K,
which A tests by checking whether F (K ′ , x) agrees with Fn(x). Here x is a value different from
any that B queried, and it is to ensure that such a value can be found that we require q < |D| in
the statement of the Proposition. Now we claim that
h i
Pr RealA
F ⇒1 ≥ Advkr
F (B) (4.3)
h i 1
Pr RandA
R ⇒1 = . (4.4)
|R|
We will justify these claims shortly, but first let us use them to conclude. Subtracting, as per
Definition 4.4.1, we get
h i h i
Advprf A A
F (A) = Pr RealF ⇒1 − Pr RandR ⇒1
1
≥ Advkr
F (B) −
|R|
as desired. It remains to justify Equations (4.3) and (4.4).
Equation (4.3) is true because in RealF the oracle Fn is a random instance of F , which is the oracle
that B expects, and thus B functions as it does in KRB F . If B is successful, meaning the key K it
′
outputs equals K, then certainly A returns 1. (It is possible that A might return 1 even though B
was not successful. This would happen if K ′ 6= K but F (K ′ , x) = F (K, x). It is for this reason that
Equation (4.3) is in inequality rather than an equality.) Equation (4.4) is true because in RandR
the function Fn is random, and since x was never queried by B, the value Fn(x) is unpredictable
to B. Imagine that Fn(x) is chosen only when x is queried to Fn. At that point, K ′ , and thus
F (K ′ , x), is already defined. So Fn(x) has a 1/|R| chance of hitting this fixed point. Note this is
true regardless of how hard B tries to make F (K ′ , x) be the same as Fn(x).
For the proof of Equation (4.2), the adversary A is the same. For the analysis we see that
h i
Pr RealA
F ⇒1 ≥ Advkr
F (B)
h i 1
Pr RandA
R ⇒1 ≤ .
|D| − q
Bellare and Rogaway 73
Subtracting yields Equation (4.2). The first equation above is true for the same reason as before.
The second equation is true because in World 0 the map Fn is now a random permutation of D to
D. So Fn(x) assumes, with equal probability, any value in D except y1 , . . . , yq , meaning there are
at least |D| − q things it could be. (Remember R = D in this case.)
The following example illustrates that the converse of the above claim is far from true. The kr-
advantage of a family can be significantly smaller than its prf or prp-cpa advantage, meaning that
a family might be very secure against key recovery yet very insecure as a prf or prp, and thus not
useful for protocol design.
Example 4.8.4 Define the blockcipher E: {0, 1}k × {0, 1}ℓ → {0, 1}ℓ by EK (x) = x for all k-bit
keys K and all ℓ-bit inputs x. We claim that it is very secure against key-recovery but very insecure
as a PRP under CPA. More precisely, we claim that for any adversary B,
Advkr
E (B) = 2
−k
,
regardless of the running time and number of queries made by B. On the other hand there is an
adversary A, making only one oracle query and having a very small running time, such that
Advprp-cpa (A) ≥ 1 − 2−ℓ .
E
In other words, given an oracle for EK , you may make as many queries as you want, and spend
as much time as you like, before outputting your guess as to the value of K, yet your chance
of getting it right is only 2−k . On the other hand, using only a single query to a given oracle
Fn: {0, 1}ℓ → {0, 1}ℓ , and very little time, you can tell almost with certainty whether Fn is an
instance of E or is a random function of ℓ bits to ℓ bits. Why are these claims true? Since EK does
not depend on K, an adversary with oracle EK gets no information about K by querying it, and
hence its guess as to the value of K can be correct only with probability 2−k . On the other hand,
an adversary can test whether Fn(0ℓ ) = 0ℓ , and by returning 1 if and only if this is true, attain a
prp-advantage of 1 − 2−ℓ .
Proof of Proposition 4.9.1: Adversary A is given an oracle Fn: {0, 1}ℓ → {0, 1}ℓ and works
like this:
Adversary A
for i = 1, . . . , q do
Let xi be the i-th ℓ-bit string in lexicographic order
yi ← Fn(xi )
if y1 , . . . , yq are all distinct then return 1, else return 0
Here C(N, q), as defined in the appendix on the birthday problem, is the probability that some bin
gets two or more balls in the experiment of randomly throwing q balls into N bins. We will justify
these claims shortly, but first let us use them to conclude. Subtracting, we get
h i h i
Advprf A A
E (A) = Pr RealE ⇒1 − Pr RandE ⇒1
= 1 − [1 − C(N, q)]
= C(N, q)
q(q − 1)
≥ 0.3 · .
2ℓ
The last line is by Theorem A.1 in the appendix on the birthday problem. It remains to justify
Equations (4.6) and (4.7).
Equation (4.6) is clear because in the real world, Fn = EK for some key K, and since E is a family
of permutations, Fn is a permutation, and thus y1 , . . . , yq are all distinct. Now, suppose A is in
the random world, so that Fn is a random function of ℓ bits to ℓ bits. What is the probability
that y1 , . . . , yq are all distinct? Since Fn is a random function and x1 , . . . , xq are distinct, y1 , . . . , yq
are random, independently distributed values in {0, 1}ℓ . Thus we are looking at the birthday
problem. We are throwing q balls into N = 2ℓ bins and asking what is the probability of there
being no collisions, meaning no bin contains two or more balls. This is 1 − C(N, q), justifying
Equation (4.7).
Lemma 4.10.1 [PRP/PRF Switching Lemma] Let E: K × {0, 1}n → {0, 1}n be a function
family. Let R = {0, 1}n . Let A be an adversary that asks at most q oracle queries. Then
h i h i q(q − 1)
Pr RandA A
R ⇒1 − Pr PermR ⇒1 ≤ . (4.8)
2n+1
As a consequence, we have that
q(q − 1)
Advprf prp
E (A) − AdvE (A) ≤ . (4.9)
2n+1
The proof introduces a technique that we shall use repeatedly: a game-playing argument. We
are trying to compare what happens when an adversary A interacts with one kind of object—a
random permutation oracle—to what happens when the adversary interacts with a different kind
of object—a random function oracle. So we set up each of these two interactions as a kind of game,
writing out the game in pseudocode. The two games are written in a way that highlights when
they have differing behaviors. In particular, any time that the behavior in the two games differ,
we set a flag bad. The probability that the flag bad gets set in one of the two games is then used
to bound the difference between the probability that the adversary outputs 1 in one game and the
the probability that the adversary outputs 1 in the other game.
Proof: Let’s begin with Equation (4.8), as Equation (4.9) follows from that. We need to establish
that h i h i
q(q − 1) q(q − 1)
− n+1 ≤ Pr RandA A
R ⇒1 − Pr PermR ⇒1 ≤
2 2n+1
Let’s show the right-hand inequality, since the left-hand inequality works in exactly the same way.
So we are trying to establish that
q(q − 1)
Pr[Aρ ⇒1] − Pr[Aπ ⇒1] ≤ . (4.10)
2n+1
We can assume that A never asks an oracle query that is not an n-bit string. You can assume that
such an invalid oracle query would generate an error message. The same error message would be
generated on any invalid query, regardless of A’s oracle, so asking invalid queries is pointless for A.
We can also assume that A never repeats an oracle query: if it asks a question X it won’t later ask
the same question X. It’s not interesting for A to repeat a question, because it’s going to get the
same answer as before, independent of the type of oracle to which A is speaking to. More precisely,
with a little bit of bookkeeping the adversary can remember what was its answer to each oracle
query it already asked, and it doesn’t have to repeat an oracle query because the adversary can
just as well look up the prior answer.
Let’s look at Games G0 and G1 of Fig. 4.5. Notice that the adversary never sees the flag bad. The
flag bad will play a central part in our analysis, but it is not something that the adversary A can
get hold of. It’s only for our bookkeeping.
Suppose that the adversary asks a query X. By our assumptions about A, the string X is an n-bit
string that the adversary has not yet asked about. In line 10, we choose a random n-bit string Y .
Lines 11,12, next, are the most interesting. If the point Y that we just chose is already in the
range of the function we are defining then we set a flag bad. In such a case, if we are playing game
G0 , then we now make a fresh choice of Y , this time from the co-range of the function. If we are
playing game G1 then we stick with our original choice of Y . Either way, we return Y , effectively
growing the domain of our function.
76 PSEUDORANDOM FUNCTIONS
procedure Initialize // G0 , G1
UR ← ∅
procedure Fn(x)
$
10 Y ← R
11 if Y ∈ UR then
$
12 bad ← true; Y ← R \ UR
13 UR ← UR ∪ {Y }
14 return Y
Figure 4.5: Games used in the proof of the Switching Lemma. Game G0 includes the boxed code
while game G1 does not.
Now let’s think about what A sees as it plays Game G1 . Whatever query X is asked, we just
return a random n-bit string Y . So game G1 perfectly simulates a random function. Remember
that the adversary isn’t allowed to repeat a query, so what the adversary would get if it had a
random function oracle is a random n-bit string in response to each query—just what we are giving
it. Hence
Pr[RandA
R ⇒1] = Pr[G1 ⇒1] (4.11)
Now if we’re in game G0 then what the adversary gets in response to each query X is a random
point Y that has not already been returned to A. Thus
Pr[PermA A
R ⇒1] = Pr[G0 ⇒1] . (4.12)
But game G0 , G1 are identical until bad and hence the Fundamental Lemma of game playing implies
that
Pr[GA A A
0 ⇒1] − Pr[G1 ⇒1] ≤ Pr[G1 sets bad] . (4.13)
To bound Pr[GA 1 sets bad] is simple. Line 11 is executed q times. The first time it is executed UR
contains 0 points; the second time it is executed UR contains 1 point; the third time it is executed
Range(π) contains at most 2 points; and so forth. Each time line 11 is executed we have just selected
a random value Y that is independent of the contents of UR. By the sum bound, the probability
that a Y will ever be in UR at line 11 is therefore at most 0/2n + 1/2n + 2/2n + · · · + (q − 1)/2n =
(1 + 2 + · · · + (q − 1))/2n = q(q − 1)/2n+1 . This completes the proof of Equation (4.10). To go on
and show that Advprf prp
E (A) − AdvE (A) ≤ q(q − 1)/2
n+1 note that
h i h i h i h i
Advprf prp A A A A
E (A) − AdvE (A) = Pr RealF ⇒1 −Pr RandR ⇒1 − Pr RealF ⇒1 −Pr PermR ⇒1
h i h i
= Pr PermA A
R ⇒1 − Pr RandR ⇒1
≤ q(q − 1)/2n+1
The PRP/PRF switching lemma is one of the central tools for understanding block-cipher based
protocols, and the game-playing method will be one of our central techniques for doing proofs.
Bellare and Rogaway 77
4.12 Problems
Problem 20 Let E: {0, 1}k × {0, 1}n → {0, 1}n be a secure PRP. Consider the family of permu-
tations E ′ : {0, 1}k × {0, 1}2n → {0, 1}2n defined by for all x, x′ ∈ {0, 1}n by
′
EK (x k x′ ) = EK (x) k EK (x ⊕ x′ ) .
Problem 21 Consider the following blockcipher E : {0, 1}3 × {0, 1}2 → {0, 1}2 :
key 0 1 2 3
0 0 1 2 3
1 3 0 1 2
2 2 3 0 1
3 1 2 3 0
4 0 3 2 1
5 1 0 3 2
6 2 1 0 3
7 3 2 1 0
(The eight possible keys are the eight rows, and each row shows where the points to which 0, 1,
2, and 3 map.) Compute the maximal prp-advantage an adversary can get (a) with one query,
(b) with four queries, and (c) with two queries.
Problem 22 Present a secure construction for the problem of Example 4.7.2. That is, given a
PRF F : {0, 1}k × {0, 1}n → {0, 1}n , construct a PRF G: {0, 1}k × {0, 1}n → {0, 1}2n which is a
secure PRF as long as F is secure.
Problem 23 Design a blockcipher E : {0, 1}k × {0, 1}128 → {0, 1}128 that is secure (up to a
large number of queries) against non-adaptive adversaries, but is completely insecure (even for
two queries) against an adaptive adversary. (A non-adaptive adversary readies all her questions
M1 , . . . , Mq , in advance, getting back EK (M1 ), ..., EK (Mq ). An adaptive adversary is the sort we
have dealt with throughout: each query may depend on prior answers.)
Problem 24 Let a[i] denote the i-th bit of a binary string i, where 1 ≤ i ≤ |a|. The inner product
of n-bit binary strings a, b is
h a, b i = a[1]b[1] ⊕ a[2]b[2] ⊕ · · · ⊕ a[n]b[n] .
78 PSEUDORANDOM FUNCTIONS
Game G Game H
A family of functions F : {0, 1}k × {0, 1}ℓ → {0, 1}L is said to be inner-product preserving if for
every K ∈ {0, 1}k and every distinct x1 , x2 ∈ {0, 1}ℓ − {0ℓ } we have
h F (K, x1 ), F (K, x2 ) i = h x1 , x2 i .
Prove that if F is inner-product preserving then there exists an adversary A, making at most two
oracle queries and having running time 2 · TF + O(ℓ), where TF denotes the time to perform one
computation of F , such that
prf 1 1
AdvF (A) ≥ · 1 + L .
2 2
Explain in a sentence why this shows that if F is inner-product preserving then F is not a secure
PRF.
Problem 25 Let E: {0, 1}k × {0, 1}ℓ → {0, 1}ℓ be a blockcipher. The two-fold cascade of E is the
blockcipher E (2) : {0, 1}2k × {0, 1}ℓ → {0, 1}ℓ defined by
(2)
EK1 k K2 (x) = EK1 (EK2 (x))
for all K1 , K2 ∈ {0, 1}k and all x ∈ {0, 1}ℓ . Prove that if E is a secure PRP then so is E (2) .
Problem 26 Let A be a adversary that makes at most q total queries to its two oracles, f and g,
where f, g : {0, 1}n → {0, 1}n . Assume that A never asks the same query X to both of its oracles.
Define
Adv(A) = Pr[GA = 1] − Pr[H A = 1]
where games G, H are defined in Fig. 4.6. Prove a good upper bound for Adv(A), say Adv(A) ≤
q 2 /2n .
Problem 27 Let F : {0, 1}k × {0, 1}ℓ → {0, 1}ℓ be a family of functions and r ≥ 1 an integer.
The r-round Feistel cipher associated to F is the family of permutations F (r) : {0, 1}rk × {0, 1}2ℓ →
{0, 1}2ℓ defined as follows for any K1 , . . . , Kr ∈ {0, 1}k and input x ∈ {0, 1}2ℓ :
(a) Prove that there exists an adversary A, making at most two oracle queries and having running
time about that to do two computations of F , such that
Advprf
F (2)
(A) ≥ 1 − 2−ℓ .
(b) Prove that there exists an adversary A, making at most two queries to its first oracle and one
to its second oracle, and having running time about that to do three computations of F or
F −1 , such that
Advprp-cca (A) ≥ 1 − 3 · 2−ℓ .
F (3)
Problem 28 Let E: K × {0, 1}n → {0, 1}n be a function family and let A be an adversary that
asks at most q queries. In trying to construct a proof that |Advprp prf 2 n+1 ,
E (A) − AdvE (A)| ≤ q /2
Michael and Peter put forward an argument a fragment of which is as follows:
Consider an adversary A that asks at most q oracle queries to an oracle Fn for a function
from R to R, where R = {0, 1}n . Let C (for “collision”) be the event that A asks some
two distinct queries X and X ′ and the oracle returns the same answer. Then clearly
Pr[PermA A
R ⇒1] = Pr[RandR ⇒1 | C].
Show that Michael and Peter have it all wrong: prove that the quantities above are not necessarily
equal. Do this by selecting a number n and constructing an adversary A for which the left and
right sides of the equation above are unequal.
80 PSEUDORANDOM FUNCTIONS
Bibliography
[1] M. Bellare and P. Rogaway. The Security of Triple Encryption and a Framework for
Code-Based Game-Playing Proofs. Advances in Cryptology – EUROCRYPT ’06, Lecture
Notes in Computer Science Vol. , ed., Springer-Verlag, 2006
[2] M. Bellare, J. Kilian and P. Rogaway. The security of the cipher block chaining message
authentication code. Journal of Computer and System Sciences , Vol. 61, No. 3, Dec 2000,
pp. 362–399.
[3] O. Goldreich, S. Goldwasser and S. Micali. How to construct random functions. Jour-
nal of the ACM, Vol. 33, No. 4, 1986, pp. 210–217.
[4] M. Luby and C. Rackoff. How to construct pseudorandom permutations from pseudoran-
dom functions. SIAM J. Comput, Vol. 17, No. 2, April 1988.
82 BIBLIOGRAPHY
Chapter 5
Symmetric Encryption
The symmetric setting considers two parties who share a key and will use this key to imbue commu-
nicated data with various security attributes. The main security goals are privacy and authenticity
of the communicated data. The present chapter looks at privacy. A later chapter looks at authen-
ticity. Chapters 3 and 4 describe tools we shall use here.
The key-generation algorithm, as the definition indicates, is randomized. It takes no inputs. When
it is run, it flips coins internally and uses these to select a key K. Typically, the key is just a random
string of some length, in which case this length is called the key length of the scheme. When two
parties want to use the scheme, it is assumed they are in possession of a key K generated via K.
How they came into joint possession of this key K in such a way that the adversary did not get
to know K is not our concern here, and will be addressed later. For now we assume the key has
been shared.
Once in possession of a shared key, the sender can run the encryption algorithm with key K and
input message M to get back a string we call the ciphertext. The latter can then be transmitted
to the receiver.
The encryption algorithm may be either randomized or stateful. If randomized, it flips coins
and uses those to compute its output on a given input K, M . Each time the algorithm is invoked,
it flips coins anew. In particular, invoking the encryption algorithm twice on the same inputs may
not yield the same response both times.
We say the encryption algorithm is stateful if its operation depends on a quantity called the
state that is initialized in some pre-specified way. When the encryption algorithm is invoked on
inputs K, M , it computes a ciphertext based on K, M and the current state. It then updates the
state, and the new state value is stored. (The receiver does not maintain matching state and, in
particular, decryption does not require access to any global variable or call for any synchronization
between parties.) Usually, when there is state to be maintained, the state is just a counter. If there
is no state maintained by the encryption algorithm the encryption scheme is said to be stateless.
The encryption algorithm might be both randomized and stateful, but in practice this is rare: it
is usually one or the other but not both.
When we talk of a randomized symmetric encryption scheme we mean that the encryption
algorithm is randomized. When we talk of a stateful symmetric encryption scheme we mean that
the encryption algorithm is stateful.
The receiver, upon receiving a ciphertext C, will run the decryption algorithm with the same
key used to create the ciphertext, namely compute DK (C). The decryption algorithm is neither
randomized nor stateful.
Many encryption schemes restrict the set of strings that they are willing to encrypt. (For
example, perhaps the algorithm can only encrypt plaintexts of length a positive multiple of some
block length n, and can only encrypt plaintexts of length up to some maximum length.) These
kinds of restrictions are captured by having the encryption algorithm return the special symbol ⊥
when fed a message not meeting the required restriction. In a stateless scheme, there is typically a
set of strings M, called the plaintext space, such that
$ $
M ∈ M iff Pr[K ← K; C ← EK (M ) : C 6= ⊥] = 1
In a stateful scheme, whether or not EK (M ) returns ⊥ depends not only on M but also possibly on
the value of the state variable. For example, when a counter is being used, it is typical that there
is a limit to the number of encryptions performed, and when the counter reaches a certain value
the encryption algorithm returns ⊥ no matter what message is fed to it.
The correct decryption requirement simply says that decryption works: if a message M is
encrypted under a key K to yield a ciphertext C, then one can recover M by decrypting C under
K. This holds, however, only if C 6= ⊥. The condition thus says that, for each key K ∈ Keys(SE)
and message M ∈ {0, 1}∗ , with probability one over the coins of the encryption algorithm, either
the latter outputs ⊥ or it outputs a ciphertext C which upon decryption yields M . If the scheme
is stateful, this condition is required to hold for every value of the state.
Correct decryption is, naturally, a requirement before one can use a symmetric encryption
Bellare and Rogaway 85
scheme in practice, for if this condition is not met, the scheme fails to communicate information
accurately. In analyzing the security of symmetric encryption schemes, however, we will see that
it is sometimes useful to be able to consider ones that do not meet this condition.
Here X[i .. j] denotes the i-th through j-th bit of the binary string X. By hctr, Ci we mean a string
that encodes the number ctr and the string C. The most natural encoding is to encode ctr using
some fixed number of bits, at least lg k, and to prepend this to C. Conventions are established so
that every string Y is regarded as encoding some ctr, C for some ctr, C. The encryption algorithm
XORs the message bits with key bits, starting with the key bit indicated by one plus the current
counter value. The counter is then incremented by the length of the message. Key bits are not
reused, and thus if not enough key bits are available to encrypt a message, the encryption algorithm
returns ⊥. Note that the ciphertext returned includes the value of the counter. This is to enable
decryption. (Recall that the decryption algorithm, as per Definition 5.1.1, must be stateless and
deterministic, so we do not want it to have to maintain a counter as well.)
algorithm EK (M )
M [1] · · · M [m] ← M
for i ← 1 to m do
C[i] ← EK (M [i])
C ← C[1] · · · C[m]
return C
algorithm DK (C)
C[1] · · · C[m] ← C
for i ← 1 to m do
−1
M [i] ← EK (C[i])
M ← M [1] · · · M [m]
return M
algorithm EK (M )
algorithm DK (C)
M [1] · · · M [m] ← M
$ C[0] · · · C[m] ← C
C[0] ← {0, 1}n
for i = 1, . . . , m do
for i = 1, . . . , m do −1
M [i] ← EK (C[i]) ⊕ C[i − 1]
C[i] ← EK (M [i] ⊕ C[i − 1])
return M
return C
X[i] . . . X[m] ← X
the operation of parsing string X into m − i + 1 blocks, each block of length n. Here i ≤ m and X
is assumed to have length (m − i + 1) · n. Thus, X[j] consists of bits (j − i)n + 1 to (j − i + 1)n of
X, for i ≤ j ≤ m.
The first scheme we consider is ECB (Electronic Codebook Mode), whose security is considered
in Section 5.5.1.
Scheme 5.2.2 [ECB mode] Let E: K × {0, 1}n → {0, 1}n be a blockcipher. Operating it in
ECB (Electronic Code Book) mode yields a stateless symmetric encryption scheme SE = (K, E, D).
The key-generation algorithm simply returns a random key for the blockcipher, meaning it picks a
$
random string K ← K and returns it. The encryption and decryption algorithms are depicted in
Fig. 5.1. Notice that this time the encryption algorithm did not make any random choices. (That
does not mean it is not, technically, a randomized algorithm; it is simply a randomized algorithm
that happened not to make any random choices.)
The next scheme, cipher-block chaining (CBC) with random initial vector, is the most popular
block-cipher mode of operation, used pervasively in practice.
Bellare and Rogaway 87
algorithm EK (M )
algorithm DK (C)
M [1] · · · M [m] ← M
C[0] · · · C[m] ← C
C[0] ← ctr
for i = 1, . . . , m do
for i = 1, . . . , m do −1
M [i] ← EK (C[i]) ⊕ C[i − 1]
C[i] ← EK (M [i] ⊕ C[i − 1])
return M
return C
Scheme 5.2.3 [CBC$ mode] Let E: K × {0, 1}n → {0, 1}n be a blockcipher. Operating it in
CBC mode with random IV yields a stateless symmetric encryption scheme, SE = (K, E, D). The
$
key generation algorithm simply returns a random key for the blockcipher, K ← K. The encryption
and decryption algorithms are depicted in Fig. 5.2. The IV (“initialization vector”) is C[0], which
is chosen at random by the encryption algorithm. This choice is made independently each time the
algorithm is invoked.
For the following schemes it is useful to introduce some notation. With n fixed, we let hii denote
the n-bit string that is the binary representation of integer i mod 2n . If we use a number i ≥ 0 in
a context for which a string I ∈ {0, 1}n is required, it is understood that we mean to replace i by
I = [i]n . The following is a counter-based version of CBC mode, whose security is considered in
Section 5.5.3.
Scheme 5.2.4 [CBCC mode] Let E: K × {0, 1}n → {0, 1}n be a blockcipher. Operating it in
CBC mode with counter IV yields a stateful symmetric encryption scheme, SE = (K, E, D). The
$
key generation algorithm simply returns a random key for the blockcipher, K ← K. The encryptor
maintains a counter ctr which is initially zero. The encryption and decryption algorithms are
depicted in Fig. 5.3. The IV (“initialization vector”) is C[0], which is set to the current value of
the counter. The counter is then incremented each time a message is encrypted. The counter is a
static variable, meaning that its value is preserved across invocations of the encryption algorithm.
The CTR (counter) modes that follow are not much used, to the best of our knowledge, but
perhaps wrongly so. We will see later that they have good privacy properties. In contrast to CBC,
the encryption procedure is parallelizable, which can be exploited to speed up the process in the
presence of hardware support. It is also the case that the methods work for strings of arbitrary
bit lengths, without doing anything “special” to achieve this end. There are two variants of CTR
mode, one random and the other stateful, and, as we will see later, their security properties are
different. For security analyses see Section 5.7 and Section 5.10.1.
Scheme 5.2.5 [CTR$ mode] Let F : K × {0, 1}n → {0, 1}n be a family of functions. (Possibly
a blockcipher, but not necessarily.) Then CTR mode over F with a random starting point is a
probabilistic, stateless symmetric encryption scheme, SE = (K, E, D). The key-generation algo-
rithm simply returns a random key for E. The encryption and decryption algorithms are depicted
in Fig. 5.4. The starting point C[0] is used to define a sequence of values on which FK is applied
to produce a “pseudo one-time pad” to which the plaintext is XORed. The starting point C[0]
chosen by the encryption algorithm is a random n-bit string. To add an n-bit string C[0] to an
integer i—when we write FK (R + i)—convert the n-bit string C[0] into an integer in the range
88 SYMMETRIC ENCRYPTION
algorithm EK (M )
algorithm DK (C)
M [1] · · · M [m] ← M
$ C[0] · · · C[m] ← C
C[0] ← {0, 1}n
for i = 1, . . . , m do
for i = 1, . . . , m do
P [i] ← FK (C[0] + i)
P [i] ← FK (C[0] + i)
M [i] ← P [i] ⊕ C[i]
C[i] ← P [i] ⊕ M [i]
return M
return C
Figure 5.4: CTR$ mode using a family of functions F : K × {0, 1}n → {0, 1}n . This version of
counter mode is randomized and stateless.
algorithm EK (M )
algorithm DK (C)
M [1] · · · M [m] ← M
C[0] · · · C[m] ← C
C[0] ← ctr
ctr ← C[0]
for i = 1, . . . , m do
for i = 1, . . . , m do
P [i] ← FK (ctr + i)
P [i] ← FK (ctr + i)
C[i] ← P [i] ⊕ M [i]
M [i] ← P [i] ⊕ C[i]
ctr ← ctr + m
return M
return C
Figure 5.5: CTRC mode using a family of functions F : K × {0, 1}n → {0, 1}n . This version of
counter mode uses stateful (but deterministic) encryption.
[0 .. 2n − 1] in the usual way, add this number to i, take the result modulo 2n , and then convert
this back into an n-bit string. Note that the starting point C[0] is included in the ciphertext, to
enable decryption.
Scheme 5.2.6 [CTRC mode] Let F : K × {0, 1}n → {0, 1}n be a family of functions (possibly
a blockcipher, but not necessarily). Operating it in CTR mode with a counter starting point is a
stateful symmetric encryption scheme, SE = (K, E, D), which we call CTRC. The key-generation
algorithm simply returns a random key for F . The encryptor maintains a counter ctr which is
initially zero. The encryption and decryption algorithms are depicted in Fig. 5.5. Position index
ctr is not allowed to wrap around: the encryption algorithm returns ⊥ if this would happen. The
position index is included in the ciphertext in order to enable decryption. The encryption algorithm
updates the position index upon each invocation, and begins with this updated value the next time
it is invoked.
We will return to the security of these schemes after we have developed the appropriate notions.
now want to explore the issue of what the privacy of the scheme might mean. For this chapter,
security is privacy, and we are trying to get to the heart of what security is about.
The adversary is assumed able to capture any ciphertext that flows on the channel between
the two parties. It can thus collect ciphertexts, and try to glean something from them. Our first
question is: what exactly does “glean” mean? What tasks, were the adversary to accomplish them,
would make us declare the scheme insecure? And, correspondingly, what tasks, were the adversary
unable to accomplish them, would make us declare the scheme secure?
It is easier to think about insecurity than security, because we can certainly identify adversary
actions that indubitably imply the scheme is insecure. So let us begin here.
For example, if the adversary can, from a few ciphertexts, derive the underlying key K, it can
later decrypt anything it sees, so if the scheme allowed easy key recovery from a few ciphertexts it
is definitely insecure.
Now, the mistake that is often made is to go on to reverse this, saying that if key recovery is
hard, then the scheme is secure. This is certainly not true, for there are other possible weaknesses.
For example, what if, given the ciphertext, the adversary could easily recover the plaintext M
without finding the key? Certainly the scheme is insecure then too.
So should we now declare a scheme secure if it is hard to recover a plaintext from the ciphertext?
Many people would say yes. Yet, this would be wrong too.
One reason is that the adversary might be able to figure out partial information about M . For
example, even though it might not be able to recover M , the adversary might, given C, be able
to recover the first bit of M , or the sum of all the bits of M . This is not good, because these bits
might carry valuable information.
For a concrete example, say I am communicating to my broker a message which is a sequence
of “buy” or “sell” decisions for a pre-specified sequence of stocks. That is, we have certain stocks,
numbered 1 through m, and bit i of the message is 1 if I want to buy stock i and 0 otherwise. The
message is sent encrypted. But if the first bit leaks, the adversary knows whether I want to buy
or sell stock 1, which may be something I don’t want to reveal. If the sum of the bits leaks, the
adversary knows how many stocks I am buying.
Granted, this might not be a problem at all if the data were in a different format. However,
making assumptions, or requirements, on how users format data, or how they use it, is a bad and
dangerous approach to secure protocol design. An important principle of good cryptographic design
is that the encryption scheme should provide security regardless of the format of the plaintext. Users
should not have to worry about the how they format their data: they format it as they like, and
encryption should provide privacy nonetheless.
Put another way, as designers of security protocols, we should not make assumptions about
data content or formats. Our protocols must protect any data, no matter how formatted. We view
it as the job of the protocol designer to ensure this is true.
At this point it should start becoming obvious that there is an infinite list of insecurity proper-
ties, and we can hardly attempt to characterize security as their absence. We need to think about
security in a different and more direct way and arrive at some definition of it.
This important task is surprisingly neglected in many treatments of cryptography, which will
provide you with many schemes and attacks, but never actually define the goal by saying what
an encryption scheme is actually trying to achieve and when it should be considered secure rather
than merely not known to be insecure. This is the task that we want to address.
One might want to say something like: the encryption scheme is secure if given C, the adversary
has no idea what M is. This however cannot be true, because of what is called a priori information.
Often, something about the message is known. For example, it might be a packet with known
headers. Or, it might be an English word. So the adversary, and everyone else, has some information
90 SYMMETRIC ENCRYPTION
5.4.1 Definition
The basic idea behind indistinguishability (or, more fully, left-or-right indistinguishability under
a chosen-plaintext attack ) is to consider an adversary (not in possession of the secret key) who
chooses two messages of the same length. Then one of the two messages is encrypted, and the
ciphertext is given to the adversary. The scheme is considered secure if the adversary has a hard
time telling which of the two messages was the one encrypted.
We will actually give the adversary a little more power, letting her choose a whole sequence of
pairs of equal-length messages. Let us now detail the game.
The adversary chooses a sequence of pairs of messages, (M0,1 , M1,1 ), . . . , (M0,q , M1,q ), where,
in each pair, the two messages have the same length. We give to the adversary a sequence of
ciphertexts C1 , . . . , Cq where either (1) Ci is an encryption of M0,i for all 1 ≤ i ≤ q or, (2) Ci is
an encryption of M1,i for all 1 ≤ i ≤ q. In doing the encryptions, the encryption algorithm uses
the same key but fresh coins, or an updated state, each time. The adversary gets the sequence of
ciphertexts and now it must guess whether M0,1 , . . . , M0,q were encrypted or M1,1 , . . . , M1,q were
encrypted.
To further empower the adversary, we let it choose the sequence of message pairs via a chosen
plaintext attack. This means that the adversary chooses the first pair, then receives C1 , then
chooses the second pair, receives C2 , and so on. (Sometimes this is called an adaptive chosen-
plaintext attack, because the adversary can adaptively choose each query in a way responsive to
the earlier answers.)
Let us now formalize this. We fix some encryption scheme SE = (K, E, D). It could be either
stateless or stateful. We consider an adversary A. It is a program which has access to an oracle
that we call LR (left or right) oracle. A can provide as input any pair of equal-length messages.
The oracle will return a ciphertext. We will consider two possible ways in which this ciphertext
is computed by the oracle, corresponding to two possible “worlds” in which the adversary “lives”.
In the “right” world, the oracle, given query M0 , M1 , runs E with key K and input M1 to get a
ciphertext C which it returns. In the “left” world, the oracle, given M0 , M1 , runs E with key K and
input M0 to get a ciphertext C which it returns. The problem for the adversary is, after talking
to its oracle for some time, to tell which of the two oracles it was given. The formalization uses
the game LeftSE and RightSE of Fig. 5.6. The game begins by picking at random encryption key
K. (The key is not returned to the adversary.) Each game then defines a LR oracle, to which
the adversary may make multiple queries, each query being a pair of equal-length strings. The
adversary outputs a bit which become the output of the game.
First assume the given symmetric encryption scheme SE is stateless. The oracle, in either
world, is probabilistic, because it calls the encryption algorithm. Recall that this algorithm is
$
probabilistic. Above, when we say C ← EK (Mb ), it is implicit that the oracle picks its own random
coins and uses them to compute ciphertext C.
The random choices of the encryption function are somewhat “under the rug” here, not being
explicitly represented in the notation. But these random bits should not be forgotten. They are
central to the meaningfulness of the notion and the security of the schemes.
If the given symmetric encryption scheme SE is stateful, the oracles, in either world, become
stateful, too. (Think of a subroutine that maintains a “static” variable across successive calls.)
An oracle begins with a state value initialized to a value specified by the encryption scheme. For
example, in CTRC mode, the state is an integer ctr that is initialized to 0. Now, each time the
oracle is invoked, it computes EK (Mb ) according to the specification of algorithm E. The algorithm
may, as a side-effect, update the state, and upon the next invocation of the oracle, the new state
value will be used.
92 SYMMETRIC ENCRYPTION
Definition 5.4.1 Let SE = (K, E, D) be a symmetric encryption scheme, and let A be an algorithm
that has access to an oracle. We consider two games as described in Fig. 5.6. The ind-cpa advantage
of A is defined as h i h i
Advind-cpa (A) = Pr RightA ⇒1 − Pr LeftA ⇒1
SE SE SE
As the above indicates, the choice of which world we are in is made just once, at the beginning,
before the adversary starts to interact with the oracle. In the left world, all message pairs sent to
the oracle are answered by the oracle encrypting the left message in the pair, while in the right
world, all message pairs are answered by the oracle encrypting the right message in the pair. The
choice of which does not flip-flop from oracle query to oracle query.
If Advind -cpa (A) is small (meaning close to zero), it means that A is outputting 1 about as often
SE
in world 0 as in world 1, meaning it is not doing a good job of telling which world it is in. If this
quantity is large (meaning close to one—or at least far from zero) then the adversary A is doing
well, meaning our scheme SE is not secure, at least to the extent that we regard A as “reasonable.”
Informally, for symmetric encryption scheme SE to be secure against chosen plaintext attack,
the ind-cpa advantage of an adversary must be small, no matter what strategy the adversary tries.
However, we have to be realistic in our expectations, understanding that the advantage may grow
as the adversary invests more effort in its attack. Security is a measure of how large the advantage
of the adversary might when compared against the adversary’s resources.
We consider an encryption scheme to be “secure against chosen-plaintext attack” if an adversary
restricted to using “practical” amount of resources (computing time, number of queries) cannot
obtain “significant” advantage. The technical notion is called left-or-right indistinguishability under
chosen-plaintext attack, denoted IND-CPA.
We discuss some important conventions regarding the resources of adversary A. The running
time of an adversary A is the worst case execution time of A over all possible coins of A and all
conceivable oracle return values (including return values that could never arise in the experiments
used to define the advantage). Oracle queries are understood to return a value in unit time, but
it takes the adversary one unit of time to read any bit that it chooses to read. By convention, the
running time of A also includes the size of the code of the adversary A, in some fixed RAM model
of computation. This convention for measuring time complexity is the same as used in other parts
of these notes, for all kinds of adversaries.
Other resource conventions are specific to the IND-CPA notion. When the adversary asks its
left-or-right encryption oracle a query (M0 , M1 ) we say that length of this query is max(|M0 |, |M1 |).
Bellare and Rogaway 93
(This will equal |M0 | since we require that any query consist of equal-length messages.) The total
length of queries is the sum of the length of each query. We can measure query lengths in bits or
in blocks, with block having some understood number of bits n.
The resources of the adversary we will typically care about are three. First, its time-complexity,
measured according to the convention above. Second, the number of oracle queries, meaning the
number of message pairs the adversary asks of its oracle. These messages may have different lengths,
and our third resource measure is the sum of all these lengths, denoted µ, again measured according
to the convention above.
The Proposition says that this rescaled advantage is exactly the same measure as before.
Proof of Proposition 5.4.2: We let Pr [·] be the probability of event “·” in the game IND-CPAA SE ,
and refer below to quantities in this game. The claim of the Proposition follows by a straightforward
94 SYMMETRIC ENCRYPTION
calculation:
h i
Pr IND-CPAA
SE ⇒true
= Pr b = b′
= Pr b = b′ | b = 1 · Pr [b = 1] + Pr b = b′ | b = 0 · Pr [b = 0]
1 1
= Pr b = b′ | b = 1 · + Pr b = b′ | b = 0 ·
2 2
1 1
= Pr b′ = 1 | b = 1 · + Pr b′ = 0 | b = 0 ·
2 2
′ 1 ′ 1
= Pr b = 1 | b = 1 · + 1 − Pr b = 1 | b = 0 ·
2 2
1 1
= + · Pr b′ = 1 | b = 1 − Pr b′ = 1 | b = 0
2 2
1 1 h i h i
= + · Pr RightA SE ⇒1 − Pr Left A
SE ⇒1
2 2
1 1 -cpa (A) .
= + · Advind SE
2 2
We began by expanding the quantity of interest via standard conditioning. The term of 1/2 in the
third line emerged because the choice of b is made at random. In the fourth line we noted that if we
are asking whether b = b′ given that we know b = 1, it is the same as asking whether b′ = 1 given b =
1, and analogously for b = 0. In the fifth line and sixth lines we just manipulated the probabilities
and simplified. The next line is important; here we observed that the conditional probabilities in
question are exactly the probabilities that A returns 1 in the games of Definition 5.4.1.
Advind-cpa (A) = 1 .
SE
Adversary A runs in time O(m) and asks just two queries, each of length m.
The requirement being made on the message space is minimal; typical schemes have messages spaces
containing all strings of lengths between some minimum and maximum length, possibly restricted
to strings of some given multiples. Note that this Proposition applies to ECB and is enough to
show the latter is insecure.
Proof of Proposition 5.5.1: We must describe the adversary A. Remember that A is given an
lr-encryption oracle LR(·, ·) that takes input a pair of messages and returns an encryption of either
the left or the right message in the pair. The goal of A is to determine which. Our adversary works
like this:
Adversary A
Let X, Y be distinct, m-bit strings in the plaintext space
C1 ← LR(X, Y )
C2 ← LR(Y, Y )
If C1 = C2 then return 1 else return 0
Why? In the right world, the oracle returns C1 = EK (Y ) and C2 = EK (Y ), and since the encryption
function is deterministic and stateless, C1 = C2 , so A returns 1. In the left world, the oracle returns
C1 = EK (X) and C2 = EK (Y ), and since it is required that decryption be able to recover the
message, it must be that C1 6= C2 . So A returns 0.
Subtracting, we get Advind-cpa (A) = 1 − 0 = 1. And A achieved this advantage by making two
SE
oracle queries, each of whose length, which as per our conventions is just the length of the first
message, is m bits.
Adversary A
M0,1 ← 0n ; M1,1 ← 0n
M0,2 ← 0n ; M1,2 ← 0n−1 1
$
C1 ← LR(M0,1 , M1,1 )
$
C2 ← LR(M0,2 , M1,2 )
If C1 = C2 then return 1 else return 0
We claim that
h i
Pr RightA
SE ⇒1 = 1 and
h i
Pr LeftA
SE ⇒1 = 0.
Why? First consider the left world,. In that case C1 [0] = 0 and C2 [0] = 1 and C1 [1] = EK (0) and
C2 [1] = EK (1) and so C1 6= C2 and A returns 0. On the other hand, if we are in the right world,
then C1 [0] = 0 and C2 [0] = 1 and C1 [1] = EK (0) and C2 [1] = EK (0), so A returns 1.
Subtracting, we get Advind -cpa (A) = 1 − 0 = 1, showing that A has a very high advantage.
SE
Moreover, A is practical, using very few resources. So the scheme is insecure.
random, and then let Ci be a random encryption of Mi . Adaptively querying, the adversary
obtains the vector of ciphertexts (C1 , . . . , Cq ). Now the adversary tries to find a function f such
that it can do a good job at predicting f (M1 , . . . , Mq ). Doing a good job means predicting this value
significantly better than how well the adversary would predict it had it been given no information
about M1 , . . . , Mq : each Ci was not the encryption of Mi but the encryption of a random point Mi′
from Mi . The formal definition now follows.
$ $
(f, Y ) ← A(s) (f, Y ) ← A(s)
return f (M1 , . . . , Mq ) = Y return f (M1 , . . . , Mq ) = Y
In the definition above, each experiment initializes its oracle by choosing a random key K. A
total of q times, the adversary chooses a message space Mi . The message space is specified by an
always-halting probabilistic algorithm, written in some fixed programming language. The code for
this algorithm is what the adversary actually outputs. Each time the message space is output, two
random samples are drawn from this message space, Mi and Mi′ . We expect that Mi and Mi′ to have
the same length, and if they don’t we “erase” both strings. The encryption of one of these messages
will be returned to the adversary. Which string gets encrypted depends on the experiment: Mi for
experiment 1 and Mi′ for experiment 0. By f we denote a deterministic function. It is described by
an always-halting program and, as before, it actually the program for f that the adversary outputs.
By Y we denote a string. The string s represents saved state that the adversary may wish to retain.
In speaking of the running time of A, we include, beyond the actual running time, the maximal
time to draw two samples from each message space M that A outputs, and we include the maximal
time to compute f (M1 , . . . , Mq ) over any vector of strings. In speaking of the length of A’s queries
we sum, over all the message spaces output by A, the maximal length of a string M output with
nonzero probability by M, and we sum also over the lengths of the encodings of each messages
space, function f , and string Y output by A.
We emphasize that the above would seem to be an exceptionally strong notion of security. We
have given the adversary the ability to choose the message spaces from which each message will be
drawn. We have let the adversary choose the partial information about the messages that it finds
convenient to predict. We have let the adversary be fully adaptive. We have built in the ability to
perform a chosen-message attack (simply by producing an algorithm M that samples one and only
one point). Despite all this, we now show that security in the indistinguishability sense implies
semantic security.
Bellare and Rogaway 99
and where B runs in time t + O(µ) and asks at most q queries, these queries totaling µ bits.
algorithm B g
$
s←ε
for i ← 1 to q do
$
(Mi , s) ← A(s)
′
Mi , M i ← Mi$
$
Suppose first that g is instantiated by a right encryption oracle—an oracle that returns C ← EK (M )
′
in response to a query (M , M ). Then the algorithm above coincides with experiment ExpSE ss -cpa-1
(A).
$ ′
Similarly, if g is instantiated by a left encryption oracle—the oracle it returns C ← EK (M ) in re-
sponse to a query (M ′ , M )—then the algorithm above coincides with experiment Expss -cpa-0 (A).
SE
sem-cpa ind-cpa
It follows that AdvSE (B) = AdvSE (A). To complete the theorem, note that B’s running
time is A’s running time plus O(µ) and B asks a total of q queries, these having total length at
most the total length of A’s queries, under our convention.
Theorem 5.7.1 [Security of CTRC mode] Let F : K × {0, 1}n → {0, 1}n be a family of func-
tions and let SE = (K, E, D) be the corresponding CTRC symmetric encryption scheme as described
in Scheme 5.2.6. Let A be an adversary (for attacking the IND-CPA security of SE) that runs in
time at most t and asks at most q queries, these totaling at most σ n-bit blocks. Then there exists
an adversary B (attacking the PRF security of F ) such that
Furthermore B runs in time at most t′ = t + O(q + nσ) and asks at most q ′ = σ oracle queries.
Theorem 5.7.2 [Security of CTR$ mode] Let F : K × {0, 1}n → {0, 1}n be a blockcipher
and let SE = (K, E, D) be the corresponding CTR$ symmetric encryption scheme as described in
Scheme 5.2.5. Let A be an adversary (for attacking the IND-CPA security of SE) that runs in time
at most t and asks at most q queries, these totaling at most σ n-bit blocks. Then there exists an
adversary B (attacking the PRF security of F ) such that
The above theorems exemplify the kinds of results that the provable-security approach is about.
Namely, we are able to provide provable guarantees of security of some higher level cryptographic
construct (in this case, a symmetric encryption scheme) based on the assumption that some building
block (in this case an underlying block) is secure. The above results are the first example of the
“punch-line” we have been building towards. So it is worth pausing at this point and trying to
make sure we really understand what these theorems are saying and what are their implications.
If we want to entrust our data to some encryption mechanism, we want to know that this
encryption mechanism really provides privacy. If it is ill-designed, it may not. We saw this happen
with ECB. Even if we used a secure blockcipher, the flaws of ECB mode make it an insecure
encryption scheme.
Flaws are not apparent in CTR at first glance. But maybe they exist. It is very hard to see how
one can be convinced they do not exist, when one cannot possible exhaust the space of all possible
attacks that could be tried. Yet this is exactly the difficulty that the above theorems circumvent.
They are saying that CTR mode does not have design flaws. They are saying that as long as you use
a good blockcipher, you are assured that nobody will break your encryption scheme. One cannot
ask for more, since if one does not use a good blockcipher, there is no reason to expect security of
your encryption scheme anyway. We are thus getting a conviction that all attacks fail even though
we do not even know exactly how these attacks might operate. That is the power of the approach.
Now, one might appreciate that the ability to make such a powerful statement takes work. It
is for this that we have put so much work and time into developing the definitions: the formal
notions of security that make such results meaningful. For readers who have less experience with
definitions, it is worth knowing, at least, that the effort is worth it. It takes time and work to
understand the notions, but the payoffs are big: you get significant guarantees of security.
How, exactly, are the theorems saying this? The above discussion has pushed under the rug
the quantitative aspect that is an important part of the results. It may help to look at a concrete
example.
Example 5.7.3 Let us suppose that F is the blockcipher AES, so that n = 128. Suppose I want
to encrypt q = 230 messages, each being one kilobyte (213 bits) long. I am thus encrypting a total
of 243 bits, which is to say σ = 236 blocks. (This is about one terabyte). Can I do this securely
using CTR$? Let A be an adversary attacking the privacy of my encryption. Theorem 5.7.2 says
that there exists a B satisfying the stated conditions. How large can Advprf AES (B) be? It makes
q = 236 queries, and it is consistent with our state of knowledge of the security of AES to assume
that such an adversary cannot do better than mount a birthday attack, meaning its advantage is
no more than q 2 /2128 . Under such an assumption, the theorem tells us that Advrnd -cpa (A) is at
SE
most σ 2 /2128 + 0.5 σ 2 /2128 = 1.5 272 /2128 ≤ 1/255 . This is a very small number indeed, saying that
Bellare and Rogaway 101
our encryption is secure, at least under the assumption that the best attack on the PRF security of
AES is a birthday attack. Note however that if we encrypt 264 blocks of data, our provable-security
bound becomes meaningless.
The example illustrates how to use the theorems to figure out how much security you will get from
the CTR encryption scheme in a given application.
Note that as per the above theorems, encrypting more than σ = 2n/2 blocks of data with CTR$
is not secure regardless of the quality of F as a PRF. On the other hand, with CTRC, it might be
secure, as long as F can withstand σ queries. This is an interesting and possibly useful distinction.
Yet, in the setting in which such modes are usually employed, the distinction all but vanishes.
For usually F is a blockcipher and in that case, we know from the birthday attack that the prf-
advantage of B may itself be as large as Θ(σ 2 /2n ), and thus, again, encrypting more than σ = 2n/2
blocks of data is not secure. However, we might be able to find or build function families F that
are not families of permutations and preserve PRF security against adversaries making more than
2n/2 queries.
Proof of Theorem 5.7.1: We consider the games G0 , G1 , G2 of Fig. 5.8. We now build prf-
adversary B so that
h i h i
prf
Pr GA A
0 ⇒1 − Pr G1 ⇒1 = AdvF (B). (5.1)
Additionally, B’s resource usage (running time and query count) is as claimed in the theorem.
Remember that B has access to an oracle Fn: {0, 1}n → {0, 1}n . B runs A, answering any queries
that A makes to its LR oracle via the subroutine LRSim. This subroutine uses B’s own oracle Fn.
102 SYMMETRIC ENCRYPTION
subroutine LRSim(M0 , M1 )
Adversary B C[0] ← ctr; m ← kMb kn
$
b ← {0, 1}; ctr ← 0; for i = 1, ..., m do
b′ ←$
ALRSim T [ctr + i] ← Fn(ctr + i)
If (b = b′ ) then return 1 C[i] ← T [ctr + i] ⊕ Mb [i]
Else return 0 ctr ← ctr + m
return C
Thus
h i h i
Advprf B B
F (B) = Pr RealF ⇒1 − Pr RandF ⇒1
h i h i
= Pr GA A
0 ⇒1 − Pr G1 ⇒1
Pr[GA A A A
0 ⇒1] = Pr[G1 ⇒1] + Pr[G0 ⇒1] − Pr[G1 ⇒1]
= 2 · Advprf A
F (B) + 2 Pr[G1 ⇒1] − 1
1
Pr[GA
1 ⇒1] = 2 .
Game G0 Game G1
Proof: The proof uses games G0 , G1 of Fig. 5.9 and games G2 , G3 , G4 of Fig. 5.10. Then we have
h i
Advind-cpa (A) = 2 · Pr GA ⇒ true − 1.
SE 0
But h i h i h i h i
Pr GA A A A
0 ⇒ true = Pr G1 ⇒ true + Pr G0 ⇒ true − Pr G1 ⇒ true
Additionally, B’s resource usage (query count and running time) is as claimed in the theorem.
The prf-adversary B works as follows:
104 SYMMETRIC ENCRYPTION
subroutine LRSim(M0 , M1 )
Adversary B m ← kMb kn ; C[0] ← $
{0, 1}n
$ for i = 1, ..., m do
b← {0, 1} ; S ← ∅
P ← C[i − 1] ⊕ Mb [i]
b ← ALR
′ $
if P ∈
/ S then T [P ] ← Fn(P )
if (b = b′ ) then return 1
C[i] ← T [P ]
else return 0
S ← S ∪ {P }
return C
So we have
h i h i h i h i
prf
Pr GA A B B
0 ⇒ true − Pr G1 ⇒ true = Pr RealE ⇒ 1 − Pr RandE ⇒ 1 ≤ AdvE (B).
h i 1
We are going to prove that Pr GA
1 ⇒ true ≤ + σ 2 · 2−n−1 .
2
Assuming this for now, we will have
!
-cpa (A) 1 σ2
Advind
SE ≤ 2· + n+1 − 1 + 2Advprf
E (B)
2 2
σ2
= + 2Adv prf
E (B)
2n
In the following, we use games G2 , G3 , G4 of Fig. 5.10.
Pr[GA A
1 ⇒ true] = Pr[G2 ⇒ true]
= Pr[GA A A
3 ⇒ true] + (Pr[G2 ⇒ true] − Pr[G3 ⇒ true])
1
• Pr[GA
3 ⇒ true] = 2
σ2
• Pr[GA A
2 ⇒ true] − Pr[G3 ⇒ true] ≤ 2n+1
Game G2 , G3 Game G4
procedure Finalize(b′ )
return (b = b′ )
Figure 5.10: Games used to prove Theorem 5.8.1.
h i h i
Pr GA A
3 sets bad = Pr G4 sets bad
n n
√ n/2 5.8.2 Let n ≥ 1, let E: K × {0, 1} → {0, 1} be a function family, and let σ ∈
Proposition
[0 .. 2 2 − 1]. Then there is an adversary A that asks a single query, the query consisting of σ
blocks, runs in time O(nσ lg(σ)), and achieves advantage Advind -cpa 2 n
CBC[E] (A) ≥ 0.15 σ /2 and
given a left encryption-oracle that oracle is encrypting the zero-string and any time Ci = CI we
must have that Ci+1 = CI+1 as well. Thus
Advprp
E (A) = Pr[A
Right(·,·)
⇒1]
= Pr[R ← {0, 1}nσ ; K ← K; IV ← {0, 1}n ; C ← CBCIV
$ $ $ $
K (R) :
∃i < I s.t. Ci = CI and Ci+1 6= CI+1 on the first such (i, I)]
By the structure of CBC mode with a random IV it is easy to see that that when you encrypt a
random string R ∈ {0, 1}nσ you get a random string C ∈ {0, 1}n(σ+1) . To see this, note that to make
block Ci , for i ≥ 1, you xor the random block Ri with Ci and apply the blockcipher. The random
block Ri is independent of Ci —it wasn’t even consulted in making Ci —and it is independent of all
of C0 , . . . , Ci−1 , too. The image of a uniformly selected value under a permutation is uniform. The
very first block of ciphertext, C0 , is uniform. This makes the entire string C0 C1 · · · Cσ uniform. So
the probability in question is
Advprp $
E (A) = Pr[C ← {0, 1}
n(σ+1)
:
∃i < I s.t. Ci = CI and Ci+1 6= CI+1 on the first such (i, I)]
Now the birthday bound (Appendix A, Theorem A.0.1) tells us that the probability there will be
an i < I such that Ci = CI is at least C(2n , σ + 1) ≥ 0.3σ 2 /2n . When there is such an i, I and
we fix the lexicographically first such i, I, note that CI+1 is still uniform and independent of Ci+1 .
Independence is assured because CI+1 is obtained as EK (RI+1 ⊕ CI ) for a permutation EK and
a uniform random value RI+1 that is independent of CI and Ci+1 . Because of this probabilistic
independence, the probability of the conjunct is just the product of the probabilities and we have
that
Advprp 2 n
E (A) ≥ 0.3 σ /2 · (1 − 2
−n
) ≥ 0.15 σ 2 /2n
completing the proof.
How might such a situation arise? One situation one could imagine is that an adversary at some
point gains temporary access to the equipment performing decryption. It can feed the equipment
ciphertexts and see what plaintexts emerge. (We assume it cannot directly extract the key from
the equipment, however.)
If an adversary has access to a decryption oracle, security at first seems moot, since after all it
can decrypt anything it wants. To create a meaningful notion of security, we put a restriction on
the use of the decryption oracle. To see what this is, let us look closer at the formalization. As in
the case of chosen-plaintext attacks, we consider two worlds. In the left world, all message pairs
sent to the oracle are answered by the oracle encrypting the left message in the pair, while in the
right world, all message pairs are answered by the oracle encrypting the right message in the pair.
The choice of which does not flip-flop from oracle query to oracle query.
The adversary’s goal is the same as in the case of chosen-plaintext attacks: it wants to figure out
which world it is in. There is one easy way to do this. Namely, query the lr-encryption oracle on
two distinct, equal length messages M0 , M1 to get back a ciphertext C, and now call the decryption
oracle on C. If the message returned by the decryption oracle is M0 then the adversary is in world 0,
and if the message returned by the decryption oracle is M1 then the adversary is in world 1. The
restriction we impose is simply that this call to the decryption oracle is not allowed. More generally,
call a query C to the decryption oracle illegitimate if C was previously returned by the lr-encryption
oracle; otherwise a query is legitimate. We insist that only legitimate queries are allowed. In the
formalization below, the experiment simply returns 0 if the adversary makes an illegitimate query.
(We clarify that a query C is legitimate if C is returned by the lr-encryption oracle after C was
queried to the decryption oracle.)
This restriction still leaves the adversary with a lot of power. Typically, a successful chosen-
ciphertext attack proceeds by taking a ciphertext C returned by the lr-encryption oracle, modifying
it into a related ciphertext C ′ , and querying the decryption oracle with C ′ . The attacker seeks to
create C ′ in such a way that its decryption tells the attacker what the underlying message M was.
We will see this illustrated in Section 5.10 below.
The model we are considering here might seem quite artificial. If an adversary has access to a
decryption oracle, how can we prevent it from calling the decryption oracle on certain messages?
The restriction might arise due to the adversary’s having access to the decryption equipment for
a limited period of time. We imagine that after it has lost access to the decryption equipment, it
sees some ciphertexts, and we are capturing the security of these ciphertexts in the face of previous
access to the decryption oracle. Further motivation for the model will emerge when we see how
encryption schemes are used in protocols. We will see that when an encryption scheme is used
in many authenticated key-exchange protocols the adversary effectively has the ability to mount
chosen-ciphertext attacks of the type we are discussing. For now let us just provide the definition
and exercise it.
The conventions with regard to resource measures are the same as those used in the case of chosen-
plaintext attacks. In particular, the length of a query M0 , M1 to the lr-encryption oracle is defined
as the length of M0 .
We consider an encryption scheme to be “secure against chosen-ciphertext attack” if a “reason-
able” adversary cannot obtain “significant” advantage in distinguishing the cases b = 0 and b = 1
given access to the oracles, where reasonable reflects its resource usage. The technical notion is
called indistinguishability under chosen-ciphertext attack, denoted IND-CCA.
Proposition 5.10.1 Let F : K × {0, 1}n → {0, 1}n be a family of functions and let SE = (K, E, D)
be the corresponding CTR$ symmetric encryption scheme as described in Scheme 5.2.5. Then
The advantage of this adversary is 1 even though it uses hardly any resources: just one query to
each oracle. That is clearly an indication that the scheme is insecure.
Bellare and Rogaway 109
Advind-cca (A) = 1 .
SE
Adversary ALR(·,·),Dec(·)
M0 ← 0n ; M1 ← 1n
hr, Ci ← LR(M0 , M1 )
C ′ ← C ⊕ 1n
M ← Dec(hr, C ′ i)
If M = M0 then return 1 else return 0
The adversary’s single lr-encryption oracle query is the pair of distinct messages M0 , M1 , each one
block long. It is returned a ciphertext hr, Ci. It flips the bits of C to get C ′ and then feeds the
ciphertext hr, Ci to the decryption oracle. It bets on world 1 if it gets back M0 , and otherwise on
world 0. Notice that hr, C ′ i =
6 hr, Ci, so the decryption query is legitimate. Now, we claim that
h i
Pr RightA
SE ⇒1 = 1
h i
Pr LeftA
SE ⇒1 = 0.
ind-cpa
Hence AdvSE (A) = 1 − 0 = 1. And A achieved this advantage by making just one lr-encryption
oracle query, whose length, which as per our conventions is just the length of M0 , is n bits, and just
one decryption oracle query, whose length is 2n bits (assuming an encoding of hr, Xi as n+|X|-bits).
So Advpr -cpa (t, 1, n, 1, 2n) = 1.
SE
Why are the two equations claimed above true? You have to return to the definitions of the
quantities in question, as well as the description of the scheme itself, and walk it through. In
world 1, meaning b = 1, let hr, Ci denote the ciphertext returned by the lr-encryption oracle. Then
C = FK (r + 1) ⊕ M1 = FK (r + 1) ⊕ 1n .
Now notice that
M = DK (hr, C ′ i)
= FK (r + 1) ⊕ C ′
= FK (r + 1) ⊕ C ⊕ 1n
= FK (r + 1) ⊕ (FK (r + 1) ⊕ 1n ) ⊕ 1n
= 0n
= M0 .
110 SYMMETRIC ENCRYPTION
Thus, the decryption oracle will return M0 , and A will return 1. In world 0, meaning b = 0, let
hr, C[1]i denote the ciphertext returned by the lr-encryption oracle. Then
C = FK (r + 1) ⊕ M0 = FK (r + 1) ⊕ 0n .
Now notice that
M = DK (hr, C ′ i)
= FK (r + 1) ⊕ C ′
= FK (r + 1) ⊕ C ⊕ 1n
= FK (r + 1) ⊕ (FK (r + 1) ⊕ 0n ) ⊕ 1n
= 1n
= M1 .
Thus, the decryption oracle will return M1 , and A will return 0, meaning will return 1 with
probability zero.
An attack on CTRC (cf. Scheme 5.2.6) is similar, and is left to the reader.
Proposition 5.10.2 Let E: K × {0, 1}n → {0, 1}n be a blockcipher and let SE = (K, E, D) be the
corresponding CBC$ encryption scheme as described in Scheme 5.2.3. Then
The advantage of this adversary is 1 even though it uses hardly any resources: just one query to
each oracle. That is clearly an indication that the scheme is insecure.
Proof of Proposition 5.10.2: We will present an adversary A, having time-complexity t, making
1 query to its lr-encryption oracle, this query being of length n, making 1 query to its decryption
oracle, this query being of length 2n, and having
Advind-cca (A) = 1 .
SE
Adversary ALR(·,·),Dec(·)
M0 ← 0n ; M1 ← 1n
hIV, C[1]i ← LR(M0 , M1 )
IV′ ← IV ⊕ 1n
M ← Dec(hIV′ , C[1]i)
If M = M0 then return 1 else return 0
The adversary’s single lr-encryption oracle query is the pair of distinct messages M0 , M1 , each one
block long. It is returned a ciphertext hIV, C[1]i. It flips the bits of the IV to get a new IV, IV′ ,
and then feeds the ciphertext hIV′ , C[1]i to the decryption oracle. It bets on world 1 if it gets back
M0 , and otherwise on world 0. It is important that hIV′ , C[1]i 6= hIV, C[1]i so the decryption oracle
query is legitimate. Now, we claim that
h i
Pr RightA
SE ⇒1 = 1
h i
Pr LeftA
SE ⇒1 = 0.
Hence Advind -cca (A) = 1 − 0 = 1. And A achieved this advantage by making just one lr-encryption
SE
oracle query, whose length, which as per our conventions is just the length of M0 , is n bits, and
just one decryption oracle query, whose length is 2n bits. So Advind -cca (t, 1, n, 1, 2n) = 1.
SE
Why are the two equations claimed above true? You have to return to the definitions of the
quantities in question, as well as the description of the scheme itself, and walk it through. In
world 1, meaning b = 1, the lr-encryption oracle returns hIV, C[1]i with
C[1] = EK (IV ⊕ M1 ) = EK (IV ⊕ 1n ) .
Now notice that
M = DK (hIV′ , C[1]i)
−1
= EK (C[1]) ⊕ IV′
−1
= EK (EK (IV ⊕ 1n )) ⊕ IV′
= (IV ⊕ 1n ) ⊕ IV′ [0]
= (IV ⊕ 1n ) ⊕ (IV ⊕ 1n )
= 0n
= M0 .
Thus, the decryption oracle will return M0 , and A will return 1. In world 0, meaning b = 0, the
lr-encryption oracle returns hIV, C[1]i with
C[1] = EK (IV ⊕ M0 ) = EK (IV ⊕ 0l ) .
Now notice that
M = DK (hIV′ , C[1]i)
−1
= EK (C[1]) ⊕ IV′
−1
= EK (EK (IV ⊕ 0n )) ⊕ IV′
= (IV ⊕ 0n ) ⊕ IV′ [0]
112 SYMMETRIC ENCRYPTION
= (IV ⊕ 0n ) ⊕ (IV ⊕ 1n )
= 1n
= M1 .
Thus, the decryption oracle will return M1 , and A will return 0, meaning will return 1 with
probability zero.
5.12 Problems
Problem 29 Formalize a notion of security against key-recovery for symmetric encryption schemes.
Then prove that IND-CPA security implies key-recovery security.
That is, a scheme in IND0-CPA secure if the encryption of every string looks like the encryption
of an equal number of zeros. Here we assume that whenever M is in the message space, so is 0|M | .
Prove that this notion of security is equivalent to IND-CPA security, carefully stating a pair of
theorems and proving them.
Problem 31 The definition above for IND0-CPA provides the adversary with no method to get,
with certitude, the encryption of a given message: when the adversary asks a query M , it might
$ $
get answered with C ← EK (M ) or it might get answered with C ← EK (0|M | ). Consider providing
the adversary an additional, “reference” oracle that always encrypts the queried string. Consider
defining the corresponding advantage notion in the natural way: for an encryption scheme SE =
(K, E, D), let
ind0-cpa∗ |M | ,
AdvSE $
(A) = Pr[K ← K : AEK (·), EK (·) ⇒1] − Pr[K ←
$
K : A0 EK (·)
⇒1]
State and prove a theorem that shows that this notion of security is equivalent to our original
IND0-CPA notion (and therefore to IND-CPA).
(m)
algorithm EK (M )
M [1] . . . M [m] ← M
for i ← 1 to m do
C[i] ← EK (M [i])
return C ← hC[1], . . . , C[m]i
(m)
algorithm DK (C)
C[1] · · · C[m] ← C
for i ← 1 to m do
M [i] ← DK (C[i])
if M [i] = ⊥ then return ⊥
return M ← hM [1], . . . , M [m]i
m ≥ 1, we define a new symmetric encryption scheme SE (m) = (K, E (m) , D (m) ) having the same
key-generation algorithm as that of SE, plaintext space {0, 1}mn , and encryption and decryption
algorithms as depicted in Fig. 5.11.
(a) Show that
for any t, q.
Part (a) says that SE (m) is insecure against chosen-ciphertext attack. Note this is true regardless
of the security properties of SE, which may itself be secure against chosen-ciphertext attack. Part
(b) says that if SE is secure against chosen-plaintext attack, then so is SE (m) .
Problem 33 The CBC-Chain mode of operation is a CBC variant in which the IV that is used for
the very first message to be encrypted is random, while the IV used for each subsequent encrypted
message is the last block of ciphertext that was generated. The scheme is probabilistic and stateful.
Show that CBC-Chain is insecure by giving a simple and efficient adversary that breaks it in the
IND-CPA sense.
Problem 34 Using the proof of Theorem 5.7.1 as a template, prove Theorem 5.7.2 assuming
Lemma ??.
Problem 35 Define a notion for indistinguishability from random bits, IND$-CPA. Your notion
should capture the idea that the encryption of each message M looks like a string of random bits.
Pay careful attention to the number of random bits that one outputs. Then formalize and prove
that IND$-CPA security implies IND-CPA security—but that IND-CPA security does not imply
IND$-CPA security.
114 SYMMETRIC ENCRYPTION
Problem 36 Using a game-based argument, prove that CBC$[Func(n,n)] achieves IND$-CPA se-
curity. Assume that one encodes hR, Ci as R k C.
Problem 37 Devise a secure extension to CBC$ mode that allows messages of any bit length to
be encrypted. Clearly state your encryption and decryption algorithm. Your algorithm should
resemble CBC mode as much as possible, and should coincide with CBC mode when the message
being encrypted is a multiple of the blocklength. It should increase the length of the message being
encrypted by exactly n bits, where n is the length of the underlying blockcipher. How would you
prove your algorithm secure?
Problem 38 An IND-CPA secure encryption scheme might not conceal identities, in the following
sense: given a pair of ciphertexts C, C ′ for equal-length messages, it might be “obvious” if the
ciphertexts were encrypted using the same random key or were encrypted using two different random
keys. Give an example of a (plausibly) IND-CPA secure encryption scheme that has this is identity-
revealing. Then give a definition for “identity-concealing” encryption. Your definition should imply
IND-CPA security but a scheme meeting your definition can’t be identity-revealing.
Bibliography
[3] S. Goldwasser and S. Micali. Probabilistic encryption. J. of Computer and System Sci-
ences, Vol. 28, April 1984, pp. 270–299.
[4] S. Micali, C. Rackoff and R. Sloan. The notion of security for probabilistic cryptosys-
tems. SIAM J. of Computing, April 1988.
[5] M. Naor and M. Yung. Public-key cryptosystems provably secure against chosen ciphertext
attacks. Proceedings of the 22nd Annual Symposium on the Theory of Computing, ACM,
1990.
[6] C. Rackoff and D. Simon. Non-interactive zero-knowledge proof of knowledge and chosen
ciphertext attack. Advances in Cryptology – CRYPTO ’91, Lecture Notes in Computer Science
Vol. 576, J. Feigenbaum ed., Springer-Verlag, 1991.
116 BIBLIOGRAPHY
Chapter 6
Hash Functions
A hash function usually means a function that compresses, meaning the output is shorter than the
input. Often, such a function takes an input of arbitrary or almost arbitrary length to one whose
length is a fixed number, like 160 bits. Hash functions are used in many parts of cryptography,
and there are many different types of hash functions, with differing security properties. We will
consider them in this chapter.
Figure 6.1: The SHA1 hash function and the underlying SHF1 family.
strings M and M ′ that hash to the same value. This is just by the pigeonhole principle: if 2256
pigeons (the 256-bit messages) roost in 2160 holes (the 160-bit hash values) then some two pigeons
(two distinct strings) roost in the same hole (have the same hash). Indeed countless pigeons must
share the same hole. The difficult is only that nobody has as yet identified (meaning, explicitly
provided) even two such pigeons (strings).
Bellare and Rogaway 119
In trying to define this collision-resistance property of SHA1 we immediately run into “foun-
dational” problems. We would like to say that it is computationally infeasible to output a pair
of distinct strings M and M ′ that collide under SHA1. But in what sense could it be infeasible?
There is a program—indeed a very short an simple one, having just two “print” statements—whose
output specifies a collision. It’s not computationally hard to output a collision; it can’t be. The
only difficulty is our human problem of not knowing what this program is.
It seems very hard to make a mathematical definition that captures the idea that human beings
can’t find collisions in SHA1. In order to reach a mathematically precise definition we are going to
have to change the very nature of what we conceive to be a hash function. Namely, rather than it
being a single function, it will be a family of functions. This is unfortunate in some ways, because
it distances us from concrete hash functions like SHA1. But no alternative is known.
Figure 6.3: Framework for security notions for collision-resistant hash functions. The three choices
of s ∈ {0, 1, 2} give rise to three notions of security.
In measuring resource usage of an adversary we use our usual conventions. Although there is
formally no definition of a “secure” hash function, we will talk of a hash function being CR2, CR1
or CR0 with the intended meaning that its associated advantage function is small for all adversaries
of practical running time.
Note that the running time of the adversary is not really relevant for CR0, because we can
always imagine that hardwired into its code is a “best” choice of distinct points x1 , x2 , meaning a
Bellare and Rogaway 121
Figure 6.4: Games defining security notions for three kinds of collision-resistant hash functions
under known-key attack.
Figure 6.5: Types of hash functions, with names in our framework and corresponding names found
in the literature.
The above value equals Advcr0H (A) and is the maximum advantage attainable.
Clearly, a CR2 hash function is also CR1 and a CR1 hash function is also CR0. The following
states the corresponding relations formally. The proof is trivial and is omitted.
122 HASH FUNCTIONS
Proposition 6.2.2 Let H: K × D → R be a hash function. Then for any adversary A0 there
exists an adversary A1 having the same running time as A0 and
Advcr0 (A ) ≤ Advcr1-kk (A ) .
H 0 H 1
Also for any adversary A1 there exists an adversary A2 having the same running time as A1 and
Advcr1 -kk (A ) ≤ Advcr2-kk (A ) .
H 1 H 2
We believe that SHF1 is CR2, meaning that there is no practical algorithm A for which Advcr2 -kk (A)
H
is appreciably large. This is, however, purely a belief, based on the current inability to find such
an algorithm. Perhaps, later, such an algorithm will emerge.
It is useful, for any integer n, to get SHF1n : {0, 1}n → {0, 1}160 denote the restriction of SHF1
to the domain {0, 1}n . Note that a collision for SHF1nK is also a collision for SHF1K , and it is often
convenient to think of attacking SHF1n for some fixed n rather than SHF1 itself.
Figure 6.6: Birthday attack on a hash function H: K × D → R. The attack is successful in finding
a collision if it does not return FAIL.
A particular trial finds a collision with probability (about) 1 in |R|, so we expect to find a collision
in about q = |R| trials. This is much better than the |D| trials used by our first attempt. In
particular, a collision for shf1 would be found in time around 2160 rather than 2672 . But this is still
far from practical. Our conclusion is that as long as the range size of the hash function is large
enough, this attack is not a threat.
We now consider another strategy, called a birthday attack, that turns out to be much better
than the above. It is illustrated in Fig. 6.6. It picks at random q points from the domain, and
applies HK to each of them. If it finds two distinct points yielding the same output, it has found
a collision for HK . The question isphow large q need be to find a collision. The answer may seem
surprising at first. Namely, q = O( |R|) trials suffices.
We will justify this later, but first let us note the impact. Consider SHA1n with n ≥ 161. As we
indicated, the random-input collision-finding attack 160 trials to find a collision. The
√ takes about 2
birthday attack on the other hand takes around 2160 = 280 trials. This is MUCH less than 2160 .
Similarly, the birthday attack finds a collision in shf1 in around 280 trials while while random-input
collision-finding takes about 2160 trials.
To see why the birthday attack performs as well as we claimed, we recall the following game.
Suppose we have q balls. View them as numbered, 1, . . . , q. We also have N bins, where N ≥ q.
We throw the balls at random into the bins, one by one, beginning with ball 1. At random means
that each ball is equally likely to land in any of the N bins, and the probabilities for all the balls
are independent. A collision is said to occur if some bin ends up containing at least two balls. We
are interested in C(N, q), the probability of a collision. As shown in the Appendix,
q2
C(N, q) ≈ (6.1)
2N
√ √
for 1 ≤ q ≤ 2N . Thus C(N, q) ≈ 1 for q ≈ 2N .
The relation to birthdays arises from the question of how many people need be in a room before
the probability of there being two people with the same birthday is close to one. We imagine each
person has a birthday that is a random one of the 365 days in a year. This means we can think
of a person as a ball being thrown at random into one of 365 bins, where the i-th bin represents
having birthday the i-th day of the year. So we can apply the Proposition from the Appendix
with N = 365√and q the number of people in the room. The Proposition says that when the room
contains q ≈ 2 · 365 ≈ 27 people, the probability that there are two people with the same birthday
is close to one. This number (27) is quite small and may be hard to believe at first hearing, which
is why this is sometimes called the birthday paradox.
To see how this applies to the birthday attack of Fig. 6.6, let us enumerate the points in the
range as R1 , . . . , RN , where N = |R|. Each such point defines a bin. We view xi as a ball, and
imagine that it is thrown into bin yi , where yi = HK (xi ). Thus, a collision of balls (two balls in
124 HASH FUNCTIONS
the same bin) occurs precisely when two values xi , xj have the same output under HK . We are
interested in the probability that this happens as a function of q. (We ignore the probability that
xi = xj , counting a collision only when HK (xi ) = HK (xj ). It can be argued that since D is larger
than R, the probability that xi = xj is small enough to neglect.)
However, we cannot apply the birthday analysis directly, because the latter assumes that each
ball is equally likely to land in each bin. This is not, in general, true for our attack. Let P (Rj )
denote the probability that a ball lands in bin Rj , namely the probability that HK (x) = Rj taken
over a random choice of x from D. Then
−1
|HK (Rj )|
P (y) = .
|D|
In order for P (R1 ) = P (R2 ) = · · · = P (RN ) to be true, as required to apply the birthday analysis,
it must be the case that
−1 −1 −1
|HK (R1 )| = |HK (R2 )| = · · · = |HK (RN )| .
A function HK with this property is called regular, and H is called regular if HK is regular for
every K. Our conclusion is that if H is regular, then the probability that √ the attack
p succeeds is
roughly C(N, q). So the above says that in this case we need about q ≈ 2N = 2 · |R| trials to
find a collision with probability close to one.
If H is not regular, it turns out the attack succeeds even faster, telling us that we ought to
design hash functions to be as “close” to regular as possible [2].
In summary, there is a 2l/2 or better time attack to find collisions in any hash function outputting
l bits. This leads designers to choose l large enough that 2l/2 is prohibitive. In the case of SHF1
and shf1, the choice is l = 160 because 280 is indeed a prohibitive number of trials. These functions
cannot thus be considered vulnerable to birthday attack. (Unless they turn out to be extremely
non-regular, for which there is no evidence so far.)
Ensuring, by appropriate choice of output length, that a function is not vulnerable to a birthday
attack does not, of course, guarantee it is collision resistant. Consider the family H: K×{0, 1}161 →
{0, 1}160 defined as follows. For any K and any x, function HK (x) returns the first 160 bits of x.
The output length is 160, so a birthday attack takes 280 time and is not feasible, but it is still easy
to find collisions. Namely, on input K, an adversary can just pick some 160-bit y and output y0, y1.
This tells us that to ensure collision-resistance it is not only important to have a long enough output
but also design the hash function so that there no clever “shortcuts” to finding a collision, meaning
no attacks that exploit some weakness in the structure of the function to quickly find collisions.
We believe that shf1 is well-designed in this regard. Nobody has yet found an adversary that
finds a collision in shf1 using less than 280 trials. Even if a somewhat better adversary, say one
finding a collision for shf1 in 265 trials, were found, it would not be devastating, since this is still a
very large number of trials, and we would still consider shf1 to be collision-resistant.
If we believe shf1 is collision-resistant, Theorem 6.5.2 tells us that SHF1, as well as SHF1n , can
also be considered collision-resistant, for all n.
We let
Advow A
H (A) = Pr[OWH ⇒ true] .
We now ask ourselves whether collision-resistance implies one-wayness. It is easy to see, however,
that, in the absence of additional assumptions about the hash function than collision-resistance,
the answer is “no.” For example, let H be a family of functions every instance of which is the
identity function. Then H is highly collision-resistant (the advantage of an adversary in finding
a collision is zero regardless of its time-complexity since collisions simply don’t exist) but is not
one-way.
However, we would expect that “genuine” hash functions, meaning ones that perform some
non-trivial compression of their data (ie. the size of the range is more than the size of the domain)
are one-way. This turns out to be true, but needs to be carefully quantified. To understand the
issues, it may help to begin by considering the natural argument one would attempt to use to show
that collision-resistance implies one-wayness.
Suppose we have an adversary A that has a significant advantage in attacking the one-wayness
of hash function H. We could try to use A to find a collision via the following strategy. In the
pre-key phase (we consider a type-1 attack) we pick and return a random point x1 from D. In the
post-key phase, having received the key K, we compute y = HK (x1 ) and give K, y to A. The latter
returns some x2 , and, if it was successful, we know that HK (x2 ) = y. So HK (x2 ) = HK (x1 ) and
we have a collision.
Not quite. The catch is that we only have a collision if x2 6= x1 . The probability that this
happens turns out to depend on the quantity:
h i
$ $ −1
PreImH (1) = Pr K ← K; x← D ; y ← HK (x) : |HK (y)| = 1 .
This is the probability that the size of the pre-image set of y is exactly 1, taken over y generated
as shown. The following Proposition says that a collision-resistant function H is one-way as long
as PreImH (1) is small.
Proposition 6.4.2 Let H: K × D → R be a hash function. Then for any A there exists a B such
that
ow-kk -kk (B) + PreIm (1) .
AdvH (A) ≤ 2 · Advcr1
H H
Furthermore the running time of B is that of A plus the time to sample a domain point and compute
H once.
126 HASH FUNCTIONS
The result is about the CR1 type of collision-resistance. However Proposition 6.2.2 implies that
the same is true for CR2.
A general and widely-applicable corollary of the above Proposition is that collision-resistance
implies one-wayness as long as the domain of the hash function is significantly larger than its range.
The following quantifies this.
Corollary 6.4.3 Let H: K × D → R be a hash function. Then for any A there exists a B such
that
-kk (A) ≤ 2 · Advcr1-kk (B) + |R|
Advow
H H .
|D|
Furthermore the running time of B is that of A plus the time to sample a domain point and compute
H once.
Proof of Corollary 6.4.3: For any key K, the number of points in the range of HK that have
exactly one pre-image certainly cannot exceed |R|. This implies that
|R|
PreImH (1) ≤ .
|D|
The corollary follows from Proposition 6.4.2.
Corollary 6.4.3 says that if H is collision-resistant, and performs enough compression that |R|
is much smaller than |D|, then it is also one-way. Why? Let A be a practical adversary that
attacks the one-wayness of H. Then B is also practical, and since H is collision-resistant we know
Advcr1 -kk (B) is low. Equation (6.2) then tells us that as long as |R|/|D| is small, Advow-kk (A) is
H H
low, meaning H is one-way.
As an example, let H be the compression function shf1. In that case R = {0, 1}160 and D =
{0, 1}672 so |R|/|D| = 2−512 , which is tiny. We believe shf1 is collision-resistant, and the above thus
says it is also one-way.
There are some natural hash functions, however, for which Corollary 6.4.3 does not apply.
Consider a hash function H every instance of which is two-to-one. The ratio of range size to
domain size is 1/2, so the right hand side of the equation of Corollary 6.4.3 is 1, meaning the
bound is vacuous. However, such a function is a special case of the one considered in the following
Proposition.
Corollary 6.4.4 Suppose 1 ≤ r < d and let H: K × {0, 1}d → {0, 1}r be a hash function which
−1
is regular, meaning |HK (y)| = 2d−r for every y ∈ {0, 1}r and every K ∈ K. Then for any A there
exists a B such that
ow-kk -kk (B) .
AdvH (A) ≤ 2 · Advcr1
H
Furthermore the running time of B is that of A plus the time to sample a domain point and compute
H once.
Proof of Corollary 6.4.4: The assumption d > r implies that PreImH (1) = 0. Now apply
Proposition 6.4.2.
Let Pr [·] denote the probability of event “·” in experiment Expcr1-kk (B). For any K ∈ K let
H
−1
SK = { x ∈ D : |HK (HK (x))| = 1 } .
H(K, M )
y ← pad(M )
Parse y as M1 k M2 k · · · k Mn where |Mi | = b (1 ≤ i ≤ n)
V ← IV
for i = 1, . . . , n do
V ← h(K, Mi k V )
Return V
Adversary Ah (K)
Run AH (K) to get its output (x1 , x2 )
y1 ← pad(x1 ) ; y2 ← pad(x2 )
Parse y1 as M1,1 k M1,2 k · · · k M1,n[1] where |M1,i | = b (1 ≤ i ≤ n[1])
Parse y2 as M2,1 k M2,2 k · · · k M2,n[2] where |M2,i | = b (1 ≤ i ≤ n[2])
V1,0 ← IV ; V2,0 ← IV
for i = 1, . . . , n[1] do V1,i ← h(K, M1,i k V1,i−1 )
for i = 1, . . . , n[2] do V2,i ← h(K, M2,i k V2,i−1 )
if (V1,n[1] 6= V2,n[2] OR x1 = x2 ) return FAIL
if |x1 | =
6 |x2 | then return (M1,n[1] k V1,n[1]−1 , M2,n[2] k V2,n[2]−1 )
n ← n[1] // n = n[1] = n[2] since |x1 | = |x2 |
for i = n downto 1 do
if M1,i k V1,i−1 6= M2,i k V2,i−1 then return (M1,i k V1,i−1 , M2,i k V2,i−1 )
Figure 6.7: Hash function H defined from compression function h via the MD paradigm, and
adversary Ah for the proof of Theorem 6.5.2.
Let b be an integer parameter called the block length, and v another integer parameter called
the chaining-variable length. Let h: K × {0, 1}b+v → {0, 1}v be a family of functions that we call
the compression function. We assume it is collision-resistant.
Let B denote the set of all strings whose length is a positive multiple of b bits, and let D be
b
some subset of {0, 1}<2 .
Definition 6.5.1 A function pad: D → B is called a MD-compliant padding function if it has the
following properties for all M, M1 , M2 ∈ D:
(1) M is a prefix of pad(M )
(2) If |M1 | = |M2 | then |pad(M1 )| = |pad(M2 )|
(3) If M1 6= M2 then the last block of pad(M1 ) is different from the last block of pad(M2 ).
A block, above, consists of b bits. Remember that the output of pad is in B, meaning is a sequence
of b-bit blocks. Condition (3) of the definition is saying that if two messages are different then,
when we apply pad to them, we end up with strings that differ in their final blocks.
An example of a MD-compliant padding function is shapad. However, there are other examples
as well.
Now let IV be a v-bit value called the initial vector. We build a family H: K × D → {0, 1}v
from h and pad as illustrated in Fig. 6.7. Notice that SHF1 is such a family, built from h = shf1
and pad = shapad. The main fact about this method is the following.
Bellare and Rogaway 129
Theorem 6.5.2 Let h: K×{0, 1}b+v → {0, 1}v be a family of functions and let H: K×D → {0, 1}v
be built from h as described above. Suppose we are given an adversary AH that attempts to find
collisions in H. Then we can construct an adversary Ah that attempts to find collisions in h, and
Advcr2-kk (A ) ≤ Advcr2-kk (A ) .
H H h h (6.9)
Furthermore, the running time of Ah is that of AH plus the time to perform (|pad(x1 )|+|pad(x2 )|)/b
computations of h where (x1 , x2 ) is the collision output by AH .
This theorem says that if h is collision-resistant then so is H. Why? Let AH be a practical adversary
attacking H. Then Ah is also practical, because its running time is that of AH plus the time to
do some extra computations of h. But since h is collision-resistant we know that Advcr2 -kk (A )
h h
cr2-kk
is low. Equation (6.9) then tells us that AdvH (AH ) is low, meaning H is collision-resistant as
well.
Proof of Theorem 6.5.2: Adversary Ah , taking input a key K ∈ K, is depicted in Fig. 6.7. It
runs AH on input K to get a pair (x1 , x2 ) of messages in D. We claim that if x1 , x2 is a collision
for HK then Ah will return a collision for hK .
Adversary Ah computes V1,n[1] = HK (x1 ) and V2,n[2] = HK (x2 ). If x1 , x2 is a collision for HK then
we know that V1,n[1] = V2,n[2] . Let us assume this. Now, let us look at the inputs to the application
of hK that yielded these outputs. If these inputs are different, they form a collision for hK .
The inputs in question are M1,n[1] k V1,n[1]−1 and M2,n[2] k V2,n[2]−1 . We now consider two cases. The
first case is that x1 , x2 have different lengths. Item (3) of Definition 6.5.1 tells us that M1,n[1] 6=
M2,n[2] . This means that M1,n[1] k V1,n[1]−1 6= M2,n[2] k V2,n[2]−1 , and thus these two points form a
collision for hK that can be output by Ah .
The second case is that x1 , x2 have the same length. Item (2) of Definition 6.5.1 tells us that y1 , y2
have the same length as well. We know this length is a positive multiple of b since the range of pad
is the set B, so we let n be the number of b-bit blocks that comprise y1 and y2 . Let Vn denote the
value V1,n , which by assumption equals V2,n . We compare the inputs M1,n kV1,n−1 and M2,n kV2,n−1
that under hK yielded Vn . If they are different, they form a collision for hK and can be returned
by Ah . If, however, they are the same, then we know that V1,n−1 = V2,n−1 . Denoting this value by
Vn−1 , we now consider the inputs M1,n−1 k V1,n−2 and M2,n−1 k V2,n−2 that under hK yield Vn−1 .
The argument repeats itself: if these inputs are different we have a collision for hK , else we can step
back one more time.
Can we get stuck, continually stepping back and not finding our collision? No, because y1 6= y2 .
Why is the latter true? We know that x1 6= x2 . But item (1) of Definition 6.5.1 says that x1 is a
prefix of y1 and x2 is a prefix of y2 . So y1 6= y2 .
We have argued that on any input K, adversary Ah finds a collision in hK exactly when AH finds a
collision in HK . This justifies Equation (6.9). We now justify the claim about the running time of
Ah . The main component of the running time of Ah is the time to run AH . In addition, it performs
a number of computations of h equal to the number of blocks in y1 plus the number of blocks in
y2 . There is some more overhead, but small enough to neglect.
130 HASH FUNCTIONS
Bibliography
[2] M. Bellare and T. Kohno. Hash function balance and its impact on birthday attacks.
Advances in Cryptology – EUROCRYPT ’04, Lecture Notes in Computer Science Vol. 3027,
C. Cachin and J. Camenisch ed., Springer-Verlag, 2004.
[3] I. Damgård. A Design Principle for Hash Functions. Advances in Cryptology – CRYPTO
’89, Lecture Notes in Computer Science Vol. 435, G. Brassard ed., Springer-Verlag, 1989.
[4] B. den Boer and A. Bosselaers, Collisions for the compression function of MD5. Advances
in Cryptology – EUROCRYPT ’93, Lecture Notes in Computer Science Vol. 765, T. Helleseth
ed., Springer-Verlag, 1993.
[6] H. Dobbertin, Cryptanalysis of MD5. Rump Session of Eurocrypt 96, May 1996,
http://www.iacr.org/conferences/ec96/rump/index.html.
[8] M. Naor and M. Yung, Universal one-way hash functions and their cryptographic appli-
cations. Proceedings of the 21st Annual Symposium on the Theory of Computing, ACM,
1989.
[9] National Institute of Standards. FIPS 180-2, Secure hash standard, August 2000. http://
csrc.nist.gov/publications/fips/fips180-2/fips180-2.pdf.
[10] R. Merkle. One way hash functions and DES. Advances in Cryptology – CRYPTO ’89,
Lecture Notes in Computer Science Vol. 435, G. Brassard ed., Springer-Verlag, 1989.
[11] R. Rivest, The MD4 message-digest algorithm, Advances in Cryptology – CRYPTO ’90,
Lecture Notes in Computer Science Vol. 537, A. J. Menezes and S. Vanstone ed., Springer-
Verlag, 1990, pp. 303–311. Also IETF RFC 1320 (April 1992).
[12] R. Rivest, The MD5 message-digest algorithm. IETF RFC 1321 (April 1992).
132 BIBLIOGRAPHY
Chapter 7
Message Authentication
In most people’s minds, privacy is the goal most strongly associated to cryptography. But message
authentication is arguably even more important. Indeed you may or may not care if some particular
message you send out stays private, but you almost certainly do want to be sure of the originator
of each message that you act on. Message authentication is what buys you that guarantee.
Message authentication allows one party—the sender—to send a message to another party—
the receiver—in such a way that if the message is modified en route, then the receiver will almost
certainly detect this. Message authentication is also called data-origin authentication. Message
authentication is said to protect the integrity of a message, ensuring that each message that it is
received and deemed acceptable is arriving in the same condition that it was sent out—with no
bits inserted, missing, or modified.
Here we’ll be looking at the shared-key setting for message authentication (remember that
message authentication in the public-key setting is the problem addressed by digital signatures). In
this case the sender and the receiver share a secret key, K, which they’ll use to authenticate their
transmissions. We’ll define the message authentication goal and we’ll describe some different ways
to achieve it. As usual, we’ll be careful to pin down the problem we’re working to solve.
Sender Receiver
C C’
M E D M’ or
K K
A
Figure 7.1: An authenticated-encryption scheme. Here we are authenticating messages with what
is, syntactically, just an encryption scheme. The sender transmits a transformed version C of M
and the receiver is able to recover M ′ = M or else obtain indication of failure. Adversary A controls
the communication channel and may even influence messages sent by the sender.
example, the message M might be tagged by an identifier which somehow names S. Or it might
be that the manner in which M arrives is a route dedicated to servicing traffic from S.
Here we’re going to be looking at the case when S and R already share some secret key, K.
How S and R came to get this shared secret key is a separate question, one that we deal with later.
There are several high-level approaches for authenticating transmissions.
1. The most general approach works like this. To authenticate a message M using the key K,
the sender will apply some encryption algorithm E to K, giving rise to a ciphertext C. When
we speak of encrypting M in this context, we are using the word in the broadest possible
sense, as any sort of keyed transformation on the message that obeys are earlier definition for
the syntax of an encryption scheme; in particular, we are not suggesting that C conceals M .
The sender S will transmit C to the receiver R. Maybe the receiver will receive C, or maybe
it will not. The problem is that an adversary A may control the channel on which messages
are being sent. Let C ′ be the message that the receiver actually gets. The receiver R, on
receipt of C ′ , will apply some decryption algorithm D to K and C ′ . We want that this
should yield one of two things: (1) a message M ′ that is the original message M ; or (2) an
indication ⊥ that C ′ be regarded as inauthentic. Viewed in this way, message authentication
is accomplished by an encryption scheme. We are no longer interested in the privacy of
the encryption scheme but, functionally, it is still an encryption scheme. See Fig. 7.1. We
sometimes use the term authenticated encryption to indicate that we are using an encryption
scheme to achieve authenticity.
2. Since our authenticity goal is not about privacy, most often the ciphertext C that the sender
transmits is simply the original message M together with a tag T ; that is, C = hM, T i.
When the ciphertext is of this form, we call the mechanism a message-authentication scheme.
A message-authentication scheme will be specified by a tag-generation algorithm TG and
a tag-verification algorithm VF. The former may be probabilistic or stateful; the latter is
$
neither. The tag-generation algorithm TG produces a tag T ← TGK (M ) from a key K and
′ ′
the message. The tag-verification algorithm VF ← VFK (M , T ) produces a bit from a key
K, a message M ′ , and a tag T ′ . The intent is that the bit 1 tells the receiver to accept M ′ ,
while the bit 0 tells the receiver to reject M ′ . See Fig. 7.5
Bellare and Rogaway 135
Sender Receiver
M M’
M’
T T’
M TG VF 0 or 1
K A K
Figure 7.2: A message authentication scheme. This is a special case of the more general framework
from the prior diagram. The authenticated message C is now understood to be the original message
M together with a tag T . Separate algorithms generate the tag and verify the pair.
Sender Receiver
M M’
M’
T T’ T*
M MAC MAC = 0 or 1
K K
A
Figure 7.3: A message authentication code. This is a special case of a message authentication
scheme. The authenticated message C is now understood to be the original message M together
with a tag T that is computed as a deterministic and stateless function of M and K. The receiver
verifies the authenticity of messages using the same MACing algorithm.
3. The most common possibility of all occurs when the tag-generation algorithm TG is deter-
ministic and stateless. In this case we call the tag-generation algorithm, and the scheme itself,
a message authentication code, or MAC. When authentication is accomplished using a MAC,
we do not need to specify a separate tag-verification algorithm, for tag-verification always
works he same way: the receiver, having received hM ′ , T ′ i, computes T ∗ = MACK (M ′ ). If
this computed-tag T ∗ is identical to the received tag T ′ then the receiver regards the message
M ′ as authentic; otherwise, the receiver regards M ′ as inauthentic. We write T = MACK (M )
for the tag generated by the specified MAC. See Fig. 7.5
When the receiver decides that a message he has received is inauthentic what should he do?
The receiver might want to just ignore the bogus message. Perhaps it was just noise on the channel;
or perhaps taking action will do more harm than good, opening up new possibilities for denial-of-
service attacks. Alternatively, the receiver may want to take more decisive actions, like tearing
down the channel on which the message was received and informing some human being of apparent
mischief. The proper course of action is dictated by the circumstances and the security policy of
the receiver.
136 MESSAGE AUTHENTICATION
We point out that adversarial success in violating authenticity demands an active attack: to
succeed, the adversary has to do more than listen—it has to get some bogus message to the receiver.
In some communication scenarios it may be difficult for the adversary to get its messages to the
receiver. For example, it may be tricky for an adversary to drop its own messages onto a physically
secure phone line or fiber-optic channel. In other environments it may be trivial for the adversary
to put messages onto the channel. Since we don’t know what are the characteristics of the sender—
receiver channel it is best to assume the worst and think that the adversary has plenty of power
over this channel. We will actually assume even more than that, giving the adversary the power of
creating legitimately authenticated messages.
We wish to emphasize that the message-authentication problem is very different from the privacy
problem. We are not worried about secrecy of the message M ; our concern is in whether the
adversary can profit by injecting new messages into the communications stream. Not only is the
problem conceptually different but, as we shall now see, privacy-providing encryption does nothing
to ensure message authenticity.
ciphertext he checks whether the decrypted string ends in 128 zeros. He rejects the transmission if
it does not. Such an approach can, and almost always will, fail. For example, the added redundancy
does absolutely nothing for our one-time-pad example.
What you should conclude is that privacy-providing encryption was never an appropriate ap-
proach for protecting its authenticity. With hindsight, this is pretty clear. The fact that data is
encrypted need not prevent an adversary from being able to make the receiver recover data different
from that which the sender had intended. Indeed with most encryption schemes any ciphertext will
decrypt to something, so even a random transmission will cause the receiver to receive something
different from what the sender intended, which was not to send any message at all. Now perhaps
the random ciphertext will look like garbage to the receiver, or perhaps not. Since we do not know
what the receiver intends to do with his data it is impossible to say.
Since the encryption schemes we have discussed were not designed for authenticating messages,
they don’t. We emphasize this because the belief that good encryption, perhaps after adding re-
dundancy, already provides authenticity, is not only voiced, but even printed in books or embedded
into security systems.
Good cryptographic design is goal-oriented. One must understand and formalize our goal. Only
then do we have the basis on which to design and evaluate potential solutions. Accordingly, our
next step is to come up with a definition for a message-authentication scheme and its security.
To verify hM, T i the receiver checks if T = MACK (M ). If so, message M is viewed as authentic;
otherwise, the message is viewed as being a forgery.
Note that our definitions don’t permit stateful message-recovery / verification. Stateful func-
tions for the receiver can be problematic because of the possibility of messages not reaching their
destination—it is too easy for the receiver to be in a state different from the one that we’d like. All
the same, stateful MAC verification functions are essential for detecting “replay attacks.”
Recall that it was essential for the IND-CPA security of an encryption scheme that the en-
cryption algorithm be probabilistic or stateful—you couldn’t achieve IND-CPA security with a
deterministic encryption algorithm. But we will see that probabilism and state are not necessary
for achieving secure message authentication. This realization is built into the fact that we deal
with MACs.
again.
What we have just described is called a replay attack. The adversary sees a valid (M, T ) from
the sender, and at some later point in time it re-transmits it. Since the receiver accepted it the
first time, he’ll do so again.
Should a replay attack count as a valid forgery? In real life it usually should. Say the first
message was “Transfer $1000 from my account to the account of party A.” Then party A may have
a simple way to enriching herself: it just keeps replaying this same authenticated message, happily
watching her bank balance grow.
It is important to protect against replay attacks. But for the moment we will not try to do
this. We will say that a replay is not a valid forgery; to be valid a forgery must be of a message
M which was not already produced by the sender. We will see later that we can always achieve
security against replay attacks by simple means; that is, we can take any message authentication
mechanism which is not secure against replay attacks and modify it—after making the receiver
stateful—so that it will be secure against replay attacks. At this point, not worrying about replay
attacks results in a cleaner problem definition. And it leads us to a more modular protocol-design
approach—that is, we cut up the problem into sensible parts (“basic security” and then “replay
security”) solving them one by one.
Of course there is no reason to think that the adversary will be limited to seeing only one
example message. Realistic adversaries may see millions of authenticated messages, and still it
should be hard for them to forge.
For some message authentication schemes the adversary’s ability to forge will grow with the
number qs of legitimate message-tag pairs it sees. Likewise, in some security systems the number
of valid (M, T ) pairs that the adversary can obtain may be architecturally limited. (For example,
a stateful Signer may be unwilling to MAC more than a certain number of messages.) So when we
give our quantitative treatment of security we will treat qs as an important adversarial resource.
How exactly do all these tagged messages arise? We could think of there being some distribution
on messages that the sender will authenticate, but in some settings it is even possible for the
adversary to influence which messages are tagged. In the worst case, imagine that the adversary
itself chooses which messages get authenticated. That is, the adversary chooses a message, gets its
tag, chooses another message, gets its tag, and so forth. Then it tries to forge. This is called an
adaptive chosen-message attack. It wins if it succeeds in forging the MAC of a message which it
has not queried to the sender.
At first glance it may seem like an adaptive chosen-message attack is unrealistically generous to
our adversary; after all, if an adversary could really obtain a valid tag for any message it wanted,
wouldn’t that make moot the whole point of authenticating messages? In fact, there are several
good arguments for allowing the adversary such a strong capability. First, we will see examples—
higher-level protocols that use MACs—where adaptive chosen-message attacks are quite realistic.
Second, recall our general principles. We want to design schemes which are secure in any usage.
This requires that we make worst-case notions of security, so that when we err in realistically
modeling adversarial capabilities, we err on the side of caution, allowing the adversary more power
than it might really have. Since eventually we will design schemes that meet our stringent notions
of security, we only gain when we assume our adversary to be strong.
As an example of a simple scenario in which an adaptive chosen-message attack is realistic,
imagine that the sender S is forwarding messages to a receiver R. The sender receives messages
from any number of third parties, A1 , . . . , An . The sender gets a piece of data M from party Ai
along a secure channel, and then the sender transmits to the receiver hii k M k M ACK (hii k M ).
This is the sender’s way of attesting to the fact that he has received message M from party Ai .
Now if one of these third parties, say A1 , wants to play an adversarial role, it will ask the sender
140 MESSAGE AUTHENTICATION
TG K VF K
M (M,T)
T=MAC K(M) 0 or 1
Figure 7.4: The model for a message authentication code. Adversary A has access to a tag-
generation oracle and a tag-verification oracle. The adversary wants to get the verification oracle
to answer 1 to some (M, T ) for which it didn’t earlier ask the signing oracle M . The verification
oracle returns 1 if T = MACK (M ) and 0 if T 6= MACK (M ).
to forward its adaptively-chosen messages M1 , M2 , . . . to the receiver. If, based on what it sees, it
can learn the key K, or even if it can learn to forge message of the form h2i k M , so as to produce
a valid h2i k M k M ACK (h2i k M ), then the intent of the protocol will have been defeated.
So far we have said that we want to give our adversary the ability to obtain MACs for messages
of its choosing, and then we want to look at whether or not it can forge: produce a valid (M, T )
pair where it never asked the sender to MAC M . But we should recognize that a realistic adversary
might be able to produce lots of candidate forgeries, and it may be content if any of these turn
out to be valid. We can model this possibility by giving the adversary the capability to tell if a
prospective (M, T ) pair is valid, and saying that the adversary forges if it ever finds an (M, T ) pair
that is but M was not MACed by the sender.
Whether or not a real adversary can try lots of possible forgeries depends on the context.
Suppose the receiver is going to tear down a connection the moment he detects an invalid tag.
Then it is unrealistic to try to use this receiver to help you determine if a candidate pair (M, T ) is
valid—one mistake, and you’re done for. In this case, thinking of there being a single attempt to
forge a message is quite adequate.
On the other hand, suppose that a receiver just ignores any improperly tagged message, while
it responds in some noticeably different way if it receives a properly authenticated message. In this
case a quite reasonable adversarial strategy may be ask the verifier about the validity of a large
number of candidate (M, T ) pairs. The adversary hopes to find at least one that is valid. When
the adversary finds such an (M, T ) pair, we’ll say that it has won.
Let us summarize. To be fully general, we will give our adversary two different capabilities.
The first adversarial capability is to obtain a MAC M for any message that it chooses. We will call
this a signing query. The adversary will make some number of them, qs . The second adversarial
capability is to find out if a particular pair (M, T ) is valid. We will call this a verification query.
The adversary will make some number of them, qv . Our adversary is said to succeed—to forge—if
it ever makes a verification query (M, T ) and gets a return value of 1 (accept) even though the
message M is not a message that the adversary already knew a tag for by virtue of an earlier signing
query. Let us now proceed more formally.
Let MAC: K×{0, 1}∗ → {0, 1}∗ be an arbitrary message authentication code. We will formalize
a quantitative notion of security against adaptive chosen-message attack. We begin by describing
the model.
We distill the model from the intuition we have described above. There is no need, in the model,
to think of the sender and the verifier as animate entities. The purpose of the sender, from the
adversary’s point of view, is to authenticate messages. So we will embody the sender as an oracle
Bellare and Rogaway 141
that the adversary can use to authenticate any message M . This tag-generation oracle, as we will
call it, is our way to provide the adversary black-box access to the function MACK (·). Likewise,
the purpose of the verifier, from the adversary’s point of view, is to have that will test attempted
forgeries. So we will embody the verifier as an oracle that the adversary can use to see if a candidate
pair (M, T ) is valid. This verification oracle, as we will call it, is our way to provide the adversary
black-box access to the function VFK (·) which is 1 if T = MACK (M ) and 0 otherwise. Thus, when
we become formal, the cast of characters—the sender, receiver, and the adversary—gets reduced
to just the adversary, running with its oracles.
Definition 7.4.1 [MAC security] Let MAC: K × {0, 1}∗ → {0, 1}∗ be a message authentication
code and let A be an adversary. We consider the following experiment:
Experiment Expuf -cma (A)
MAC
$
K← K
Run AMACK (·),VFK (·,·) where VFK (M, T ) is 1 if MACK (M ) = T and 0 otherwise
if A made a VFK query (M, T ) such that
– The oracle returned 1, and
– A did not, prior to making verification query (M, T ),
make tag-generation query M
then return 1 else return 0
The uf-cma advantage of A is defined as
h i
Advuf -cma uf -cma
MAC (A) = Pr ExpMAC (A)⇒1 .
Let us discuss the above definition. Fix a message authentication code MAC. Then we associate to
any adversary A its “advantage,” or “success probability.” We denote this value as Advuf -cma
MAC (A).
It’s just the chance that A manages to forge. The probability is over the choice of key K and the
probabilistic choices, if any, that the adversary A makes.
As usual, the advantage that can be achieved depends both on the adversary strategy and the
resources it uses. Informally, Π is secure if the advantage of a practical adversary is low.
As usual, there is a certain amount of arbitrariness as to which resources we measure. Certainly
it is important to separate the oracle queries (qs and qv ) from the time. In practice, signing queries
correspond to messages sent by the legitimate sender, and obtaining these is probably more difficult
than just computing on one’s own. Verification queries correspond to messages the adversary hopes
the verifier will accept, so finding out if it does accept these queries again requires interaction. Some
system architectures may effectively limit qs and qv . No system architecture can limit t; that is
limited primarily by the adversary’s budget.
We emphasize that there are contexts in which you are happy with a MAC that makes forgery
impractical when qv = 1 and qs = 0 (an “impersonation attack”) and there are contexts in which
you are happy when forgery is impractical when qv = 1 and qs = 1 (a “substitution attack”). But
it is perhaps more common that you’d like for forgery to be impractical even when qs is large, like
250 , and when qv is large, too.
Naturally the key K is not directly given to the adversary, and neither are any random choices
or counter used by the MAC-generation algorithm. The adversary sees these things only to the
extent that they are reflected in the answers to her oracle queries.
With a definition for MAC security in hand, it is not hard for us to similarly define authenticity
for encryption schemes and message-authentication schemes. Let us do the former; we will explore
the latter in exercises. We have an encryption scheme Π = (K, E, D) and we want to measure how
effective an adversary is at attacking its authenticity.
142 MESSAGE AUTHENTICATION
Advauth auth
Π (A) = Pr[ExpΠ (A)⇒1] .
We note that we could just as well have provided A with a decryption oracle DK (·) instead of
a verification oracle VFK (·), giving the adversary credit if it ever manages to ask a this oracle a
query C that decrypts to something other than ⊥ and where C was not already returned by the
encryption oracle.
7.5 Examples
Let us examine some example message authentication codes and use the definition to assess their
strengths and weaknesses. We fix a PRF F : K × {0, 1}n → {0, 1}n . Our first scheme MAC1: K ×
{0, 1}∗ → {0, 1}∗ works as follows:
algorithm MAC1K (M )
if (|M | mod n 6= 0 or |M | = 0) then return ⊥
Break M into n-bit blocks M = M1 . . . Mm
for i ← 1 to m do Yi ← FK (Mi )
T ← Y1 ⊕ · · · ⊕ Yn
return T
Now let us try to assess the security of this message authentication code.
Suppose the adversary wants to forge the tag of a certain given message M . A priori it is unclear
this can be done. The adversary is not in possession of the secret key K, so cannot compute FK
and use it to compute T . But remember that the notion of security we have defined says that the
adversary is successful as long as it can produce a correct tag for some message, not necessarily
a given one. We now note that even without a chosen-message attack (in fact without seeing any
examples of correctly tagged data) the adversary can do this. It can choose a message M consisting
of two equal blocks, say M = X kX where X is some n-bit string, set T ← 0n , and make verification
query (M, T ). Notice that VFK (M, Tag ) = 1 because FK (x) ⊕ FK (x) = 0n = T . In more detail,
the adversary is as follows.
MAC (·),VF (·,·)
algorithm A1 K K
algorithm MAC2K (M )
η ← n−ι
if (|M | mod η 6= 0 or |M | = 0 or |M |/η ≥ 2ι ) then return ⊥
Break M into η-bit blocks M = M1 . . . Mm
for i ← 1 to m do Yi ← FK ([i]ι k Mi )
T ← Y1 ⊕ · · · ⊕ Ym
return Tag
As the code indicates, we divide M into blocks, but the size of each block is smaller than in
our previous scheme: it is now only η = n − ι bits. Then we prefix the i-th message block with the
value i itself, the block index, written in binary as a string of length exactly m bits. It is to this
padded block that we apply FK before taking the xor.
Note that encoding of the block index i as an iota-bit string is only possible if i < 2ι . This means
that we cannot authenticate a message M having more 2ι blocks. This explains the conditions under
which the MAC returns ⊥. However this is a feasible restriction in practice, since a reasonable value
of ι, like ι = 32, is large enough that very long messages will be in the message space.
Anyway, the question we are really concerned with is the security. Has this improved from
scheme MAC1? Begin by noticing that the attacks we found on MAC1 no longer work. For
example if X is an η-bit string and we let M = X k X then its tag is not likely to be 0n . Similarly,
the second attack discussed above, namely that based on permuting of message blocks, also has
low chance of success against the new scheme. Why? In the new scheme, if M1 , M2 are strings of
length η, then
These are unlikely to be equal. As an exercise, a reader might upper bound the probability that
these values are equal in terms of the value of the advantage of F at appropriate parameter values.
All the same, MAC2 is still insecure. The attack however require a more non-trivial usage of
the chosen-message attacking ability. The adversary will query the tagging oracle at several related
points and combine the responses into the tag of a new message. We call it A2 –
MAC (·),VF (·)
algorithm A2 K K
T ← T1 ⊕ T2 ⊕ T3
d ← VFK (B1 B2 , T )
T1 = FK ([1]ι k A1 ) ⊕ FK ([2]ι k A2 )
T2 = FK ([1]ι k A1 ) ⊕ FK ([2]ι k B2 )
T3 = FK ([1]ι k B1 ) ⊕ FK ([2]ι k A2 ) .
Now look how A2 defined T and do the computation; due to cancellations we get
T = T1 ⊕ T2 ⊕ T3
= FK ([1]ι k B1 ) ⊕ FK ([2]ι k B2 ) .
This is indeed the correct tag of B1 B2 , meaning the value T ′ that VFK (B1 B2 , T ) would compute,
so the latter algorithm returns 1, as claimed. In summary we have shown that this scheme is
insecure.
It turns out that a slight modification of the above, based on use of a counter or random number
chosen by the MAC algorithm, actually yields a secure scheme. For the moment however we want
to stress a feature of the above attacks. Namely that these attacks did not cryptanalyze the PRF F .
The attacks did not care anything about the structure of F ; whether it was DES, AES, or anything
else. They found weaknesses in the message authentication schemes themselves. In particular, the
attacks work just as well when FK is a random function, or a “perfect” cipher. This illustrates
again the point we have been making, about the distinction between a tool (here the PRF) and
its usage. We need to make better usage of the tool, and in fact to tie the security of the scheme
to that of the underlying tool in such a way that attacks like those illustrated here are provably
impossible under the assumption that the tool is secure.
algorithm MACK (M )
if (M 6∈ D) then return ⊥
T ← FK (M )
return T
Note that when we think of a PRF as a MAC it is important that the domain of the PRF be
whatever one wants as the domain of the MAC. So such a PRF probably won’t be realized as a
blockcipher. It may have to be realized by a PRF that allows for inputs of many different lengths,
Bellare and Rogaway 145
since you might want to MAC messages of many different lengths. As yet we haven’t demonstrated
that we can make such PRFs. But we will. Let us first relate the security of the above MAC to
that of the PRF.
Proposition 7.6.1 Let F : K×D → {0, 1}τ be a family of functions and let MAC be the associated
message authentication code as defined above. Let A by any adversary attacking Π, making qs
MAC-generation queries of total length µs , qv MAC-verification queries of total length µv , and
having running time t. Then there exists an adversary B attacking F such that
Furthermore B makes qs + qv oracle queries of total length µs + µv and has running time t.
Proof: Remember that B is given an oracle for a function f : D → {0, 1}τ . It will run A, providing
it an environment in which A’s oracle queries are answered by B.
algorithm B f
d←0; S ←∅
Run A
When A asks its signing oracle some query M :
Answer f (M ) to A ; S ← S ∪ {M }
When A asks its verification oracle some query (M, Tag ):
if f (M ) = Tag then
answer 1 to A ; if M 6∈ S then d ← 1
else answer 0 to A
Until A halts
return d
Subtracting, we get Equation (7.1). Let us now justify the two equations above.
In the first case f is an instance of F , so that the simulated environment that B is providing for A
is exactly that of experiment Expuf -cma (A). Since B returns 1 exactly when A makes a successful
Π
verification query, we have Equation (7.2).
In the second case, A is running in an environment that is alien to it, namely one where a random
function is being used to compute MACs. We have no idea what A will do in this environment, but
no matter what, we know that the probability that any particular verification query (M, Tag ) with
M 6∈ S will be answered by 1 is at most 2−τ , because that is the probability that Tag = f (M ).
Since there are at most qv verification queries, Equation (7.3) follows.
146 MESSAGE AUTHENTICATION
M1 M2 M3 M4
+ + +
X1 X2 X3 X4
EK EK EK EK
C1 C2 C3 C4
Figure 7.5: The CBC MAC, here illustrated with a message M of four blocks, M = M1 M2 M3 M4 .
Scheme 7.7.1 CBC MAC] Let E: K × {0, 1}n → {0, 1}n be a blockcipher. The CBC MAC over
blockcipher E has key space K and is given by the following algorithm:
algorithm MACK (M )
if M 6∈ ({0, 1}n )+ then return ⊥
Break M into n-bit blocks M1 · · · Mm
C0 ← 0n
for i = 1 to m do Ci ← EK (Ci−1 ⊕ Mi )
return Cm
As we will see below, the CBC MAC is secure only if you restrict attention to strings of some
one particular length: the domain is restricted to {0, 1}mn for some constant m. If we apply the
CBC MAC across messages of varying lengths, it will be easy to distinguish this object from a
random function.
Theorem 7.7.2 [2] Fix n ≥ 1, m ≥ 1, and q ≥ 2. Let A be an adversary that asks at most q
queries, each of mn bits. Then that
m2 q 2
Advprf
CBC[Func(mn,n)] (A) ≤ .
2n
Proof: Let A be an adversary that asks exactly q queries and assume without loss of generality
that it never repeats a query. Refer to games C0–C9 in Fig. 7.6. Let us begin by explaining the
notation used there. Each query M s in the games is required to be a string of blocks, and we silently
s where each M is a block. Recall that M s
parse M s to M s = M1s M2s · · · Mm i
s s
1→i = M1 · · · Mi . The
n n
function π: {0, 1} → {0, 1} is initially undefined at each point. The set Domain(π) grows as we
define points π(X), while Range(π), initially {0, 1}n , correspondingly shrinks. The table Y stores
blocks and is indexed by strings of blocks P having at most m blocks. A random block will come
to occupy selected entries Y [X] except for Y [ε], which is initialized to the constant block 0n and
Bellare and Rogaway 147
Figure 7.6: Games used in the CBC MAC analysis. Let Prefix(M 1 , . . . , M s ) be ε if s = 1, else
the longest string P ∈ ({0, 1}n )∗ s.t. P is a prefix of M s and M r for some r < s. In each game,
Initialize sets Y [ε] ← 0n .
148 MESSAGE AUTHENTICATION
is never changed. The value defined (introduced at line 306) is an arbitrary point of {0, 1}n , say 0n .
Finally, Prefix(M 1 , . . . , M s ) is the longest string of blocks P = P1 · · · Pp that is a prefix of M s and
is also a prefix of M r for some r < s. If Prefix is applied to a single string the result is the empty
string, Prefix(P 1 ) = ε. As an example, letting A, B, and C be distinct blocks, Prefix(ABC) = ε,
Prefix(ACC, ACB, ABB, ABA) = AB, and Prefix(ACC, ACB, BBB) = ε.
We briefly explain the game chain up until the terminal game. Game C0 is obtained from game C1
by dropping the assignment statements that immediately follow the setting of bad. Game C1 is
a realization of CBCm [Perm(n)] and game C0 is a realization of Func(mn,n). Games C1 and C0
are designed so that the fundamental lemma applies, so the advantage of A in attacking the CBC
construction is at most Pr[AC0 sets bad]. C0→C2: The C0 → C2 transition is a lossy transition
that takes care of bad getting set at line 105, which clearly happens with probability at most
(0 + 1 + · · · + (mq − 1))/2n ≤ 0.5 m2 q 2 /2n , so Pr[AC0 sets bad] ≤ Pr[AC2 sets bad] + 0.5 m2 q 2 /2n .
C2→C3: Next notice that in game C2 we never actually use the values assigned to π, all that
matters is that we record that a value had been placed in the domain of π, and so game C3 does
just that, dropping a fixed value defined = 0n into π(X) when we want X to join the domain of π.
C3→C4: Now notice that in game C3 the value returned to the adversary, although dropped into
Y [M1s · · · Mms ], is never subsequently used in the game so we could as well choose a random value Z s
and return it to the adversary, doing nothing else with Z s . This is the change made for game C4.
The transition is conservative. C4→C5: Changing game C4 to C5 is by the “coin-fixing” technique.
Coin-fixing in this case amounts to letting the adversary choose the sequence of queries M1 , . . . , Mm
it asks and the sequence of answers returned to it. The queries still have to be valid: each M s is an
mn-bit string different from all prior ones: that is the query/response set. For the worst M1 , . . . , Mm ,
which the coin-fixing technique fixes, Pr[AC4 sets bad] ≤ Pr[C5 sets bad]. Remember that, when
applicable, coin-fixing is safe. C5→C6: Game C6 unrolls the first iteration of the loop at lines 503–
507. This transformation is conservative. C6→C7: Game C7 is a rewriting of game C6 that omits
mention of the variables C and X, directly using values from the Y -table instead, whose values are
now chosen at the beginning of the game. The change is conservative. C7→C8: Game C8 simply
re-indexes the for loop at line 705. The change is conservative. C8→C9: Game C9 restructures
the setting of bad inside the loop at 802–807 to set bad in a single statement. Points were into the
domain of π at lines 804 and 807 and we checked if any of these points coincide with specified other
points at lines 803 and 806. The change is conservative.
At this point, we have only to bound Pr[AC9 sets bad]. We do this using the sum bound and a case
analysis. Fix any r, i, s, j as specified in line 902. Consider the following ways that bad can get set
to true.
Line 903. We first bound Pr[Y [Pr ] ⊕ MrkPr kn +1 = Y [Ps ] ⊕ MskPs kn +1 ]. If Pr = Ps = ε then
Pr[Y [Pr ] ⊕ MrkPr kn +1 = Y [Ps ] ⊕ MskPs kn +1 ] = Pr[Mr1 = Ms1 ] = 0 because Mr and Ms , having only ε as a
common block prefix, must differ in their first block. If Pr = ε but Ps 6= ε then Pr[Y [Pr ] ⊕ MrkPr kn +1 =
Y [Ps ] ⊕ MskPs kn +1 ] = Pr[Mr1 = Y [Ps ] ⊕ MskPs kn +1 ] = 2−n since the probability expression involves the
single random variable Y [Ps ] that is uniformly distributed in {0, 1}n . If Pr 6= ε and Ps = ε the
same reasoning applies. If Pr 6= ε and Ps 6= ε then Pr[Y [Pr ] ⊕ MrkPr kn +1 = Y [Ps ] ⊕ MskPs kn +1 ] = 2−n
unless Pr = Ps , so assume that to be the case. Then Pr[Y [Pr ] ⊕ MrkPr kn +1 = Y [Ps ] ⊕ MskPs kn +1 ] =
Pr[MrkPr kn +1 = MskPs kn +1 ] = 0 because Pr = Ps is the longest block prefix that coincides in Mr and Ms .
Line 904. We want to bound Pr[Y [Ps ] ⊕ MskPs kn +1 = Y [Mr1→i ] ⊕ Mri+1 ]. If Ps = ε then Pr[Y [Ps ] ⊕ MskPs kn +1 =
Y [Mr1→i ] ⊕ Mri+1 ] = Pr[MskPs kn +1 = Y [Mr1→i ] ⊕ Mri+1 ] = 2−n because it involves a single random value
Y [Mr1→i ]. So assume that Ps 6= ε. Then Pr[Y [Ps ] ⊕ MskPs kn +1 = Y [Mr1→i ] ⊕ Mri+1 ] = 2−n unless
Bellare and Rogaway 149
Ps = Mr1→i in which case we are looking at Pr[MskPs kn +1 = MrkPs kn +1 ]. But this is 0 because Ps = Mr1→i
means that the longest prefix that Ms shares with Mr is Ps and so MskPs kn +1 6= MrkPs kn +1 .
Line 905. What is Y [Ms1→j ] ⊕ Msj+1 = Y [Mr1→i ] ⊕ Mri+1 . It is 2−n unless i = j and Ms1→j = Mr1→i .
In that case kPs kn ≥ j and kPr kn ≥ i, contradicting our choice of allowed values for i and j at
line 902.
Line 906. We must bound Pr[Y [Pr ] ⊕ MrkPr kn +1 = Y [Ms1→j ] ⊕ Msj+1 ]. As before, this is 2−n unless
Pr = Ms1→j but we can not have that Pr = Ms1→j because j ≥ kPs kn + 1.
There are at most 0.5m2 q 2 tuples (r, i, s, j) considered at line 902 and we now know that for
each of them bad gets set with probability at most 2−n . So Pr[Game C9 sets bad] ≤ 0.5m2 q 2 /2n .
Combining with the loss from the C0→C2 transition we have that Pr[Game C0 setsbad] ≤ m2 q 2 /2n ,
completing the proof.
7.9 Problems
Problem 39 Consider the following variant of the CBC MAC, intended to allow one to MAC
messages of arbitrary length. The construction uses a blockcipher E : {0, 1}k × {0, 1}n → {0, 1}n ,
which you should assume to be secure. The domain for the MAC is ({0, 1}n )+ . To MAC M under
key K compute CBCK (M k |M |), where |M | is the length of M , written in n bits. Of course K has
k bits. Show that this MAC is completely insecure: break it with a constant number of queries.
Problem 40 Consider the following variant of the CBC MAC, intended to allow one to MAC
messages of arbitrary length. The construction uses a blockcipher E : {0, 1}k × {0, 1}n → {0, 1}n ,
which you should assume to be secure. The domain for the MAC is ({0, 1}n )+ . To MAC M under
key (K, K ′ ) compute CBCK (M ) ⊕ K ′ . Of course K has k bits and K ′ has n bits. Show that this
MAC is completely insecure: break it with a constant number of queries.
150 MESSAGE AUTHENTICATION
Problem 41 Let SE = (K, E, D) be a symmetric encryption scheme and let MA = (K′ , MAC, VF)
be a message authentication code. Alice (A) and Bob (B) share a secret key K = (K1, K2) where
K1 ← K and K2 ← K′ . Alice wants to send messages to Bob in a private and authenticated way.
Consider her sending each of the following as a means to this end. For each, say whether it is a
secure way or not, and briefly justify your answer. (In the cases where the method is good, you
don’t have to give a proof, just the intuition.)
(g) EK1 (M, A) where A encodes the identity of Alice; B decrypts the received ciphertext C and
checks that the second half of the plaintext is “A”.
In analyzing these schemes, you should assume that the primitives have the properties guaran-
teed by their definitions, but no more; for an option to be good it must work for any choice of a
secure encryption scheme and a secure message authentication scheme.
Now, out of all the ways you deemed secure, suppose you had to choose one to implement for
a network security application. Taking performance issues into account, do all the schemes look
pretty much the same, or is there one you would prefer?
Problem 42 Refer to problem 4.3. Given a blockcipher E: K × {0, 1}n → {0, 1}n construct a
cipher E ′ : K′ × {0, 1}2n → {0, 1}2n . Formalize and prove a theorem that shows that E ′ is a secure
PRP if E is.
Problem 43 Let H: {0, 1}k × D → {0, 1}L be a hash function, and let Π = (K, MAC, VF) be the
message authentication code defined as follows. The key-generation algorithm K takes no inputs
and returns a random k-bit key K, and the tagging and verifying algorithms are:
algorithm MACK (M ) algorithm VFK (M, Tag ′ )
Tag ← H(K, M ) Tag ← H(K, M )
return Tag if Tag = Tag ′ then return 1
else return 0
Show that
for any t, q, µ with q ≥ 2, where t′ is t + O(log(q)). (This says that if Π is a secure message
authentication code then H was a CR2-HK secure hash function.)
Bibliography
[1] M. Bellare, J. Kilian and P. Rogaway. The security of the cipher block chaining
message authentication code. Journal of Computer and System Sciences , Vol. 61, No. 3,
Dec 2000, pp. 362–399.
[2] M. Bellare, R. Canetti and H. Krawczyk. Keying hash functions for message auth-
entication. Advances in Cryptology – CRYPTO ’96, Lecture Notes in Computer Science
Vol. 1109, N. Koblitz ed., Springer-Verlag, 1996.
[4] J. Black and P. Rogaway. CBC MACs for Arbitrary-Length Messages: The Three-Key
Constructions. Advances in Cryptology – CRYPTO ’00, Lecture Notes in Computer Science
Vol. 1880, M. Bellare ed., Springer-Verlag, 2000.
152 BIBLIOGRAPHY
Chapter 8
Authenticated Encryption
Authenticated encryption is the problem of achieving both privacy and authenticity in the shared-
key setting. Historically, this goal was given very little attention by cryptographers. Perhaps people
assumed that there was nothing to be said: combine what we did on encryption in Chapter 5 and
what we did on message authentication in Chapter 7—end of story, no?
The answer is indeed no. First, from a theoretical point of view, achieving privacy and au-
thenticity together is a new cryptographic goal—something different from achieving privacy and
different from achieving authenticity. We need to look at what this goal actually means. Second,
even if we do plan to achieve authenticated encryption using the tools we’ve already looked at, say
an encryption scheme and a MAC, we still have to figure out how to combine these primitives in
a way that is guaranteed to achieve security. Finally, if our goal is to achieve both privacy and
authenticity then we may be able to achieve efficiency that would not be achievable if treating these
goals separately.
Rest of Chapter to be written.
154 AUTHENTICATED ENCRYPTION
Chapter 9
ZN = {0, 1, . . . , N − 1}
Z∗N = { i ∈ Z : 1 ≤ i ≤ N − 1 and gcd(i, N ) = 1 }
The first set is called the set of integers mod N . Its size is N , and it contains exactly the integers
that are possible values of a mod N as a ranges over Z. We define the Euler Phi (or totient)
function ϕ: Z+ → N by ϕ(N ) = |Z∗N | for all N ∈ Z+ . That is, ϕ(N ) is the size of the set Z∗N .
9.1.2 Groups
Let G be a non-empty set, and let · be a binary operation on G. This means that for every two
points a, b ∈ G, a value a · b is defined.
Definition 9.1.1 Let G be a non-empty set and let · denote a binary operation on G. We say
that G is a group if it has the following properties:
1. Closure: For every a, b ∈ G it is the case that a · b is also in G.
156 COMPUTATIONAL NUMBER THEORY
We now return to the sets we defined above and remark on their group structure. Let N be a
positive integer. The operation of addition modulo N takes input any two integers a, b and returns
(a + b) mod N . The operation of multiplication modulo N takes input any two integers a, b and
returns ab mod N .
Fact 9.1.2 Let N be a positive integer. Then ZN is a group under addition modulo N , and Z∗N
is a group under multiplication modulo N .
In ZN , the identity element is 0 and the inverse of a is −a mod N = N − a. In Z∗N , the identity
element is 1 and the inverse of a is a b ∈ Z∗N such that ab ≡ 1 (mod N ). In may not be obvious
why such a b even exists, but it does. We do not prove the above fact here.
In any group, we can define an exponentiation operation which associates to any a ∈ G and
any integer i a group element we denote ai , defined as follows. If i = 0 then ai is defined to be 1,
the identity element of the group. If i > 0 then
ai = |a · a{z· · · a} .
i
If i is negative, then we define ai = (a−1 )−i . Put another way, let j = −i, which is positive, and
set
ai = |a−1 · a−1 −1
{z · · · a } .
j
With these definitions in place, we can manipulate exponents in the way in which we are accustomed
with ordinary numbers. Namely, identities such as the following hold for all a ∈ G and all i, j ∈ Z:
ai+j = ai · aj
(ai )j = aij
a−i = (ai )−1
a−i = (a−1 )i .
Fact 9.1.3 Let G be a group and let m = |G| be its order. Then am = 1 for all a ∈ G.
This means that computation in the group indices can be done modulo m:
Proposition 9.1.4 Let G be a group and let m = |G| be its order. Then ai = ai mod m for all
a ∈ G and all i ∈ Z.
Bellare and Rogaway 157
We leave it to the reader to prove that this follows from Fact 9.1.3.
Example 9.1.5 Let us work in the group Z∗21 under the operation of multiplication modulo 21.
The members of this group are 1, 2, 4, 5, 8, 10, 11, 13, 16, 17, 19, 20, so the order of the group is
m = 12. Suppose we want to compute 586 in this group. Applying the above we have
586 mod 21 = 586 mod 12 mod 21 = 52 mod 21 = 25 mod 21 = 4 .
If G is a group, a set S ⊆ G is called a subgroup if it is a group in its own right, under the same
operation as that under which G is a group. If we already know that G is a group, there is a simple
way to test whether S is a subgroup: it is one if and only if x · y −1 ∈ S for all x, y ∈ S. Here y −1
is the inverse of y in G.
Fact 9.1.6 Let G be a group and let S be a subgroup of G. Then the order of S divides the order
of G.
9.2 Algorithms
Fig. 9.1 summarizes some basic algorithms involving numbers. These algorithms are used to im-
plement public-key cryptosystems, and thus their running time is an important concern. We begin
with a discussion about the manner in which running time is measured, and then go on to discuss
the algorithms, some very briefly, some in more depth.
2|n| G-operations
Running Time
O(|n| · |N |2 )
O(|a| · |N |)
O(|a| · |N |)
O(|a| · |b|)
O(|N |2 )
O(|N |2 )
O(|N |)
(q, r) with a = N q + r and 0 ≤ r < N
an mod N
ab mod N
a mod N
Output
an ∈ G
((a, b) 6= (0, 0))
(a, b ∈ ZN )
(a, b ∈ ZN )
(a ∈ Z∗N )
(a ∈ ZN )
(N > 0)
(N > 0)
(a ∈ G)
a, n, N
Input
a, b, N
a, b, N
a, N
a, N
a, N
a, n
a, b
MOD-MULT
Algorithm
MOD-ADD
MOD-EXP
EXT-GCD
MOD-INV
INT-DIV
EXPG
MOD
Figure 9.1: Some basic algorithms and their running time. Unless otherwise indicated, an
input value is an integer and the running time is the number of bit operations. G denotes a group.
calling INT-DIV(a, N ) to get (q, r), and then returning just the remainder r.
Example 9.2.1 The gcd of 20 and 12 is d = gcd(20, 12) = 4. We note that 4 = 20(2) + (12)(−3),
so in this case a = 2 and b = −3.
Besides the gcd itself, we will find it useful to be able to compute these weights a, b. This
is what the extended-gcd algorithm EXT-GCD does: given a, b as input, it returns (d, a, b) such
that d = gcd(a, b) = aa + bb. The algorithm itself is an extension of Euclid’s classic algorithm for
computing the gcd, and the simplest description is a recursive one. We now provide it, and then
discuss the correctness and running time. The algorithm takes input any integers a, b, not both
zero.
Algorithm EXT-GCD(a, b)
If b = 0 then return (a, 1, 0)
Else
(q, r) ← INT-DIV(a, b)
(d, x, y) ← EXT-GCD(b, r)
a←y
b ← x − qy
Return (d, a, b)
EndIf
The base case is when b = 0. If b = 0 then we know by assumption that a 6= 0, so gcd(a, b) = a, and
since a = a(1) + b(0), the weights are 1 and 0. If b 6= 0 then we can divide by it, and we divide a by
it to get a quotient q and remainder r. For the recursion, we use the fact that gcd(a, b) = gcd(b, r).
The recursive call thus yields d = gcd(a, b) together with weights x, y such that d = bx + ry. Noting
that a = bq + r we have
d = bx + ry = bx + (a − bq)y = ay + b(x − qy) = aa + bb ,
confirming that the values assigned to a, b are correct.
The running time of this algorithm is O(|a| · |b|), or, put a little more simply, the running time
is quadratic in the length of the longer number. This is not so obvious, and proving it takes some
work. We do not provide this proof here.
We also want to know an upper bound on the lengths of the weights a, b output by EXT-GCD(a, b).
The running time bound tells us that |a|, |b| = O(|a| · |b|), but this is not good enough for some of
what follows. I would expect that |a|, |b| = O(|a| + |b|). Is this true? If so, can it be proved by
induction based on the recursive algorithm above?
160 COMPUTATIONAL NUMBER THEORY
Algorithm MOD-INV(a, N )
(d, a, N ) ← EXT-GCD(a, N )
b ← a mod N
Return b
Correctness is easy to see. Since a ∈ Z∗N we know that gcd(a, N ) = 1. The EXT-GCD algorithm
thus guarantees that d = 1 and 1 = aa + N N. Since N mod N = 0, we have 1 ≡ aa (mod N ),
and thus b = a mod N is the right value to return.
The cost of the first step is O(|a| · |N |). The cost of the second step is O(|a| · |N |). If we
assume that |a| = O(|a| + |N |) then the overall cost is O(|a| · |N |). See discussion of the EXT-GCD
algorithm regarding this assumption on the length of a.
y←1
For i = 1, . . . , n do y ← y · a EndFor
Return y
This might at first seem like a satisfactory algorithm, but actually it is very slow. The number of
group operations required is n, and the latter can be as large as the order of the group. Since we
are often looking at groups containing about 2512 elements, exponentiation by this method is not
feasible. In the language of complexity theory, the problem is that we are looking at an exponential
time algorithm. This is because the running time is exponential in the binary length |n| of the input
n. So we seek a better algorithm. We illustrate the idea of fast exponentiation with an example.
Bellare and Rogaway 161
Example 9.2.2 Suppose the binary length of n is 5, meaning the binary representation of n has
the form b4 b3 b2 b1 b0 . Then
n = 24 b4 + 23 b3 + 22 b2 + 21 b1 + 20 b0
= 16b4 + 8b3 + 4b2 + 2b1 + b0 .
In general, we let bk−1 . . . b1 b0 be the binary representation of n, meaning b0 , . . . , bk−1 are bits such
that n = 2k−1 bk−1 + 2k−2 bk−2 + · · · + 21 b1 + 20 b0 . The algorithm proceeds as follows given any
input a ∈ G and n ∈ Z:
The algorithm uses two group operations per iteration of the loop: one to multiply y by itself,
another to multiply the result by abi . (The computation of abi is without cost, since this is just
a if bi = 1 and 1 if bi = 0.) So its total cost is 2k = 2|n| group operations. (We are ignoring the
cost of the one possible inversion in the case n < 0.) (This is the worst case cost. We observe that
it actually takes |n| + WH (n) group operations, where WH (n) is the number of ones in the binary
representation of n.)
We will typically use this algorithm when the group G is Z∗N and the group operation is multi-
plication modulo N , for some positive integer N . We have denoted this algorithm by MOD-EXP in
Fig. 9.1. (The input a is not required to be relatively prime to N even though it usually will be, so
is listed as coming from ZN .) In that case, each group operation is implemented via MOD-MULT
and takes O(|N |2 ) time, so the running time of the algorithm is O(|n| · |N |2 ). Since n is usually
in ZN , this comes to O(|N |3 ). The salient fact to remember is that modular exponentiation is a
cubic time algorithm.
162 COMPUTATIONAL NUMBER THEORY
Example 9.3.1 Let p = 11, which is prime. Then Z11 ∗ = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} has order
p − 1 = 10. Let us find the subgroups generated by group elements 2 and 5. We raise them to the
powers i = 0, . . . , 9. We get:
i 0 1 2 3 4 5 6 7 8 9
2i mod 11 1 2 4 8 5 10 9 7 3 6
5i mod 11 1 5 3 4 9 1 5 3 4 9
Looking at which elements appear in the row corresponding to 2 and 5, respectively, we can deter-
mine the subgroups these group elements generate:
Since h2i equals Z∗11 , the element 2 is a generator. Since a generator exists, Z∗11 is cyclic. On the
other hand, h5i =6 Z∗11 , so 5 is not a generator. The order of 2 is 10, while the order of 5 is 5.
Note that these orders divide the order 10 of the group. The table also enables us to determine the
discrete logarithms to base 2 of the different group elements:
a 1 2 3 4 5 6 7 8 9 10
DLogZ∗11 ,2 (a) 0 1 8 2 4 9 7 3 6 5
Later we will see a way of identifying all the generators given that we know one of them.
The discrete exponentiation function is conjectured to be one-way (meaning the discrete loga-
rithm function is hard to compute) for some cyclic groups G. Due to this fact we often seek cyclic
groups for cryptographic usage. Here are three sources of such groups. We will not prove any of
the facts below; their proofs can be found in books on algebra.
The operation here is multiplication modulo p, and the size of this group is ϕ(p) = p − 1. This is
the most common choice of group in cryptography.
Bellare and Rogaway 163
Fact 9.3.3 Let G be a group and let m = |G| be its order. If m is a prime number, then G is
cyclic.
In other words, any group having a prime number of elements is cyclic. Note that it is not for this
reason that Fact 9.3.2 is true, since the order of Z∗p (where p is prime) is p − 1, which is even if
p ≥ 3 and 1 if p = 2, and is thus never a prime number.
The following is worth knowing if you have some acquaintance with finite fields. Recall that
a field is a set F equipped with two operations, an addition and a multiplication. The identity
element of the addition is denoted 0. When this is removed from the field, what remains is a group
under multiplication. This group is always cyclic.
Fact 9.3.4 Let F be a finite field, and let F ∗ = F − {0}. Then F ∗ is a cyclic group under the
multiplication operation of F .
A finite field of order m exists if and only if m = pn for some prime p and integer n ≥ 1. The finite
field of order p is exactly Zp , so the case n = 1 of Fact 9.3.4 implies Fact 9.3.2. Another interesting
special case of Fact 9.3.4 is when the order of the field is 2n , meaning p = 2, yielding a cyclic group
of order 2n − 1.
When we want to use a cyclic group G in cryptography, we will often want to find a generator
for it. The process used is to pick group elements in some appropriate way, and then test each
chosen element to see whether it is a generator. One thus has to solve two problems. One is how to
test whether a given group element is a generator, and the other is what process to use to choose
the candidate generators to be tested.
Let m = |G| and let 1 be the identity element of G. The obvious way to test whether a given
g ∈ G is a generator is to compute the values g 1 , g2 , g 3 , . . . , stopping at the first j such that g j = 1.
If j = m then g is a generator. This test however can require up to m group operations, which is
not efficient, given that the groups of interest are large, so we need better tests.
The obvious way to choose candidate generators is to cycle through the entire group in some
way, testing each element in turn. Even with a fast test, this can take a long time, since the group
is large. So we would also like better ways of picking candidates.
We address these problems in turn. Let us first look at testing whether a given g ∈ G is a
generator. One sees quickly that computing all powers of g as in g1 , g2 , g3 , . . . is not necessary. For
example if we computed g8 and found that this is not 1, then we know that g 4 6= 1 and g 2 6= 1
and g 6= 1. More generally, if we know that gj 6= 1 then we know that gi 6= 1 for all i dividing j.
This tells us that it is better to first compute high powers of g, and use that to cut down the space
of exponents that need further testing. The following Proposition pinpoints the optimal way to do
this. It identifies a set of exponents m1 , . . . , mn such that one need only test whether gmi 6= 1 for
i = 1, . . . , n. As we will argue later, this set is quite small.
Proposition 9.3.5 Let G be a cyclic group and let m = |G| be the size of G. Let pα1 1 · · · pαnn be
the prime factorization of m and let mi = m/pi for i = 1, . . . , n. Let g ∈ G. Then g is a generator
of G if and only if
For all i = 1, . . . , n: g mi 6= 1 , (9.1)
where 1 is the identity element of G.
Proof of Proposition 9.3.5: First suppose that g is a generator of G. Then we know that the
smallest positive integer j such that gj = 1 is j = m. Since 0 < mi < m, it must be that gmi 6= 1
for all i = 1, . . . , m.
164 COMPUTATIONAL NUMBER THEORY
Conversely, suppose g satisfies the condition of Equation (9.1). We want to show that g is a
generator. Let j be the order of g, meaning the smallest positive integer such that gj = 1. Then we
know that j must divide the order m of the group, meaning m = dj for some integer d ≥ 1. This
implies that j = pβ1 1 · · · pβnn for some integers β1 , . . . , βn satisfying 0 ≤ βi ≤ αi for all i = 1, . . . , n.
If j < m then there must be some i such that βi < αi , and in that case j divides mi , which in turn
implies gmi = 1 (because gj = 1). So the assumption that Equation (9.1) is true implies that j
cannot be strictly less than m, so the only possibility is j = m, meaning g is a generator.
The number n of terms in the prime factorization of m cannot be more than lg(m), the binary
logarithm of m. (This is because pi ≥ 2 and αi ≥ 1 for all i = 1, . . . , n.) So, for example, if the
group has size about 2512 , then at most 512 tests are needed. So testing is quite efficient. One
should note however that it requires knowing the prime factorization of m.
Let us now consider the second problem we discussed above, namely how to choose candidate
group elements for testing. There seems little reason to think that trying all group elements in turn
will yield a generator in a reasonable amount of time. Instead, we consider picking group elements
at random, and then testing them. The probability of success in any trial is |Gen(G)|/|G|. So the
expected number of trials before we find a generator is |G|/|Gen(G)|. To estimate the efficacy of this
method, we thus need to know the number of generators in the group. The following Proposition
gives a characterization of the generator set which in turn tells us its size.
Proposition 9.3.6 Let G be a cyclic group of order m, and let g be a generator of G. Then
Gen(G) = { g i ∈ G : i ∈ Z∗m } and |Gen(G)| = ϕ(m).
That is, having fixed one generator g, a group element h is a generator if and only if its discrete
logarithm to base g is relatively prime to the order m of the group. As a consequence, the number
of generators is the number of integers in the range 1, . . . , m − 1 that are relatively prime to m.
Proof of Proposition 9.3.6: Given that Gen(G) = { gi ∈ G : i ∈ Z∗m }, the claim about its size
follows easily:
|Gen(G)| = { g i ∈ G : i ∈ Z∗m } = |Z∗m | = ϕ(m) .
We now prove that Gen(G) = { gi ∈ G : i ∈ Z∗m }. First, we show that if i ∈ Z∗m then gi ∈ Gen(G).
Second, we show that if i ∈ Zm − Z∗m then gi 6∈ Gen(G).
So first suppose i ∈ Z∗m , and let h = g i . We want to show that h is a generator of G. It suffices to
show that the only possible value of j ∈ Zm such that hj = 1 is j = 0, so let us now show this. Let
j ∈ Zm be such that hj = 1. Since h = g i we have
1 = hj = gij mod m .
Since g is a generator, it must be that ij ≡ 0 (mod m), meaning m divides ij. But i ∈ Z∗m so
gcd(i, m) = 1. So it must be that m divides j. But j ∈ Zm and the only member of this set
divisible by m is 0, so j = 0 as desired.
Next, suppose i ∈ Zm −Z∗m and let h = g i . To show that h is not a generator it suffices to show that
there is some non-zero j ∈ Zm such that hj = 1. Let d = gcd(i, m). Our assumption i ∈ Zm − Z∗m
implies that d > 1. Let j = m/d, which is a non-zero integer in Zm because d > 1. Then the
following shows that hj = 1, completing the proof:
hj = g ij = gi·m/d = gm·i/d = (g m )i/d = 1i/d = 1.
We used here the fact that d divides i and that g m = 1.
Bellare and Rogaway 165
Example 9.3.7 Let us determine all the generators of the group Z∗11 . Let us first use Proposition 9.3.5.
The size of Z∗11 is m = ϕ(11) = 10, and the prime factorization of 10 is 21 · 51 . Thus, the test for
whether a given a ∈ Z∗11 is a generator is that a2 6≡ 1 (mod 11) and a5 6≡ 1 (mod 11). Let us
compute a2 mod 11 and a5 mod 11 for all group elements a. We get:
a 1 2 3 4 5 6 7 8 9 10
a2 mod 11 1 4 9 5 3 3 5 9 4 1
a5 mod 11 1 10 1 1 1 10 10 10 1 10
The generators are those a for which the corresponding column has no entry equal to 1, meaning
in both rows, the entry for this column is different from 1. So
Gen(Z∗11 ) = {2, 6, 7, 8} .
Now, let us use Proposition 9.3.6 and double-check that we get the same thing. We saw in
Example 9.3.1 that 2 was a generator of Z∗11 . As per Proposition 9.3.6, the set of generators
is
Gen(Z∗11 ) = { 2i mod 11 : i ∈ Z∗10 } .
This is because the size of the group is m = 10. Now, Z∗10 = {1, 3, 7, 9}. The values of 2i mod 11
as i ranges over this set can be obtained from the table in Example 9.3.1 where we computed all
the powers of 2. So
{ 2i mod 11 : i ∈ Z∗10 } = {21 mod 11, 23 mod 11, 27 mod 11, 29 mod 11}
= {2, 6, 7, 8} .
This is the same set we obtained above via Proposition 9.3.5. If we try to find a generator by picking
group elements at random and then testing using Proposition 9.3.5, each trial has probability of
success ϕ(10)/10 = 4/10, so we would expect to find a generator in 10/4 trials. We can optimize
slightly by noting that 1 and −1 can never be generators, and thus we only need pick candidates
randomly from Z∗11 −{1, 10}. In that case, each trial has probability of success ϕ(10)/8 = 4/8 = 1/2,
so we would expect to find a generator in 2 trials.
When we want to work in a cyclic group in cryptography, the most common choice is to work
over Z∗p for a suitable prime p. The algorithm for finding a generator would be to repeat the process
of picking a random group element and testing it, halting when a generator is found. In order to
make this possible we choose p in such a way that the prime factorization of the order p − 1 of
Z∗p is known. In order to make the testing fast, we choose p so that p − 1 has few prime factors.
Accordingly, it is common to choose p to equal 2q + 1 for some prime q. In this case, the prime
factorization of p − 1 is 21 q 1 , so we need raise a candidate to only two powers to test whether or not
it is a generator. In choosing candidates, we optimize slightly by noting that 1 and −1 are never
generators, and accordingly pick the candidates from Z∗p − {1, p − 1} rather than from Z∗p . So the
algorithm is as follows:
Algorithm FIND-GEN(p)
q ← (p − 1)/2
found ← 0
While (found 6= 1) do
$
g ← Z∗p − {1, p − 1}
If (g 2 mod p 6= 1) and (g q mod p 6= 1) then found ← 1
166 COMPUTATIONAL NUMBER THEORY
EndWhile
Return g
Proposition 9.3.5 tells us that the group element g returned by this algorithm is always a generator
of Z∗p . By Proposition 9.3.6, the probability that an iteration of the algorithm is successful in
finding a generator is
|Gen(Z∗p )| ϕ(p − 1) ϕ(2q) q−1 1
∗
= = = = .
|Zp | − 2 p−3 2q − 2 2q − 2 2
Thus the expected number of iterations of the while loop is 2. Above, we used that fact that
ϕ(2q) = q − 1 which is true because q is prime.
for all a ∈ Z. We call Jp (a) the Legendre symbol of a. Thus, the Legendre symbol is simply a
compact notation for telling us whether or not its argument is a square modulo p.
Before we move to developing the theory, it may be useful to look at an example.
Example 9.4.1 Let p = 11, which is prime. Then Z∗11 = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} has order
p − 1 = 10. A simple way to determine QR(Z∗11 ) is to square all the group elements in turn:
a 1 2 3 4 5 6 7 8 9 10
a2 mod 11 1 4 9 5 3 3 5 9 4 1
The squares are exactly those elements that appear in the second row, so
QR(Z∗11 ) = {1, 3, 4, 5, 9} .
The number of squares is 5, which we notice equals (p − 1)/2. This is not a coincidence, as we will
see. Also notice that each square has exactly two different square roots. (The square roots of 1 are
1 and 10; the square roots of 3 are 5 and 6; the square roots of 4 are 2 and 9; the square roots of 5
are 4 and 7; the square roots of 9 are 3 and 8.)
Since 11 is prime, we know that Z∗11 is cyclic, and as we saw in Example 9.3.1, 2 is a generator.
(As a side remark, we note that a generator must be a non-square. Indeed, if a = b2 is a square,
Bellare and Rogaway 167
then a5 = b10 = 1 modulo 11 because 10 is the order of the group. So aj = 1 modulo 11 for
some positive j < 10, which means a is not a generator. However, not all non-squares need be
generators.) Below, we reproduce from that example the table of discrete logarithms of the group
elements. We also add below it a row providing the Legendre symbols, which we know because,
above, we identified the squares. We get:
a 1 2 3 4 5 6 7 8 9 10
DLogZ∗11 ,2 (a) 0 1 8 2 4 9 7 3 6 5
J11 (a) 1 −1 1 1 1 −1 −1 −1 1 −1
We observe that the Legendre symbol of a is 1 if its discrete logarithm is even, and −1 if the discrete
logarithm is odd, meaning the squares are exactly those group elements whose discrete logarithm
is even. It turns out that this fact is true regardless of the choice of generator.
As we saw in the above example, the fact that Z∗p is cyclic is useful in understanding the
structure of the subgroup of quadratic residues QR(Z∗p ). The following Proposition summarizes
some important elements of this connection.
so gy is also a square root of a. Since i is an even number in Zp−1 and p − 1 is even, it must be that
0 ≤ x < (p − 1)/2. It follows that (p − 1)/2 ≤ y < p − 1. Thus x 6= y. This means that a has as
least two square roots. This is true for each of the (p − 1)/2 squares mod p. So the only possibility
is that each of these squares has exactly two square roots.
Suppose we are interested in knowing whether or not a given a ∈ Z∗p is a square mod p, meaning
we want to know the value of the Legendre symbol Jp (a). Proposition 9.4.2 tells us that
DLogZ∗ ,g (a)
Jp (a) = (−1) p ,
where g is any generator of Z∗p . This however is not very useful in computing Jp (a), because it
requires knowing the discrete logarithm of a, which is hard to compute. The following Proposition
says that the Legendre symbols of a modulo an odd prime p can be obtained by raising a to the
power (p − 1)/2, and helps us compute the Legendre symbol.
Now one can determine whether or not a is a square mod p by running the algorithm MOD-EXP
on inputs a, (p − 1)/2, p. If the algorithm returns 1 then a is a square mod p, and if it returns p − 1
(which is the same as −1 mod p) then a is a non-square mod p. Thus, the Legendre symbol can be
computed in time cubic in the length of p.
Towards the proof of Proposition 9.4.3, we begin with the following lemma which is often useful
in its own right.
Proof of Lemma 9.4.4: We begin by observing that 1 and −1 are both square roots of 1 mod
p, and are distinct. (It is clear that squaring either of these yields 1, so they are square roots
of 1. They are distinct because −1 equals p − 1 mod p, and p − 1 6= 1 because p ≥ 3.) By
Proposition 9.4.2, these are the only square roots of 1. Now let
p−1
b = g 2 mod p .
Then b2 ≡ 1 (mod p), so b is a square root of 1. By the above b can only be 1 or −1. However,
since g is a generator, b cannot be 1. (The smallest positive value of i such that gi is 1 mod p is
i = p − 1.) So the only choice is that b ≡ −1 (mod p), as claimed.
Proof of Proposition 9.4.3: By definition of the Legendre symbol, we need to show that
p−1
1 (mod p) if a is a square mod p
a 2 ≡
−1 (mod p) otherwise.
Let g be a generator of Z∗p and let i = DLogZ∗p ,g (a). We consider separately the cases of a being a
square and a being a non-square.
Bellare and Rogaway 169
Suppose a is a square mod p. Then Proposition 9.4.2 tells us that i is even. In that case
p−1 p−1 p−1
a 2 ≡ (gi ) 2 ≡ g i· 2 ≡ (g p−1 )i/2 ≡ 1 (mod p) ,
as desired.
Now suppose a is a non-square mod p. Then Proposition 9.4.2 tells us that i is odd. In that case
p−1 p−1 p−1 p−1
+ p−1 p−1 p−1
a 2 ≡ (gi ) 2 ≡ gi· 2 ≡ g(i−1)· 2 2 ≡ (g p−1 )(i−1)/2 · g 2 ≡g 2 (mod p) .
However Lemma 9.4.4 tells us that the last quantity is −1 modulo p, as desired.
The following Proposition says that ab mod p is a square if and only if either both a and b are
squares, or if both are non-squares. But if one is a square and the other is not, then ab mod p is
a non-square. This can be proved by using either Proposition 9.4.2 or Proposition 9.4.3. We use
the latter in the proof. You might try, as an exercise, to reprove the result using Proposition 9.4.2
instead.
A quantity of cryptographic interest is the Diffie-Hellman (DH) key. Having fixed a cyclic group
G and generator g for it, the DH key associated to elements X = g x and Y = g y of the group is
the group element gxy . The following Proposition tells us that the DH key is a square if either X
or Y is a square, and otherwise is a non-square.
The above Propositions, combined with Proposition 9.4.3 (which tells us that quadratic residu-
osity modulo a prime can be efficiently tested), will later lead us to pinpoint weaknesses in certain
cryptographic schemes in Z∗p .
Proof of Proposition 9.5.1: It suffices to show that the order of h is q. We know that the
order of any group element must divide the order of the group. Since the group has prime order
q, the only possible values for the order of h are 1 and q. But h does not have order 1 since it is
non-trivial, so it must have order q.
A common way to obtain a group of prime order for cryptographic schemes is as a subgroup of a
group of integers modulo a prime. We pick a prime p having the property that q = (p − 1)/2 is also
prime. It turns out that the subgroup of quadratic residues modulo p then has order q, and hence
is a group of prime order. The following proposition summarizes the facts for future reference.
Bellare and Rogaway 171
Proposition 9.5.2 Let q ≥ 3 be a prime such that p = 2q + 1 is also prime. Then QR(Z∗p ) is a
group of prime order q. Furthermore, if g is any generator of Z∗p , then g2 mod p is a generator of
QR(Z∗p ).
Note that the operation under which QR(Z∗p ) is a group is multiplication modulo p, the same
operation under which Z∗p is a group.
Proof of Proposition 9.5.2: We know that QR(Z∗p ) is a subgroup, hence a group in its own
right. Proposition 9.4.2 tells us that |QR(Z∗p )| is (p − 1)/2, which equals q in this case. Now let g
be a generator of Z∗p and let h = g2 mod p. We want to show that h is a generator of QR(Z∗p ). As
per Proposition 9.5.1, we need only show that h is non-trivial, meaning h 6= 1. Indeed, we know
that g2 6≡ 1 (mod p), because g, being a generator, has order p and our assumptions imply p > 2.
Example 9.5.3 Let q = 5 and p = 2q + 1 = 11. Both p and q are primes. We know from
Example 9.4.1 that
QR(Z∗11 ) = {1, 3, 4, 5, 9} .
This is a group of prime order 5. We know from Example 9.3.1 that 2 is a generator of Z∗p .
Proposition 9.5.2 tells us that 4 = 22 is a generator of QR(Z∗11 ). We can verify this by raising 4 to
the powers i = 0, . . . , 4:
i 0 1 2 3 4
4i mod 11 1 4 5 9 3
We see that the elements of the last row are exactly those of the set QR(Z∗11 ).
Let us now explain what we perceive to be the advantage conferred by working in a group of
prime order. Let G be a cyclic group, and g a generator. We know that the discrete logarithms to
base g range in the set Zm where m = |G| is the order of G. This means that arithmetic in these
exponents is modulo m. If G has prime order, then m is prime. This means that any non-zero
exponent has a multiplicative inverse modulo m. In other words, in working in the exponents, we
can divide. It is this that turns out to be useful.
As an example illustrating how we use this, let us return to the problem of the distribution of
the DH key that we looked at in Section 9.4. Recall the question is that we draw x, y independently
at random from Zm and then ask how gxy is distributed over G. We saw that when G = Z∗p for a
prime p ≥ 3, this distribution was noticebly different from uniform. In a group of prime order, the
distribution of the DH key, in contrast, is very close to uniform over G. It is not quite uniform,
because the identity element of the group has a slightly higher probability of being the DH key than
other group elements, but the deviation is small enough to be negligible for groups of reasonably
large size. The following proposition summarizes the result.
Proposition 9.5.4 Suppose G is a group of order q where q is a prime, and let g be a generator
of G. Then for any Z ∈ G we have
1 1
h i q 1− q
6 1
if Z =
$
Pr x ← $
Zq ; y ← Zq : gxy = Z =
1 1
2− if Z = 1,
q q
where 1 denotes the identity element of G.
172 COMPUTATIONAL NUMBER THEORY
Proof of Proposition 9.5.4: First suppose Z = 1. The DH key gxy is 1 if and only if either x
or y is 0 modulo q. Each is 0 with probability 1/q and these probabilities are independent, so the
probability that either x or y is 0 is 2/q − 1/q 2 , as claimed.
Now suppose Z 6= 1. Let z = DLogG,g (Z), meaning z ∈ Z∗q and g z = Z. We will have g xy ≡ Z
(mod p) if and only if xy ≡ z (mod q), by the uniqueness of the discrete logarithm. For any fixed
x ∈ Z∗q , there is exactly one y ∈ Zq for which xy ≡ z (mod q), namely y = x−1 z mod q, where
x−1 is the multiplicative inverse of x in the group Z∗q . (Here we are making use of the fact that
q is prime, since otherwise the inverse of x modulo q may not exist.) Now, suppose we choose x
at random from Zq . If x = 0 then, regardless of the choice of y ∈ Zq , we will not have xy ≡ z
(mod q), because z 6≡ 0 (mod q). On the other hand, if x 6= 0 then there is exactly 1/q probability
that the randomly chosen y is such that xy ≡ z (mod q). So the probability that xy ≡ z (mod q)
when both x and y are chosen at random in Zq is
q−1 1 1 1
· = 1−
q q q q
as desired. Here, the first term is because when we choose x at random from Zq , it has probability
(q − 1)/q of landing in Z∗q .
Number-Theoretic Primitives
Number theory is a source of several computational problems that serve as primitives in the design
of cryptographic schemes. Asymmetric cryptography in particular relies on these primitives. As
with other beasts that we have been calling “primitives,” these computational problems exhibit
some intractability features, but by themselves do not solve any cryptographic problem directly
relevant to a user security goal. But appropriately applied, they become useful to this end. In
order to later effectively exploit them it is useful to first spend some time understanding them.
This understanding has two parts. The first is to provide precise definitions of the various
problems and their measures of intractability. The second is to look at what is known or conjectured
about the computational complexity of these problems.
There are two main classes of primitives. The first class relates to the discrete logarithm problem
over appropriate groups, and the second to the factoring of composite integers. We look at them
in turn.
This chapter assumes some knowledge of computational number theory as covered in the chapter
on Computational Number Theory.
Figure 10.1: An informal description of three discrete logarithm related problems over a cyclic
group G with generator g. For each problem we indicate the input to the attacker, and what the
attacker must figure out to “win.” The formal definitions are in the text.
Definition 10.1.1 Let G be a cyclic group of order m, let g be a generator of G, and let A be an
algorithm that returns an integer in Zm . We consider the following experiment:
Recall that the discrete exponentiation function takes input i ∈ Zm and returns the group element
gi . The discrete logarithm function is the inverse of the discrete exponentiation function. The
definition above simply measures the one-wayness of the discrete exponentiation function according
to the standard definition of one-way function. It is to emphasize this that certain parts of the
experiment are written the way they are.
The discrete logarithm problem is said to hard in G if the dl-advantage of any adversary of
reasonable resources is small. Resources here means the time-complexity of the adversary, which
includes its code size as usual.
Definition 10.1.2 Let G be a cyclic group of order m, let g be a generator of G, and let A be an
algorithm that returns an element of G. We consider the following experiment:
Again, the CDH problem is said to be hard in G if the cdh-advantage of any adversary of reasonable
resources is small, where the resource in question is the adversary’s time complexity.
176 NUMBER-THEORETIC PRIMITIVES
Again, the DDH problem is said to be hard in G if the ddh-advantage of any adversary of reasonable
resources is small, where the resource in question is the adversary’s time complexity.
Proposition 10.1.4 Let G be a cyclic group and let g be a generator of G. Let Adl be an adversary
(against the DL problem). Then there exists an adversary Acdh (against the CDH problem) such
that
Advdl cdh
G,g (Adl ) ≤ AdvG,g (Acdh ) . (10.2)
Furthermore the running time of Acdh is the that of Adl plus the time to do one exponentiation in
G. Similarly let Acdh be an adversary (against the CDH problem). Then there exists an adversary
Addh (against the DDH problem) such that
1
Advcdh ddh
G,g (Acdh ) ≤ AdvG,g (Addh ) + . (10.3)
|G|
Bellare and Rogaway 177
Let x = DLogG,g (X) and y = DLogG,g (y). If Adl is successful then its output x equals x. In that
case
Y x = Y x = (g y )x = g yx = gxy
is the correct output for Acdh . This justifies Equation (10.2).
We now turn to the second inequality in the proposition. Adversary Addh works as follows:
We claim that
h i
Pr DDH1A ddh
G,g = 1 = Advcdh
G,g (Acdh )
h i 1
Pr DDH0A ddh
G,g = 1 = ,
|G|
which implies Equation (10.3). To justify the above, let x = DLogG,g (X) and y = DLogG,g (y). If
Acdh is successful then its output Z equals gxy , so in world 1, Addh returns 1. On the other hand
in world 0, Z is uniformly distributed over G and hence has probability 1/|G| of equalling Z.
x = nx1 + x0 . This means that gnx1 +x0 = X, or Xg−x0 = (gn )x1 . The idea of the algorithm is to
compute two lists:
Xg −b for b = 0, 1, . . . , n
(g n )a for a = 0, 1, . . . , n
and then find a group element that is contained in both lists. The corresponding values of a, b
satisfy Xg−b = (g n )a , and thus DLogG,g (X) = an + b. The details follow.
This algorithm is interesting because it shows that there is a better way to compute the discrete
logarithm of X than to do an exhaustive search for it. However, it does not yield a practical discrete
logarithm computation method, because one can work in groups large enough that an O(|G|1/2 )
algorithm is not really feasible. There are however better algorithms in some specific groups.
Proposition 10.2.1 Let p ≥ 3 be a prime, let G = Z∗p , and let g be a generator of G. Then there
is an adversary A, with running time O(|p|3 ) such that
1
Advddh
G,g (A) = .
2
Proof of Proposition 10.2.1: The input to our adversary A is a triple X, Y, Z of group elements,
and the adversary is trying to determine whether Z was chosen as gxy or as a random group element,
where x, y are the discrete logarithms of X and Y , respectively. We know that if we know Jp (gx )
and Jp (gy ), we can predict Jp (gxy ). Our adversary’s strategy is to compute Jp (gx ) and Jp (gy ) and
then see whether or not the challenge value Z has the Jacobi symbol value that g xy ought to have.
In more detail, it works as follows:
Bellare and Rogaway 179
Adversary A(X, Y, Z)
If Jp (X) = 1 or Jp (Y ) = 1
Then s ← 1 Else s ← −1
If Jp (Z) = s then return 1 else return 0
We know that the Jacobi symbol can be computed via an exponentiation modulo p, which we know
takes O(|p|3 ) time. Thus, the time-complexity of the above adversary is O(|p|3 ). We now claim
that
h i
Pr DDH1A
G,g = 1 = 1
h i 1
Pr DDH0A
G,g = 1 = .
2
Subtracting, we get
h i h i 1 1
Advddh A A
G,g (A) = Pr DDH1G,g = 1 − Pr DDH0G,g = 1 = 1− =
2 2
as desired. Let us now see why the two equations above are true.
Let x = DLogG,g (X) and y = DLogG,g (Y ). We know that the value s computed by our adversary
A equals Jp (gxy mod p). But in World 1, Z = gxy mod p, so our adversary will always return 1. In
World 0, Z is distributed uniformly over G, so
(p − 1)/2 1
Pr [Jp (Z) = 1] = Pr [Jp (Z) = −1] = = .
p−1 2
Since s is distributed independently of Z, the probability that Jp (Z) = s is 1/2.
Now we consider the CDH and DL problems. It appears that the best approach to solving the
CDH in problem in Z∗p is via the computation of discrete logarithms. (This has not been proved in
general, but there are proofs for some special classes of primes.) Thus, the main question is how
hard is the computation of discrete logarithms. This depends both on the size and structure of p.
The currently best algorithm is the GNFS (General Number Field Sieve) which has a running
time of the form
1/3 ·(ln ln(p))2/3
O(e(C+o(1))·ln(p) ) (10.4)
where C ≈ 1.92. For certain classes of primes, the value of C is even smaller. These algorithms are
heuristic, in the sense that the run time bounds are not proven, but appear to hold in practice.
If the prime factorization of the order of the group is known, the discrete logarithm problem
over the group can be decomposed into a set of discrete logarithm problems over subgroups. As a
result, if p − 1 = pα1 1 · · · pαnn is the prime factorization of p − 1, then the discrete logarithm problem
in Z∗p can be solved in time on the order of
n
X √
αi · ( pi + |p|) .
i=1
If we want the discrete logarithm problem in Z∗p to be hard, this means that it must be the case
√
that at least one of the prime factors pi of p − 1 is large enough that pi is large.
The prime factorization of p − 1 might be hard to compute given only p, but in fact we usually
choose p in such a way that we know the prime factorization of p − 1, because it is this that gives us
a way to find a generator of the group Z∗p , as discussed in the chapter on Computational Number
Theory So the above algorithm is quite relevant.
180 NUMBER-THEORETIC PRIMITIVES
From the above, if we want to make the DL problem in Z∗p hard, it is necessary to choose p so
that it is large and has at least one large prime factor. A common choice is p = sq + 1 where s ≥ 2
is some small integer (like s = 2) and q is a prime. In this case, p − 1 has the factor q, which is
large.
Precise estimates of the size of a prime necessary to make a discrete logarithm algorithm infeasi-
ble are hard to make based on asymptotic running times of the form given above. Ultimately, what
actual implementations can accomplish is the most useful data. In April 2001, it was announced
that discrete logarithms had been computed modulo a 120 digit (ie. about 400 bit) prime (Joux
and Lercier, 2001). The computation took 10 weeks and was done on a 525MHz quadri-processor
Digital Alpha Server 8400 computer. The prime p did not have any special structure that was
exploited, and the algorithm used was the GNFS. A little earlier, discrete logarithms had been
computed modulo a slightly larger prime, namely a 129 digit one, but this had a special structure
that was exploited [35].
Faster discrete logarithm computation can come from many sources. One is exploiting paral-
lelism and the paradigm of distributing work across available machines on the Internet. Another is
algorithmic improvements. A reduction in the constant C of Equation (10.4) has important impact
on the running time. A reduction in the exponents from 1/3, 2/3 to 1/4, 3/4 would have an even
greater impact. There are also threats from hardware approaches such as the design of special
purpose discrete logarithm computation devices. Finally, the discrete logarithm probably can be
solved in polynomial time with a quantum computer. Whether a quantum computer can be built
is not known.
Predictions are hard to make. In choosing a prime p for cryptography over Z∗p , the security
risks must be weighed against the increase in the cost of computations over Z∗p as a function of the
size of p.
Definition 10.3.1 Let N, f ≥ 1 be integers. The RSA function associated to N, f is the function
RSAN,f : Z∗N → Z∗N defined by RSAN,f (w) = wf mod N for all w ∈ Z∗N .
The RSA function associated to N, f is thus simply exponentiation with exponent f in the group
Z∗N , but it is useful in the current context to give it a new name. The following summarizes a basic
property of this function. Recall that ϕ(N ) is the order of the group Z∗N .
Bellare and Rogaway 181
Proposition 10.3.2 Let N ≥ 2 and e, d ∈ Z∗ϕ(N ) be integers such that ed ≡ 1 (mod ϕ(N )).
Then the RSA functions RSAN,e and RSAN,d are both permutations on Z∗N and, moreover, are
inverses of each other, ie. RSA−1 −1
N,e = RSAN,d and RSAN,d = RSAN,e .
A permutation, above, simply means a bijection from Z∗N to Z∗N , or, in other words, a one-to-one,
onto map. The condition ed ≡ 1 (mod ϕ(N )) says that d is the inverse of e in the group Z∗ϕ(N ) .
Proof of Proposition 10.3.2: For any x ∈ Z∗N , the following hold modulo N :
RSAN,d (RSAN,e (x)) ≡ (xe )d ≡ xed ≡ xed mod ϕ(N ) ≡ x1 ≡ x .
The third equivalence used the fact that ϕ(N ) is the order of the group Z∗N . The fourth used the
assumed condition on e, d. Similarly, we can show that for any y ∈ Z∗N ,
RSAN,e (RSAN,d (y)) ≡ y
modulo N . These two facts justify all the claims of the Proposition.
With N, e, d as in Proposition 10.3.2 we remark that
• For any x ∈ Z∗N : RSAN,e (x) = MOD-EXP(x, e, N ) and so one can efficiently compute
RSAN,e (x) given N, e, x.
• For any y ∈ Z∗N : RSAN,d (y) = MOD-EXP(y, d, N ) and so one can efficiently compute
RSAN,d (y) given N, d, y.
We now consider an adversary that is given N, e, y and asked to compute RSA−1 N,e (y). If it had d,
this could be done efficiently by the above, but we do not give it d. It turns out that when the
paremeters N, e are properly chosen, this adversarial task appears to be computationally infeasible,
and this property will form the basis of both asymmetric encryption schemes and digital signature
schemes based on RSA. Our goal in this section is to lay the groundwork for these later applications
by showing how RSA parameters can be chosen so as to make the above claim of computational
difficulty true, and formalizing the sense in which it is true.
Proposition 10.3.3 There is an O(k2 ) time algorithm that on inputs ϕ(N ), e where e ∈ Z∗ϕ(N )
and N < 2k , returns d ∈ Z∗ϕ(N ) satisfying ed ≡ 1 (mod ϕ(N )).
Proof of Proposition 10.3.3: Since d is the inverse of e in the group Z∗ϕ(N ) , the algorithm
consists simply of running MOD-INV(e, ϕ(N )) and returning the outcome. Recall that the modular
inversion algorithm invokes the extended-gcd algorithm as a subroutine and has running time
quadratic in the bit-length of its inputs.
To choose RSA parameters, one runs a generator. We consider a few types of geneators:
An RSA generator with associated security parameter k is a randomized algorithm that takes no
inputs and returns a pair ((N, e), (N, p, q, d)) such that the three conditions above are true, and, in
addition,
4. e, d ∈ Z∗(p−1)(q−1)
5. ed ≡ 1 (mod (p − 1)(q − 1))
We call N an RSA modulus, or just modulus. We call e the encryption exponent and d the decryption
exponent.
Note that (p − 1)(q − 1) = ϕ(N ) is the size of the group Z∗N . So above, e, d are relatively prime to
the order of the group Z∗N . As the above indicates, we are going to restrict attention to numbers
N that are the product of two distinct odd primes. Condition (4) for the RSA generator translates
to 1 ≤ e, d < (p − 1)(q − 1) and gcd(e, (p − 1)(q − 1)) = gcd(d, (p − 1)(q − 1)) = 1.
For parameter generation to be feasible, the generation algorithm must be efficient. There are
many different possible efficient generators. We illustrate a few.
In modulus generation, we usually pick the primes p, q at random, with each being about k/2
$
bits long. The corresponding modulus generator Kmod with associated security parameter k works
as follows:
$
Algorithm Kmod
ℓ1 ← ⌊k/2⌋ ; ℓ2 ← ⌈k/2⌉
Repeat
$ $
p ← {2ℓ1 −1 , . . . , 2ℓ1 − 1} ; q ← {2ℓ2 −1 , . . . , 2ℓ2 − 1}
Until the following conditions are all true:
– TEST-PRIME(p) = 1 and TEST-PRIME(q) = 1
– p 6= q
– 2k−1 ≤ N
N ← pq
Return (N, e), (N, p, q, d)
Above, TEST-PRIME denotes an algorithm that takes input an integer and returns 1 or 0. It is
designed so that, with high probability, the former happens when the input is prime and the latter
when the input is composite.
Sometimes, we may want modulii product of primes having a special form, for example primes
p, q such that (p − 1)/2 and (q − 1)/2 are both prime. This corresponds to a different modulus
generator, which works as above but simply adds, to the list of conditions tested to exit the loop, the
conditions TEST-PRIME((p − 1)/2)) = 1 and TEST-PRIME((q − 1)/2)) = 1. There are numerous
other possible modulus generators too.
An RSA generator, in addition to N, p, q, needs to generate the exponents e, d. There are several
options for this. One is to first choose N, p, q, then pick e at random subject to gcd(N, ϕ(N )) = 1,
and compute d via the algorithm of Proposition 10.3.3. This random-exponent RSA generator,
$ , is detailed below:
denoted Krsa
$
Algorithm Krsa
$ $
(N, p, q) ← Kmod
M ← (p − 1)(q − 1)
$
e← Z∗M
Bellare and Rogaway 183
Above, “kea” stands for “known-exponent attack.” We might also allow a chosen-exponent attack,
abbreviated “cea,” in which, rather than having the encryption exponent specified by the instance
of the problem, one allows the adversary to choose it. The only condition imposed is that the
adversary not choose e = 1.
Definition 10.3.6 Let Kmod be a modulus generator with associated security parameter k, and
let A be an algorithm. We consider the following experiment:
Experiment Expow-cea (A)
Krsa
$
(N, p, q) ← Kmod
y← $
Z∗N
$
(x, e) ← A(N, y)
e
If x ≡ y (mod N ) and e > 1
then return 1 else return 0.
184 NUMBER-THEORETIC PRIMITIVES
[1] T. Denny and D. Weber The solution of Mccurley’s discrete logchallenge. Advances in
Cryptology – CRYPTO ’98, Lecture Notes in Computer Science Vol. 1462, H. Krawczyk ed.,
Springer-Verlag, 1998.
186 BIBLIOGRAPHY
Chapter 11
Digital signatures
In the public key setting, the primitive used to provide data integrity is a digital signature scheme.
In this chapter we look at security notions and constructions for this primitive.
Definition 11.1.1 A digital signature scheme DS = (K, Sig, VF) consists of three algorithms, as
follows:
• The randomized key generation algorithm K (takes no inputs and) returns a pair (pk, sk)
$
of keys, the public key and matching secret key, respectively. We write (pk, sk) ← K for the
operation of executing K and letting (pk, sk) be the pair of keys returned.
• The signing algorithm Sig takes the secret key sk and a message M to return a signature (also
sometimes called a tag) σ ∈ {0, 1}∗ ∪ {⊥}. The algorithm may be randomized or stateful. We
$ $
write σ ← Sigsk (M ) or σ ← Sig(sk, M ) for the operation of running Sig on inputs sk, M and
letting σ be the signature returned.
• The deterministic verification algorithm VF takes a public key pk, a message M , and a
candidate signature σ for M to return a bit. We write d ← VFpk (M, σ) or d ← VF(pk, M, σ)
to denote the operation of running VF on inputs pk, M, σ and letting d be the bit returned.
We require that VFpk (M, σ) = 1 for any key-pair (pk, sk) that might be output by K, any message
M , and any σ 6= ⊥ that might be output by Sigsk (M ). If Sig is stateless then we associate to
each public key a message space Messages(pk) which is the set of all M for which Sigsk (M ) never
returns ⊥.
Let S be an entity that wants to have a digital signature capability. The first step is key generation:
S runs K to generate a pair of keys (pk, sk) for itself. Note the key generation algorithm is run locally
by S. Now, S can produce a digital signature on some document M ∈ Messages(pk) by running
188 DIGITAL SIGNATURES
Sigsk (M ) to return a signature σ. The pair (M, σ) is then the authenticated version of the document.
Upon receiving a document M ′ and tag σ ′ purporting to be from S, a receiver B in possession of
pk verifies the authenticity of the signature by using the specified verification procedure, which
depends on the message, signature, and public key. Namely he computes VFpk (M ′ , σ ′ ), whose
value is a bit. If this value is 1, it is read as saying the data is authentic, and so B accepts it as
coming from S. Else it discards the data as unauthentic.
Note that an entity wishing to verify S’s signatures must be in possession of S’s public key pk,
and must be assured that the public key is authentic, meaning really is S’s key and not someone
else’s key. We will look later into mechanisms for assuring this state of knowledge. But the key
management processes are not part of the digital signature scheme itself. In constructing and
analyzing the security of digital signature schemes, we make the assumption that any prospective
verifier is in possession of an authentic copy of the public key of the signer. This assumption is
made in what follows.
A viable scheme of course requires some security properties. But these are not our concern now.
First we want to pin down what constitutes a specification of a scheme, so that we know what are
the kinds of objects whose security we want to assess.
The key usage is the “mirror-image” of the key usage in an asymmetric encryption scheme. In
a digital signature scheme, the holder of the secret key is a sender, using the secret key to tag its
own messages so that the tags can be verified by others. In an asymmetric encryption scheme, the
holder of the secret key is a receiver, using the secret key to decrypt ciphertexts sent to it by others.
The signature algorithm might be randomized, meaning internally flip coins and use these coins
to determine its output. In this case, there may be many correct tags associated to a single message
M . The algorithm might also be stateful, for example making use of a counter that is maintained
by the sender. In that case the signature algorithm will access the counter as a global variable,
updating it as necessary. The algorithm might even be both randomized and stateful. However,
unlike encryption schemes, whose encryption algorithms must be either randomized or stateful for
the scheme to be secure, a deterministic, stateless signature algorithm is not only possible, but
common.
The signing algorithm might only be willing to sign certain messages and not others. It indicates
its unwillingness to sign a message by returning ⊥. If the scheme is stateless, the message space,
which can depend on the public key, is the set of all messages for which the probability that the
signing algorithm returns ⊥ is zero. If the scheme is stateful we do not talk of such a space since
whether or not the signing algorithm returns ⊥ can depend not only on the message but on its
state.
The last part of the definition says that signatures that were correctly generated will pass the
verification test. This simply ensures that authentic data will be accepted by the receiver. In the
case of a sateful scheme, the requirement holds for any state of the signing algorithm.
Definition 11.2.1 Let DS = (K, Sig, VF) be a digital signature scheme, and let A be an algorithm
that has access to an oracle and returns a pair of strings. We consider the following experiment:
Experiment Expuf -cma (A)
DS
$
(pk, sk) ← K
(M, σ) ← ASigsk (·) (pk)
If the following are true return 1 else return 0:
– VFpk (M, σ) = 1
– M ∈ Messages(pk)
– M was not a query of A to its oracle
The uf-cma-advantage of A is defined as
h i
Advuf -cma (A) = Pr Expuf -cma (A) = 1 .
DS DS
In the case of message authentication schemes, we provided the adversary not only with an oracle
for producing tags, but also with an oracle for verifying them. Above, there is no verification oracle.
This is because verification of a digital signature does not depend on any quantity that is secret
from the adversary. Since the adversary has the public key and knows the algorithm VF, it can
verify as much as it pleases by running the latter.
When we talk of the time-complexity of an adversary, we mean the worst case total execution
time of the entire experiment. This means the adversary complexity, defined as the worst case
execution time of A plus the size of the code of the adversary A, in some fixed RAM model of
computation (worst case means the maximum over A’s coins or the answers returned in response
to A’s oracle queries), plus the time for other operations in the experiment, including the time for
key generation and the computation of answers to oracle queries via execution of the encryption
algorithm.
As adversary resources, we will consider this time complexity, the message length µ, and the
number of queries q to the sign oracle. We define µ as the sum of the lengths of the oracle queries
plus the length of the message in the forgery output by the adversary. In practice, the queries
correspond to messages signed by the legitimate sender, and it would make sense that getting these
examples is more expensive than just computing on one’s own. That is, we would expect q to be
smaller than t. That is why q, µ are resources separate from t.
190 DIGITAL SIGNATURES
operation of computing RSAN,e on the claimed signature and seeing if we get back the message.
More precisely, let Krsa be an RSA generator with associated security parameter k, as per
Definition 10.9. We consider the digital signature scheme DS = (Krsa , Sig, VF) whose signing and
verifying algorithms are as follows:
This is a deterministic stateless scheme, and the message space for public key (N, e) is Messages(N, e) =
ZN∗ , meaning the only messages that the signer signs are those which are elements of the group Z ∗ .
N
In this scheme we have denoted the signature of M by x. The signing algorithm simply applies
RSAN,d to the message to get the signature, and the verifying algorithm applies RSAN,e to the
signature and tests whether the result equals the message.
Bellare and Rogaway 191
The first thing to check is that signatures generated by the signing algorithm pass the verification
test. This is true because of Proposition 10.7 which tells us that if x = M d mod N then xe =
M mod N .
Now, how secure is this scheme? As we said above, the intuition behind it is that the signing
operation should be something only the signer can perform, since computing RSA−1 N,e (M ) is hard
without knowledge of d. However, what one should remember is that the formal assumed hardness
property of RSA, namely one-wayness under known-exponent attack (we call it just one-wayness
henceforth) as specified in Definition 10.10, is under a very different model and setting than that
of security for signatures. One-wayness tells us that if we select M at random and then feed it
to an adversary (who knows N, e but not d) and ask the latter to find x = RSA−1 N,e (M ), then the
adversary will have a hard time succeeding. But the adversary in a signature scheme is not given a
random message M on which to forge a signature. Rather, its goal is to create a pair (M, x) such
that VFN,e (M, x) = 1. It does not have to try to imitate the signing algorithm; it must only do
something that satisfies the verification algorithm. In particular it is allowed to choose M rather
than having to sign a given or random M . It is also allowed to obtain a valid signature on any
message other than the M it eventually outputs, via the signing oracle, corresponding in this case
to having an oracle for RSA−1 N,e (·). These features make it easy for an adversary to forge signatures.
A couple of simple forging strategies are illustrated below. The first is to simply output the
forgery in which the message and signature are both set to 1. The second is to first pick at random
a value that will play the role of the signature, and then compute the message based on it:
Sig (·) Sig (·)
Forger F1 N,p,q,d (N, e) Forger F2 N,p,q,d (N, e)
Return (1, 1) x← $ ∗ ; M ← xe mod N
ZN
Return (M, x)
These forgers makes no queries to their signing oracles. We note that 1e ≡ 1 (mod N ), and hence
the uf-cma-advantage of F1 is 1. Similarly, the value (M, x) returned by the second forger satisfies
xe mod N = M and hence it has uf-cma-advantage 1 too. The time-complexity in both cases is
very low. (In the second case, the forger uses the O(k3 ) time to do its exponentiation modulo N .)
So these attacks indicate the scheme is totally insecure.
The message M whose signature the above forger managed to forge is random. This is enough
to break the scheme as per our definition of security, because we made a very strong definition of
security. Actually for this scheme it is possible to even forge the signature of a given message M ,
but this time one has to use the signing oracle. The attack relies on the multiplicativity of the RSA
function.
Given M the forger wants to compute a valid signature x for M . It creates M1 , M2 as shown, and
obtains their signatures x1 , x2 . It then sets x = x1 x2 mod N . Now the verification algorithm will
check whether xe mod N = M . But note that
xe ≡ (x1 x2 )e ≡ xe1 xe2 ≡ M1 M2 ≡ M (mod N ) .
Here we used the multiplicativity of the RSA function and the fact that xi is a valid signature of
Mi for i = 1, 2. This means that x is a valid signature of M . Since M1 is chosen to not be 1 or M ,
192 DIGITAL SIGNATURES
the same is true of M2 , and thus M was not an oracle query of F . So F succeeds with probability
one.
These attacks indicate that there is more to signatures than one-wayness of the underlying
function.
forger wins if xe mod N = Hash(M ) (rather than merely xe mod N = M as before). The hope is
that with a “good” hash function, it is very unlikely that xe mod N = HashN (M ). Consider now
the third attack we presented above, which relied on the multiplicativity of the RSA function. For
this attack to work under the hash-then-invert scheme, it would have to be true that
Again, with a “good” hash function, we would hope that this is unlikely to be true.
The hash function is thus supposed to “destroy” the algebraic structure that makes attacks like
the above possible. How we might find one that does this is something we have not addressed.
While the hash function might prevent some attacks that worked on the trapdoor scheme, its
use leads to a new line of attack, based on collisions in the hash function. If an adversary can find
two distinct messages M1 , M2 that hash to the same value, meaning HashN (M1 ) = HashN (M2 ),
then it can easily forge signatures, as follows:
This works because M1 , M2 have the same signature. Namely because x1 is a valid signature of
M1 , and because M1 , M2 have the same hash value, we have
xe1 ≡ HashN (M1 ) ≡ HashN (M2 ) (mod N ) ,
and this means the verification procedure will accept x1 as a signature of M2 . Thus, a necessary
requirement on the hash function Hash is that it be CR2-KK, meaning given N it should be
computationally infeasible to find distinct values M, M ′ such that HashN (M ) = HashN (M ′ ).
Below we will go on to more concrete instantiations of the hash-then-invert paradigm. But
before we do that, it is important to try to assess what we have done so far. Above, we have
pin-pointed some features of the hash function that are necessary for the security of the signature
scheme. Collision-resistance is one. The other requirement is not so well formulated, but roughly
we want to destroy algebraic structure in such a way that Equation (11.1), for example, should
fail with high probability. Classical design focuses on these attacks and associated features of the
hash function, and aims to implement suitable hash functions. But if you have been understanding
the approaches and viewpoints we have been endeavoring to develop in this class and notes, you
should have a more critical perspective. The key point to note is that what we need is not really to
pin-point necessary features of the hash function to prevent certain attacks, but rather to pin-point
sufficient features of the hash function, namely features sufficient to prevent all attacks, even ones
that have not yet been conceived. And we have not done this. Of course, pinning down necessary
features of the hash function is useful to gather intuition about what sufficient features might be,
but it is only that, and we must be careful to not be seduced into thinking that it is enough, that
we have identified all the concerns. Practice proves this complacence wrong again and again.
How can we hope to do better? Return to the basic philosophy of provable security. We want
assurance that the signature scheme is secure under the assumption that its underlying primitives
are secure. Thus we must try to tie the security of the signature scheme to the security of RSA as
a one-way function, and some security condition on the hash function. With this in mind, let us
proceed to examine some suggested solutions.
PKCS-HashN (M ) = 00 01 FF FF · · · FF FF 00 k h(M ) .
Here k denotes concatenation, and enough FF-bytes are inserted that the length of PKCS-HashN (M )
is equal to k bits. Note the the first four bits of the hash output are zero, meaning as an integer it
is certainly at most N , and thus most likely in Z∗N , since most numbers between 1 and N are in
Z∗N . Also note that finding collisions in PKCS-Hash is no easier than finding collisions in h, so if
the latter is collision-resistant then so is the former.
194 DIGITAL SIGNATURES
Recall that the signature scheme is exactly that of the hash-then-invert paradigm. For con-
creteness, let us rewrite the signing and verifying algorithms:
Now what about the security of this signature scheme? Our first concern is the kinds of algebraic
attacks we saw on trapdoor signatures. As discussed in Section 11.3.3, we would like that relations
like Equation (11.1) fail. This we appear to get; it is hard to imagine how PKCS-HashN (M1 ) ·
PKCS-HashN (M2 ) mod N could have the specific structure required to make it look like the PKCS-
hash of some message. This isn’t a proof that the attack is impossible, of course, but at least it is
not evident.
This is the point where our approach departs from the classical attack-based design one. Under
the latter, the above scheme is acceptable because known attacks fail. But looking deeper there is
cause for concern. The approach we want to take is to see how the desired security of the signature
scheme relates to the assumed or understood security of the underlying primitive, in this case the
RSA function.
We are assuming RSA is one-way, meaning it is computationally infeasible to compute RSA−1 N,e (y)
∗ −1
for a randomly chosen point y ∈ ZN . On the other hand, the points to which RSAN,e is applied in
the signature scheme are those in the set SN = { PKCS-HashN (M ) : M ∈ {0, 1}∗ }. The size of
SN is at most 2l since h outputs l bits and the other bits of PKCS-HashN (·) are fixed. With SHA-1
this means |SN | ≤ 2160 . This may seem like quite a big set, but within the RSA domain ZN ∗ it is
tiny. For example when k = 1024, which is a recommended value of the security parameter these
days, we have
|SN | 2160 1
∗ ≤ 1023
= 863 .
|ZN | 2 2
This is the probability with which a point chosen randomly from ZN ∗ lands in S . For all practical
N
purposes, it is zero. So RSA could very well be one-way and still be easy to invert on SN , since
the chance of a random point landing in SN is so tiny. So the security of the PKCS scheme cannot
be guaranteed solely under the standard one-wayness assumption on RSA. Note this is true no
matter how “good” is the underlying hash function h (in this case SHA-1) which forms the basis
for PKCS-Hash. The problem is the design of PKCS-Hash itself, in particular the padding.
The security of the PKCS signature scheme would require the assumption that RSA is hard
to invert on the set SN , a miniscule fraction of its full range. (And even this would be only a
necessary, but not sufficient condition for the security of the signature scheme.)
Let us try to clarify and emphasize the view taken here. We are not saying that we know how
to attack the PKCS scheme. But we are saying that an absence of known attacks should not be
deemed a good reason to be satisfied with the scheme. We can identify “design flaws,” such as the
way the scheme uses RSA, which is not in accordance with our understanding of the security of
RSA as a one-way function. And this is cause for concern.
function must always “look random”. Yet, even this only highlights a necessary condition, not (as
far as we know) a sufficient one.
We now ask ourselves the following question. Suppose we had a “perfect” hash function Hash.
In that case, at least, is the hash-then-invert signature scheme secure? To address this we must first
decide what is a “perfect” hash function. The answer is quite natural: one that is random, namely
returns a random answer to any query except for being consistent with respect to past queries. (We
will explain more how this “random oracle” works later, but for the moment let us continue.) So
our question becomes: in a model where Hash is perfect, can we prove that the signature scheme
is secure if RSA is one-way?
This is a basic question indeed. If the hash-then-invert paradigm is in any way viable, we really
must be able to prove security in the case the hash function is perfect. Were it not possible to prove
security in this model it would be extremely inadvisable to adopt the hash-then-invert paradigm; if
it doesn’t work for a perfect hash function, how can we expect it to work in any real world setting?
Accordingly, we now focus on this “thought experiment” involving the use of the signature
scheme with a perfect hash function. It is a thought experiment because no specific hash function
is perfect. Our “hash function” is no longer fixed, it is just a box that flips coins. Yet, this thought
experiment has something important to say about the security of our signing paradigm. It is not
only a key step in our understanding but will lead us to better concrete schemes as we will see
later.
Now let us say more about perfect hash functions. We assume that Hash returns a random
member of ZN ∗ every time it is invoked, except that if twice invoked on the same message, it returns
the same thing both times. In other words, it is an instance of a random function with domain
{0, 1}∗ and range ZN ∗ . We have seen such objects before, when we studied pseudorandomness:
The only change with respect to the way we wrote the algorithms for the generic hash-then-invert
scheme of Section 11.3.3 is notational: we write H as a superscript to indicate that it is an oracle
196 DIGITAL SIGNATURES
accessible only via the specified oracle interface. The instruction y ← H(M ) is implemented by
making the query (hash, M ) and letting y denote the answer returned, as discussed above.
We now ask ourselves whether the above signature scheme is secure under the assumption that
RSA is one-way. To consider this question we first need to extend our definitions to encompass
the new model. The key difference is that the success probability of an adversary is taken over
the random choice of H in addition to the random choices previously considered. The forger F as
before has access to a signing oracle, but now also has access to H. Furthermore, Sig and VF now
have access to H. Let us first write the experiment that measures the success of forger F and then
discuss it more.
Note that the forger is given oracle access to H in addition to the usual access to the sign oracle
that models a chosen-message attack. After querying its oracles some number of times the forger
outputs a message M and candidate signature x for it. We say that F is successful if the verification
process would accept M, x, but F never asked the signing oracle to sign M . (F is certainly allowed
to make hash query M , and indeed it is hard to imagine how it might hope to succeed in forgery
otherwise, but it is not allowed to make sign query M .) The uf-cma-advantage of A is defined as
h i
Advuf -cma (A) = Pr Expuf -cma (A) = 1 .
DS DS
We will want to consider adversaries with time-complexity at most t, making at most qsig sign oracle
queries and at most qhash hash oracle queries, and with total query message length µ. Resources
refer again to those of the entire experiment. We first define the execution time as the time taken
by the entire experiment Expuf -cma (F ). This means it includes the time to compute answers to
DS
oracle queries, to generate the keys, and even to verify the forgery. Then the time-complexity t
is supposed to upper bound the execution time plus the size of the code of F . In counting hash
queries we again look at the entire experiment and ask that the total number of queries to H here
be at most qhash . Included in the count are the direct hash queries of F , the indirect hash queries
made by the signing oracle, and even the hash query made by the verification algorithm in the
last step. This latter means that qhash is always at least the number of hash queries required for
a verification, which for FDH-RSA is one. In fact for FDH-RSA we will have qhash ≥ qsig + 1,
something to be kept in mind when interpreting later results. Finally µ is the sum of the lengths
of all messages in sign queries plus the length of the final output message M .
However, there is one point that needs to be clarified here, namely that if time-complexity refers
to that of the entire experiment, how do we measure the time to pick H at random? It is an infinite
object and thus cannot be actually chosen in finite time. The answer is that although we write H
as being chosen at random upfront in the experiment, this is not how it is implemented. Instead,
imagine H as being chosen dynamically. Think of the process implementing the table we described,
so that random choices are made only at the time the H oracle is called, and the cost is that of
maintaining and updating a table that holds the values of H on inputs queried so far. Namely
Bellare and Rogaway 197
when a query M is made to H, we charge the cost of looking up the table, checking whether H(M )
was already defined and returning it if so, else picking a random point from Z∗N , putting it in the
table with index M , and returning it as well.
In this setting we claim that the FDH-RSA scheme is secure. The following theorem upper
bounds its uf-cma-advantage solely in terms of the ow-kea advantage of the underlying RSA gen-
erator.
Theorem 11.3.1 Let Krsa be an RSA generator with associated security parameter k, and let DS
be the FDH-RSA scheme associated to Krsa . Let F be an adversary making at most qhash queries
to its hash oracle and at most qsig queries to its signing oracle where qhash ≥ 1 + qsig . Then there
exists an adversary I such that
uf -cma ow -kea
AdvDS (F ) ≤ qhash · AdvK rsa
(I) . (11.2)
The theorem says that the only way to forge signatures in the FDH-RSA scheme is to try to invert
the RSA function on random points. There is some loss in security: it might be that the chance of
breaking the signature scheme is larger than that of inverting RSA in comparable time, by a factor
of the number of hash queries made in the forging experiment. But we can make Advow -kea ′
Krsa (t )
ow-kea ′
small enough that even qhash · AdvK rsa
(t ) is small, by choosing a larger modulus size k.
One must remember the caveat: this is in a model where the hash function is random. Yet,
even this tells us something, namely that the hash-then-invert paradigm itself is sound, at least for
“perfect” hash functions. This puts us in a better position to explore concrete instantiations of the
paradigm.
Let us now proceed to the proof of Theorem 11.3.1. Remember that inverter I takes as input
(N, e), describing RSAN,e , and also a point y ∈ ZN ∗ . Its job is to try to output RSA−1 (y) =
N,e
y d mod N , where d is the decryption exponent corresponding to encryption exponent e. Of course,
neither d nor the factorization of N are available to I. The success of I is measured under a random
choice of ((N, e), (N, p, q, d)) as given by Krsa , and also a random choice of y from ZN ∗ . In order to
accomplish its task, I will run F as a subroutine, on input public key (N, e), hoping somehow to
use F ’s ability to forge signatures to find RSA−1 N,e (y). Before we discuss how I might hope to use
the forger to determine the inverse of point y, we need to take a closer look at what it means to
run F as a subroutine.
Recall that F has access to two oracles, and makes calls to them. At any point in its execution
it might output (hash, M ). It will then wait for a return value, which it interprets as H(M ). Once
this is received, it continues its execution. Similarly it might output (sign, M ) and then wait to
H(·)
receive a value it interprets as SigN,p,q,d (M ). Having got this value, it continues. The important
thing to understand is that F , as an algorithm, merely communicates with oracles via an interface.
It does not control what these oracles return. You might think of an oracle query like a system
call. Think of F as writing an oracle query M at some specific prescribed place in memory. Some
process is expected to put in another prescribed place a value that F will take as the answer. F
reads what is there, and goes on.
When I executes F , no oracles are actually present. F does not know that. It will at some
point make an oracle query, assuming the oracles are present, say query (hash, M ). It then waits
for an answer. If I wants to run F to completion, it is up to I to provide some answer to F as
the answer to this oracle query. F will take whatever it is given and go on executing. If I cannot
provide an answer, F will not continue running; it will just sit there, waiting. We have seen this
198 DIGITAL SIGNATURES
idea of “simulation” before in several proofs: I is creating a “virtual reality” under which F can
believe itself to be in its usual environment.
The strategy of I will be to take advantage of its control over responses to oracle queries. It
will choose them in strange ways, not quite the way they were chosen in Experiment Expuf -cma (F ).
DS
Since F is just an algorithm, it processes whatever it receives, and eventually will halt with some
output, a claimed forgery (M, x). By clever choices of replies to oracle queries, I will ensure that
F is fooled into not knowing that it is not really in Expuf -cma (F ), and furthermore x will be the
DS
desired inverse of y. Not always, though; I has to be lucky. But it will be lucky often enough.
We begin by consider the case of a very simple forger F . It makes no sign queries and exactly
one hash query (hash, M ). It then outputs a pair (M, x) as the claimed forgery, the message M
being the same in the hash query and the forgery. (In this case we have qsig = 0 and qhash = 2,
the last due to the hash query of F and the final verification query in the experiment.) Now if
F is successful then x is a valid signature of M , meaning xe ≡ H(M ) mod N , or, equivalently,
x ≡ H(M )d mod N . Somehow, F has found the inverse of H(M ), the value returned to it as the
response to oracle query M . Now remember that I’s goal had been to compute y d mod N where
y was its given input. A natural thought suggests itself: If F can invert RSAN,e at H(M ), then
I will “set” H(M ) to y, and thereby obtain the inverse of y under RSAN,e . I can set H(M ) in
this way because it controls the answers to oracle queries. When F makes query (hash, M ), the
inverter I will simply return y as the response. If F then outputs a valid forgery (M, x), we have
x = y d mod N , and I can output x, its job done.
But why would F return a valid forgery when it got y as its response to hash query M ? Maybe
it will refuse this, saying it will not work on points supplied by an inverter I. But this will not
happen. F is simply an algorithm and works on whatever it is given. What is important is solely the
distribution of the response. In Experiment Expuf -cma (F ) the response to (hash, M ) is a random
DS
element of ZN ∗ . But y has exactly the same distribution, because that is how it is chosen in the
experiment defining the success of I in breaking RSA as a one-way function. So F cannot behave
any differently in this virtual reality than it could in its real world; its probability of returning a
valid forgery is still Advuf -cma (F ). Thus for this simple F the success probability of the inverter
DS
d
in finding y mod N is exactly the same as the success probability of F in forging signatures.
Equation (11.2) claims less, so we certainly satisfy it.
However, most forgers will not be so obliging as to make no sign queries, and just one hash
query consisting of the very message in their forgery. I must be able to handle any forger.
Inverter I will define a pair of subroutines, H -Sim (called the hash oracle simulator) and
Sign-Sim (called the sign oracle simulator) to play the role of the hash and sign oracles respectively.
Namely, whenever F makes a query (hash, M ) the inverter I will return H -Sim(M ) to F as the
answer, and whenever F makes a query (sign, M ) the inverter I will return Sign-Sim(M ) to F as
the answer. (The Sign-Sim routine will additionally invoke H -Sim.) As it executes, I will build up
various tables (arrays) that “define” H. For j = 1, . . . , qhash , the j-th string on which H is called
in the experiment (either directly due to a hash query by F , indirectly due to a sign query by F ,
or due to the final verification query) will be recorded as Msg[j]; the response returned by the hash
oracle simulator to Msg[j] is stored as Y [j]; and if Msg[j] is a sign query then the response returned
to F as the “signature” is X[j]. Now the question is how I defines all these values.
Suppose the j-th hash query in the experiment arises indirectly, as a result of a sign query
(sign, Msg[j]) by F . In Experiment Expuf -cma (F ) the forger will be returned H(Msg[j])d mod N .
DS
If I wants to keep F running it must return something plausible. What could I do? It could
attempt to directly mimic the signing process, setting Y [j] to a random value (remember Y [j]
plays the role of H(Msg[j])) and returning (Y [j])d mod N . But it won’t be able to compute the
latter since it is not in possesion of the secret signing exponent d. The trick, instead, is that I first
Bellare and Rogaway 199
picks a value X[j] at random in ZN ∗ and sets Y [j] = (X[j])e mod N . Now it can return X[j] as
the answer to the sign query, and this answer is accurate in the sense that the verification relation
(which F might check) holds: we have Y [j] ≡ (X[j])e mod N .
This leaves a couple of loose ends. One is that we assumed above that I has the liberty of
defining Y [j] at the point the sign query was made. But perhaps Msg[j] = Msg[l] for some l < j
due to there having been a hash query involving this same message in the past. Then the hash
value Y [j] is already defined, as Y [l], and cannot be changed. This can be addressed quite simply
however: for any hash query Msg[l], the hash simulator can follow the above strategy of setting the
reply Y [l] = (X[l])e mod N at the time the hash query is made, meaning it prepares itself ahead of
time for the possibility that Msg[l] is later a sign query. Maybe it will not be, but nothing is lost.
Well, almost. Something is lost, actually. A reader who has managed to stay awake so far may
notice that we have solved two problems: how to use F to find y d mod N where y is the input
to I, and how to simulate answers to sign and hash queries of F , but that these processes are in
conflict. The way we got y d mod N was by returning y as the answer to query (hash, M ) where M
is the message in the forgery. However, we do not know beforehand which message in a hash query
will be the one in the forgery. So it is difficult to know how to answer a hash query Msg[j]; do we
return y, or do we return (X[j])e mod N for some X[j]? If we do the first, we will not be able to
answer a sign query with message Msg[j]; if we do the second, and if Msg[j] equals the message
in the forgery, we will not find the inverse of y. The answer is to take a guess as to which to do.
There is some chance that this guess is right, and I succeeds in that case.
Specifically, notice that Msg[qhash ] = M is the message in the forgery by definition since
Msg[qhash ] is the message in the final verification query. The message M might occur more than
once in the list, but it occurs at least once. Now I will choose a random i in the range 1 ≤ i ≤ qhash
and respond by y to hash query (hash, Msg[i]). To all other queries j it will respond by first picking
X[j] at random in ZN ∗ and setting H(Msg[j]) = (X[j])e mod N . The forged message M will equal
Msg[i] with probability at least 1/qhash and this will imply Equation (11.2). Below we summarize
these ideas as a proof of Theorem 11.3.1.
It is tempting from the above description to suggest that we always choose i = qhash , since
Msg[qhash ] = M by definition. Why won’t that work? Because M might also have been equal to
Msg[j] for some j < qhash , and if we had set i = qhash then at the time we want to return y as the
answer to M we find we have already defined H(M ) as something else and it is too late to change
our minds.
Proof of Theorem 11.3.1: We first decribe I in terms of two subroutines: a hash oracle
simulator H -Sim(·) and a sign oracle simulator Sign-Sim(·). It takes inputs N, e, y where y ∈ Z∗N
and maintains three tables, Msg, X and Y , each an array with index in the range from 1 to qhash .
It picks a random index i. All these are global variables which will be used also be the subroutines.
The intended meaning of the array entries is the following, for j = 1, . . . , qhash –
Msg[j] – The j-th hash query in the experiment
Y [j] – The reply of the hash oracle simulator to the above, meaning
the value playing the role of H(Msg[j]). For j = i it is y.
X[j] – For j 6= i, the response to sign query Msg[j], meaning it satisfies
(X[j])e ≡ Y [j] (mod N ). For j = i it is undefined.
The code for the inverter is below.
Inverter I(N, e, y)
200 DIGITAL SIGNATURES
The inverter responds to oracle queries by using the appropriate subroutines. Once it has the
claimed forgery, it makes the corresponding hash query and then returns the signature x.
We now describe the hash oracle simulator. It makes reference to the global variables instantiated
in in the main code of I. It takes as argument a value v which is simply some message whose hash
is requested either directly by F or by the sign simulator below when the latter is invoked by F .
We will make use of a subroutine Find that given an array A, a value v and index m, returns 0 if
v 6∈ {A[1], . . . , A[m]}, and else returns the smallest index l such that v = A[l].
Subroutine H -Sim(v)
l ← Find(Msg, v, j) ; j ← j + 1 ; Msg[j] ← v
If l = 0 then
If j = i then Y [j] ← y
$ ∗ ; Y [j] ← (X[j])e mod N
Else X[j] ← ZN
EndIf
Return Y [j]
Else
If j = i then abort
Else X[j] ← X[l] ; Y [j] ← Y [l] ; Return Y [j]
EndIf
EndIf
The manner in which the hash queries are answered enables the following sign simulator.
Subroutine Sign-Sim(M )
h ← H -Sim(M )
If j = i then abort
Else return X[j]
EndIf
Inverter I might abort execution due to the “abort” instruction in either subroutine. The first such
situation is that the hash oracle simulator is unable to return y as the response to the i-th hash
query because this query equals a previously replied to query. The second case is that F asks for
the signature of the message which is the i-th hash query, and I cannot provide that since it is
hoping the i-th message is the one in the forgery and has returned y as the hash oracle response.
Bellare and Rogaway 201
Now we need to lower bound the ow-kea-advantage of I with respect to Krsa . There are a few
observations involved in verifying the bound claimed in Equation (11.2). First that the “view” of
F at any time at which I has not aborted is the “same” as in Experiment Expuf -cma (F ). This
DS
means that the answers being returned to F by I are distributed exactly as they would be in the
real experiment. Second, F gets no information about the value i that I chooses at random. Now
remember that the last hash simulator query made by I is the message M in the forgery, so M is
certainly in the array Msg at the end of the execution of I. Let l = Find(Msg, M, qhash ) be the
first index at which M occurs, meaning Msg[l] = M but no previous message is M . The random
choice of i then means that there is a 1/qhash chance that i = l, which in turn means that Y [i] = y
and the hash oracle simulator won’t abort. If x is a correct signature of M we will have xe ≡ Y [i]
(mod N ) because Y [i] is H(M ) from the point of view of F . So I is successful whenever this
happens.
oracle. Additonally it has a parameter s which is the length of the random value chosen by the
signing algorithm. We write the signing and verifying algorithms as follows:
H(·) H(·)
Algorithm SigN,p,q,d(M ) Algorithm VFN,e (M, σ)
r← $
{0, 1}s Parse σ as (r, x) where |r| = s
y ← H(r k M ) y ← H(r k M )
x ← y d mod N If xe mod N = y
Return (r, x) Then return 1 else return 0
Obvious “range checks” are for simplicity not written explicitly in the verification code; for example
in a real implementation the latter should check that 1 ≤ x < N and gcd(x, N ) = 1.
This scheme may still be viewed as being in the “hash-then-invert” paradigm, except that
the hash is randomized via a value chosen by the signing algorithm. If you twice sign the same
message, you are likely to get different signatures. Notice that random value r must be included
in the signature since otherwise it would not be possible to verify the signature. Thus unlike the
previous schemes, the signature is not a member of ZN ∗ ; it is a pair one of whose components is an
∗
s-bit string and the other is a member of ZN . The length of the signature is s + k bits, somewhat
longer than signatures for deterministic hash-then-invert signature schemes. It will usually suffice
to set l to, say, 160, and given that k could be 1024, the length increase may be tolerable.
The success probability of a forger F attacking DS is measured in the random oracle model,
via experiment Expuf -cma (F ). Namely the experiment is the same experiment as in the FDH-RSA
DS
202 DIGITAL SIGNATURES
case; only the scheme DS we plug in is now the one above. Accordingly we have the insecurity
function associated to the scheme. Now we can summarize the security property of the PSS0
scheme.
Theorem 11.3.2 Let DS be the PSS0 scheme with security parameters k and s. Let F be an
adversary making qsig signing queries and qhash ≥ 1 + qsig hash oracle queries. Then there exists an
adversary I such that
randomly and independently each time. Now the fact that RSAN,e is a permutation means that
all the different Y [j] values are randomly and independently distributed. Furthermore, suppose
(M, (r, x)) is a forgery for which hash oracle query r k M has been made and got the reponse
Y [l] = y · (X[l])e mod N . Then we have (x · X[l]−1 )e ≡ y (mod N ), and thus the inverse of y is
x · X[l]−1 mod N .
The second problem however, cannot be resolved for FDH. That is exactly why PSS0 pre-pends
the random value r to the message before hashing. This effectively “separates” the two kinds of
hash queries: the direct queries of F to the hash oracle, and the indirect queries to the hash oracle
arising from the sign oracle. The direct hash oracle queries have the form r k M for some l-bit
Bellare and Rogaway 203
string r and some message M . The sign query is just a message M . To answer it, a value r is first
chosen at random. But then the value r k M has low probability of having been a previous hash
query. So at the time any new direct hash query is made, I can assume it will never be an indirect
hash query, and thus reply via the above trick.
Here now is the full proof.
Proof of Theorem 11.3.2: We first decribe I in terms of two subroutines: a hash oracle
simulator H -Sim(·) and a sign oracle simulator Sign-Sim(·). It takes input N, e, y where y ∈ Z∗N ,
and maintains four tables, R, V , X and Y , each an array with index in the range from 1 to qhash .
All these are global variables which will be used also be the subroutines. The intended meaning of
the array entries is the following, for j = 1, . . . , qhash –
V [j] – The j-th hash query in the experiment, having the form R[j] k Msg[j]
R[j] – The first l-bits of V [j]
Y [j] – The value playing the role of H(V [j]), chosen either by the hash simulator
or the sign simulator
X[j] – If V [j] is a direct hash oracle query of F this satisfies Y [j] · X[j]−e ≡ y
(mod N ). If V [j] is an indirect hash oracle query this satisfies X[j]e ≡ Y [j]
(mod N ), meaning it is a signature of Msg[j].
Note that we don’t actually need to store the array Msg; it is only referred to above in the expla-
nation of terms.
We will make use of a subroutine Find that given an array A, a value v and index m, returns 0 if
v 6∈ {A[1], . . . , A[m]}, and else returns the smallest index l such that v = A[l].
Inverter I(N, e, y)
Initialize arrays R[1 . . . qhash ], V [1 . . . qhash ], X[1 . . . qhash ], Y [1 . . . qhash ], to empty
j←0
Run F on input N, e
If F makes oracle query (hash, v)
then h ← H -Sim(v) ; return h to F as the answer
If F makes oracle query (sign, M )
then σ ← Sign-Sim(M ) ; return σ to F as the answer
Until F halts with output (M, (r, x))
y ← H -Sim(r k M ) ; l ← Find(V, r k M, qhash )
w ← x · X[l]−1 mod N ; Return w
We now describe the hash oracle simulator. It makes reference to the global variables instantiated
in in the main code of I. It takes as argument a value v which is assumed to be at least s bits long,
meaning of the form r k M for some s bit strong r. (There is no need to consider hash queries not
of this form since they are not relevant to the signature scheme.)
Subroutine H -Sim(v)
Parse v as r k M where |r| = s
l ← Find(V, v, j) ; j ← j + 1 ; R[j] ← r ; V [j] ← v
If l = 0 then
$
X[j] ← ZN∗ ; Y [j] ← y · (X[j])e mod N ; Return Y [j]
204 DIGITAL SIGNATURES
Else
X[j] ← X[l] ; Y [j] ← Y [l] ; Return Y [j]
EndIf
Every string v queried of the hash oracle is put by this routine into a table V , so that V [j] is the
j-th hash oracle query in the execution of F . The following sign simulator does not invoke the hash
simulator, but if necessary fills in the necessary tables itself.
Subroutine Sign-Sim(M )
r← $
{0, 1}s
l ← Find(R, r, j)
If l 6= 0 then abort
Else
$
j ← j + 1 ; R[j] ← r ; V [j] ← r k M ; X[j] ← ∗ ; Y [j] ← (X[j])e mod N
ZN
Return X[j]
EndIf
This is justified as follows. Remember that the last hash simulator query made by I is r k M where
M is the message in the forgery, so r k M is certainly in the array V at the end of the execution of
Bellare and Rogaway 205
I. So l = Find(V, r k M, qhash ) 6= 0. We know that r k M was not put in V by the sign simulator,
because F is not allowed to have made sign query M . This means the hash oracle simulator has
been invoked on r k M . This means that Y [l] = y · (X[l])e mod N because that is the way the
hash oracle simulator chooses its replies. The correctness of the forgery means that xe ≡ H(r k M )
(mod N ), and the role of the H value here is played by Y [l], so we get xe ≡ Y [l] ≡ y · X[l]
(mod N ). Solving this gives (x · X[l]−1 )e mod N = y, and thus the inverter is correct in returning
x · X[l]−1 mod N .
It may be worth adding some words of caution about the above. It is tempting to think that
Advow -kea (I) ≥ 1 − (qhash − 1) · qsig · Advuf -cma (F ) ,
Krsa DS
2s
which would imply Equation (11.3) but is actually stronger. This however is not true, because the
bad events and success events as defined above are not independent.
206 DIGITAL SIGNATURES
Chapter 12
This chapter assumes that the reader has background in basic computational complexity theory.
You should know about complexity classes like P, NP, PSPACE, RP, BPP. It also assumes
background in basic cryptography, including computational number theory.
We fix the alphabet Σ = {0, 1}. A member of Σ∗ is called a string, and the empty string is
denoted ε. Objects upon which computation is performed, be they numbers, graphs or sets, are
assumed to be appropriately encoded as strings. A language is a set of strings.
14.1 Introduction
Consider two parties, whom we will call the prover and the verifier, respectively. They have a
common input, denoted x. They also might have individual, private inputs, called auxiliary inputs,
with that of the prover denoted w, and that of the verifier denoted a. Each party also has access to
a private source of random bits, called the party’s coins. The parties exchange messages, and, at the
end of the interaction, the verifier either accepts or rejects. Each party computes the next message
it sends as a function of the common input, its auxiliary input, its coins, and the conversation so
far.
The computational powers of the parties are important. The verifier must be “efficient.” (As
per complexity-theoretic conventions, this means it must be implementable by an algorithm running
in time polynomial in the length of the common input.) Accordingly, it is required that the number
of moves (meaning, message exchanges) be bounded by a polynomial in the length of the common
input x. The computational power of the prover varies according to the setting and requirements,
as we will see below.
We are interested in various goals for the interaction. Some are more important in complexity-
theory, others in cryptography.
two conditions. The first, called “completeness,” asks that the verifier be open to being convinced
of true claims, meaning that there exist a prover strategy P such that the interaction between
P and V on common input x ∈ L leads the verifier to accept. The second, called “soundness,”
asks that the verifier is able to protect itself against being convinced of false claims, meaning that
for any prover strategy Pb , the interaction between P and V on common input x 6∈ L leads the
verifier to reject except with some “small” probability. (This is called the error-probability and is
a parameter of the system.)
As an example, suppose L is the language SAT of satisfiable, boolean formulae. In that case the
common input x is a boolean formula. The protocol consists of a single move, in which the prover
supplies a string y that the verifier expects to be an assignment to the variables of the formula that
makes the formula true. The verifier program simply evaluates x at y, accepting iff this value is
one. If the formula is satisfiable, the prover can prove that this by sending the verifier a satisfying
truth assignment y. If the formula x is unsatisfiable, there is no string y that can make the verifier
accept.
The above is a very simple proof system in that the interaction consists of a single message
from prover to verifier, and randomness is not used. This is called an NP-proof system. There are
several reasons for which we are interested in proof systems that go beyond this, comprising many
rounds of exchange and allowing randomness. On the complexity-theoretic side, they enable one to
prove membership in languages outside NP. On the cryptographic side, they enable one to have
properties like “zero-knowledge,” to be discussed below.
Within the definitional template outlined above, there are various variants, depending on the
computational power allowed to the prover in the two conditions. When no computational re-
strictions are put on the prover in the completeness and soundness conditions, one obtains what
is actually called an interactive proof in the literature, a notion due to Goldwasser, Micali and
Rackoff [20]. A remarkable fact is that IP, the class of languages possessing interactive proofs of
membership, equals PSPACE [27, 34], showing that interaction and randomness extend language
membership-proof capability well beyond NP.
Thinking of the prover’s computation as one to be actually implemented in a cryptographic
protocol, however, one must require that it be feasible. A stronger completeness requirement,
that we call poly-completeness, is considered. It asks that there exist a prover P , running in time
polynomial in the length of the common input, such that for every x ∈ L there exists some auxiliary
input w which, when given to P , enables the latter to make the verifier accept. The NP-proof
system outlined above satisfies this condition because a satisfying assignment to x can play the role
of the auxiliary input, and the prover simply transmits it.
The auxiliary input is important, since without it one would not expect a polynomial-time
prover to be able to prove anything of interest. Why is it realistic? We imagine that the input
x, for the sake of example a boolean formula, did not appear out of thin air, but was perhaps
constructed by the prover in such a way that the latter knows a satisfying assignment, and makes
claims about the satisfiability of the formula based on this knowledge.
The soundness condition can analogously be weakened to ask that it hold only with respect to
polynomial-time provers Pb , having no auxiliary input. This poly-soundness condition is usually
enough in cryptographic settings.
A proof system satisfying poly-completeness and poly-soundness is sometimes called a compu-
tationally sound proof system, or an argument [6], with a proof system satisfying completeness
and soundness called a statistically sound proof system, or, as mentioned above, an interactive
proof. The class of languages possessing proofs of membership satisfying poly-completeness and
poly-soundness is a subset of IP and a superset of NP. We do not know any simple characterization
of it. For cryptographic applications, it is typically enough that it contains NP.
Bellare and Rogaway 213
Although poly-soundness suffices for applications, it would be a mistake to think that soundness
is not of cryptographic interest. It often holds, meaning natural protocols have this property, and
is technically easier to work with than poly-soundness. An example illustrating this is the fact that
soundness is preserved under parallel composition, while poly-soundness is not [5].
The focus of these notes being cryptography, we will neglect the large and beautiful area of
the computational complexity of interactive proof systems. But let us at least try to note some
highlights. The first evidence as to the power of interactive proofs was provided by the fact that the
language of non-isomorphic graphs, although not known to be in NP, possesses an interactive proof
of membership [15]. Eventually, as noted above, it was found that IP = PSPACE. The related
model of probabilistically checkable proofs has been applied to derive strong non-approximability
results for NP-optimization problems, solving age-old open questions in algorithms.
If a prover sends a verifier a truth assignment to their common input formula x, it does more
than convince the verifier that x is satisfiable: it gives that verifier useful information regarding
the satisfiability of the formula. In particular, it enables the verifier to then prove to a third party
that the same formula is satisfiable. The goal of a zero-knowledge protocol in this setting would
be to not reveal to the verifier anything other than the fact that x is satisfiable, and in particular
not enable the verifier, after having completed its interaction with the prover, to turn around and
convince another polynomial-time party that x is satisfiable.
Although the above discussion was about proofs of language membership, the same applies
to proofs of knowledge. One can have zero-knowledge proofs of language membership, or zero-
knowledge proofs of knowledge.
The prover claims that some fact related to x is true, namely that x is a member of some fixed
and understood underlying language L. The verifier is skeptical, but willing to be convinced. It
asks the prover to supply evidence as to the truth of the claim, and eventually makes a decision,
based on this evidence, regarding whether or not the claimed fact about x is true. We require that
if x ∈ L then it should be possible for the prover to convince the verifier, but if x 6∈ L, then, no
matter what the prover does, the verifier will not be convinced. This is what we call a proof system.
The simplest type of proof system is an NP one. Here, the evidence supplied by the prover is
a witness string y, and the verifier can check some relation between x and y in time polynomial in
the length of x. For example if the underlying language is SAT, the input x would be a formula,
and the witness could be a satisfying truth assignment. The verifier would evaluate x at the given
truth assignment, and accept if and only if the answer is 1.
Interactive proofs extend NP-proof systems in two ways: they allow an exchange of messages,
and they allow both parties to be randomized. We require that if x ∈ L then it should be possible
for the prover to convince the verifier, but if x 6∈ L, then, no matter what the prover does, the
probability that the verifier is convinced is low. As we can see, “conviction” has become probabilistic
attribute, because there is some, small probability that the verifier accepts when x 6∈ L.
Bellare and Rogaway 215
Prover P Verifier V
Coins: RP Coins: RV
Auxiliary input: aP Auxiliary input: aV
Initial State: SP0 = (x, aP , RP ) Initial State: SV0 = (x, aV , RV )
A current state is maintained by each party, and, when this party receives a message, it uses its
interactive function to determine the next message to the other party, as well as a new, updated
state to maintain, as a function of the message it received and its current state. The exchange con-
tinues for some pre-determined number of moves m(·), the latter being a polynomially-bounded,
polynomial-time computable function of the length n of the common input. The last message is
sent by the prover, and at the end of it, V must enter one of the special states accept or reject.
Fig. 14.1 illustrates the message exchange process for the case of a prover-initiated interaction.
A verifier-initiated interaction proceeds analogously with the first message being sent by the ver-
ifier. The number of moves m is odd when the interaction is prover-initiated, and even when the
interaction is verifier-initiated.
The random tape of a party is the only source of randomness for this party’s interactive function,
and the manner in which randomness enters the party’s computation. The length of the random
tape RI for an interactive function I is a function rI (·) of the length n of the common input.
Notice that once the interactive functions P, V have been chosen and their initial states (which
include their random tapes) have been chosen, the sequence of messages between them, and the
verifier’s decision, are determined, meaning these are deterministic functions of the initial states.
We let
P,aP ,RP
DecisionV,aP ,RV
(x)
denote this decision. This is the value, either accept or reject, given by the final state of V in
the interaction with P in which the initial state of V is (x, aV , RV ) and the initial state of P is
(x, aP , RP ). This leads to an important quantity, the accepting probability, defined as
h i
AccP,aP P,aP ,RP
V,aV (x) = Pr Decision V,aP ,RV (x) = accept , (14.1)
the probability being over the random choices of RP and RV . In more detail, the quantity of
Equation (14.1) is the probability that the following experiment returns 1:
n ← |x|
$ $
RP ← {0, 1}rP (n) ; RV ← {0, 1}rV (n)
P,aP ,RP
decision ← DecisionV,a V ,RV
(x)
If decision = accept then return 1 else return 0
The time complexity of an interactive function is measured as a function of the length n of the
common input. In particular, interactive function I is said to be polynomial time if it is computable
in time polynomial in the length of the common input. Henceforth, a verifier is a polynomial-time
interactive function. When I is polynomial-time, it is assumed that rI is polynomially-bounded
and polynomial-time computable.
Definition 14.3.1 Let V be a polynomial-time interactive function, which we call the verifier. Let
L be a language, and let δ: N → [0, 1] be a function. We consider the following conditions on V
relative to L and δ:
1. True-completeness: There exists an interactive function P such that
∀x ∈ L ∀aV ∈ Σ∗ : AccP,ε
V,aV (x) = 1 .
The prover P of a completeness condition is called the honest prover, while a prover being considered
for a soundness condition is called a cheating prover.
Definition 14.3.2 Let L be a language. We say that L has an interactive true-proof (of mem-
bership) if there exists a polynomial-time interactive function V such that the following conditions
hold: true-completeness, and true-soundness with error-probability 1/2. We let IP denote the class
of all languages possessing interactive true-proofs of membership.
Definition 14.3.3 Let L be a language. We say that L has an interactive poly-proof (of mem-
bership) if there exists a polynomial-time interactive function V such that the following conditions
hold: poly-completeness, and poly-soundness with error-probability 1/2. We let pIP denote the
class of all languages possessing interactive poly-proofs of membership.
Before we discuss the definitions further, let us look at a simple class of examples.
Definition 14.3.4 An NP-relation is a boolean-valued function ρ(·, ·), computable in time poly-
nomial in the length of its first input. A language L is in the class NP if there exists an NP-relation
ρ such that
L = { x ∈ Σ∗ : ∃y ∈ Σ∗ such that ρ(x, y) = 1 } .
In this case, ρ is called an NP-relation for L. When ρ is understood, a string y such that ρ(x, y)
is called a certificate, or witness, to the membership of x in L.
An example is the language SAT consisting of all satisfiable boolean formulae. The associated
NP-relation is ρ(ϕ, y) = 1 iff y is a satisfying assignment to formula ϕ.
If L ∈ NP then it has a very simple associated proof system. The protocol has just one move,
from prover to verifier. The prover is expected to supply a certificate for the membership of the
common input in the language, and the verifier checks it. Let us detail this, to make sure we see
how to fit the models and definitions we have provided above.
Proposition 14.3.5 Suppose L ∈ NP. Then there exists a verifier V defining a one-move protocol
satisfying poly-completeness, and true-soundness with error-probability zero, with respect to L.
Note that this shows that NP ⊆ IP and also NP ⊆ pIP. Why? Because poly-completeness
implies completeness, and soundness implies poly-soundness, as we will see below.
Proof of Proposition 14.3.5: We need to specify the interactive function V . We set rV (·) = 0,
meaning the verifier uses no coins. We let ρ be an NP-relation for L. Now, the description of V is
the following:
Verifier V (M ; S)
Parse S as (x, aV , ε)
If ρ(x, M ) = 1 then decision ← accept else decision ← reject
Return (ε, decision)
The initial state of the verifier is the only state it can maintain, and this contains the common input
x, an auxiliary input aV , and the emptystring representing a random tape of length 0. The verifier
treats the incoming message as a witness for the membership of x in L, and evaluates ρ(x, M )
to verify that this is indeed so. The outgoing message is ε, since the verifier takes its decision
after receiving its first message. The verifier is polynomial-time because ρ is computable in time
polynomial in the length of its first input.
Let us now show that poly-soundness holds. To do this we must specify an interactive function
P meeting condition 2 of Definition 14.3.1. It is deterministic, meaning rP (·) = 0. Its action is
simply to transmit its auxiliary input to the verifier. Formally:
Prover P (ε; S)
Parse S as (x, aP , ε)
Return (aP , ε)
Now we must check that this satisfies the conditions. Assume x ∈ L. Then, since ρ is an NP-
relation for L, there exists a y such that ρ(x, y) = 1. Setting aP = y, we see that
AccP,aP
V,aV (x) = 1
for all aV ∈ Σ∗ .
Bellare and Rogaway 219
Finally, we show that true-soundness with error-probability zero holds. Let Pb be an arbitrary
interactive function, and assume x 6∈ L. In that case, there is no string y such that ρ(x, y) = 1,
and thus no message M that could lead V to accept.
Be careful with your arguments above. In particular, how exactly are you using the poly-soundness
condition?
Another worthwhile exercise at this point is to prove the following relation, which says that
poly-completeness is a stronger requirement than true-completeness.
The reason this is true is of course the fact that the prover in the true-completeness condition is
not computationally restricted, but the reason it is worth looking at more closely is to make sure
that you take into account the auxiliary inputs of both parties and the roles they play.
We have denied the prover an auxiliary input in both the soundness conditions. True-soundness
is not affected by whether or not the prover gets an auxiliary input, but poly-soundness might be,
and we have chosen a simple formulation. Notice that poly-soundness is a weaker requirement than
true-soundness in the sense that true-soundness implies poly-soundness.
The auxiliary input of the verifier will be used to model its history, meaning information from
prior interactions. This is important for zero-knowledge, but in the interactive proof context you
can usually ignore it, and imagine that aV = ε.
Many natural protocols have an error-probability of δ(n) = 1/2 in the soundness condition.
One way to lower this is by independent repetitions of the protocol. As we will see later, the extent
to which this is effective depends on the type of soundness condition (whether true or poly).
220 INTERACTIVE PROOFS AND ZERO KNOWLEDGE
Figure 14.2: Known containment relations amongst some interactive proof related complexity
classes. A line indicates that the class at the lower end of the line is contained in the class at
the upper end of the line.
Prover V
$
Pick r ← ∗
ZN
Let y ← r 2 mod N
y ✲
$
Pick c ← {0, 1}
✛ c
z ← rsc mod N
z ✲
Accept iff z 2 ≡ yxc mod N
We must first recall some number theory. Let N be a positive integer. An element S ∈ ZN ∗ is
∗
a square, or quadratic residue, if it has a square root modulo N , namely there is a s ∈ ZN such
that s2 ≡ S mod N . If not, it is a non-square or non-quadratic-residue. The quadratic residuosity
language is
QR = { (N, S) : S is a quadratic residue modulo N } .
Note a number may have lots of square roots modulo N . Recall that there exists a polynomial time
algorithm to compute the gcd, meaning on inputs N, s it returns gcd(N, s). There also exists a
Bellare and Rogaway 221
polynomial time algorithm that on inputs N, S, s can check that s2 ≡ S (mod N ). However, there
is no known polynomial-time algorithm, even randomized, that on input N, S returns a square root
of S modulo N .
Now imagine that the common input x to the prover and verifier is a pair (N, S). We consider
various possible requirements of the protocol between the parties, and the motivations for these
requirements.
The prover claims that S is a quadratic residue modulo N , meaning it claims that (N, S) is a
member of the language QR.
14.4 NP proof-systems
Recall that there exists a polynomial time algorithm to compute the gcd, meaning on inputs N, s it
returns gcd(N, s). There also exists a polynomial time algorithm that on inputs N, S, s can check
that s2 ≡ S (mod N ). However, there is no known polynomial-time algorithm, even randomized,
that on input N, S returns a square root of S modulo N .
Example 14.4.1 We claim that QR is in NP. To justify this we must present an NP-relation ρ
for QR. The relation in question is defined by
1 if S, s ∈ Z∗ and s2 ≡ S (mod N )
N
ρ((N, S), s) =
0 otherwise.
The facts recalled above tell us that ρ is computable in time polynomial in the length of the pair
N, S. This involves two gcd computations and one squaring operation. So ρ is an NP-relation.
Now suppose L ∈ NP and let ρ be an NP relation for L. Imagine a party that has an input x and
wants to know whether or not this input is in L. It probably cannot do this efficiently, since NP is
probably different from P. Now imagine that there is another party that is willing to help out. We
call this party the prover. It has, for some reason, the ability to determine whether or not x is in L.
Our original party is willing to take the prover’s help, but does not trust the prover. It asks that if
the prover claims that x ∈ L, it should supply evidence to this effect, and the evidence should be
efficiently verifiable, where “efficient” means in time polynomial in the length of x. Accordingly, we
call our original party a verifier. Given that L ∈ NP, the evidence can take the form of a witness
y satisfying ρ(x, y) = 1, where ρ is an NP-relation for L. The verifier would compute ρ to check
this evidence.
We thus visualize a game involving two parties, a prover and a verifier, having a common input
x. The verifier is computationally restricted, specifically to run in time polynomial in the length n
of the input x, but no computation restrictions are put on the prover. The prover claims that x is
a member of the underlying language L. To prove its claim, the prover transmits to the verifier a
string y. The latter evaluages ρ(x, y), and accepts if and only if this value is 1. This is an NP-proof
system.
Let us formalize this. We say that a language L has an NP-proof system if there exists an
algorithm V , called the verifier, that is computable in time polynomial in its first input, and for
which two conditions hold. The first, called completeness, says that there exists a function P such
that, if the verifier is supplied the message y = P (x), then it accepts. The second condition, called
soundness
testing whether x is in L by running some decision procedure, but, being restricted to polynomial
time, this will only lend it certainity if L ∈ P. However, the verifier is willing to allow the prover
to supply evidence, or “proof” or its claim. The prover is asked to supply a string y
222 INTERACTIVE PROOFS AND ZERO KNOWLEDGE
The setting is that we have q balls. View them as numbered, 1, . . . , q. We also have N bins, where
N ≥ q. We throw the balls at random into the bins, one by one, beginning with ball 1. At random
means that each ball is equally likely to land in any of the N bins, and the probabilities for all the
balls are independent. A collision is said to occur if some bin ends up containing at least two balls.
We are interested in C(N, q), the probability of a collision.
The birthday paradox is the case where N = 365. We are asking what is the chance that, in a
group of q people, there are two people with the same birthday, assuming birthdays are randomly
√
and independently distributed over the days of the year. It turns out that when q hits 365 the
chance of a birthday collision is already quite high, around 1/2.
This fact can seem surprising when first heard. The reason it is true is that the collision
probability C(N, q) grows roughly proportional to q 2 /N . This is the fact to remember. The
following gives a more exact rendering, providing both upper and lower bounds on this probability.
Theorem A.0.1 [Birthday bound] Let C(N, q) denote the probability of at least one collision
when we throw q ≥ 1 balls at random into N ≥ q buckets. Then
q(q − 1)
C(N, q) ≤
2N
and
C(N, q) ≥ 1 − e−q(q−1)/2N .
√
Also if 1 ≤ q ≤ 2N then
q(q − 1)
C(N, q) ≥ 0.3 · .
N
In the proof we will find the following inequalities useful to make estimates.
Proof of Theorem A.0.1: Let Ci be the event that the i-th ball collides with one of the previous
ones. Then Pr [Ci ] is at most (i − 1)/N , since when the i-th ball is thrown in, there are at most
i − 1 different occupied slots and the i-th ball is equally likely to land in any of them. Now
C(N, q) = Pr [C1 ∨ C2 ∨ · · · ∨ Cq ]
≤ Pr [C1 ] + Pr [C2 ] + · · · + Pr [Cq ]
0 1 q−1
≤ + + ··· +
N N N
q(q − 1)
= .
2N
This proves the upper bound. For the lower bound we let Di be the event that there is no collision
after having thrown in the i-th ball. If there is no collision after throwing in i balls then they must
all be occupying different slots, so the probability of no collision upon throwing in the (i + 1)-st
ball is exactly (N − i)/N . That is,
N −i i
Pr [Di+1 | Di ] = = 1− .
N N
Also note Pr [D1 ] = 1. The probability of no collision at the end of the game can now be computed
via
1 − C(N, q) = Pr [Dq ]
= Pr [Dq | Dq−1 ] · Pr [Dq−1 ]
.. ..
. .
q−1
Y
= Pr [Di+1 | Di ]
i=1
Y
q−1
i
= 1− .
i=1
N
Note that i/N ≤ 1. So we can use the inequality 1 − x ≤ e−x for each term of the above expression.
This means the above is not more than
q−1
Y
e−i/N = e−1/N −2/N −···−(q−1)/N = e−q(q−1)/2N .
i=1
Putting all this together we get
C(N, q) ≥ 1 − e−q(q−1)/2N ,
which is the second inequality in Proposition A.0.1. To get the
√ last one, we need to make some
more estimates. We know q(q − 1)/2N ≤ 1 because q ≤ 2N , so we can use the inequality
1 − e−x ≥ (1 − e−1 )x to get
1 q(q − 1)
C(N, q) ≥ 1 − · .
e 2N
A computation of the constant here completes the proof.
Appendix B
Information-Theoretic Security
We imagine that the sender chooses a message at random according to D, meaning that a specific
message M ∈ Plaintexts has probability D(M ) of being chosen. In our example, the sender would
choose 00 with probability 1/6, and so on.
The message distribution, and the fact that the sender chooses according to it, are known to
the adversary. Before any ciphertext is transmitted, the adversary’s state of knowledge about the
message chosen by the sender is given by D. That is, it knows that the message was 00 with
probability 1/6, and so on.
We say that the encryption scheme is perfectly secure if the possession of the ciphertext does
not impart any additional information about the message than was known a priori via the fact
that it was chosen according to D. The setup is like this. After the sender has chosen the message
according to D, a key K is also chosen, according to the key generation algorithm, meaning K ← K,
and the message is encrypted to get a ciphertext, via C ← EK (M ). The adversary is given C. We
ask the adversary: given that you know C is the ciphertext produced, for each possible value of
the message, what is the probability that that particular value was actually the message chosen?
If the adversary can do no better than say that the probability that M was chosen was D(M ), it
means that the possession of the ciphertext is not adding any new information to what is already
known. This is perfect security.
To state this more formally, we first let
S = Keys(SE) × Plaintexts × {0, 1}r
denote the sample space underlying our experiment. Here r is the number of coins the encryption
algorithm tosses. (This is zero if the encryption algorithm is deterministic, as is the case for the
one-time pad.) We let introduce the following random variables:
Definition B.0.1 Let SE = (K, E, D) be a symmetric encryption scheme with associated message
space Plaintexts. Let D: Plaintexts → [0, 1] be a message distribution on Plaintexts. We say that
SE is perfectly secure with respect to D if for every M ∈ Plaintexts and every possible ciphertext C
it is the case that
We say that SE = (K, E, D) is perfectly secure if it is perfectly secure with respect to every message
distribution on Plaintexts.
Here “M = M ” is the event that the message chosen by the sender was M , and “C = C” is the event
that the ciphertext computed by the sender and received by the adversary was C. The definition
considers the conditional probability that the message was M given that the ciphertext was C. It
says that this probability is exactly the a priori probability of the message M , namely D(M ).
In considering the one-time pad encryption scheme (cf. Scheme 5.2.1) we omit the counter as
part of the ciphertext since only a single message is being encrypted. Thus, the ciphertext is a
227
C 00 01 10 11
D(M ) M
1/6 00 0.25 0.25 0.25 0.25
1/3 01 0.25 0.25 0.25 0.25
1/4 10 0.25 0.25 0.25 0.25
1/4 01 0.25 0.25 0.25 0.25
C 00 01 10 11
D(M ) M
1/6 00 1/6 1/6 1/6 1/6
1/3 01 1/3 1/3 1/3 1/3
1/4 10 1/4 1/4 1/4 1/4
1/4 01 1/4 1/4 1/4 1/4
Figure B.1: In the first table, the entry corresponding to row M and column C shows the value of
PrD,SE [C = C | M = M ], for the one-time-pad scheme of Example B.0.2. Here the key and message
length are both k = 2. In the second table, the entry corresponding to row M and column C shows
the value of PrD,SE [M = M | C = C], for the same scheme.
k-bit string where k is the length of the key and also of the message. Also note that in this scheme
r = 0 since the encryption algorithm is not randomized.
Example B.0.2 Let SE = (K, E, D) be the one-time-pad encryption scheme with the key length
(and thus also message length and ciphertext length) set to k = 2 bits and the message space set
to Plaintexts = {0, 1}k . Let D be the message distribution on Plaintexts defined by D(00) = 1/6,
D(01) = 1/3, D(10) = 1/4 and D(11) = 1/4. For each possible ciphertext C ∈ {0, 1}k , the
first table of Fig. B.1 shows the value of PrD,SE [C = C | M = M ], the probability of obtaining this
particular ciphertext if you encrypt M with the one-time pad scheme. As the table indicates,
this probability is always 0.25. Why? Having fixed M , the possible ciphertexts are M ⊕ K as K
ranges over {0, 1}k . So, regardless of the value of M , all different k bit strings are equally likely as
ciphertexts. The corresponding general statement is stated and proved in Lemma B.0.3 below. The
second table shows the value of PrD,SE [M = M | C = C], the probability that the message was M
given that an adversary sees ciphertext C. Notice that this always equals the a priori probability
D(M ).
The following lemma captures the basic security property of the one-time-pad scheme: no matter
what is the message, each possible k-bit ciphertext is produced with probability 2−k , due to the
random choice of the key. .
228 INFORMATION-THEORETIC SECURITY
Lemma B.0.3 Let k ≥ 1 be an integer and let SE = (K, E, D) be the one-time-pad encryption
scheme of Scheme 5.2.1 with the key length set to k bits and the message space set to Plaintexts =
{0, 1}k . Let D be a message distribution on Plaintexts. Then
PrD,SE [C = Y | M = X] = 2−k .
for any X ∈ Plaintexts and any Y ∈ {0, 1}k .
Proof of Lemma B.0.3: If X is fixed and known, what’s the probability that we see Y ? Since
Y = K ⊕ X for the one-time-pad scheme, it only happens if K = Y ⊕ X. The probability that K
is this particular string is exactly 2−k since K is a randomly chosen k-bit string.
This enables us to show that the one-time-pad scheme meets the notion of perfect security we
considered above.
Theorem B.0.4 Let k ≥ 1 be an integer and let SE = (K, E, D) be the one-time-pad encryption
scheme of Scheme 5.2.1 with the key length set to k bits and the message space set to Plaintexts =
{0, 1}k . Let D be a message distribution on Plaintexts. Then SE is perfectly secure with respect to
D.
Proof of Theorem B.0.4: Let M ∈ Plaintexts be a message and let C ∈ {0, 1}k be a possible
ciphertext. We need to show that Equation (B.1) is true. We have
PrD,SE [M = M ]
PrD,SE [M = M | C = C] = PrD,SE [C = C | M = M ] ·
PrD,SE [C = C]
PrD,SE [M = M ]
= 2−k · .
PrD,SE [C = C]
The first equality was by Bayes’ rule. The second equality was obtained by applying Lemma B.0.3
with X = M and Y = C. By definition
PrD,SE [M = M ] = D(M )
is the a priori probability of M . Now for the last term:
X
PrD,SE [C = C] = PrD,SE [M = X] · PrD,SE [C = C | M = X]
X
X
= D(X) · 2−k
X
X
= 2−k · D(X)
X
−k
= 2 ·1.
The sum here was over all possible messages X ∈ Plaintexts, and we used Lemma B.0.3. Plugging
all this into the above we get
D(M )
PrD,SE [M = M | C = C] = 2−k · = D(M )
2−k
as desired.
The one-time-pad scheme is not the only scheme possessing perfect security, but it seems to be the
simplest and most natural one.
Bibliography
[2] M. Bellare, J. Kilian and P. Rogaway. The security of the cipher block chaining
message authentication code. Journal of Computer and System Sciences , Vol. 61, No. 3,
Dec 2000, pp. 362–399.
[5] M. Bellare, R. Impagliazzo and M. Naor. Does parallel repetition lower the error in
computationally sound protocols? Proceedings of the 38th Symposium on Foundations of
Computer Science, IEEE, 1997.
[7] Data Encryption Standard. FIPS PUB 46, Appendix A, Federal Information Processing
Standards Publication, January 15, 1977, US Dept. of Commerce, National Bureau of Stan-
dards.
[10] W. Diffie and M. Hellman. New directions in cryptography. IEEE Trans. Info. Theory,
Vol. IT-22, No. 6, November 1976, pp. 644–654.
[11] U. Feige, A. Fiat, and A. Shamir. Zero-Knowledge Proofs of Identity. Journal of Cryp-
tology, Vol. 1, 1988, pp. 77–94.
[12] U. Feige, and A. Shamir. Witness Indistinguishability and Witness Hiding Protocols.
Proceedings of the 22nd Annual Symposium on the Theory of Computing, ACM, 1990.
[15] O. Goldreich, S. Micali, and A. Wigderson. Proofs that Yields Nothing but Their
Validity, or All Languages in NP Have Zero-Knowledge Proof Systems. Journal of the ACM,
Vol. 38, No. 1, July 1991, pp. 691–729.
[16] O. Goldreich and Y. Oren. Definitions and Properties of Zero-Knowledge Proof Systems.
Journal of Cryptology, Vol. 7, No. 1, 1994, pp. 1–32.
[19] S. Landau. Standing the test of time: The Data Encryption Standard. Notices of the AMS,
March 2000.
[21] S. Goldwasser, S. Micali and R. Rivest. A digital signature scheme secure against
adaptive chosen-message attacks. SIAM Journal of Computing, Vol. 17, No. 2, pp. 281–308,
April 1988.
[22] A. Joux and R. Lercier. Computing a discrete logarithm in GF(p), p a 120 digits prime,
http://www.medicis.polytechnique.fr/˜lercier/english/dlog.html.
[23] D. Kahn. The Codebreakers; The Comprehensive History of Secret Communication from
Ancient Times to the Internet. Scribner, Revised edition, December 1996.
[25] M. Luby and C. Rackoff. How to construct pseudorandom permutations from pseudo-
random functions. SIAM J. Comput, Vol. 17, No. 2, April 1988.
[27] C. Lund, L. Fortnow, H. Karloff and N. Nisan. Algebraic Methods for Interactive
Proof Systems. Journal of the ACM, Vol. 39, No. 4, 1992, pp. 859–868.
[28] S. Micali, C. Rackoff and R. Sloan. The notion of security for probabilistic cryptosys-
tems. SIAM J. of Computing, April 1988.
[29] M. Naor and M. Yung. Public-key cryptosystems provably secure against chosen cipher-
text attacks. Proceedings of the 22nd Annual Symposium on the Theory of Computing,
ACM, 1990.
BIBLIOGRAPHY 231
[30] A. Odlyzko. The rise and fall of knapsack cryptosystems. Available via http://www.
research.att.com/˜amo/doc/cnt.html.
[31] C. Rackoff and D. Simon. Non-interactive zero-knowledge proof of knowledge and chosen
ciphertext attack. Advances in Cryptology – CRYPTO ’91, Lecture Notes in Computer
Science Vol. 576, J. Feigenbaum ed., Springer-Verlag, 1991.
[32] Ronald Rivest, Matt Robshaw, Ray Sidney, and Yiquin Yin. The RC6 Block Ci-
pher. Available via http://theory.lcs.mit.edu/˜rivest/publications.html.
[33] R. Rivest, A. Shamir, and L. Adleman. A Method for Obtaining Digital Signatures
and Public-Key Cryptosystems. Communications of the ACM, Vol. 21, No. 2, February 1978,
pp. 120–126.
[34] A. Shamir. IP = PSPACE. Journal of the ACM, Vol. 39, No. 4, 1992, pp. 869–877.
[35] D. Weber and T. Denny. The solution of Mccurley’s discrete log challenge. Advances in
Cryptology – CRYPTO ’98, Lecture Notes in Computer Science Vol. 1462, H. Krawczyk ed.,
Springer-Verlag, 1998.