0% found this document useful (0 votes)
48 views

LDPC Codes in The Mceliece Cryptosystem: January 2007

This document discusses using low-density parity-check (LDPC) codes in the McEliece cryptosystem. It proposes modifying the system to use quasi-cyclic (QC) LDPC codes, which allow for large code families with equivalent error correction and reduced key sizes. However, cryptanalysis is needed to ensure the modified system is secure against attacks. The document analyzes previous attacks on McEliece and presents two new attacks - one that breaks codes using circulant matrices, and another serious threat targeting the dual code.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views

LDPC Codes in The Mceliece Cryptosystem: January 2007

This document discusses using low-density parity-check (LDPC) codes in the McEliece cryptosystem. It proposes modifying the system to use quasi-cyclic (QC) LDPC codes, which allow for large code families with equivalent error correction and reduced key sizes. However, cryptanalysis is needed to ensure the modified system is secure against attacks. The document analyzes previous attacks on McEliece and presents two new attacks - one that breaks codes using circulant matrices, and another serious threat targeting the dual code.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/220490221

LDPC Codes in the McEliece Cryptosystem

Article · January 2007


Source: DBLP

CITATIONS READS
3 44

2 authors:

Marco Baldi Franco Chiaraluce


Università Politecnica delle Marche Università Politecnica delle Marche
160 PUBLICATIONS   1,074 CITATIONS    219 PUBLICATIONS   1,468 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

RESCUe View project

Investigation of binary and non-binary QC-LDPC codes with short block lengths View project

All content following this page was uploaded by Marco Baldi on 21 May 2014.

The user has requested enhancement of the downloaded file.


On the Usage of LDPC Codes
in the McEliece Cryptosystem
Marco Baldi
Dipartimento di Elettronica, Intelligenza Artificiale e Telecomunicazioni
Università Politecnica delle Marche
Ancona, Italy
Email: [email protected]

Abstract—In this paper, a new variant of the McEliece cryp- (LDPC) codes, whose decoding exploits iterative exchange
tosystem, based on Low-Density Parity-Check (LDPC) codes, is of messages among constituent components, based on soft-
studied. Random-based techniques allow to design large families input soft-output (SISO) modules. Thus, it seems interesting
of LDPC codes with equivalent error correction capability;
therefore, in principle, such codes can substitute Goppa codes, to investigate possible application of this kind of codes in the
originally used by McEliece in his cryptosystem. Furthermore, framework of the McEliece cryptosystem. The idea to adopt
Quasi-Cyclic (QC) LDPC codes can be adopted in order to LDPC codes in the public-key cryptosystem was first explored
reduce the key length, thus overcoming the main drawbacks in [4]; however, the main task of that paper was to demonstrate
of the original cryptosystem. Their usage, however, must be that the usage of LDPC codes in place of Goppa codes does
subject to cryptanalytic evaluation to ensure sufficient system
robustness. The author proves that some widespread families of not permit to reduce the key length.
QC-LDPC codes, based on circulant permutation matrices, are In this paper, the system proposed in [4] is slightly modified
inapplicable in this context, due to security issues, whilst other and the possible application of Quasi-Cyclic (QC) LDPC
families of codes, based on the “difference families” approach, codes in that framework is studied. These codes are easily
are not exposed to the same risk. However, another attack is encodable, and have been included as an option in the IEEE
presented that obliges to adopt very large codes in order to
ensure a good level of security against intrusions. 802.16e standard for Mobile Wireless MAN [5], thus pro-
viding a valuable reference in this kind of applications. An
I. I NTRODUCTION algebraic technique is suggested to design a very large number
Since many years, error correcting codes have gained an of equivalent codes with fixed length and rate, which is the
important place in cryptography. In particular, just in 1978, pre-requisite for their application in cryptosystems. Such codes
McEliece proposed a public-key cryptosystem based on alge- should, in principle, overcome both the major drawbacks of
braic coding theory [1] that revealed to be very secure. The the original McEliece cryptosystem, but their adoption must
rationale of the McEliece algorithm, that adopts a generator be subject to cryptanalytic evaluation. Previous attacks to the
matrix as the private key and a linear transformation of it McEliece cryptosystem and new attacks tailored to LDPC
as the public key, lies in the difficulty of decoding a large codes are presented: the first new attack is a total break attack
linear code with no visible structure, that in fact is known that can be conducted on codes based on circulant permutation
to be an NP complete problem [2]. As a matter of fact, the matrices, whilst it does not affect those designed with the
original McEliece cryptosystem is still unbroken, in the sense proposed approach; the second new attack is targeted to the
that no algorithm able to realize a total break in an acceptable dual of the secret code and represents a serious threat for the
time has been presented up to now. On the other hand, a considered cryptosystem, to the point it can compromise its
vast body of literature exists on local deduction attacks, i.e., practical feasibility.
attacks finalized to find the plaintext of intercepted ciphertexts, The details of the code design method are given in Section
without knowing the secret key. Despite the advances in the II, where QC codes are described in general terms and an
field, however, the work factors required for this kind of overview of different design techniques is reported. In Section
violation remain very high, and quite intractable in practice. III, the McEliece system using LDPC codes is reviewed, and
Moreover, the system is two or three orders of magnitude faster the role of its matrix components discussed. In Section IV
than RSA, the latter being, probably, the most popular public cryptanalysis of the new system is carried out, considering
key algorithm. A variant of the McEliece cryptosystem, due both classic attacks to the McEliece cryptosystem and new
to Niederreiter [3], is even faster. attacks specifically targeted to LDPC codes.
As a counterpart, however, the McEliece system also shows
some drawbacks, that can justify the limited interest most II. Q UASI -C YCLIC LDPC CODES
cryptographers have devoted to it till today; among them, the Quasi-Cyclic codes have been studied since many years [6],
large length of the key and the low transmission rate. but they did not find a great success in the past because of
The current scenario of error correcting codes is dominated their inherent decoding complexity in classic implementations.
by schemes, like turbo codes or low density parity check Nowadays, however, the encoding facility of Quasi-Cyclic
B
ai,j ai,j ai,j · · · ai,j
 
0 1 2 p−1
A
u p-1 u p-2 u1 u0 G0

 ai,j
p−1 ai,j
0 ai,j
1 · · · ai,j
p−2


Hci,j ai,j ai,j ai,j · · · ai,j
 
= p−2 p−1 0 p−3  (3)
.. .. .. .. ..
 
 
G p-1  . . . . . 
ai,j
1 ai,j
2 ai,j
3 · · · ai,j
0

+ and can be, therefore, associated to a polynomial ai,j (x) ∈


GF (2)[x]mod (xp + 1), with maximum degree p − 1 and
G2 coefficients taken from the first row of Hci,j :

G1 ai,j (x) = ai,j i,j i,j 2 i,j 3 i,j


0 + a1 x + a2 x + a3 x + · · · + ap−1 x
p−1
(4)
A. QC-LDPC codes based on Circulant Permutation Matrices
Figure 1. Encoder circuit for a QC code
A well-known family of QC-LDPC codes has parity-check
matrices in which each block Hci,j = Pi,j is a circulant
codes can be combined with new efficient LDPC decoding permutation matrix or the null matrix of size p; circulant
techniques, thus yielding QC-LDPC codes, recently appeared permutation matrices can be represented through the value of
even in telecommunication standards [5], [7]. their first row shift pi,j . Many authors have proposed code
The dimension k and the length n of a QC code are both construction techniques based on this approach, like Tanner et
multiple of a positive integer p, i.e. k = p · k0 and n = al. in 2001 [8] and Fossorier in 2004 [9], and LDPC codes
p · n0 ; the information vector u = [u0 , u1 , . . . , uk−1 ] and based on permutation matrices have been also included in the
the codeword vector c = [c0 , c1 , . . . , cn−1 ] can be divided IEEE 802.16e standard [5], thus confirming their recognized
into p sub-vectors of size k0 and n0 , respectively, so that u = error correction performance. Another reason why these codes
[u0 , u1 , . . . , up−1 ] and c = [c0 , c1 , . . . , cp−1 ]. have known good fortune is their implementation simplicity
The distinctive characteristic of QC codes is that every [10], [11].
cyclic shift of n0 positions of a codeword yields another code- The parity-check matrix of these codes can be represented
word; since every cyclic shift of n0 positions of a codeword through a “model” matrix Hm , of size r0 × n0 , containing
is led by a cyclic shift of k0 positions of the corresponding the shift values pi,j (pi,j = 0 represents the identity matrix
information word, it can be easily shown that Quasi-Cyclic while, conventionally, pi,j = −1 the null matrix). The code
codes are characterized by the following form of the generator rate of such codes is R = k0 /n0 and it can be varied arbitrarily
matrix G, where each block Gi has size k0 × n0 : through a suitable choice of r0 and n0 . On the other hand, the
  local girth length for these codes cannot exceed twelve [8],
G0 G1 . . . Gp−1 [9], and the imposition of a lower bound on the local girth
 Gp−1 G0 . . . Gp−2 
length reflects on a lower bound on the code length [9].
G= . (1)
 
. .. . . ..
The rows of a permutation matrix sum into the all-one

 . . . . 
G1 G2 . . . G0 vector, so these parity-check matrices cannot have full rank.
Precisely, every parity-check matrix contains at least r0 − 1
This leads to an efficient encoder implementation, consisting rows that are linearly dependent on the others, and the maxi-
in a barrel shift register of size p, followed by a combinatorial mum rank is r0 (p − 1) + 1. A common solution to ensure the
network and an adder, as shown in Fig. 1. full rank is that of imposing the lower triangular (or quasi-
It can be easily proved that the parity check matrix H is lower triangular) form of the matrices, similarly to what done
also characterized by the same “circulant of blocks” form, in in the IEEE 802.16e standard [5].
which each block Hi has size (n0 − k0 ) × n0 = r0 × n0 .
A row and column rearrangement can be applied that yields B. QC-LDPC codes based on Difference Families
the alternative “block of circulants” form, here shown for the When designing a QC-LDPC code, the parity check matrix
parity check matrix H: H should have some properties that optimize the behavior of
the belief propagation-based decoder. First of all, H must be
Hc0,0 Hc0,1 Hc0,n0 −1
 
... sparse, and this reflects on the maximum number of non-zero
 Hc1,0 Hc1,1 ... Hc1,n0 −1  coefficients of each polynomial ai,j (x). Then, short length
H= (2)
 
.. .. .. ..  cycles must be avoided in the Tanner graph associated with
 . . . . 
the code.
Hcr0 −1,0 Hcr0 −1,1 . . . Hcr0 −1,n0 −1
The latter requirement can be ensured, as demonstrable
In this expression, each block Hci,j is a p×p circulant matrix: through algebraic considerations, when H is a single row of
circulants (i.e. r0 = 1), and the corresponding code rate is When considering LDPC codes, their equivalence needs to
R = (n0 − 1)/n0 : be verified under message passing decoding, whose perfor-
mance does not depend only on the weight spectrum. Gener-
Hc0 Hc1 · · · Hcn0 −1
 
H= (5) ally speaking, two codes exhibit almost identical performance
when they have equal (or very similar): i) code length and rate,
Let (G, +) be a finite group, and D a subset of G; ii) parity check matrix density, iii) nodes degree distributions
then, ∆D ≡ [x − y : x, y ∈ D, x 6= y] is the collection of all and iv) girth length distribution in the Tanner graph associated
differences of distinct elements of D. Given a positive integer with the code.
s
Let us adopt a family of codes with fixed length n and
S
s, and a multi-set M , sM ≡ M is defined as s copies of
i=1 parity-check matrices in the form (5), so property i) is ensured.
M.
An integer dv > 2 is then set as the column weight in
Let v be the order of the group G, µ and λ positive
matrix H, so that property ii) is verified too. dv integers
integers, with v > µ ≥ 2. A (v, µ, λ)-difference family (DF)
are then chosen in such a way to ensure the difference
is a collection [D1 , . . . , Dt ] of µ-subsets of G, called “base
family properties, and they are used to construct, through
blocks”, such that
Eq. (7), a parity check matrix H in the form (5), with rate
t
[ R = (n0 − 1) /n0 . Since circulant matrices are regular, the
∆Di = λ (G\ {0}) (6) resulting parity check matrix is regular in both row and column
i=1 weights; so all codes have the same nodes degree distributions
In other terms, every non-zero element of G appears exactly and property iii) is verified. Finally, when a circulant matrix
λ times as a difference of two elements from a base block. has row (column) weight greater than 2, and it does not induce
As already shown in the literature [12], difference families 4-length cycles, it can be shown it corresponds to a Tanner
can be used to construct QC-LDPC codes. In particular, if we graph with constant local girth length equal to 6, so property
consider the difference family [D1 , · · · , Dn0 ], a code based on iv) is satisfied as well.
it has thefollowing form for the polynomials of the circulant In a recent paper [15] the author has proposed, for the same
matrices Hc0 , · · · , Hcn0 −1 : objective of McEliece cryptosystem implementation, to design
µ
an EDF larger than needed, through a randomized version of
X the algorithm reported in [14], and then select randomly a
ai (x) = xdij , i ∈ [1; n0 ] (7)
subset of each block in order to have a QC-LDPC code. Each
j=1
generated EDF hence produces as many different codes as
where dij is the j-th element of Di whose dimension is the possible different random choices of its elements; in order
µ. With this choice, the designed matrix H is regular and to ensure large cardinality of the code family, however, this
all the elements in the difference family are used. It can be approach can require large code length.
shown [12] that, by using difference families with λ = 1 in Therefore, it is preferable to adopt Random Difference
construction (7), the resulting code has a Tanner graph free of Families (RDFs), that do not derive from EDFs, but rather are
4-length cycles. constructed by direct (constrained) random selection of their
Some theorems ensure the existence of difference families elements. The constraints imposed serve to ensure the differ-
with λ = 1, but they apply only for particular values of ence family character of each set, namely that the designed
the group order, so putting heavy constraints on the code codes do not have 4-length cycles. A lower bound on the
length. Such constraints can be overcome by relaxing part cardinality C {RDF (n0 , dv , p)} of a set of RDFs with fixed
of the hypotheses of these theorems and then refining the parameters n0 , dv and p can be evaluated, through probabilistic
outputs through simple computer searches, in the so-called arguments, as follows [16]:
“Pseudo Difference Families” approach [13]; other authors
have proposed an alternative technique based on “Extended  n0 −1dYv −1
Difference Families”, that ensures great flexibility in the code 1 p Y p
C {RDF (n0 , dv , p)} ≥ +
length [14]. Finally, a multi-set with the properties of a p dv j=1
p−j
l=0
h i (8)
difference family can be constructed by (constrained) random 2
j 2 − p mod 2 + (j − 1) /2 + l · dv · (dv − 1)
choice of its elements; let us call this a “Random Difference −
Family” or RDF. p−j
According with this expression, very high values of
C. Equivalent codes based on Difference Families C {RDF (n0 , dv , p)} can be obtained, even for (relatively)
A first requirement when using an error correcting code in a short codes. As an example, by assuming n = 1060, n0 = 4
cryptosystem concerns the possibility to choose it at random (which implies rate R = 3/4 and p = 265) and dv = 5,
among a very large class of equivalent codes. This way, an we have C {RDF (4, 5, 210)} ' 2111 . From the cryptanalysis
opponent, even aware of the code parameters (i.e. length and point of view, however, it should be observed that an attack
rate), neither knows (obviously) the private key nor is able to could be made on each Hci,j block, which is equivalent to say
obtain it through a brute force attack. that the maximum number of trials for an eavesdropper results
0
from setting n0 = 1 in expression (8). In order to preserve high 10

cardinalities, longer codes should be therefore considered, that 10


-1

however would remain feasible for application. As an example, -2


10
by setting, at a parity of n0 , n = 8000 (that implies p = 2000)
and dv = 13, we have C {RDF (1, 13, 2000)} ' 297 , high 10
-3

enough to discourage a brute force attack on a single subma- -4


10
trix.
-5
10
III. M C E LIECE C RYPTOSYSTEM WITH LDPC C ODES -6 BER FER
10
decoding by H
A. System description -7
10 decoding by H'

A possible implementation of the McEliece system using -8


uncoded

10
LDPC codes is as follows [4]. Bob, that must receive a 10 100 t
message from Alice, chooses a parity check matrix H among
a family of a given class of LDPC codes. Let us suppose to Figure 2. Performance attainable over the “McEliece channel” by H and by
know that the chosen code, like any other in the same class, H0 .
is able to correct t errors with high probability, under belief
propagation decoding. Bob also chooses an r × r non-singular
this product reveal directly u · S−1 , and right multiplication
dense circulant “transformation” matrix, T, and obtains the
by S permits to extract the plaintext u as desired.
new matrix H0 = T · H, that, obviously, has the same null
The eavesdropper Eve, that wishes to intercept the message
space of H.
from Alice to Bob and is not able to derive matrix H from
Bob then computes a generator matrix, G, corresponding
the knowledge of G, is, as expected, in a much less favorable
to H0 , in row reduced echelon form and makes it available
position. Even if H0 can be derived from G, it is made
in the public directory: G is Bob’s public key and it is
unsuitable for LDPC decoding through the action of matrix
completely described by a single row of its non-systematic
T. In addition, H0 is dense and this implies, for Eve, large
part, that is a k-bit vector. On the contrary, H and T form
decoding complexity.
the private (or secret) key, that is owned by Bob only. The
system requires also a k × k non-singular “scrambling” matrix B. Choice of matrix S
S, that is suitably chosen and publicly available (it can be The residual errors after decoding are “propagated” by
even embedded in the algorithm implementation). Also matrix the subsequent product by S, so that, at the output of the
S has the “circulants block” form, and its role is to cause eavesdropper’s decoder, not only the FER is equal to 1 (which
propagation of residual errors at the eavesdropper’s receiver, means that all decoded sequences are erred) but also the BER
leaving the opponent in the most uncertain condition (that is is practically equal to 1/2 (that is the most uncertain condition
equivalent to guess the plaintext at random). For this purpose, for an opponent). Similarly to T, also S has to be rather dense,
it must be sufficiently dense in its turn. for doing, at the best, this error propagating action.
When Alice wants to send an encrypted message to Bob, The results of a numerical transmission simulation over the
she fetches G from the public directory and calculates G0 = “McEliece channel” (a channel that introduces t errors on
S−1 · G. Then she divides her message into k-bit blocks and each frame) is shown in Fig. 2, where the error correction
encrypts each block u as follows: performance achieved by the private matrix H and by a public
x = u · G0 + e (9) matrix H0 is shown. The considered code has parameters
n = 8000, k = 6000 (p = 2000) and dv = 13. For further
where x is the encrypted version of u and e is a random vector evidence, a sparse T matrix (with same column weight as
of t intentional errors. The choice of t will be discussed next. H) has been used to obtain H0 ; employing denser T matrices
At the receiver side, Bob uses its private key for decoding. (as those requested for security issues) the performance of H0
In the ideal case of a channel that does not introduce additional would be even worse.
errors, by a suitable choice of t, all errors can be corrected As mentioned, the value of t must be upper bounded by the
with high probability (in the extremely rare case of an code error correction capability. In the considered example,
LDPC decoder failure, message resend is request, as discussed it is possible to foresee that authorized users have negligible
in Subsection IV-A). Belief propagation decoding, however, error probability for t ≤ 40, while, by assuming t = 40,
works only on sparse Tanner graphs, free of short-length unauthorized users have useless decoders (they achieve same
cycles; therefore, in order to exploit the actual correction BER as the uncoded transmission).
capability, the knowledge of the sparse parity-check matrix Despite the effect of matrix T, the error correction per-
H is essential. formance achievable by unauthorized users is still good. Fig.
By applying the decoding algorithm  on the ciphertext x, 2 shows that, in our example, the uncoded transmission for
Bob can derive u · G0 = u · S−1 · G . On the other hand, as t = 40 achieves BER of the order of 5 · 10−3 , which is quite
G is in row reduced echelon form, the first k coordinates of unacceptable from the security viewpoint. A proper security
level is restored by the action of the scrambling matrix, IV. S YSTEM CRYPTANALYSIS
that causes “propagation” of the residual errors. This can be Cryptanalysis is carried out considering both attacks already
explained in the following terms. developed for the original McEliece cryptosystems and new
If we consider the information part of the vector decoded attacks targeted to the proposed instance of it.
by an opponent, ũ, it can be expressed as ũ = u · S−1 + ẽ,
where ẽ is corresponding part of the residual errors vec- A. Classic Attacks
tor. After the descrambling process, the decoded message is The first kind of known attacks are brute force attacks. As
û = ũ · S = u + ẽ · S; therefore the scrambling matrix already shown, enumeration of the code set is too demanding;
S operates directly on the residual errors. Really, the extent therefore a brute force attack on H (or a square sub-block
of the error propagation effect can be predicted through of H) should be excluded. Even the number of randomly
simple combinatorial arguments, under the hypothesis that S is chosen (dense) T matrices is extremely high. Other kinds of
randomly generated. At the same time, this prediction permits brute force attacks (e.g. trying to decode the ciphertext, even
to design the features of matrix S (its density, in particular, using coset leaders) are unfeasible as well, like in the original
for a given size) that are compatible with the achievement McEliece cryptosystem.
of a satisfactory security level, according with the procedure A low complexity attack, in the form of a local deduc-
described below. tion, yet mentioned in [1], has been further investigated and
Let η be the Hamming weight of vector ẽ and v the improved in the subsequent literature. Lee and Brickell, in
unknown value of the row (column) weight of matrix S. [17], propose a generalization of such attack, characterized by
Moreover, let us suppose that the value of k is fixed. Under reduced complexity. The work factor of Lee and Brickell’s
the hypothesis that S is randomly generated, the probability algorithm can be evaluated as follows:
that j symbols 1 in ẽ coincide with as many symbols 1 in a
Wj = Tj αk 3 + Nj βk

generic column of S can be easily calculated as follows: (13)
v k−v
  hP
j t n−t
 n −1
i Pj
j η−j
, Nj = i=0 ki and α,
 
Pj (v) = where Tj = i=0 i k−i / k
k

η β and j are integers ≥ 1. Assuming α = β = 1 (the most
Now, the i-th element of vector ẽ · S is equal to 1 (which favorable condition for an opponent), n = 8000 and k = 6000,
means that a residual error is present) any time the number we have that Wj is minimum for j = 2 and, for t = 40,
of coincidences j is odd. Consequently, the conditional prob- W2 ' 2105.8 , that would be high enough to discourage the
ability that the i-th bit of û is in error, when ẽ has weight η attack.
and the row (column) weight of S is v, equals: Berson [18] proved that attacks based on Information Set
Decoding become very easy when the same message is
bX2 c
v−1
encrypted more than once (with different error vectors), or
P (err|η,v) = P2m+1 (v) (10) in the case of messages with a known linear relationship
m=0 among them. When considering LDPC codes, it is not easy
Finally, the probability that the i-th bit of û is in error, for the to determine the decoding radius under belief propagation;
same value v, can be obtained as: therefore, for a fixed value of t, Bob’s LDPC decoder may
k
X be occasionally unable to correct all the errors. In this case,
P (err|v) = P (err|η,v) Pẽ (η) (11) that however occurs with extremely low probability, the parity
η=1 check fails, and the uncorrected errors are detected. So,
where Pẽ (η) represents the probability that vector ẽ has message resending could be necessary. It should be noted,
weight η, and, for the McEliece cryptosystem, can be esti- however, that the same circumstance can occur in the original
mated as follows: McEliece cryptosystem, though originated by different causes.
Berson-like attacks, however, can be avoided through a
t n−t
 
η k−η
simple modification of the cryptosystem [19]: for example, the
Pẽ (η) = n
 (12) encryption map can be modified as follows: x = [u + h (e)] ·
k G0 + e, where h is a one-way hash function with e as input
Expression (12) can be used, together with expression (10), and a k-bit output. Its contribution must be obviously removed
in expression (11). This way, it can be calculated, as an ex- by Bob, after successful decoding: u = [u + h (e)] + h (e).
ample, that, for n = 8000, k = 6000 and t = 30, an S matrix Another attack can be derived considering that the problem
with density of about 18% suffices to ensure maximum error of finding e translates into that of finding the minimum
“propagation” action (that consists in obtaining BER equal to
 0 of the (n, k + 1) linear block code generated
weight codeword
1/2). On the other hand, it is expected that the scrambling 00 G
action has impact also for the authorized user (Bob). In other by G = . The problem of finding the minimum
x
words, a penalty in the error correction performance must weight codeword is mostly unsolved for the case of LDPC
appear, that is compensated by the possibility to make the codes. Actually, some algorithms have been studied [20], but
system secure against unauthorized intrusions. they rarely succeed on long codes, like those adopted here.
Moreover, matrix H0 is not suitable for iterative decoding, circulant, is equivalent to calculate the periodic autocorrelation
while these algorithms exploit belief propagation. of the first row of H0i . When H0i is sparse (i.e. T is sparse)
An adaptation of the probabilistic algorithm originally pro- the autocorrelation is everywhere null (or very small), except
posed by Stern [21], that does not rely on iterative decoding, for a limited number of peaks that reveal the couple of rows
has been applied to LDPC codes by Hirotomo et al. [22]. of H0i able to give information on the structure of Hi . On the
Even if this approach seems to outperform that based on contrary, when H0i is dense (suppose with one half of symbols
iterative decoding, it is hard to estimate the minimum distance 1), the autocorrelation is always high, and no information is
using Stern’s algorithm for code length n ≥ 2048, as the available for the opponent. In this case, the eavesdropper Eve is
authors themselves acknowledge. For n = 4096 and k = 2048 in the same condition as to guess at random. The relevant point
(that are smaller than those here considered), the work factor is that to have a dense matrix H0 , when the proposed system
reaches 298.3 . Even considering the quasi-cyclic property, such adopts QC-LDPC codes based on RDFs, does not affect the
values of the work factor should discourage the attack. public key length: matrix G remains described completely by
a column of its non-systematic part.
B. Attacks to LDPC Codes 2) Attack to Circulant Permutation Matrices-based QC-
1) Density Reduction Attacks: These attacks, already con- LDPC codes : In the previous subsection it has been explained
jectured in [4], are specifically targeted to LDPC codes. how the assumption of a matrix T sufficiently dense may guar-
Let hi be the i-th row of matrix H and h0j the j-th row antee inapplicability of an attack that aims at finding single
of matrix H0 = T · H, and let (GF2n , +, ×) be the vector rows of H. In this subsection it is shown that, independently
space of all the possible binary n-tuples with the operations of the T density, QC-LDPC codes having the form described
of addition (i.e. the logical “XOR”) and multiplication (i.e. the in Section II-A cannot be used in the considered cryptosystem,
logical “AND”). Let us define “orthogonality” in the following since a total-break attack is possible able to recover the private
sense: two binary vectors u and v are orthogonal, i.e. u ⊥ v, iff key with high probability and low complexity.
u × v = 0. From the cryptosystem description, it follows that: Let the private key H be formed by r0 × n0 circulant
h0j = hi1 + hi2 + . . . + hiz where z represents the Hamming permutation or null matrices Pi,j of size p and have lower
weight of each row (column) of T. triangular form, to ensure full rank. For the sake of clarity, let
Let us suppose that many hi are mutually orthogonal, due us consider a simple example with r0 = 3 and n0 = 6, and
to the sparsity of matrix H. Let h0j a = hia1 + hia2 + . . . + hiaz where all the blocks Pi,j are non-null:
and h0j b = hib1 + hib2 + . . . + hibz be two distinct rows of H0
and hia1 = hib1 = hi1 [that happens when T has two non-zero 
P1,1 P1,2 P1,3 P1,4 0 0

entries in the same column (i1 ), at rows j a and j b .] In this case, H =  P2,1 P2,2 P2,3 P2,4 P2,5 0  (14)
it may happen that: h0j a × h0j b = hi1 (that occurs, for example, P3,1 P3,2 P3,3 P3,4 P3,5 P3,6
when hi1 ⊥ hib2 , . . . , hi1 ⊥ hibz , . . . , hiaz ⊥ hi1 , . . . , hiaz ⊥
hibz ). Therefore, a row of H could be derived as the product Let T have the generic form of r0 × r0 square circulant
of two rows of H0 . At this point, if the code is quasi-cyclic dense blocks Ti,j , each of size p:
with the considered form, its whole parity-check matrix can 
T1,1 T1,2 T1,3

be obtained, due to the fact that the other rows of H are T =  T2,1 T2,2 T2,3  (15)
simply block-wise circular shifted versions of the one obtained T3,1 T3,2 T3,3
through the attack.
Even when the analysis of all possible couples of rows The public key is obtained as H0 = T · H. Although the
of H0 does not reveal a row of H, it may produce a new exact knowledge of the private key would certainly allow
matrix, H00 , sparser than H0 , able to allow efficient LDPC correct decoding, for the eavesdropper it suffices to find
decoding. Alternatively, the attack can be iterated on H00 and a couple of matrices (Td , Hd ), with the same dimensions
it can succeed after a number of iterations > 1; in general, the of (T, H), such that H0 = Td · Hd , and Hd is sparse
attack requires ρ − 1 iterations when not less than ρ rows of enough to allow efficient belief propagation decoding. This can
H0 have in common a single row of H. This attack procedure be accomplished considering that, given an invertible square
can be even applied on a single circulant block of H0 , say matrix Z of size r = p · r0 , the following relationship holds:
H0i , to derive its corresponding block Hi of H, from which
T = H0i · H−1 can be obtained. H0 = T · H = T · Z · Z−1 · H = Td · Hd (16)
i
The author has verified elsewhere [16] that the attack can where Td = T · Z and Hd = Z−1 · H. If we separate H in its
be avoided through a proper selection of matrix T, but this left (r × k) and right (r × r) parts, H = [Hl |Hr ], a particular
approach forces constraints on the code parameters. In this choice of Z coincides with the lower triangular part of H, i.e.,
work, following [4], it is proposed instead to resort only to for the considered example:
matrix T density.  
Let us suppose that the attack is carried out on the single P1,4 0 0
block H0i (it can be easily generalized for the whole H0 ). Z = Hr =  P2,4 P2,5 0  (17)
The first iteration of the attack, for the considered case of H0i P3,4 P3,5 P3,6
With this choice of Z, Hd assumes a particular form: row of Hd equals that of H multiplied by PT 2,5 plus a shifted
version of the first row of Hb . The third row of Hd equals
 −1  that of H multiplied by PT 3,6 plus two shifted versions of the
Hd = [Hdl |Hdr ] = H−1
r · [Hl |Hr ] = Hr · Hl |I (18)
first row of Hb and a shifted version of the second row of
where the right part of Hd is an identity matrix of size Hb . Therefore, a recursive attack can be conceived.
r = p · r0 . From this, it follows that: When sparse matrices are added, it is highly probable that
their symbols “1” do not overlap. For this reason, in a sum of
H0 = Td · Hd = [Td · Hdl |Td ] (19) rows of Hb , the contributions of shifted versions of known
rows can be isolated through a correlation operation and,
Eq. (19) is the starting point for the eavesdropper: looking therefore, eliminated. This permits to deduce each row of Hb
at H0 , she immediately knows Td and, hence Hd , both from the previous ones and, this way, obtain the entire Hb .
corresponding to the choice of Z expressed by Eq. (17). Obviously, the hypothesis of non-overlapping elements is most
Moreover, it can be proved that Hd , so found, can be sparse likely verified when the blocks of H are sparse and r0 is not
enough to allow efficient belief propagation decoding. In fact, too high. For example, it has been verified that matrices with
because of the Hd definition, its sparsity depends on that of r0 = 6 and p = 40 (i.e. density of the blocks 0.025) are
Z−1 . Actually, for the considered example, it is: highly exposed to total break. This is not a trivial case; for
  example, the codes included in the IEEE 802.16e standard [5]
Z1,1 0 0
have p ∈ [24; 96] and r0 = 6 when the code rate is 3/4.
Z−1 = H−1 r =
 Z2,1 Z2,2 0  (20)
An attack of this kind is addressed to LDPC codes based on
Z3,1 Z3,2 Z3,3
circulant permutation matrices, whilst it is not applicable to
where LDPC codes based on difference families (described in Section
II-B). For this reason, the latter appear more secure, at least
= PT at the present stage of cryptanalysis.

 Z1,1 1,4
= PT 3) Attacks to the Dual Code: The most dangerous attack for



 Z2,2 2,5
every instance of the McEliece cryptosystem based on LDPC

Z3,3 = PT

3,6
(21) codes rises from the fact that an opponent knows the dual of
 Z2,1 = PT T
2,5 P2,4 P1,4
the secret code contains very low weight codewords and can

Z = P3,6 P3,5 PT
T

 3,2

 2,5
directly search for them, thus recovering H.

Z3,1 = PT T T T T
3,6 P3,4 P1,4 + P3,6 P3,5 P2,5 P2,4 P1,4
The dual of the secret code can be generated by H; therefore
and the property of permutation matrices to have inversion it has at least Adc ≥ r codewords with weight dc . Each
coincident with transposition has been used too. of them completely describes H and, if known, allows the
It follows that the main diagonal of Z−1 contains per- opponent to break the system by gathering the private key.
mutation matrices; the underlying diagonal contains prod- From a cryptographic point of view, Adc should be known
ucts of permutation matrices, i.e. permutation matrices again; in order to precisely evaluate the work factor of the attack,
the following one contains sums of permutation matrices. but this is not, in general, a simple task. However, it can be
The same analysis holds for an arbitrary r0 : the blocks considered that it is dc  n and that sparse vectors most
Zi+j,1+j , i ∈ [2; r0 ] , j ∈ [0; r0 − i] have column (row) weight likely sum into vectors of higher weight. Therefore, it will be
2i−2 . It follows that, when r0 is small, Z−1 is sparse and, considered Adc = r in the following.
consequently, Hd is sparse as well; so it could be used by The best known probabilistic algorithm for finding low
Eve for efficient decoding. weight codewords is due to Stern [21], and it has been recently
Furthermore, the attack can continue and aim at obtaining applied to LDPC codes by Hirotomo et al. [22]. The algorithm,
another matrix, Hb , that has the same density of H and, that exploits an iterative procedure, works on the parity-check
therefore, produces a total break of the cryptosystem. This new matrix of a code and has two parameters, p and l, that represent
matrix corresponds to another choice of Z, namely Z = Z∗ , the number of matrix columns and rows considered at each
that, for the considered example, has the following form: iteration, respectively. Optimal values for p and l can be
  derived considering their influence on the total number of
P1,4 0 0
binary operations needed for finding a codeword of given
Z∗ =  0 P2,5 0  (22)
weight. If we suppose that the algorithm is performed on the
0 0 P3,6
dual of the secret code, with length n and dimension kd (i.e.
For Z = Z∗ , Hb (that is in row reduced echelon form) has redundancy rd = n − kd ), the probability of finding, in one
the same density of H, since each of its rows is a permuted iteration, a codeword with weight w, supposed that it is unique,
version of the corresponding row of H. An attack that aims at is Pw = P1 · P2 · P3 , with:
finding Hb can be conceived by analyzing the structure of Hd  w
 n−w  n

in the form (18), with Z−1 expressed by Eq. (20) and (21).  P1 = p kd /2−p / kd /2 

n−kd /2−w+p
P2 = w−p / n−k d /2

We notice that the first row of Hd equals that of H multiplied p kd /2−p kd /2
by PT P3 = n−kd −w+2p / n−k
  
1,4 , so it corresponds to the first row of Hb . The second
 d
l l
If the code contains Aw codewords with weight w, it is ACKNOWLEDGMENTS
Pw,Aw ≤ Aw Pw ; therefore, the average number of iterations The author is greatly indebted to Professor F. Chiaraluce
−1
needed in order to find one of them is m ≥ Pw,A w
. It can be for his continued interest and his precious insights during the
considered that each iteration of the algorithm requires progress of this research.
2
rd3 2prd kdp/2 R EFERENCES
 
2 kd /2
N= + kd rd + 2pl +
2 p 2l [1] R. J. McEliece, “A public-key cryptosystem based on algebraic coding
theory.” DSN Progress Report, pp. 114–116, 1978.
binary operations, so the total work factor can be estimated as [2] E. Berlekamp, R. McEliece, and H. van Tilborg, “On the inherent
W = mN . intractability of certain coding problems,” IEEE Trans. Inform. Theory,
vol. 24, pp. 384–386, May 1978.
If we consider the following choice of the system param- [3] H. Niederreiter, “Knapsack-type cryptosystems and algebraic coding
eters: n = 8000, kd = 2000, w = dc = n0 · dv = 52, theory,” Probl. Contr. and Inform. Theory, vol. 15, pp. 159–166, 1986.
Aw = 2000, the minimum work factor, that corresponds to [4] C. Monico, J. Rosenthal, and A. Shokrollahi, “Using low density parity
check codes in the McEliece cryptosystem,” in Proc. IEEE ISIT 2000,
(p, l) = (3, 38), is 235.65 . It is evident that such system would Sorrento, Italy, Jun. 2000, p. 215.
be highly exposed to a total break. [5] 802.16e (2005), IEEE Standard for Local and Metropolitan Area Net-
This attack is particularly insidious since the work factor works - Part 16: Air Interface for Fixed and Mobile Broadband Wireless
Access Systems - Amendment for Physical and Medium Access Control
of Stern’s algorithm mainly depends on the relative weight Layers for Combined Fixed and Mobile Operation in Licensed Bands,
searched and decreases with the code rate (it is desirable, for IEEE Std., Dec. 2005.
the dual code, to have rate as low as possible, i.e. highest [6] R. Townsend and E. J. Weldon, “Self-orthogonal quasi-cyclic codes,”
IEEE Trans. Inform. Theory, vol. 13, pp. 183–195, Apr. 1967.
rate for the private and the public code). In order to increase [7] IEEE P802.11, Wireless LANs WWiSE Proposal: High throughput ex-
the work factor, denser parity-check matrices and lower code tension to the 802.11 Standard, IEEE Std. IEEE 11-04-0886-00-000n,
rate should be adopted. On the other hand, however, such Aug. 2004.
[8] R. Tanner, D. Sridhara, and T. Fuja, “A class of group-structured LDPC
matrices must be sparse enough to ensure the absence of 4- codes,” in Proc. ISCTA 2001, Ambleside, UK, Jul. 2001.
length cycles and allow efficient belief propagation decoding. [9] M. P. C. Fossorier, “Quasi-cyclic low-density parity-check codes from
This means that it is possible to obtain high work factors only circulant permutation matrices,” IEEE Trans. Inform. Theory, vol. 50,
no. 8, pp. 1788–1793, Aug. 2004.
by employing relatively large codes. For example, by choosing [10] D. Hocevar, “LDPC code construction with flexible hardware implemen-
n = 84000, r = kd = 28000, n0 = 3 (R = 2/3) and dv = 41 tation,” in Proc. IEEE ICC 2003, vol. 4, Anchorage, Alaska, May 2003,
(dc = 123) it is W = 281.3 (minimal for p = 3, l = 54), pp. 2708–2712.
[11] ——, “Efficient encoding for a family of quasi-cyclic LDPC codes,” in
that ensures satisfactory system robustness. With this choice of Proc. IEEE Global Telecommunications Conference (GLOBECOM ’03),
the parameters, the number of equivalent codes is still high 1 . vol. 7, 2003, pp. 3996–4000.
However, the complexity of such a system is high, and it could [12] S. Johnson and S. Weller, “A family of irregular LDPC codes with low
encoding complexity,” IEEE Commun. Lett., vol. 7, pp. 79–81, Feb.
be very hard to implement. Alternatively, the cryptosystem 2003.
should be modified in order not to expose the secret code, [13] M. Baldi and F. Chiaraluce, “New quasi cyclic low density parity check
thus preventing the attack to be even attempted. Further work codes based on difference families,” in Proc. 8th Int. Symp. Commun.
Theory and Appl., ISCTA 05, Ambleside, UK, Jul. 2005, pp. 244–249.
is in progress in this direction. [14] T. Xia and B. Xia, “Quasi-cyclic codes from extended difference
families,” in Proc. IEEE Wireless Commun. and Networking Conf.,
V. C ONCLUSIONS vol. 2, New Orleans, USA, Mar. 2005, pp. 1036–1040.
[15] M. Baldi, F. Chiaraluce, and R. Garello, “On the usage of quasi-cyclic
An instance of the McEliece cryptosystem based on QC- low-density parity-check codes in the McEliece cryptosystem,” in Proc.
LDPC codes has been studied. Such codes, in principle, could First Int. Conf. on Commun. and Electron. (ICCE’06), Hanoi, Vietnam,
be able to overcome the main drawbacks of the original Oct. 2006, pp. 305–310.
[16] M. Baldi, “Quasi-cyclic low-density parity-check codes and their appli-
McEliece cryptosystem, that are large keys and low transmis- cation to cryptography,” Ph.D. dissertation, Università Politecnica delle
sion rate. Marche, Ancona, Italy, Nov. 2006.
The new system has been cryptanalyzed both considering [17] P. Lee and E. Brickell, “An observation on the security of McEliece’s
public-key cryptosystem,” in Advances in Cryptology - EUROCRYPT
classic attacks and introducing new threats to its security. It 88, Springer-Verlag, Ed., 1988, pp. 275–280.
has been shown that some structured configurations of the [18] T. A. Berson, “Failure of the McEliece public-key cryptosystem under
parity-check matrix, like those based on circulant permutation message-resend and related-message attack,” Advances in Cryptology -
CRYPTO ’97, Lecture Notes in Computer Science, vol. 1294, pp. 213–
matrices, cannot be used in this system as highly vulnerable 220, Aug. 1997.
to total breaks. The attack proposed for demonstrating this [19] H. M. Sun, “Improving the security of the McEliece public-key cryp-
conclusion is not applicable to different design methods, like tosystem.” in ASIACRYPT, 1998, pp. 200–213.
[20] X.-Y. Hu, M. Fossorier, and E. Eleftheriou, “On the computation of the
that based on the use of difference families. However, another minimum distance of low-density parity-check codes,” in Proc. IEEE
attack has been presented, targeted to the dual of the secret ICC 2004, vol. 2, Paris, France, Jun. 2004, pp. 767–771.
code, able to seriously threaten the system security. This attack [21] J. Stern, “A method for finding codewords of small weight,” in G. Cohen
and J. Wolfmann, Coding Theory and Applications, Springer-Verlag, Ed.,
forces to adopt larger codes, even if such choice can jeopardize no. 388 in Lecture Notes in Computer Science, 1989, pp. 106–113.
the system feasibility. [22] M. Hirotomo, M. Mohri, and M. Morii, “A probabilistic computation
method for the weight distribution of low-density parity-check codes,”
1 The lower bound given in Subsection II-C becomes too loose, but such in Proc. IEEE ISIT 2005, Adelaide, Australia, Sep. 2005, pp. 2166–2170.
number can be estimated through different arguments.

View publication stats

You might also like