Abstract
Abstract
RC4 was designed by Ron Rivest of RSA Security in 1987. While it is officially termed "Rivest Cipher 4", [4] the RC acronym is alternatively understood to stand for "Ron's Code" (see alsoRC2, RC5 and RC6). RC4 generates a pseudorandom stream of bits (a keystream). As with any stream cipher, these can be used for encryption by combining it with the plaintext using bit-wise exclusive-or; decryption is performed the same way (since exclusive-or is a symmetric operation). (This is similar to theVernam cipher except that generated pseudorandom bits, rather than a prepared stream, are used.) To generate the keystream, the cipher makes use of a secret internal state which consists of two parts: 1. A permutation of all 256 possible bytes (denoted "S" below). 2. Two 8-bit index-pointers (denoted "i" and "j"). The permutation is initialized with a variable length key, typically between 40 and 256 bits, using the keyscheduling algorithm (KSA). Once this has been completed, the stream of bits is generated using the pseudo-random generation algorithm (PRGA). The key-scheduling algorithm (KSA) The key-scheduling algorithm is used to initialize the permutation in the array "S." "keylength" is defined as the number of bytes in the key and can be in the range 1 keylength 256, typically between 5 and 16, corresponding to a key length of 40 128 bits. First, the array "S" is initialized to the identity permutation. S is then processed for 256 iterations in a similar way to the main PRGA, but also mixes in bytes of the key at the same time. for i from 0 to 255 S[i] := i endfor j := 0 for i from 0 to 255 j := (j + S[i] + key[i mod keylength]) mod 256 swap values of S[i] and S[j] endfor
The lookup stage of RC4. The output byte is selected by looking up the values of S(i) and S(j), adding them together modulo 256, and then looking up the sum in S; S(S(i) + S(j)) is used as a byte of the key stream, K.
For as many iterations as are needed, the PRGA modifies the state and outputs a byte of the keystream. In each iteration, the PRGA increments i, looks up theith element of S, S[i], and adds that to j, exchanges the values of S[i] and S[j], and then uses the sum S[i] + S[j] (modulo 256) as an index to fetch a third element of S which is the output of the algorithm. Each element of S is swapped with another element at least once every 256 iterations.
i := 0 j := 0 while GeneratingOutput: i := (i + 1) mod 256 j := (j + S[i]) mod 256 swap values of S[i] and S[j] K := S[(S[i] + S[j]) mod 256] output K endwhile
Implementation Many stream ciphers are based on linear feedback shift registers (LFSRs), which, while efficient in hardware, are less so in software. The design of RC4 avoids the use of LFSRs, and is ideal for software implementation, as it requires only byte manipulations. It uses 256 bytes of memory for the state array, S[0] through S[255], k bytes of memory for the key, key[0] through key[k-1], and integer variables, i, j, and y. Performing a modular reduction of some value modulo 256 can be done with a bitwise AND with 255 (which is equivalent to taking the low-order byte of the value in question). [edit]Test vectors These test vectors are not official, but convenient for anyone testing their own RC4 program. The keys and plaintext are ASCII, the keystream and ciphertext are in hexadecimal.
Key
Keystream
Plaintext
Ciphertext
Key
eb9f7781b734ca72a719... Plaintext
BBF316E8D940AF0AD3
Wiki
6044db6d41b7...
pedia
1021BF0420
Secret 04d46b053ca87b59...
Security Unlike a modern stream cipher (such as those in eSTREAM), RC4 does not take a separatenonce alongside the key. This means that if a single long-term key is to be used to securely encrypt multiple streams, the cryptosystem must specify how to combine the nonce and the long-term key to generate the stream key for RC4. One approach to addressing this is to generate a "fresh" RC4 key by hashing a long-term key with a nonce. However, many applications that use RC4 simply concatenate key and nonce; RC4's weak key schedule then gives rise to a variety of serious problems. Because RC4 is a stream cipher, it is more malleable than common block ciphers. If not used together with a strong message authentication code (MAC), then encryption is vulnerable to a bit-flipping attack. It [7] is noteworthy, however, that RC4, being a stream cipher, is the only common cipher which is immune to the 2011 BEAST attack on TLS 1.0, which exploits a known weakness in the way cipher block chaining mode is used with all of the other ciphers supported by TLS 1.0, which are all block ciphers. Biased Outputs of the RC4 The keystream generated by the RC4 is biased in varying degrees towards certain sequences. The best such attack is due to Itsik Mantin and Adi Shamir who showed that the second output byte of the cipher was biased toward zero with probability 1/128 (instead of 1/256). This is due to the fact that if the third byte of the original state is zero, and the second byte is not equal to 2, then the second output byte is [15] always zero. Such bias can be detected by observing only 256 bytes. Souradyuti Paul and Bart Preneel of COSIC showed that the first and the second bytes of the RC4 were 25 [16] also biased. The number of required samples to detect this bias is 2 bytes. Scott Fluhrer and David McGrew also showed such attacks which distinguished the keystream of the RC4 [17] from a random stream given a gigabyte of output. The complete characterization of a single step of RC4 PRGA was performed by Riddhipratim Basu, [18] Shirshendu Ganguly, Subhamoy Maitra, and Goutam Paul. Considering all the permutations, they prove that the distribution of the output is not uniform given i and j, and as a consequence, information about j is always leaked into the output.
RC4 variants
As mentioned above, the most important weakness of RC4 comes from the insufficient key schedule; the first bytes of output reveal information about the key. This can be corrected by simply discarding some [25] initial portion of the output stream. This is known as RC4-dropN, where N is typically a multiple of 256, such as 768 or 1024. RC4A uses two state arrays S1 and S2, and two indexes j1 and j2. Each time i is incremented, two bytes are generated: 1. First, the basic RC4 algorithm is performed using S1 and j1, but in the last step, S1[i] + S1[j1] is looked up in S2. 2. Second, the operation is repeated (without incrementing i again) on S2 and j2, and S1[S2[i]+S2[j2]] is output.
3. Thus, the algorithm is: 4. All arithmetic is performed modulo 256 5. i := 0 6. j1 := 0 7. j2 := 0 8. while GeneratingOutput: 9. 10. 11. 12. 13. 14. 15. i := i + 1 j1 := j1 + S1[i] swap values of S1[i] and S1[j1] output S2[S1[i] + S1[j1]] j2 := j2 + S2[i] swap values of S2[i] and S2[j2] output S1[S2[i] + S2[j2]]
16. endwhile
Although the algorithm required the same number of operations per output byte, there is greater parallelism than RC4, providing a possible speed improvement. Although stronger than RC4, this algorithm has also been attacked, with Alexander Maximov [29] team from NEC developing ways to distinguish its output from a truly random sequence.
[27] [28]
and a
VMPC "Variably Modified Permutation Composition" is another RC4 variant. It uses the same key schedule as RC4, but iterating 768 times rather than 256 (it is not the same as RC4-drop512 because all iterations incorporate key material), and with an optional additional 768 iterations to incorporate an initial vector. Written to highlight the similarity to RC4 as much as possible, the output generation function operates as follows:
[30]