NAND Dump Analysis, Bit Errors Fixing with ECC, UBI Image Analysis, and Firmware Extraction Demystified
NAND Dump Analysis, Bit Errors Fixing with ECC, UBI Image Analysis, and Firmware Extraction Demystified
NAND Dump Analysis, Bit Errors Fixing with ECC, UBI Image Analysis,
and Firmware Extraction Demystified
on 25/05/2023
1 - Introduction
2 - NAND Dump Analysis
3 - Bit Errors Fixing with ECC
4 - UBI Image Analysis
5 - Firmware Extraction
6 - Conclusion
1 - Introduction
This is a paper about how a NAND dump to be processed from a hacker point of
view and obtain all the files included in the dump file. For each step of the
process, the applied method is explained in detail together with example.
The NAND dump that is going to focus in depth is physical NAND dump, which is
the dump file getting from a universal programmer. For the dump file getting
from bootloader such as u-boot, I name it as logical NAND dump. For logical
NAND dump, the correctness of data is ensured by the Flash Translation Layer
(FTL). In other words, the FTL will do all the bit errors fixing with Error
Correcting Code (ECC) for you. However, for physical NAND dump, the data will
Correcting Code (ECC) for you. However, for physical NAND dump, the data will
come along with ECC, and you are on your own to guess how to use the ECC to
ensure the correctness of data. If bit errors exist, the ECC should be used
to fix the errors accordingly. But, it is not easy to guess how the ECC works
associated with the data. If the association between ECC and data is not
known, it is impossible to use the ECC to fix bit errors in data. So, it is
necessary to perform thorough NAND dump analysis systematically and uncover
the association between ECC and data which is in secret. It is not a good
idea to uncover the secret by brute forcing it blindly. Instead, by making
use the result from thorough analysis, the blindly brute forcing can be
transformed into guided brute forcing. As a result, the chance of getting
the secret association between ECC and data is maximized in the guided
brute forcing manner.
Once the bit errors in data get fixed, and the ECCs get removed, the NAND
dump transformed from physical into logical, and it is ready for actual
firmware image analysis. As a real case scenario for this paper, an UBI
image is going to deal with. The analysis to the UBI image will be
discussed in pretty detail. Based on the substantial knowledge gained from
the UBI image analysis, a creative approach is proposed to recover the file
system and extract all the files being hosted inside the file system. It
is important to note that the entire process being discussed in this paper
is not possible to replicate with those automated tools such as binwalk or
unblob. Besides, the entire analysis process is getting demonstrated on
step-by-step basis manually to make sure everything is explained clearly.
Without wasting more time in mere talk, let's get started from the actual
NAND dump analysis in details.
First of all, let's start with a little bit of fundamental stuff. A NAND
flash comprises a lot of so called "page" in certain size, and a group of
"page" in certain count will make up a "block". Since the sample NAND dump
that is going to be used for the demonstration is obtained from an actual
NAND chip with part number of MT29F2G08ABAEAWP, and so it should be used
as example to illustrate the hacking-related technical specification
accordingly. So, for MT29F2G08ABAEAWP, the size of a "page" is 2048+64=2112
bytes, and a group of 64 "page" make up a "block", and 2048 "block" make up
the entire storage of the NAND flash, which contain 2048*64=131072 "page".
For each "page" with 2112 bytes in size, the first 2048 bytes are data and
the rest of 64 bytes are spare area to host ECC or some kind of vendor
specific metadata. Sometimes, the spare area is also known as Out Of Band
00000260 00 00 9f e5 00 f0 a0 e1 54 06 00 00 a0 ac 00 00 |........T.......|
00000270 a0 ac 00 00 a0 ac 00 00 00 00 a0 e3 17 0f 07 ee |................|
00000280 17 0f 08 ee 10 0f 11 ee 23 0c c0 e3 87 00 c0 e3 |........#.......|
00000280 17 0f 08 ee 10 0f 11 ee 23 0c c0 e3 87 00 c0 e3 |........#.......|
00000290 02 00 80 e3 01 0a 80 e3 10 0f 01 ee 0e c0 a0 e1 |................|
000002a0 0a 00 00 eb 0c e0 a0 e1 0e f0 a0 e1 00 00 a0 e1 |................|
000002b0 e8 d0 1f e5 fe ff ff eb 00 80 00 bc a0 ae 00 00 |................|
000002c0 80 b7 00 00 00 00 a0 e1 00 00 a0 e1 00 00 a0 e1 |................|
000002d0 68 00 9f e5 00 10 e0 e3 00 10 80 e5 00 00 0f e1 |h...............|
000002e0 c0 00 80 e3 00 f0 21 e1 54 00 9f e5 54 10 9f e5 |......!.T...T...|
000002f0 00 10 80 e5 50 00 9f e5 50 10 9f e5 00 10 80 e5 |....P...P.......|
00000300 4c 00 9f e5 05 14 a0 e3 00 10 80 e5 44 00 9f e5 |L...........D...|
00000310 44 10 9f e5 00 10 80 e5 03 2a a0 e3 01 20 52 e2 |D........*... R.|
00000320 fd ff ff 1a 20 00 9f e5 30 10 9f e5 00 10 80 e5 |.... ...0.......|
00000330 01 2b a0 e3 01 20 52 e2 fd ff ff 1a 0e f0 a0 e1 |.+... R.........|
00000340 24 21 00 b8 04 10 00 b0 84 00 04 40 04 02 00 b0 |[email protected]|
00000350 ff 0f 00 00 08 02 00 b0 0c 02 00 b0 24 4f 00 00 |............$O..|
00000360 fc 0f 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000370 00 00 51 e3 1f 00 00 0a 01 30 a0 e3 00 20 a0 e3 |..Q......0... ..|
00000380 01 00 50 e1 19 00 00 3a 01 02 51 e3 00 00 51 31 |..P....:..Q...Q1|
00000390 01 12 a0 31 03 32 a0 31 fa ff ff 3a 02 01 51 e3 |...1.2.1...:..Q.|
000003a0 00 00 51 31 81 10 a0 31 83 30 a0 31 fa ff ff 3a |..Q1...1.0.1...:|
000003b0 01 00 50 e1 01 00 40 20 03 20 82 21 a1 00 50 e1 |[email protected] . .!..P.|
000003c0 a1 00 40 20 a3 20 82 21 21 01 50 e1 21 01 40 20 |[email protected] . [email protect
ed] |
000003d0 23 21 82 21 a1 01 50 e1 a1 01 40 20 a3 21 82 21 |#[email protected] .!.!|
000003e0 00 00 50 e3 23 32 b0 11 21 12 a0 11 ef ff ff 1a |..P.#2..!.......|
000003f0 02 00 a0 e1 0e f0 a0 e1 04 e0 2d e5 c9 1c 00 eb |..........-.....|
00000400 00 00 a0 e3 00 80 bd e8 03 50 2d e9 d7 ff ff eb |.........P-.....|
00000410 06 50 bd e8 90 02 03 e0 03 10 41 e0 0e f0 a0 e1 |.P........A.....|
00000420 03 50 2d e9 09 00 00 eb 06 50 bd e8 90 02 03 e0 |.P-......P......|
00000430 03 10 41 e0 0e f0 a0 e1 00 00 a0 e1 00 00 a0 e1 |..A.............|
00000440 00 00 a0 e1 00 00 a0 e1 00 00 a0 e1 00 00 a0 e1 |................|
00000450 00 00 51 e3 01 c0 20 e0 42 00 00 0a 00 10 61 42 |..Q... .B.....aB|
00000460 01 20 51 e2 27 00 00 0a 00 30 b0 e1 00 30 60 42 |. Q.'....0...0`B|
00000470 01 00 53 e1 26 00 00 9a 02 00 11 e1 28 00 00 0a |..S.&.......(...|
00000480 0e 02 11 e3 81 11 a0 01 08 20 a0 03 01 20 a0 13 |......... ... ..|
00000490 01 02 51 e3 03 00 51 31 01 12 a0 31 02 22 a0 31 |..Q...Q1...1.".1|
000004a0 fa ff ff 3a 02 01 51 e3 03 00 51 31 81 10 a0 31 |...:..Q...Q1...1|
000004b0 82 20 a0 31 fa ff ff 3a 00 00 a0 e3 01 00 53 e1 |. .1...:......S.|
000004c0 01 30 43 20 02 00 80 21 a1 00 53 e1 a1 30 43 20 |.0C ...!..S..0C |
000004d0 a2 00 80 21 21 01 53 e1 21 31 43 20 22 01 80 21 |...!!.S.!1C "..!|
000004e0 a1 01 53 e1 a1 31 43 20 a2 01 80 21 00 00 53 e3 |..S..1C ...!..S.|
000004f0 22 22 b0 11 21 12 a0 11 ef ff ff 1a 00 00 5c e3 |""..!.........\.|
00000500 00 00 60 42 0e f0 a0 e1 00 00 3c e1 00 00 60 42 |..`B......<...`B|
00000510 0e f0 a0 e1 00 00 a0 33 cc 0f a0 01 01 00 80 03 |.......3........|
00000520 0e f0 a0 e1 01 08 51 e3 21 18 a0 21 10 20 a0 23 |......Q.!..!. .#|
00000520 0e f0 a0 e1 01 08 51 e3 21 18 a0 21 10 20 a0 23 |......Q.!..!. .#|
00000530 00 20 a0 33 01 0c 51 e3 21 14 a0 21 08 20 82 22 |. .3..Q.!..!. ."|
00000540 10 00 51 e3 21 12 a0 21 04 20 82 22 04 00 51 e3 |..Q.!..!. ."..Q.|
00000550 03 20 82 82 a1 20 82 90 00 00 5c e3 33 02 a0 e1 |. ... ....\.3...|
00000560 00 00 60 42 0e f0 a0 e1 04 e0 2d e5 6d 1c 00 eb |..`B......-.m...|
00000570 00 00 a0 e3 04 f0 9d e4 00 00 a0 e1 00 00 a0 e1 |................|
00000580 00 00 a0 e1 00 00 a0 e1 00 00 a0 e1 00 00 a0 e1 |................|
00000590 20 30 52 e2 20 c0 62 e2 30 02 a0 41 31 03 a0 51 | 0R. .b.0..A1..Q|
000005a0 11 0c 80 41 31 12 a0 e1 0e f0 a0 e1 20 30 52 e2 |...A1....... 0R.|
000005b0 20 c0 62 e2 11 12 a0 41 10 13 a0 51 30 1c 81 41 | .b....A...Q0..A|
000005c0 10 02 a0 e1 0e f0 a0 e1 20 30 52 e2 20 c0 62 e2 |........ 0R. .b.|
000005d0 30 02 a0 41 51 03 a0 51 11 0c 80 41 51 12 a0 e1 |0..AQ..Q...AQ...|
000005e0 0e f0 a0 e1 2d de 4d e2 00 40 a0 e3 6c 31 9f e5 |[email protected]|
000005f0 0d 00 a0 e1 00 30 8d e5 04 30 8d e5 1c 40 8d e5 |[email protected]|
00000600 bc d2 8d e5 30 40 8d e5 50 40 8d e5 d1 01 00 eb |[email protected]@......|
00000610 1c 30 9d e5 04 00 53 e1 02 00 00 0a 04 10 a0 e1 |.0....S.........|
00000620 8a 0f 8d e2 33 ff 2f e1 8a 0f 8d e2 01 10 a0 e3 |....3./.........|
00000630 ca 1a 00 eb 00 00 50 e3 46 00 00 1a 70 04 00 eb |......P.F...p...|
00000640 38 42 9d e5 3c 52 9d e5 04 00 a0 e1 05 10 a0 e1 |8B..<R..........|
00000650 46 ff ff eb 04 10 a0 e1 0e a6 a0 e3 00 b0 a0 e1 |F...............|
00000660 0a 08 a0 e3 41 ff ff eb 04 10 a0 e1 00 70 a0 e1 |....A........p..|
00000670 ec 00 9f e5 3d ff ff eb 04 10 a0 e1 00 90 a0 e1 |....=...........|
00000680 0a 08 a0 e3 5f ff ff eb 01 00 a0 e1 05 10 a0 e1 |...._...........|
00000690 36 ff ff eb 00 60 a0 e1 24 00 00 ea 3c 12 9d e5 |6....`..$...<...|
000006a0 38 02 9d e5 31 ff ff eb 8a 4f 8d e2 bc 52 9d e5 |8...1....O...R..|
000006b0 50 10 a0 e3 00 20 a0 e3 90 07 03 e0 04 00 a0 e1 |P.... ..........|
000006c0 0f e0 a0 e1 34 f0 95 e5 04 00 a0 e1 0f e0 a0 e1 |....4...........|
000006d0 08 f0 95 e5 ff 00 50 e3 01 90 89 12 12 00 00 1a |......P.........|
000006e0 0e 00 00 ea bc 42 9d e5 38 02 9d e5 d4 51 94 e5 |.....B..8....Q..|
000006f0 3c 12 9d e5 00 00 55 e3 05 00 00 0a 1b ff ff eb |<.....U.........|
00000700 04 10 a0 e1 0a 20 a0 e1 90 67 23 e0 8a 0f 8d e2 |..... ...g#.....|
00000710 35 ff 2f e1 3c 32 9d e5 01 60 86 e2 03 a0 8a e0 |5./.<2...`......|
00000720 0b 00 56 e1 ee ff ff 3a 00 60 a0 e3 01 70 87 e2 |..V....:.`...p..|
00000730 09 00 57 e1 d8 ff ff 9a 1c 30 9d e5 00 00 53 e3 |..W......0....S.|
00000740 02 00 00 0a 8a 0f 8d e2 00 10 e0 e3 33 ff 2f e1 |............3./.|
00000750 0e 36 a0 e3 33 ff 2f e1 2d de 8d e2 1e ff 2f e1 |.6..3./.-...../.|
00000760 00 d0 00 b0 ff cf 11 00 f0 40 2d e9 02 60 d3 e5 |[email protected]`..|
00000770 00 40 d3 e5 00 00 d2 e5 01 c0 d3 e5 02 50 d2 e5 |[email protected]|
00000780 01 30 d2 e5 00 40 24 e0 03 c0 2c e0 05 60 26 e0 |[email protected]$...,..`&.|
00000790 ff 00 04 e2 06 30 8c e1 03 30 90 e1 01 70 a0 e1 |.....0...0...p..|
000007a0 03 00 a0 01 f0 80 bd 08 ac 50 a0 e1 0c 30 25 e0 |.........P...0%.|
000007b0 55 30 03 e2 55 00 53 e3 28 00 00 1a a0 30 20 e0 |U0..U.S.(....0 .|
000007c0 55 30 03 e2 55 00 53 e3 24 00 00 1a a6 30 26 e0 |U0..U.S.$....0&.|
000007d0 54 30 03 e2 54 00 53 e3 20 00 00 1a 80 20 a0 e1 |T0..T.S. .... ..|
000007d0 54 30 03 e2 54 00 53 e3 20 00 00 1a 80 20 a0 e1 |T0..T.S. .... ..|
000007e0 00 31 a0 e1 20 30 03 e2 40 20 02 e2 03 20 82 e1 |.1.. [email protected] ... ..|
000007f0 80 10 04 e2 80 31 a0 e1 01 20 82 e1 10 30 03 e2 |.....1... ...0..|
00000800 ff ff 00 00 ff ff ff ff ff ff ff ff ff ff ff ff |................|
00000810 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
00000820 f6 89 f7 79 e5 60 c9 e0 d6 e3 ed cb 9c b0 f9 f0 |...y.`..........|
00000830 1f da d4 a4 9c d4 1b e0 e0 90 cc 85 d8 d2 e2 80 |................|
00000840
This sample NAND dump is in fact a physical NAND dump from a real industrial
product. As mentioned earlier, this sample will be used as a real case
scenario to illustrate each step of analysis process until the full file
system getting extracted and recovered. Let's start with DumpFlash tool and
try to identify the ID codes of the NAND chip. However, it's failed and the
output is shown below. This happen might due to the ID codes are missing or
changed to something strange in the NAND dump.
So, just forget about the false output generated by DumpFlash, and back to
the technical specification as provided by the datasheet of MT29F2G08ABAEAWP.
Let's have a brief look to the OOB with 64 bytes in size of the first "page"
in particular.
00000800 ff ff 00 00 ff ff ff ff ff ff ff ff ff ff ff ff |................|
00000810 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
00000820 f6 89 f7 79 e5 60 c9 e0 d6 e3 ed cb 9c b0 f9 f0 |...y.`..........|
00000830 1f da d4 a4 9c d4 1b e0 e0 90 cc 85 d8 d2 e2 80 |................|
From this, two assumptions can be made. One, the first 32 bytes of OOB might
be a constant. Two, the second 32 bytes might be ECCs. Let's verify the
first assumption is a fact or a mistake, by checking the OOB of the second
"page", as shown below.
00001050 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
00001060 8f ce f4 8b 1c 26 38 00 bd 61 a0 c7 48 c4 d3 60 |.....&8..a..H..`|
00001070 d2 1b 46 ab 53 8f 41 f0 8d 18 2b 3b 8d 54 21 50 |..F.S.A...+;.T!P|
00001070 d2 1b 46 ab 53 8f 41 f0 8d 18 2b 3b 8d 54 21 50 |..F.S.A... ;.T!P|
Still unchanged. How about the first "page" of the next block then ?
Well, this is a blank page that should be ignored. By grabbing a few samples
and make a conclusion is really not a good idea. Let's check it in proper.
suspect_const = \
b'\xff\xff\x00\x00\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff' + \
b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff'
blank = \
b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff' + \
b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff'
page_count = 0
diff_count = 0
while 1:
data = input_file.read(2112)
if len(data) == 0:
break
oob_first_32_bytes = data[2048:2048+32]
page_count += 1
if len(data) == 2112 and oob_first_32_bytes != blank:
if oob_first_32_bytes != suspect_const:
diff_count += 1
So, it is convincing enough to say that the first 32 bytes of OOB for all
the "page" are constant. Next, let's verify the second assumption about the
second 32 bytes of OOB are ECCs or not. The ECC suspected portion of OOB
for the first 4 "page" are shown below.
00000820 f6 89 f7 79 e5 60 c9 e0 d6 e3 ed cb 9c b0 f9 f0 |...y.`..........|
00000830 1f da d4 a4 9c d4 1b e0 e0 90 cc 85 d8 d2 e2 80 |................|
00001060 8f ce f4 8b 1c 26 38 00 bd 61 a0 c7 48 c4 d3 60 |.....&8..a..H..`|
00001070 d2 1b 46 ab 53 8f 41 f0 8d 18 2b 3b 8d 54 21 50 |..F.S.A...+;.T!P|
000020e0 43 a9 36 70 be b0 5e 90 1c 4f c1 ad 19 54 4d 20 |C.6p..^..O...TM |
000020f0 b8 6a 20 ba 32 c2 74 80 76 73 45 10 64 3e 38 c0 |.j .2.t.vsE.d>8.|
The output looks positive, and it provides extra information about how the
ECC suspected portion of OOB going to be used by the system implementation.
For each "page", it seems the 32 bytes of ECC suspected portion can be
divided into four of 8 bytes each ECCs. The reason is the last 4 bits of
each 8 bytes of suspected ECC are always to be zero, as shown below.
f6 89 f7 79 e5 60 c9 e0
d6 e3 ed cb 9c b0 f9 f0
1f da d4 a4 9c d4 1b e0
e0 90 cc 85 d8 d2 e2 80
8f ce f4 8b 1c 26 38 00
bd 61 a0 c7 48 c4 d3 60
d2 1b 46 ab 53 8f 41 f0
8d 18 2b 3b 8d 54 21 50
01 8b bb 0a bb 54 88 50
7e 0e b9 9a c2 7b bd 40
dd 63 cb 9a e3 5a bc 70
65 ca 16 7a 50 dc 60 e0
43 a9 36 70 be b0 5e 90
1c 4f c1 ad 19 54 4d 20
b8 6a 20 ba 32 c2 74 80
76 73 45 10 64 3e 38 c0
^
0
When saying the last 4 bits of each ECC is zero, it might indicate the
length of the ECC is 8*8=64-4=60 bits. As a side note, it is important
to note that the ECC length is normally expressed in bit form. Let's
get confirm to all the ECCs are 60-bits in size by checking the last
4 bits for each of them are always zero.
suspect_const = \
b'\xff\xff\x00\x00\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff' + \
b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff'
b \xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff
blank = \
b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff' + \
b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff'
masking = b'\x00\x00\x00\x00\x00\x00\x00\x0f'
page_count = 0
diff_count = 0
while 1:
data = input_file.read(2112)
if len(data) == 0:
break
oob_1st_32_bytes = data[2048:2048+32]
oob_2nd_32_bytes = data[2048+32:2048+64]
page_count += 1
if len(data) == 2112 and oob_1st_32_bytes != blank:
for i in range(4):
last_4_bits = bytes([a & b for a, b in \
zip(oob_2nd_32_bytes[i*8:i*8+8], masking)])
if last_4_bits[7] != 0:
diff_count += 1
With such a convincing result, it is reasonable to say that the ECC length
is 60 bits.
import bchlib
BCH_POLYNOMIAL = 8219
BCH_BITS = 4
bch = bchlib.BCH(BCH_POLYNOMIAL, BCH_BITS)
data = bytearray(b'\x00'*512)
ecc = bch.encode(data)
for i in ecc:
print("%X" % i, end='')
print("")
The bchlib is used for Binary BCH encoding and decoding tasks. Two
parameters have to be specified to make it works, BCH_POLYNOMIAL and
BCH_BITS. The BCH_POLYNOMIAL is about the primitive polynomial going to
be used, and the BCH_BITS is about the maximum number of bit errors in
data that can be corrected by the ECC. All the details about these two
parameters will be discussed in the coming section of Binary BCH
implementation as it is crucial to uncover the secret association between
ECC and data. Now, let's get the first glance of bchlib and study the
first characteristic of Binary BCH. The output of test_bchlib_01.py is
shown below.
The BCH encoded output of 512 bytes of zero is indeed 3.5 bytes of zero.
How about 512 bytes of 0xFF then ? Let's check.
import bchlib
BCH_POLYNOMIAL = 8219
BCH_BITS = 4
bch = bchlib.BCH(BCH POLYNOMIAL, BCH BITS)
bch bchlib.BCH(BCH_POLYNOMIAL, BCH_BITS)
data = bytearray(b'\xFF'*512)
ecc = bch.encode(data)
for i in ecc:
print("%X" % i, end='')
print("")
The output is not all 0xFF and it makes sense. Otherwise, if 512 bytes
of 0xFF getting BCH encoded as 7 bytes of 0xFF, then it is not convenient
to differentiate from a blank "page". Now, let's proceed to the second
characteristic about the zeros padding issues. The question now is what
happen if 32 bytes of zeros appended to the 512 bytes of 0xFF ? Let's
check it.
import bchlib
BCH_POLYNOMIAL = 8219
BCH_BITS = 4
bch = bchlib.BCH(BCH_POLYNOMIAL, BCH_BITS)
for i in ecc:
print("%X" % i, end='')
print("")
import bchlib
BCH_POLYNOMIAL = 8219
BCH_BITS = 4
bch = bchlib.BCH(BCH_POLYNOMIAL, BCH_BITS)
data2 = bytearray(b'\xFF'*512)
ecc2 = bch.encode(data2)
print("Zeros Prepended:")
for i in ecc1:
print("%X" % i, end='')
print("")
print("Nothing Prepended:")
for i in ecc2:
print("%X" % i, end='')
print("")
As expected, both of the BCH encoded output are exactly the same, and the
output is shown below,
One important point should take note here. If the input data is bit order
reversed, the BCH encoded output should be in bit order reversed form also.
Thanks to bchlib for implementing this in default mode. Now, another
question arises, is it possible to remain the bit order of the input data
which is going to be BCH encoded ? Yes, it is possible by performing bit
order reversing to the input data first before passing to the bchlib
encoder, and of course the BCH encoded output should perform bit order
reversing accordingly. Let's show it by example.
import bchlib
BCH_POLYNOMIAL = 8219
BCH_BITS = 4
bch = bchlib.BCH(BCH_POLYNOMIAL, BCH_BITS)
data_reverse_bit = b''
data_reverse_bit = data_reverse_bit[::-1]
ecc = bch.encode(data_reverse_bit)
ecc_reverse_bit = b''
ecc_reverse_bit = ecc_reverse_bit[::-1]
for i in ecc_reverse_bit:
print("%X" % i, end='')
print("")
In this test_bchlib_05.py, the last bytes of the entire 512 bytes of data
input is purposely changed from 0xFF to 0xAA to avoid symmetricity of the
data ( 0b11111111 after bit order reversing is still 0b11111111 ). Now,
data ( 0b11111111 after bit order reversing is still 0b11111111 ). Now,
let's see the output.
import bchlib
BCH_POLYNOMIAL = 8219
BCH_BITS = 4
bch = bchlib.BCH(BCH_POLYNOMIAL, BCH_BITS)
data_reverse_bit = b''
data_reverse_bit = data_reverse_bit[::-1]
ecc = bch.encode(data_reverse_bit)
ecc_reverse_bit = b''
ecc_reverse_bit = ecc_reverse_bit[::-1]
for i in ecc_reverse_bit:
print("%X" % i, end='')
print("")
oob_const = b'\xff\xff\x00\x00\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff' + \
b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff'
zeros_ecc = b'\x00\x00\x00\x00\x00\x00\x00\x00'
page_cnt = 0
positive_cnt = 0
while 1:
data = input_file.read(2112)
if len(data) == 0:
break
oob_1st_32_bytes = data[2048:2048+32]
oob_2nd_32_bytes = data[2048+32:2048+32+32]
if len(data) == 2112 and oob_1st_32_bytes == oob_const:
for i in range(0, 4):
ecc = oob_2nd_32_bytes[i*8:i*8+8]
if ecc == zeros_ecc:
positive_cnt += 1
print("Page Num: %d, Address: 0x%X" % (page_cnt, page_cnt*2112))
break
if positive_cnt == 1:
break
page_cnt += 1
print("Completed")
Nice, the first found item is at address 0x84000. Let's display the full
"page" in hex view.
000841d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000841e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000841f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000841f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084200 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084210 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084220 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084230 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084240 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084250 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084260 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084270 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084280 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084290 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000842a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000842b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000842c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000842d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000842e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000842f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084300 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084310 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084320 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084330 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084340 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084350 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084360 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084370 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084380 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084390 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000843a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000843b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000843c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000843d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000843e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000843f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084400 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084410 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084420 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084430 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084440 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084450 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084460 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084470 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084480 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084490 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000844a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000844a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000844b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000844c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000844d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000844e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000844f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084500 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084510 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084520 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084530 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084540 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084550 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084560 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084570 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084580 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084590 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000845a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000845b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000845c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000845d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000845e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000845f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084600 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084610 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084620 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084630 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084640 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084650 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084660 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084670 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084680 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084690 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000846a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000846b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000846c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000846d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000846e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000846f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084700 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084710 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084720 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084730 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084740 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084750 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
0008 50 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | |
00084760 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084770 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084780 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084790 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000847a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000847b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000847c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000847d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000847e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000847f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084800 ff ff 00 00 ff ff ff ff ff ff ff ff ff ff ff ff |................|
00084810 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
00084820 3b 8d c6 e5 19 b2 24 50 00 00 00 00 00 00 00 00 |;.....$P........|
00084830 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00084840
So, what can be deduced from this "page" ? Well, it is almost certain
the data portion of a "page" in 2048 bytes in size is divided into four
parts with 512 bytes each, which I named it as "sub-page" at the start
of this article. In this "page", the first "sub-page" is started from
0x84000 to 0x841ff, which contains non-zeros data, with BCH encoded ECC
as 3b8dc6e519b22450. The following three "sub-page" are containing all
zeros data, with BCH encoded ECC as all zeros, respectively. In other
words, the 512 bytes of zeros in each of these three "sub-page" are
either being BCH encoded directly, or being padded with a certain number
of zeros ONLY, in order to generate all zeros ECC. Hence, once the others
BCH encoding parameters are slowly unveiled in the discussion of the
following section, it becomes straightforward in recovering the secret
association between ECC and data. So, the second, third, and fourth
"sub-page" in a "page" are clear now, and it is usually about the same
for all the other "page". However, the padding scheme of the first
"sub-page" is still uncertain yet, unless a "page" with four all zeros
ECCs can be found. Let's try it.
oob_const = \
b'\xff\xff\x00\x00\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff' + \
b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff'
zeros_ecc = \
b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' + \
b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
b \ 00\ 00\ 00\ 00\ 00\ 00\ 00\ 00\ 00\ 00\ 00\ 00\ 00\ 00\ 00\ 00
page_cnt = 0
while 1:
data = input_file.read(2112)
if len(data) == 0:
break
oob_1st_32_bytes = data[2048:2048+32]
oob_2nd_32_bytes = data[2048+32:2048+32+32]
if len(data) == 2112 and oob_1st_32_bytes == oob_const:
if oob_2nd_32_bytes[0:32] == zeros_ecc[0:32]:
print("Page Num: %d, Address: 0x%X" % (page_cnt, page_cnt*2112))
break
page_cnt += 1
print("Completed")
Let's find for any expected "page". However, the output is unexpected,
as shown below.
Anyhow, just let go the unsolved part for now, we will get back later in
the next section. Now, let's have a brief hacker overview of Binary BCH
implementation, yes, solely from a hacker's perspective, not academic.
In general, the BCH codec needs a primitive polynomial in order to derive
a generator polynomial to be used for code generation. The Gallois Field
order will determine the number of primitive polynomial that can be used
by the BCH codec. A polynomial can be represented by an integer or in bit
form binary. The set bits of the integer or the bit form binary represents
the coefficients of the given order of magnitude of the selected primitive
polynomial. Sound confused ? Let's have an example.
0x201B
|
V
0b0010000000011011
|
V
0b 0 0 1 0 0 0 0 0 0 0 0 1 1 0 1 1
0b 0 0 0 0 000000 0
^ ^ ^ ^ ^ ^^^^^^^^^^^
| | | | | |||||||||||
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Yes, each set bit position reflects the selected order of magnitude, and
the greatest set bit position is defined as the degree of the primitive
polynomial. Again, for the case of 0x201B, it is in degree 13. For most
of the times, the degree number is known as m to represent the Gallois
Field order, and so for the case of 0x201B, it can be expressed as m=13.
In order to protect a data in a certain number of size in the unit of bit,
the number should be less than 2^m. For example, to protect a data with
the size of 512 bytes, the data length in the unit of bit is 512*8=4096.
This number is normally known as k, and so, it is more appropriate to
write in the form of k=4096. So, number of 2^m should be greater than 4096,
then m should be greater than log(4096)/log(2)=12, and the m should be at
least 13. Again, for the case of 0x201B, since its m is 13, then it is
suitable to be used in protecting a data with 512 bytes in size. What is
the hex number of 0x201B in decimal ? It is 8219, sound familiar ? Yes,
it was being used in the "first glance" bchlib section in defining the
variable BCH_POLYNOMIAL.
When talking about data protection, one must talk about the protection
strength. The protection strength is about if something went wrong in
data, then the data can tolerate up to how many bit of errors in order
to recover it back to the correct state. The strength is normally known
as t. So, when someone mentions t=4, it means the ECC can tolerate up to
4 bits of error. Alright, it is clear for m, k, and t now. Let's proceed
to the discussion about the length of ECC, which is more commonly named
as the size of parity bits. For BCH, the size of parity bits is equal to
m*t. Thus, by given m=13, k=4096, and t=4, since 2^m=2^13=8192 which is
greater than k=4096, it is appropriate and no discrepancy at all to generate
BCH encoded ECC of parity bits with the size of m*t=13*4=52 bits. Remember
the ECC size being found from the NAND dump analysis in the previous part ?
Yes, it is 60-bits (8 bytes deduct the last 4 bits of zeros). Well, the
boring stuff is getting interesting now. Let's see what can be deduced with
this little clue. The data size to be protected is 512 bytes, which is
t s tt e c ue e data s e to be p otected s 5 bytes, c s
4096 bits. The m should be at least 13 and so 2^m=2^13=8192, which is
sufficient to protect the 4096 bits of data. As the number of parity bits
is 60, the respective factors are 1, 2, 3, 4, 5, 6, 10, 12, 15, 20, 30,
and 60. By given m*t=60, and m>=13, the possible combination of (m, t)
are (15, 4), (20, 3), (60, 1). While t=4 is a common approach for majority
of the BCH implementation of ECC, the combination of m=15 and t=4 is most
probably. The others two combinations of (20, 3) and (60, 1) are not only
unrealistic, but also terribly overkilled. At this stage, by assuming m=15
and t=4, which primitive polynomial should be selected ? Let's refer to
the primitive polynomial list as stated in [4]. For degree 15, the
candidates are shown below.
x^15 + x^1 + 1
x^15 + x^4 + 1
x^15 + x^7 + 1
x^15 + x^7 + x^6 + x^3 + x^2 + x^1 + 1
x^15 + x^10 + x^5 + x^1 + 1
x^15 + x^10 + x^5 + x^4 + 1
x^15 + x^10 + x^5 + x^4 + x^2 + x^1 + 1
x^15 + x^10 + x^9 + x^7 + x^5 + x^3 + 1
x^15 + x^10 + x^9 + x^8 + x^5 + x^3 + 1
x^15 + x^11 + x^7 + x^6 + x^2 + x^1 + 1
x^15 + x^12 + x^3 + x^1 + 1
x^15 + x^12 + x^5 + x^4 + x^3 + x^2 + 1
x^15 + x^12 + x^11 + x^8 + x^7 + x^6 + x^4 + x^2 + 1
x^15 + x^14 + x^13 + x^12 + x^11 + x^10 + x^9 + x^8 + x^7 + x^6 + \
x^5 + x^4 + x^3 + x^2+1
x^15 + x^1 + 1
0b1000000000000011
import bchlib
import binascii
BCH_POLYNOMIAL = 32771
BCH_BITS = 4
bch = bchlib.BCH(BCH_POLYNOMIAL, BCH_BITS)
page = input_file.read(2112)
ECC = page[2048+32:2048+32+32]
Sub-page: 0
ECC Ori: F689F779E560C9E0
ECC Generated: 8DE136AAF3E03F90
Wrong !
Sub-page: 1
ECC Ori: D6E3EDCB9CB0F9F0
Sub-page: 3
ECC Ori: E090CC85D8D2E280
ECC Generated: B36A94B537E14BA0
Wrong !
Completed
None of the four "sub-page" generate the correct ECC. So, the "sub-page"
should be padded by a certain number of zero before getting BCH encoded.
Let's try to do BCH encoding by padding the "sub-page" from 1 to 32 bytes
of zeros.
import bchlib
import binascii
BCH_POLYNOMIAL = 32771
BCH_BITS = 4
bch = bchlib.BCH(BCH_POLYNOMIAL, BCH_BITS)
page = input_file.read(2112)
ECC = page[2048+32:2048+32+32]
found_flag = 0
Let's go and run the check. Hola, the output is interesting, as shown
below.
Sub-page: 0
ECC Ori: F689F779E560C9E0
Wrong !
Sub-page: 1
ECC Ori: D6E3EDCB9CB0F9F0
ECC Generated: D6E3EDCB9CB0F9F0
Match ! Zeros padded number: 24
Sub-page: 2
ECC Ori: 1FDAD4A49CD41BE0
ECC Generated: 1FDAD4A49CD41BE0
Match ! Zeros padded number: 24
Sub-page: 3
ECC Ori: E090CC85D8D2E280
ECC Generated: E090CC85D8D2E280
Match ! Zeros padded number: 24
Completed
So, for those four "sub-page" in a "page", other than the first "sub-page",
the second, third, and fourth "sub-page" are padded with 24 bytes of zeros
before being BCH encoded in order to generate the correct ECC, respectively.
However, the first "sub-page" is still in cryptic, which need to tweak a bit.
Since the rest of the "sub-page" are padded with 24 bytes of zeros, it is
very likely the first "sub-page" is padded with 24 bytes of non-zeros data
then. It should be something related to some kind of "metadata" which is
descriptive to the "page" itself. Remember the first 32 bytes of OOB ?
Let's check it again.
The two bytes of zeros at 0x802 and 0x803 are a little bit strange. So,
is it possible for the first few bytes of the 24 bytes of zeros padding
are replaced by some bytes from here ? Let's try to replace the 24 bytes
of zeros padding byte by byte, until the entire 24 bytes of padding become
ffff0000ffffffffffffffffffffffffffffffffffffffff
import bchlib
import binascii
BCH_POLYNOMIAL = 32771
BCH_BITS = 4
bch = bchlib.BCH(BCH_POLYNOMIAL, BCH_BITS)
page = input_file.read(2112)
subpage = page[0:512]
ECC = page[2048+32:2048+32+8]
paddingx = \
b'\xFF\xFF\x00\x00\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF' + \
b'\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF'
padding0 = \
b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' + \
b'\x00\x00\x00\x00\x00\x00\x00\x00'
Let's run it. Bingo, the padding pattern found, as shown below.
Completed
Perfect. Now, the secret association between ECC and data is fully
unveiled. As a conclusion, for each of the "sub-page" in a "page", the
first "sub-page" has to be padded by 24 bytes of padding which comprise
2 bytes of 0xFF following by 22 bytes of zeros, before getting BCH encoded
to generate correct ECC. For the case of second, third, and fourth
"sub-page", only a 24 bytes of all zeros padding is needed to generate
correct ECC, respectively. So, by doing the BCH decoding in the similar
manner to all the "page" of the entire NAND dump, all the bit errors are
getting fixed perfectly. After that, all the 64 bytes OOB in each "page"
should be removed and generating a new NAND dump with contiguous data in
"page" by "page" without any bit errors, and I rename it as
cawan_output.bin, as shown below.
import bchlib
BCH_POLYNOMIAL = 32771
BCH_BITS = 4
pad_sub0 = \
b'\xFF\xFF\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' + \
b'\x00\x00\x00\x00\x00\x00\x00\x00'
pad_subx = \
b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' + \
b'\x00\x00\x00\x00\x00\x00\x00\x00'
count = 0
error_cnt = 0
while 1:
page = input_file.read(2112)
if len(page) != 2112:
break
for i in range(0, 4):
data, ecc = page[512*i:512*i+512], page[2048+32+i*8:2048+32+i*8+8]
if i == 0:
data_padded = data + pad_sub0
else:
data_padded = data + pad_subx
data_padded = bytearray(data_padded)
bitflips = bch.decode_inplace(data_padded, ecc)
if bitflips == 0:
output_file.write(data_padded[:512])
elif bitflips > 0:
error_cnt += 1
output_file.write(data_padded[:512])
elif bitflips == -1:
output_file.write(data_padded[:512])
count += 1
print("Sub-page with error count: %d\n" % error_cnt)
print("Completed.")
Well, there are 20 "sub-page" with bit errors have being fixed with
the ECC, as shown below.
Completed.
By armed with knowledge, any suitable common tool can be weaponized for
hacking purposes. Don't be silly and get stubborn in believing a
proprietary, special, commercial, or even an automated tool can work as
expected without requiring a single knowledge in the field. So, the
firmware is ready right now, let's proceed to the firmware analysis.
As a common approach, let's begin with binwalk and expect for gold strikes
or money grow on tree, or both. Let's see the binwalk output as shown below.
cawan% binwalk cawan_output.bin
The header really makes sense with UBI magic at 0x600000, version 1,
the erase count is 1, which mean it is a new NAND flash, or at least
it is just being reformatted. After that, the volume ID header is 0x800
or 2048 in decimal away from 0x600000, which is a common approach for
NAND flash. One important thing to emphasize here. The newly generated
NAND dump is defined as logical NAND dump which is OOB removed and the
size of each "page" is 2048 bytes. So, it is really a common approach in
locating the volume ID header one "page" away from the UBI header. Then,
the actual data is 0x1000 or 4096 in decimal away from the 0x600000,
in other words it is another one "page" away from the volume ID header.
This is also a common approach for NAND flash. So, there is something
as a lunch ? Let's try to extract it with binwalk by passing in the well
known parameters, -Me. The lengthy output seems convincing. Let's get
into the directory hosting the extracted files, as shown below.
cawan% cd _cawan_output.bin.extracted
cawan% ls
204974 _204974.extracted 600000.ubi ED074.lzo ubifs-root
cawan% cd ubifs-root
cawan% ls
1941946494 3823591600
Another two directory found. Let's check each directory by using tree
command.
15 directories, 1 file
1 directory, 0 files
cawan% cd 1941946494
cawan% cd ubifs
cawan% ls
bin dev etc home lib linuxrc mnt proc root sbin sys tmp usr \
var work
cawan% cd etc
cawan% ls
fstab HOSTNAME inittab pointercal profile~ ts.conf
group inetd.conf networks ppp services vsftpd.conf
It really takes a while to generate ubi.bin. Now, let's verify the UBI
header, volume ID header, and the start of data in hex view.
*
00001800
Let's interpret the UBI header with its data structure as shown below.
struct ubi_ec_hdr {
__be32 magic;
__u8 version;
__u8 padding1[3];
__be64 ec;
__be32 vid_hdr_offset;
__be32 data_offset;
__be32 image_seq;
__u8 padding2[32];
__be32 hdr_crc;
}
Now, let's check how many UBIFS exist in the UBI image.
count = 0
img_seq = b''
tmp_seq = b''
while 1:
block = input_file.read(2048*64)
if len(block) != 2048*64:
break
if block[0:4] == b'\x55\x42\x49\x23':
img_seq = block[24:28]
if img_seq != tmp_seq:
print("0x", end='')
print(img_seq.hex().upper(), end=' -> ')
print("%d" % int(img_seq.hex(),16))
tmp_seq = img_seq
count += 1
print("\nCompleted.")
Completed.
data_inuse = 0
UBI_hdr = b'\x55\x42\x49\x23'
VID_hdr = b'\x55\x42\x49\x21'
while 1:
block = input_file.read(2048*64)
if len(block) != 2048*64:
break
if block[0:4] == UBI_hdr and block[2048:2048+4] == VID_hdr:
data_inuse += 2048*64
print("Data size in use: %d" % data_inuse)
print("\nCompleted.")
Completed.
cawan% ls
ubi.bin
cawan% cd ubi.bin
ca a % cd ub b
cawan% ls
img-1240815858_vol-data.ubifs img-2673978231_vol-backup.ubifs
img-1941946494_vol-ubifs.ubifs img-3823591600_vol-app.ubifs
cawan% ls -la
total 145212
drwxrwxr-x 2 user user 4096 May 29 16:46 .
drwxrwxr-x 3 user user 4096 May 29 16:46 ..
-rw-rw-r-- 1 user user 100438016 May 29 16:46 img-1240815858_vol-data.ubifs
-rw-rw-r-- 1 user user 11935744 May 29 16:46 img-1941946494_vol-ubifs.ubifs
-rw-rw-r-- 1 user user 27299840 May 29 16:46 img-2673978231_vol-backup.ubifs
-rw-rw-r-- 1 user user 9015296 May 29 16:46 img-3823591600_vol-app.ubifs
Let's try to use the UBI Reader toolkit again to extract files from UBIFS.
Let's start from img-1941946494_vol-ubifs.ubifs as shown below.
cawan% cd etc
cawan% cat fstab
cawan% ls -la fstab
-rw-rw-r-- 1 user user 186 Mar 30 2015 fstab
cawan% cat fstab | xxd
00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000020: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000030: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000040: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000050: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000060: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000070: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000080: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000090: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000000a0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000000b0: 0000 0000 0000 0000 0000 ..........
Sorry, fatal error this time, nothing generated. Since this is the NAND
dump from a real device which is fully functional, and all the bit errors
have being fixed, the UBIFS should work accordingly. It should proceed
in another route by emulating the NAND chip to work associated with MTD
by using nandsim. In most of the hacking literature, while talking about
nandsim, a standard conventional approach is dd the entire UBI image into
the emulated MTD device by nandsim, and modprobe the ubi driver with some
parameters, and the ubi driver is on its own to deal with the UBI image
blob. Let's put a few words of comment about this. As what mentioned
earlier, UBI erase block is purposely for wear-leveling implementation
in UBI layer. Since the UBI erase block is in logical form, they are
normally not in sequence physically, which is the case of the NAND dump.
So, instead of relying the UBI driver to work extra for block remapping
operation, which might have high chance in causing errors in all the
operation, which might have high chance in causing errors in all the
regards under emulation mode, it is better to pre-process the UBI image
in offline mode by using ubireader_extract_images first. The output of
ubireader_extract_images is already in UBIFS form, which is the actual
file system like squashfs, jffs2, yaffs2, or CRAMFS do. In other words,
by dealing with UBIFS directly, the chance of getting errors will get
minimized. Anyway, it is no harm to go with the standard conventional
approach first. Let's get started to grab the low-hanging fruit. In order
to emulate a NAND chip, one should get know the ID codes of the chip.
By referring to the datasheet of MT29F2G08ABAEAWP, the first 4 bytes are
0x2c, 0xda, 0x90, and 0x95. With such an info, it is ready for nandsim.
mtd0
Name: NAND simulator partition 0
Type: nand
Eraseblock size: 131072 bytes, 128.0 KiB
Amount of eraseblocks: 2048 (268435456 bytes, 256.0 MiB)
Minimum input/output unit size: 2048 bytes
Sub-page size: 512 bytes
OOB size: 64 bytes
Character device major/minor: 90:0
Bad blocks are allowed: true
Device is writable: true
cawan% ls -la
total 145216
5 - Firmware Extraction
Everything works perfectly without any single error so far. Let's see
the low-hanging fruit which is not so low is available now or not.
cawan% cd etc
cawan% cat fstab
proc /proc proc defaults 0 0
none /var/shm shm defaults 0 0
sysfs /sys sysfs defaults 0 0
none /tmp tmpfs defaults 0 0
cawan% ls /tmp/nand
14x8.hzk driver_gwzd.ko libacmet.so manuf.xin startup.sh
check.ini filecheck libplat.so metproto.so tmt info.log
check.ini filecheck libplat.so metproto.so tmt_info.log
chs.bin gsmMuxd lyzd ppp updateinfo.xin
dat.ini icons.bmp lyzd.xzip seting.ini
6 - Conclusion
So, as a conclusion, the entire file system hosting in three different UBIFS
have been fully extracted successfully.
References:-
文章来源: http://cawanblog.blogspot.com/2023/06/nand-dump-analysis-bit-errors-fixing.html
如有侵权请联系:admin#unsafe.sh
Write Preview
Sign in to comment