SlideShare a Scribd company logo
Exploiting hash collisions
Ange Albertini
BlackAlps 2017
Switzerland
identical prefix
This is not a crypto talk.
It’s about exploiting hash collisions,
(the weakest ones, w/ identical prefix)
via manipulating file formats.
You may want to watch Marc Stevens’ talk at CRYPTO17.
All opinions expressed during this presentation
are mine and not endorsed
by any of my employers, present or past.
DISCLAIMERS
Nothing
groundbreaking.
No new vulnerability.
Just a look behind the scenes of
Shattered-like research
(format-wise)OTOH there are very few talks on the topic AFAIK.
TL;DR
2014: Malicious SHA1 - modified SHA1
2015-2017: Shattered - SHA1
2017: PoC||GTFO 0x14 - MD5
This talk is about...
MalSha1
● Identical prefix
○ 2 files starting with same data
● Chosen prefix
○ 2 files starting with different (chosen) data
● Second preimage attack
○ Find data to match another data's hash
● Preimage attack
○ Find data to match hash
Types of collision
From here on,
hash collision = IPC = Identical Prefix Collision
first, weakest, overlooked
Sh*t's broken, yo!
Unicorns
Dragons
MD5:1992-2004 SHA1: 1995-2005 SHA2: 2001-? SHA3: 2015-?
Formal way to present IPCs
Collisions for Hash Functions MD4, MD5, HAVAL-128 and RIPEMD.
X Wang, D Feng, X Lai, H Yu
2004
Notvery“visual”!
Determine file structure
Computation
Craftvalidand
meaningfulfiles
Collisionsblocks
(exact shape unknown
in advance)I play no role
in this
Impact
Better than random-looking blocks?
Will it convince anyone to deprecate anything?
FTR Shattered took 6500 CPU-Yr
and 110 GPU-Yr.
(that's a lot of computing power)
Re-usability:
Moar impact
infinite These are MalSHA1examples.
2004: Dan Kaminsky: MD5 To Be Considered Harmful Someday
https://eprint.iacr.org/2004/357.pdf
https://dankaminsky.com/2004/12/06/46/
2004: Ondredj Mikle: Practical Attacks on Digital Signatures Using MD5 Message
Digest
https://eprint.iacr.org/2004/356.pdf
IPC exploits papers
● 2005
Max Gebhardt, Georg Illies, Werner Schindler
A Note on the Practical Value of Single Hash Collisions for Special File Formats
● 2014 MalSHA1
Malicious Hashing: Eve’s Variant of SHA-1
Ange Albertini, Jean-Philippe Aumasson, Maria Eichlseder, Florian Mendel, Martin Schläffer
● 2017 Shattered
The first collision for full SHA-1
Marc Stevens, Elie Bursztein, Pierre Karpman, Ange Albertini, Yarik Markov
● 2017 PoC||GTFO 0x14
Greg, spq, Mako, Philippe, Evan2
, Ange, Melissa Elliott
Slides a6cb4934945457d16bc90ef9ab3c391474fb78cf844c59f34d4505b95fbad5ea
Paper ac7a05b4bf456b4358e8a754f5f70612ce593bca1cdb718c2b38e3e280fc1240
Jean-Philippe’s Slides aba7833ed35eb5b44b44377f7054c7318637a8cb5db002c1ac787a5d2314f658
Paper 5c763e295b95ee8c69fd9430eae62fa59d7c9716ada645a93dcc19387e3d6821
Paper a3396362dcc528ed29918c07701e3b5082365a1dc19a9aac8d104c9c3d07c6b2
Marc’s Crypto17 video
Elie’s BlackHat Slides 1a17c315a946409e8ef37c56c962987d41377374c15ac0d855e92297b4f03596
file format collaborator instigator
Contraints of
hash and formats
have nothing in common
File constraints
● Collision blocks are very complex
⇒ considered random
● Collision blocks only differ by a mask.
○ The mask may be fixed in advance.
● Collision blocks may contain arbitrary values
○ Or bruteforce them.
⇒ craft your files with random blocks
and apply mask
=
<>
=
Prefix?
Block A
Suffix
Prefix?
Block B
Suffix
Where the magic happens: random stuff + mask
7F 46 DC 93-A6 B6 7E 01-3B 02 9A AA-1D B2 56 0B FÜ“¦¶~ ; šª ²V
45 CA 67 D6-88 C7 F8 4B-8C 4C 79 1F-E0 2B 3D F6 EÊgÖˆÇøKŒLyà+=ö
14 F8 6D B1-69 09 01 C5-6B 45 C1 53-0A FE DF B7 øm±i ÅkEÁS þß·
60 38 E9 72-72 2F E7 AD-72 8F 0E 49-04 E0 46 C2 `8érr/ç r I àFÂ
30 57 0F E9-D4 13 98 AB-E1 2E F5 BC-94 2B E3 35 0W éÔ ˜«á.õ¼”+ã5
42 A4 80 2D-98 B5 D7 0F-2A 33 2E C3-7F AC 35 14 B¤€-˜µ× *3.ì5
E7 4D DC 0F-2C C1 A8 74-CD 0C 78 30-5A 21 56 64 çMÜ ,Á¨tÍ x0Z!Vd
61 30 97 89-60 6B D0 BF-3F 98 CD A8-04 46 29 A1 a0—‰`kп?˜Í¨F)¡
73 46 DC 91-66 B6 7E 11-8F 02 9A B6-21 B2 56 0F sFÜ‘f¶~ š¶!²V
F9 CA 67 CC-A8 C7 F8 5B-A8 4C 79 03-0C 2B 3D E2 ùÊgÌ¨Çø[¨Ly +=â
18 F8 6D B3-A9 09 01 D5-DF 45 C1 4F-26 FE DF B3 øm³© ÕßEÁO&þß³
DC 38 E9 6A-C2 2F E7 BD-72 8F 0E 45-BC E0 46 D2 Ü8éjÂ/ç½r E¼àF
3C 57 0F EB-14 13 98 BB-55 2E F5 A0-A8 2B E3 31 <W ë ˜»U.õ ¨+ã1
FE A4 80 37-B8 B5 D7 1F-0E 33 2E DF-93 AC 35 00 þ¤€7¸µ× 3.ß“¬5
EB 4D DC 0D-EC C1 A8 64-79 0C 78 2C-76 21 56 60 ëMÜ ìÁ¨dy x,v!V`
DD 30 97 91-D0 6B D0 AF-3F 98 CD A4-BC 46 29 B1 Ý0—‘ÐkЯ?˜Í¤¼F)±
0c 00 00 02 c0 00 00 10 b4 00 00 1c 3c 00 00 04
bc 00 00 1a 20 00 00 10 24 00 00 1c ec 00 00 14
0c 00 00 02 c0 00 00 10 b4 00 00 1c 2c 00 00 04
bc 00 00 18 b0 00 00 10 00 00 00 0c b8 00 00 10
⇒ generate one file from the other.
Collision blocksFile A File B
xor mask
These are Shatteredexamples.
That’s a big pile of…randomness :)
.X .. .. .X X. .. .. X. XX .. .. XX XX .. .. .X
XX .. .. XX X. .. .. X. XX .. .. XX XX .. .. XX
.X .. .. .X X. .. .. X. XX .. .. XX XX .. .. .X
XX .. .. XX X. .. .. X. .. .. .. .X XX .. .. X.
Stevens13: SHA1, 6610 Yr
Jump
.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
.. .. .. X. .. .. .. .. .. .. .. .. .. .. .. ..
.. .. .. .. .. .. .. .. .. .. .. .. .. X. .. ..
.. .. .. .. .. .. .. .. .. .. .. X. .. .. .. ..
FastColl: MD5, ~1s
Very expensive,but trivial to exploit
Prefix and masks determine how easily it's exploitable.
Instant, but very restrictive→ bruteforce
2D 20 42 EC 61 63 6B 41 6C 70 73 27 31 37 20 2D - B.ackAlps'17 -
01 4D 80 6F 5B CB C0 AE 3D 33 52 3D EA 0B 01 93 .M.o[...=3R=....
5A 58 58 DB 51 B3 32 B4 F6 17 99 75 62 B8 D3 BD ZXX.Q.2....ub...
58 A3 EE A3 7C 22 0D 10 56 7F 4A D6 EF 58 C9 1F X...|"..V.J..X..
24 60 25 1F 4A E9 FC F5 55 67 B7 A9 E3 54 C5 72 $`%.J...Ug...T.r
0A A8 05 D6 6C 79 21 85 0A 75 38 D9 C6 D9 01 51 ....ly!..u8....Q
BD C3 19 F5 32 F5 EC 99 15 AC 91 9F CF BE BD CE ....2...........
E1 2B 75 20 CB D9 76 F5 F6 96 5B 89 3E 8B 10 E0 .+u ..v...[.>...
2D 20 42 6C 61 63 6B 41 6C 70 73 27 31 37 20 2D - BlackAlps'17 -
CA 99 ED 4A 7A 59 10 F6 6C 10 5B 71 B0 80 65 5D ...JzY..l.[q..e]
87 07 94 73 71 1F 07 B2 B5 84 12 96 BD 1D 03 2C ...sq..........,
E7 09 25 96 6E 0B 02 FD 96 9A 54 32 EB 15 FC F1 ..%.n.....T2....
D7 DF 52 10 C4 35 29 0A 5B 9A 93 40 34 5C 35 4C ..R..5).[..@45L
D7 AA 9E 83 16 F3 8C 61 E0 44 5C F0 4C DE F7 1C .......a.D.L...
16 D1 F7 49 B4 D4 EE 9E 65 D5 B6 7F B6 31 27 1E ...I....e....1'.
8B 0A F7 3D E7 42 B5 64 BC 1E 2A 97 64 EA F7 F2 ...=.B.d..*.d...
2 MD5 collisionsfrom HashClash (2 min)with different masks.2D 20 42 6C 61 63 6B 41 6C 71 73 27 31 37 20 2D - BlackAlqs'17 -
CA 99 ED 4A 7A 59 10 F6 6C 10 5B 71 B0 80 65 5D ...JzY..l.[q..e]
87 07 94 73 71 1F 07 B2 B5 84 12 96 BD 1D 03 2C ...sq..........,
E7 09 25 96 6E 0B 02 FD 96 9A 54 32 EB 15 FC F1 ..%.n.....T2....
D7 DF 52 10 C4 35 29 0A 5B 99 93 40 34 5C 35 4C ..R..5).[..@45L
D7 AA 9E 83 16 F3 8C 61 E0 44 5C F0 4C DE F7 1C .......a.D.L...
16 D1 F7 49 B4 D4 EE 9E 65 D5 B6 7F B6 31 27 1E ...I....e....1'.
8B 0A F7 3D E7 42 B5 64 BC 1E 2A 97 64 EA F7 F2 ...=.B.d..*.d...
2D 20 42 6C 61 63 6B 41 6C 70 73 27 31 37 20 2D - BlackAlps'17 -
01 4D 80 6F 5B CB C0 AE 3D 33 52 BD EA 0B 01 93 .M.o[...=3R.....
5A 58 58 DB 51 B3 32 B4 F6 17 99 75 62 B8 D3 BD ZXX.Q.2....ub...
58 A3 EE A3 7C 22 0D 08 56 7F 4A D6 EF 58 C9 1F X...|"..V.J..X..
24 60 25 9F 4A E9 FC F5 55 67 B7 A9 E3 54 C5 72 $`%.J...Ug...T.r
0A A8 05 D6 6C 79 21 85 0A 75 38 59 C6 D9 01 51 ....ly!..u8Y...Q
BD C3 19 F5 32 F5 EC 99 15 AC 91 9F CF BE BD CE ....2...........
E1 2B 75 20 CB D9 76 FD F6 96 5B 89 3E 8B 10 E0 .+u ..v...[.>...
Same hash,
different masks.
IPC exploits
strategies
● Get collision block ignored (commented out)
● File suffix/separate executable contains code
○ Checks the block values
or uses block as decryption key.
⇒ Collision block == passive data
Collision blocks
(commented out)
Code
(checking block values)
If-then-else (data)
Works withmany script languages
Code
● Prefix or bruteforcing sets up some opcodes
● 2 target addresses in the collision blocks
● 2 code snippets in suffix
Blocks
Payload 1
Payload 2
Jump 1
Good
Bad
Jump 2
Good
Bad
Only needs few bytesX86 jump = EB xx,
But no real-life consequences :(
Suffix
● Prefix or bruteforcing sets up a header
● Collision blocks alter a value,
To make parsers ignore the rest of the blocks
and land at different offsets.
See MD5 rogue certificates w/ chosen-prefix.
Prefix
(declares a header)
Collision blocks
(changes header value)
Data
(contains 2 data sets)
Format (structure)
Concatenation
With a top-down file format that can start at any offset (Rar, 7z…)
1. Collision blocks end with signature's start.
○ w/ a difference on that byte.
2. Append a file minus its first byte.
3. Append another file of the same type.
Coll. Blocks
RAR File 1
RAR File 2
.. .. .. R
ar!<file>
Rar!<file>
.. .. .. ?
ar!<file>
Rar!<file>
One letter is enough
(ZIP is bottom-up)
Find a way to get 2 files
despite the randomness.
Prefix.
Randomness.
Collision block masks.
QA
Write your prefixInsert totally random dataApply mask
General goal
Test files,on all tools.
(meaningful)
Format target
● Something universally used.
○ Preferably multi-platform ⇒ executables
○ By end-users, not just developers.
○ Preferably, something with crypto!
(certificates are pretty restrictive)
● With as fewer parsers in the wild as possible.
Visual documents: JPEG, PNG, GIF, PDF...
Validity.
Compatibility.
Correct rendering.
Re-useability.
Test, test, test!
Ever dance with the specs
by the pale moonlight?
Explore all code paths,All headers values
Corner cases FTW
Challenges
2005: Gebhardt et al.
● If-then-else exploits
○ PostScript
○ PDF
○ TIFF
○ Word 97
Word97 macro
Sub collision()
Dim b(512) As Byte
FName$ = ActiveDocument.Name
Open FName$ For Binary Access Read As #1 Len = 512
Get #1, , b ’the price 1000$ is contained in 2nd line of
Close #1 ’the .doc file; that line is selected by
’the Selection .. Count:=2 command
If b(147) >= 128 Then
Selection.Collapse Direction:=wdCollapseStart
Selection.GoTo What:=wdGoToLine, Which:=wdGoToAbsolute, Count:=2
Selection.MoveRight Unit:=wdCharacter, Count:=1
Selection.Find.ClearFormatting
With Selection.Find
.Text = ’$’
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute
Selection.MoveLeft Unit:=wdCharacter, Count:=3
Selection.MoveRight Unit:=wdCharacter, Extend:=wdCharacter
Selection.Font.ColorIndex = wdWhite
Selection.GoTo What:=wdGoToLine, Which:=wdGoToAbsolute, Count:=1
Selection.Collapse Direction:=wdCollapseEnd
End If ’by the Selection .. Count:=1 command
’the cursor returns to the first character
’in the text (disguise of attack)
End Sub
No widespread scripting language in PDF:
● JavaScript/FormCalc reliably only in Adobe Reader
Only binary-based conditional function:
● PostScript Calculator (Type 4) functions
PDF features and landscape
<<
/FunctionType 4
/Domain [0.0 1.0]
/Range [0.0 1.0]
/Length 28
>>
stream
{255 mul 121 sub 1 exch sub}
endstream
depends on the collision block
● Poorly supported across readers.
● Limited to 2 non-overlapping objects
⇒ reliable but limited for payload and compatibility
Not good enough REJECTED
Only OK in Adobe
No full control
2014: MalSHA1
● Very restrictive: no prefix !!! ⇒ very simple collisions
● 30-50h on 80 cores:
Many retries are possible, but unclear collision mask.
● If then else: Shell script
● Concatenation: RAR, 7z
● Code: Master Boot Record
● Format: JPEG
● Polyglot: all in the same file!
#‘4@ ØM¦ÓTá+¸…[Gx&½ý7+îæP,uKW8¿Ø¥à²D”Q*Í6¢þ⊟2U™ª´zí‚
if [ `od -t x1 -j3 -N1 -An "${0}"` -eq "91" ]; then
echo " (__)n (oo)n /-------/n / | ||n* ||----||n ^^
^^";
Else
echo "Hello World.";
fi
● Can’t control 4 bytes in a row.
⇒ many file formats aren’t useable
● Windows Executable? (magic = “MZ”)
Would end up with huge e_lfanew (a header offset, not a memory pointer)
Max value in practice: 0x9000000 (150 Mb)
MalSHA1 failures
A primer on JPEG
signature: FF D8
Segments structure:
all start with FF 00
(FF in data always followed by 00)
Garbage? Skip until next FF!
Big endian lengths, on 2 bytes. Never too big,never too small.
very short
Very "tolerant"
2 images, 1 "comment"
A comment (an ignored segment),
of variable length.
Use another comment to
Jump over the first image.
make sure not to jump in the blocks:
⇒ 01 xx is optimal.
JPEG
collision
structure
Abusing
JPEG
tolerance
Garbage bytes with
no FF in them.
Can't combine JPEG and MBR:
FF D8 is an invalid opcode.
Polyglots:
a single pair with
several use cases.
From MalSHA1...
...to the real thing!
2015: Implementing Stevens13
1. Research file trick
2. Implement attack
3. Craft files
Stevens13 compared with MalSHA1
- Complex computation
- Expensive computation
+ Prefix
- Totally random blocks
+ Fixed mask
+ Blocks start with a difference
Never tried before:(can't be interrupted/tweaked)
One. single. try.
Constraints--
constraints++
reliability++
reliability++
1. Research file trick
● MalSHA1's JPEG trick would work.
● We'd like a new trick. PDF?
○ Nothing existing versatile so far.
○ Experiments with PDF (XREF, object numbers)
■ Never works reliably accross all readers.
● No SHA1 collision at this stage - hard to get traction.
At this stage it's still only
a set of weird file constraints.
If you're not familiar
with PDF...
...with my vision of PDF!
a correct PDF
%PDF
1 0 obj
<< /Pages
<< /Kids [
<< /Contents 2 0 R >>
] >>
>>
2 0 obj
<<>>
stream
95 Tf
20 400 Td
(Chrome) Tj
endstream
trailer <<
/Root 1 0 R
>>
1 0 obj
<<
/Resources << /Font << /F1 <<
/BaseFont /Arial /Subtype
/Type1 >> >>
>>
/Contents << >>
stream
/F1 170 Tf
10 400 Td
(FireFox) Tj
endstream
>>
endobj
xref
%trailer << /Root << /Pages <<
/Kids [1 0 R] /Count 1>> >>
>>
working
PDFs
1 0 obj
<<
/Resources << /Font << /F1 <<
/BaseFont /Arial /Subtype
/Type1 >> >>
>>
/Contents << >>
stream
/F1 170 Tf
10 400 Td
(FireFox) Tj
endstream
>>
endobj
xref
%trailer << /Root << /Pages <<
/Kids [1 0 R] /Count 1>> >>
>>
no signature
no /Type
inline /Contents no /Length
empty XREF
Direct /Root
no /Size
no /Type
no startxref
no %%EOF
%PDF
1 0 obj
<< /Pages
<< /Kids [
<< /Contents 2 0 R >>
] >>
>>
2 0 obj
<<>>
stream
95 Tf
20 400 Td
(Chrome) Tj
endstream
trailer <<
/Root 1 0 R
>>
truncated signature
direct /Kids
no /Type
no /Font
no /Countno /Type
no /Resources
no endobj
no /Length
no XREF
Direct /Root
no /Size
no /Type no startxref
no %%EOF
no BT/ET no font reference
INVALID?
INVALID?
no /Parent
comment
no /Parent
no BT/ET
no endobj
ACCEPTED!
ACCEPTED!
PDF.js PDFium
I made extreme PDFs for each reader by hand.
These extreme PDFs fail on any other reader.
The devil is in the detail
● All PDF parsers have their weirdness
○ Does it work? Does it display, behave normally?
○ A trick on a PDF reader is easy, but
a reliable trick for all of them is hard.
Examples:
● Preview is more strict for JPEG structures.
But created some funky ghost JPEGs :)
● OTOH it's less compatibility for gradients.
● An unusual JPEG in a PDF can easily reboot a Kindle.
● A complex JPEG can take minutes to load.
● A crazy JPEG in a PDF displays glitches in Adobe.
GlitchesinAdobe
Different resizing
in PreviewGhostsinPreview
2015: PDF is tricky...
● A PDF trick with total compatibility...?
○ With doc-level control? (not just a glitch)
● Eventually… JPEG in a PDF:
○ PDF embeds entire JPEG files
○ Image parameters can be referenced
○ Reliable
■ No possible error
■ "Sane" PoCs - very little overhead
○ Reusable
After the collision blocks,
so no restrictions on dimensions!
Pushing the limits
of our JPEG trick
PDF are usually documents.
We wanted fake documents!
The first image has to be jumped over.
Only 393x438 px
in 90% quality ⇒ 55Kb
Yet already near limit!
Current limit:
Size(Image) < 64Kb
Good for a photo,Not for a doc!
2 comments per segment
The scan length only concerns the start!
The ECS grows with the file,
and is not limited to 64Kb!
1024x740 Q.100% ⇒ 228 Kb
a single scan of 227 Kb!
image
0:Y
luma (brightness)
2:Cr
redness
1:Cb
blueness
Components
A JPEG image
is decomposed
Each scan increases definition
⇒ progressive file, smaller scans
JPEG
school of wizardry
Welcome to
libJPEG's JPEGTran & wizard.doc
$ jpegtran --help
usage: jpegtran [switches] [inputfile]
Switches (names may be abbreviated):
-copy none Copy no extra markers from source file
-copy comments Copy only comment markers (default)
-copy all Copy all extra markers
-optimize Optimize Huffman table (smaller file, but slow compression)
-progressive Create progressive JPEG file
Switches for modifying the image:
-grayscale Reduce to grayscale (omit color data)
-flip [horizontal|vertical] Mirror image (left-right or top-bottom)
-rotate [90|180|270] Rotate image (degrees clockwise)
-transpose Transpose image
-transverse Transverse transpose image
-trim Drop non-transformable edge blocks
-cut WxH+X+Y Cut out a subset of the image
Switches for advanced users:
-restart N Set restart interval in rows, or in blocks with B
-maxmemory N Maximum memory to use (in kbytes)
-outfile name Specify name for output file
-verbose or -debug Emit debug output
Switches for wizards:
-scans file Create multi-scan JPEG per script file
http://libjpeg.cvs.sourceforge.net/viewvc/libjpeg/libjpeg/wizard.doc?content-type=text%2Fplain
Advanced usage instructions for the Independent JPEG Group's JPEG software
==========================================================================
This file describes cjpeg's "switches for wizards".
The "wizard" switches are intended for experimentation with JPEG by persons
who are reasonably knowledgeable about the JPEG standard. If you don't know
what you are doing, DON'T USE THESE SWITCHES. You'll likely produce files
with worse image quality and/or poorer compression than you'd get from the
default settings. Furthermore, these switches must be used with caution
when making files intended for general use, because not all JPEG decoders
will support unusual JPEG parameter settings.
Quantization Table Adjustment
-----------------------------
Ordinarily, cjpeg starts with a default set of tables (the same ones given
as examples in the JPEG standard) and scales them up or down according to
the -quality setting. The details of the scaling algorithm can be found in
jcparam.c. At very low quality settings, some quantization table entries
can get scaled up to values exceeding 255. Although 2-byte quantization
values are supported by the IJG software, this feature is not in baseline
JPEG and is not supported by all implementations. If you need to ensure
wide compatibility of low-quality files, you can constrain the scaled
quantization values to no more than 255 by giving the -baseline switch.
Note that use of -baseline will result in poorer quality for the same file
size, since more bits than necessary are expended on higher AC coefficients.
You can substitute a different set of quantization values by using the
-qtables switch:
-qtables file Use the quantization tables given in the
named file.
Custom scans
Use JPEGTran's to tweak scans
and make them smaller than 64Kb,
Wizardry is hard:
● JPEGTran is inconsistent
● The documentation's examples are broken.
0: 0-0, 0, 0;
0: 1-1, 0, 0;
0: 2-6, 0, 0;
0: 7-10, 0, 0;
0: 11-13, 0, 0;
0: 14-20, 0, 0;
0: 21-26, 0, 0;
0: 27-32, 0, 0;
0: 33-40, 0, 0;
0: 41-48, 0, 0;
0: 49-54, 0, 0;
0: 55-63, 0, 0;
1: 0-0, 0, 0;
1: 1-16, 0, 0;
1: 17-32, 0, 0;
1: 33-63, 0, 0;
2: 0-0, 0, 0;
2: 1-16, 0, 0;
2: 17-32, 0, 0;
2: 33-63, 0, 0;
1944x2508 100%, 860 Kb ⇒ 20 scansSyntax:
component: byte min-max, bit min, bit max;
Making a big
image fit
w/ custom scans
definitions.
Few colors
Limitations?
LibJPEG has an limit of 100 scans.
On writing. Not on reading ;)
⇒ we could release a multi-page doc,
but it's giving mobiles a hard time.
Shattered: It's a JPEG in a PDF
● We still want a PDF file!
● PDF header, declare image
● Reference all /Image parameters after the file data.
○ After the collision blocks
● Put 2 images contents
○ With the same parameters, unlike MalSHA1
● Put image parameters values
● Finalize PDF file.
colors, dimensions...
PDF trick structure
8 brain-year,
100 GPU-year
and 6500 CPU-year later...
Woohoo! We have a collision!
"Here is the file…"
More details here
T
ff S 13Oct15->Jan17
Herecomestherandomness!
Then this happened...
I also lost compatibility with Adobe and Safari at some point...
I completely lost my... ;)
Lessons learned
● Keeping notes and PoCs helps.
● a diary and a log of command lines
might seem overkill…
...but it really helps!
(Especially as readers have been updated in the meantime!)
Shattered is real
With 0 bug reported!
nominated forPéter Szőr award
best
crypto attack
best
CRYPTO17 paper
official PoCs, side by side
Details
● CVE-2005-4900 updated :)
● It broke SVN in practice!
○ SHA1 for deduplication
○ MD5 for integrity
● BitErrant
○ BitTorrent uses SHA1
for file chunks
Impact
…
Checksum mismatch: shattered-2.pdf
expected: 5bd9d8cabc46041579a311230539b8d1
got: ee4aa52b139d925f8d8884402b0a750c
…
"SHA-1 is not collision resistant..."
● PoCs generators
○ simple within 5 hours (!)
○ advanced
● HTML collision
● Used in Boston Key Party CTF, 50 pts
● Bitcoin bounty claimed ;) [2.8K€]
Internet does its thing... first public PoCs
FLAG{AfterThursdayWeHadToReduceThePointValue}
Enthusiast feedback
● Bruce Schneier
Yes, this brute-force example has its own website.
● Linus Torvald
...in a project like git, the hash isn't used for "trust".
● John Gilmore
Linus [...] wired assumptions about SHA1 deeply into git.
● Robert J. Hansen [OpenPGP, 2013]
Scaremongering about crypto is one of the quickest ways to make me angry.
We can do more
It's not just about full-page pictures.
It's not just full-page pictures
● It's a standard PDF document, with a 'bipolar' JPEG.
● Any PDF element can be part of the JPEG.
○ A multi-page doc w/ an image with appended pages.
○ A totally standard doc, with only a few elements
replaced.
DEMONotice anything?
It's the complete Shattered paper...
d3f968d604bf1c31a4b3aaecd0f6b2fad4c33402
Exploiting hash collisions
What's JPEG?
● An image format
● A lossy data storage format (specialized for photos?)
○ PDF takes it too literally:
3 out of 6 readers accept JPEG-stored data
for non-images objects, such as page content
(rejected by browsers) 1 0 0 RG // color = red
150 w // width
53 53 m // start point
558 558 l // end point
B // draw path
53 558 m
558 53 l
B
=
Lossless JPEG?
● Quality 100%
● Grayscale JPEG ⇒ no component mixing
Still lossy!
● JPEG is 8x8 block based
⇒ Repeat content lines 8 times.
○ Pad a little to prevent truncation
⇒ Reliably works !
DEMO
d13215922636de3074ecdf63bf1eee491030f502
2 sha1-colliding PDFs with vector content stored as lossless JPEG data.
Colors via a grayscale image :)
Why not both?
JPEG as image,
JPEG as data...
We've seen so far….
Lossless data and lossy image
● Pad data to match image width
● Store 8 times to make lossless
● Append image
A page content can reference itself
No page content terminator :(
⇒ lossly data could fail rendering - YMMV
q
612 0 0 792 0 0 cm
/Im1 Do
Q
1 0 0 rg
BT
/F1 90 Tf
10 400 Td
(GOLDEN AXE) Tj
ET
Q
Standard Page code + padding
showing (itself as) an image
Displaying text
2 sha1-colliding PDFs with mixed JPEG (on different readers)
de9b4237c940ec4af249f2c80bcd841537f6624c
Trivial to detect at file level,
tricky to detect at rendering level.
Shattered:
one blocks pair,
many kinds of PoCs!
MD5?!
It's already broken!
Nothing to see here, right?
Multi-collision files
Why create only a pair of colliding files
when you can create 2609
?
2609
=
212455197126706839475835282620987450931837247090812769279777655280161423944340897095665
000906091714267555731794498600406138631735061082895763807991506634940777532508334157287
6126912512
(184 digits)
What's a collision?
Variable content, same hash
Hashquine
Display your own file's hash
It's a mental trick:
"how do you know the hash in advance?"
Make your file's content updatable
Without changing the final hash.
Fake hashquine
Actually a script that computes
and display its own hash
Often comes with obfuscation ;)
Format hashquine
1 passive collision ⇒ take this file or skip to the next.
X collisions ⇒ X+1 versions of the same element.
1. Store multiple versions of visual elements
in a chain of collisions.
2. Display the file hash in the file.
Data Hashquine
1 collision == 2 alternate contents ⇒ 1 bit of data.
Put some code that parses the bits and
displays the stored value.
More collision efficient than format hashquines,
but requires code to be executed.
cheating?
PostScript by Greg
GIFs by spq
animated
The first ever!
As images
PDFs by Mako
$ pdftotext -q md5text.pdf -
66DA5E07C0FD4C921679A65931FF8393
$ md5sum md5text.pdf
66da5e07c0fd4c921679a65931ff8393 md5text.pdf
As text
GIF & TIFF,
by Rogdham
Very nice writeup for GIF bit-hashquine TIFF with writeup, but 4 Gb !
PoC||GTFO 0x14
Articles about hashquines.
But also hashquine itself,
and polyglot!
by Evan2
and Philippe
A LaTeX-generated
PDF...
...showing its MD5...
(15x32=480 collisions) ...showing the same MD5!
(4x32=128 collisions) 608?
Mmm, seafood!
...also a NES rom...
1 extra collision ⇒ hidden cover, same MD5. 609!
You know
a cryptographic hash
is really broken
when it feels like
a fancy fidget spinner.
When you generate 609 of its collisions for fun.
In total, 9824 collisions were computed for the making of this issue.
Thanks Marc!
https://www.chrisbathgate.com
/
Other formats?
Certificates, PNG...
https://www.cem.me/pki/index.html
Very restrictive!
PNG
Strengths:
● 8 byte signature
● Chunk types after lengths
● 4 byte lengths
● Chunk CRCs
Weaknesses:
● Easy to make ignored chunks
● CRC usually ignored
Attack ⇔ format pairing
Hash collision attack ⇒ constraints (prefix, mask)
File format ⇒ other constraints (structure, compatibility)
The same attack can be used with various file formats.
A file format trick can be used with different hashes.
@arw's HTML colliding pair made with Shattered prefix.
PDF ⇒ HTML (also works as polyglot)
Mako's PDF Hashquine with MD5
MalSHA1's JPEG trick + Shattered JPEG in PDF trick for SHA1
SHA'1 ⇒ SHA1 ⇒ MD5
Why?
"It's just a bag of trick anyway…"
"Crypto doesn't care about PoCs..."
Attacks rely on PoCs.
Attacks convince people to deprecate.
You don't get pwned by academic papers, but by their PoCs.
A new format trick could benefit MD5, SHA1…
or a future attack!
In practice,
- Shattered generates an infinity of colliding documents, of different kinds.
- Shattered broke SVN.
Didn't that help?
...the end?
...we still have a few tricks up our sleeves ;)
Conclusion
● Hash collisions exploitation is a niche domain:
weird constraints, unusual challenges & rewards.
● Researching a file format manipulation now
could benefit on a future cryptographic attack.
FWIW (full personal disclosure)
● When I was asked about MalSHA1, I saw no solution.
○ I gave up for a while - I didn't think particularly about JPEG.
● In the meantime, I was challenged to encrypt with AES a JPEG to a JPEG.
⇒ AngeCryption
● With that knowledge, I succeeded for MalSHA1.
● That knowledge was the starting point for Shattered.
○ I gave up at some time on the JPEG optimization aspect.
○ But I kept that fidget spinning playfully.
○ Found my 2 breakthroughs… in very unexpected places ;)
Don't give up! Keep that fidget spinning!
One more thing
" How do you do all this?"
● I thought I lacked discipline. That led me nowhere.
● Just do what makes you giggle like a 3-year old.
(that's what playing with file formats does to me).
● Have fun! Eventually you'll get feedback, recognition…
● By then, you'll have no reasons to stop anymore.
● And you'll be happily disciplined by then.
Have fun!
Thanks for your attention!
Questions?
Special thanks to Marc & Maria
Philippe, Evan, spq, Mako, Greg, Melissa,
Elie, Jean-Philippe, and CommitStrip.
Ad

More Related Content

What's hot (7)

Каталог освітлення Azzardo Technoline 2019-2020
Каталог освітлення Azzardo Technoline 2019-2020Каталог освітлення Azzardo Technoline 2019-2020
Каталог освітлення Azzardo Technoline 2019-2020
МамаДекор
 
TimeMatlab2
TimeMatlab2TimeMatlab2
TimeMatlab2
Siriporn Anny
 
Claas v750 v540 vario lexion (type 705) cutterbar service repair manual
Claas v750 v540 vario lexion (type 705) cutterbar service repair manualClaas v750 v540 vario lexion (type 705) cutterbar service repair manual
Claas v750 v540 vario lexion (type 705) cutterbar service repair manual
jfdjskmdmme
 
OpenIot & ELC Europe 2016 Berlin - How to develop the ARM 64bit board, Samsun...
OpenIot & ELC Europe 2016 Berlin - How to develop the ARM 64bit board, Samsun...OpenIot & ELC Europe 2016 Berlin - How to develop the ARM 64bit board, Samsun...
OpenIot & ELC Europe 2016 Berlin - How to develop the ARM 64bit board, Samsun...
Chanwoo Choi
 
Cheat codes
Cheat codesCheat codes
Cheat codes
SangamHutait
 
Аварийный дамп – чёрный ящик упавшей JVM. Андрей Паньгин
Аварийный дамп – чёрный ящик упавшей JVM. Андрей ПаньгинАварийный дамп – чёрный ящик упавшей JVM. Андрей Паньгин
Аварийный дамп – чёрный ящик упавшей JVM. Андрей Паньгин
odnoklassniki.ru
 
Big Data mit Microsoft?
Big Data mit Microsoft?Big Data mit Microsoft?
Big Data mit Microsoft?
Olivia Klose
 
Каталог освітлення Azzardo Technoline 2019-2020
Каталог освітлення Azzardo Technoline 2019-2020Каталог освітлення Azzardo Technoline 2019-2020
Каталог освітлення Azzardo Technoline 2019-2020
МамаДекор
 
Claas v750 v540 vario lexion (type 705) cutterbar service repair manual
Claas v750 v540 vario lexion (type 705) cutterbar service repair manualClaas v750 v540 vario lexion (type 705) cutterbar service repair manual
Claas v750 v540 vario lexion (type 705) cutterbar service repair manual
jfdjskmdmme
 
OpenIot & ELC Europe 2016 Berlin - How to develop the ARM 64bit board, Samsun...
OpenIot & ELC Europe 2016 Berlin - How to develop the ARM 64bit board, Samsun...OpenIot & ELC Europe 2016 Berlin - How to develop the ARM 64bit board, Samsun...
OpenIot & ELC Europe 2016 Berlin - How to develop the ARM 64bit board, Samsun...
Chanwoo Choi
 
Аварийный дамп – чёрный ящик упавшей JVM. Андрей Паньгин
Аварийный дамп – чёрный ящик упавшей JVM. Андрей ПаньгинАварийный дамп – чёрный ящик упавшей JVM. Андрей Паньгин
Аварийный дамп – чёрный ящик упавшей JVM. Андрей Паньгин
odnoklassniki.ru
 
Big Data mit Microsoft?
Big Data mit Microsoft?Big Data mit Microsoft?
Big Data mit Microsoft?
Olivia Klose
 

Similar to Exploiting hash collisions (20)

No more dumb hex!
No more dumb hex!No more dumb hex!
No more dumb hex!
Ange Albertini
 
Abusing archive file formats
Abusing archive file formatsAbusing archive file formats
Abusing archive file formats
Ange Albertini
 
Key recovery attacks against commercial white-box cryptography implementation...
Key recovery attacks against commercial white-box cryptography implementation...Key recovery attacks against commercial white-box cryptography implementation...
Key recovery attacks against commercial white-box cryptography implementation...
CODE BLUE
 
Hta r31
Hta r31Hta r31
Hta r31
SelectedPresentations
 
Pci planning-for-lte
Pci planning-for-ltePci planning-for-lte
Pci planning-for-lte
chelebix
 
Learning iPython Notebook Volatility Memory Forensics
Learning iPython Notebook Volatility Memory ForensicsLearning iPython Notebook Volatility Memory Forensics
Learning iPython Notebook Volatility Memory Forensics
Vincent Ohprecio
 
AES Encryption
AES EncryptionAES Encryption
AES Encryption
Rahul Marwaha
 
Technical challenges with file formats
Technical challenges with file formatsTechnical challenges with file formats
Technical challenges with file formats
Ange Albertini
 
ブロックチェーン: 「 書き換え不可能な記録」によって 社会はどう変化するか?
ブロックチェーン: 「書き換え不可能な記録」によって社会はどう変化するか? ブロックチェーン: 「書き換え不可能な記録」によって社会はどう変化するか?
ブロックチェーン: 「 書き換え不可能な記録」によって 社会はどう変化するか?
Yoshiharu Ikutani
 
Aimp3 memory manager_eventlog
Aimp3 memory manager_eventlog Aimp3 memory manager_eventlog
Aimp3 memory manager_eventlog
Ahmad Shabri
 
DAW: Duplicate-AWare Federated Query Processing over the Web of Data
DAW: Duplicate-AWare Federated Query Processing over the Web of DataDAW: Duplicate-AWare Federated Query Processing over the Web of Data
DAW: Duplicate-AWare Federated Query Processing over the Web of Data
Muhammad Saleem
 
Highload2o013 osipv
Highload2o013 osipvHighload2o013 osipv
Highload2o013 osipv
Kostja Osipov
 
DEF CON 23 - Yaniv Balmas and Lior Oppenheim - key logger-video mouse
DEF CON 23 - Yaniv Balmas and Lior Oppenheim - key logger-video mouseDEF CON 23 - Yaniv Balmas and Lior Oppenheim - key logger-video mouse
DEF CON 23 - Yaniv Balmas and Lior Oppenheim - key logger-video mouse
Felipe Prado
 
ambil aja
ambil aja ambil aja
ambil aja
muxander
 
PORTIQUE CMU
PORTIQUE CMUPORTIQUE CMU
PORTIQUE CMU
Structural designer civil engineering
 
CM_TX_Devices
CM_TX_DevicesCM_TX_Devices
CM_TX_Devices
Jeremiah Cargill
 
Amber_Tutorial_PHAST.pdf dtkikydFHLfljfljfl
Amber_Tutorial_PHAST.pdf dtkikydFHLfljfljflAmber_Tutorial_PHAST.pdf dtkikydFHLfljfljfl
Amber_Tutorial_PHAST.pdf dtkikydFHLfljfljfl
nordine19630
 
When Computers Don't Compute and Other Fun with Numbers
When Computers Don't Compute and Other Fun with NumbersWhen Computers Don't Compute and Other Fun with Numbers
When Computers Don't Compute and Other Fun with Numbers
PDE1D
 
Numpy intro presentation for college.pdf
Numpy intro presentation for college.pdfNumpy intro presentation for college.pdf
Numpy intro presentation for college.pdf
kakkarskrishna22
 
How to I/O?
How to I/O?How to I/O?
How to I/O?
C4Media
 
Abusing archive file formats
Abusing archive file formatsAbusing archive file formats
Abusing archive file formats
Ange Albertini
 
Key recovery attacks against commercial white-box cryptography implementation...
Key recovery attacks against commercial white-box cryptography implementation...Key recovery attacks against commercial white-box cryptography implementation...
Key recovery attacks against commercial white-box cryptography implementation...
CODE BLUE
 
Pci planning-for-lte
Pci planning-for-ltePci planning-for-lte
Pci planning-for-lte
chelebix
 
Learning iPython Notebook Volatility Memory Forensics
Learning iPython Notebook Volatility Memory ForensicsLearning iPython Notebook Volatility Memory Forensics
Learning iPython Notebook Volatility Memory Forensics
Vincent Ohprecio
 
Technical challenges with file formats
Technical challenges with file formatsTechnical challenges with file formats
Technical challenges with file formats
Ange Albertini
 
ブロックチェーン: 「 書き換え不可能な記録」によって 社会はどう変化するか?
ブロックチェーン: 「書き換え不可能な記録」によって社会はどう変化するか? ブロックチェーン: 「書き換え不可能な記録」によって社会はどう変化するか?
ブロックチェーン: 「 書き換え不可能な記録」によって 社会はどう変化するか?
Yoshiharu Ikutani
 
Aimp3 memory manager_eventlog
Aimp3 memory manager_eventlog Aimp3 memory manager_eventlog
Aimp3 memory manager_eventlog
Ahmad Shabri
 
DAW: Duplicate-AWare Federated Query Processing over the Web of Data
DAW: Duplicate-AWare Federated Query Processing over the Web of DataDAW: Duplicate-AWare Federated Query Processing over the Web of Data
DAW: Duplicate-AWare Federated Query Processing over the Web of Data
Muhammad Saleem
 
DEF CON 23 - Yaniv Balmas and Lior Oppenheim - key logger-video mouse
DEF CON 23 - Yaniv Balmas and Lior Oppenheim - key logger-video mouseDEF CON 23 - Yaniv Balmas and Lior Oppenheim - key logger-video mouse
DEF CON 23 - Yaniv Balmas and Lior Oppenheim - key logger-video mouse
Felipe Prado
 
ambil aja
ambil aja ambil aja
ambil aja
muxander
 
Amber_Tutorial_PHAST.pdf dtkikydFHLfljfljfl
Amber_Tutorial_PHAST.pdf dtkikydFHLfljfljflAmber_Tutorial_PHAST.pdf dtkikydFHLfljfljfl
Amber_Tutorial_PHAST.pdf dtkikydFHLfljfljfl
nordine19630
 
When Computers Don't Compute and Other Fun with Numbers
When Computers Don't Compute and Other Fun with NumbersWhen Computers Don't Compute and Other Fun with Numbers
When Computers Don't Compute and Other Fun with Numbers
PDE1D
 
Numpy intro presentation for college.pdf
Numpy intro presentation for college.pdfNumpy intro presentation for college.pdf
Numpy intro presentation for college.pdf
kakkarskrishna22
 
How to I/O?
How to I/O?How to I/O?
How to I/O?
C4Media
 
Ad

More from Ange Albertini (20)

Overview of file type identifiers (HackLu)
Overview of file type identifiers (HackLu)Overview of file type identifiers (HackLu)
Overview of file type identifiers (HackLu)
Ange Albertini
 
A question of time - Troopers 2024 Keynote
A question of time - Troopers 2024 KeynoteA question of time - Troopers 2024 Keynote
A question of time - Troopers 2024 Keynote
Ange Albertini
 
Relations between archive formats
Relations between archive formatsRelations between archive formats
Relations between archive formats
Ange Albertini
 
TimeCryption
TimeCryptionTimeCryption
TimeCryption
Ange Albertini
 
You are *not* an idiot
You are *not* an idiotYou are *not* an idiot
You are *not* an idiot
Ange Albertini
 
Improving file formats
Improving file formatsImproving file formats
Improving file formats
Ange Albertini
 
Beyond your studies
Beyond your studiesBeyond your studies
Beyond your studies
Ange Albertini
 
An introduction to inkscape
An introduction to inkscapeAn introduction to inkscape
An introduction to inkscape
Ange Albertini
 
The challenges of file formats
The challenges of file formatsThe challenges of file formats
The challenges of file formats
Ange Albertini
 
Infosec & failures
Infosec & failuresInfosec & failures
Infosec & failures
Ange Albertini
 
Connecting communities
Connecting communitiesConnecting communities
Connecting communities
Ange Albertini
 
TASBot - the perfectionist
TASBot - the perfectionistTASBot - the perfectionist
TASBot - the perfectionist
Ange Albertini
 
Caring for file formats
Caring for file formatsCaring for file formats
Caring for file formats
Ange Albertini
 
Hacks in video games
Hacks in video gamesHacks in video games
Hacks in video games
Ange Albertini
 
Trusting files (and their formats)
Trusting files (and their formats)Trusting files (and their formats)
Trusting files (and their formats)
Ange Albertini
 
Let's write a PDF file
Let's write a PDF fileLet's write a PDF file
Let's write a PDF file
Ange Albertini
 
PDF: myths vs facts
PDF: myths vs factsPDF: myths vs facts
PDF: myths vs facts
Ange Albertini
 
An overview of potential leaks via PDF
An overview of potential leaks via PDFAn overview of potential leaks via PDF
An overview of potential leaks via PDF
Ange Albertini
 
Advanced Pdf Tricks
Advanced Pdf TricksAdvanced Pdf Tricks
Advanced Pdf Tricks
Ange Albertini
 
Funky file formats - 31c3
Funky file formats - 31c3Funky file formats - 31c3
Funky file formats - 31c3
Ange Albertini
 
Overview of file type identifiers (HackLu)
Overview of file type identifiers (HackLu)Overview of file type identifiers (HackLu)
Overview of file type identifiers (HackLu)
Ange Albertini
 
A question of time - Troopers 2024 Keynote
A question of time - Troopers 2024 KeynoteA question of time - Troopers 2024 Keynote
A question of time - Troopers 2024 Keynote
Ange Albertini
 
Relations between archive formats
Relations between archive formatsRelations between archive formats
Relations between archive formats
Ange Albertini
 
You are *not* an idiot
You are *not* an idiotYou are *not* an idiot
You are *not* an idiot
Ange Albertini
 
Improving file formats
Improving file formatsImproving file formats
Improving file formats
Ange Albertini
 
An introduction to inkscape
An introduction to inkscapeAn introduction to inkscape
An introduction to inkscape
Ange Albertini
 
The challenges of file formats
The challenges of file formatsThe challenges of file formats
The challenges of file formats
Ange Albertini
 
Connecting communities
Connecting communitiesConnecting communities
Connecting communities
Ange Albertini
 
TASBot - the perfectionist
TASBot - the perfectionistTASBot - the perfectionist
TASBot - the perfectionist
Ange Albertini
 
Caring for file formats
Caring for file formatsCaring for file formats
Caring for file formats
Ange Albertini
 
Trusting files (and their formats)
Trusting files (and their formats)Trusting files (and their formats)
Trusting files (and their formats)
Ange Albertini
 
Let's write a PDF file
Let's write a PDF fileLet's write a PDF file
Let's write a PDF file
Ange Albertini
 
An overview of potential leaks via PDF
An overview of potential leaks via PDFAn overview of potential leaks via PDF
An overview of potential leaks via PDF
Ange Albertini
 
Funky file formats - 31c3
Funky file formats - 31c3Funky file formats - 31c3
Funky file formats - 31c3
Ange Albertini
 
Ad

Recently uploaded (20)

IT488 Wireless Sensor Networks_Information Technology
IT488 Wireless Sensor Networks_Information TechnologyIT488 Wireless Sensor Networks_Information Technology
IT488 Wireless Sensor Networks_Information Technology
SHEHABALYAMANI
 
論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...
論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...
論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...
Toru Tamaki
 
Mastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B LandscapeMastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B Landscape
marketing943205
 
Config 2025 presentation recap covering both days
Config 2025 presentation recap covering both daysConfig 2025 presentation recap covering both days
Config 2025 presentation recap covering both days
TrishAntoni1
 
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptxReimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
John Moore
 
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
Ivano Malavolta
 
DNF 2.0 Implementations Challenges in Nepal
DNF 2.0 Implementations Challenges in NepalDNF 2.0 Implementations Challenges in Nepal
DNF 2.0 Implementations Challenges in Nepal
ICT Frame Magazine Pvt. Ltd.
 
UiPath AgentHack - Build the AI agents of tomorrow_Enablement 1.pptx
UiPath AgentHack - Build the AI agents of tomorrow_Enablement 1.pptxUiPath AgentHack - Build the AI agents of tomorrow_Enablement 1.pptx
UiPath AgentHack - Build the AI agents of tomorrow_Enablement 1.pptx
anabulhac
 
Build With AI - In Person Session Slides.pdf
Build With AI - In Person Session Slides.pdfBuild With AI - In Person Session Slides.pdf
Build With AI - In Person Session Slides.pdf
Google Developer Group - Harare
 
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdfKit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Wonjun Hwang
 
fennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solutionfennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solution
shallal2
 
Artificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptxArtificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptx
03ANMOLCHAURASIYA
 
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Maarten Verwaest
 
Top-AI-Based-Tools-for-Game-Developers (1).pptx
Top-AI-Based-Tools-for-Game-Developers (1).pptxTop-AI-Based-Tools-for-Game-Developers (1).pptx
Top-AI-Based-Tools-for-Game-Developers (1).pptx
BR Softech
 
Top 5 Qualities to Look for in Salesforce Partners in 2025
Top 5 Qualities to Look for in Salesforce Partners in 2025Top 5 Qualities to Look for in Salesforce Partners in 2025
Top 5 Qualities to Look for in Salesforce Partners in 2025
Damco Salesforce Services
 
MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...
MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...
MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...
ICT Frame Magazine Pvt. Ltd.
 
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptxDevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
Justin Reock
 
ACE Aarhus - Team'25 wrap-up presentation
ACE Aarhus - Team'25 wrap-up presentationACE Aarhus - Team'25 wrap-up presentation
ACE Aarhus - Team'25 wrap-up presentation
DanielEriksen5
 
Cybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and MitigationCybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and Mitigation
VICTOR MAESTRE RAMIREZ
 
How to Build an AI-Powered App: Tools, Techniques, and Trends
How to Build an AI-Powered App: Tools, Techniques, and TrendsHow to Build an AI-Powered App: Tools, Techniques, and Trends
How to Build an AI-Powered App: Tools, Techniques, and Trends
Nascenture
 
IT488 Wireless Sensor Networks_Information Technology
IT488 Wireless Sensor Networks_Information TechnologyIT488 Wireless Sensor Networks_Information Technology
IT488 Wireless Sensor Networks_Information Technology
SHEHABALYAMANI
 
論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...
論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...
論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...
Toru Tamaki
 
Mastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B LandscapeMastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B Landscape
marketing943205
 
Config 2025 presentation recap covering both days
Config 2025 presentation recap covering both daysConfig 2025 presentation recap covering both days
Config 2025 presentation recap covering both days
TrishAntoni1
 
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptxReimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
John Moore
 
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
Ivano Malavolta
 
UiPath AgentHack - Build the AI agents of tomorrow_Enablement 1.pptx
UiPath AgentHack - Build the AI agents of tomorrow_Enablement 1.pptxUiPath AgentHack - Build the AI agents of tomorrow_Enablement 1.pptx
UiPath AgentHack - Build the AI agents of tomorrow_Enablement 1.pptx
anabulhac
 
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdfKit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Wonjun Hwang
 
fennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solutionfennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solution
shallal2
 
Artificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptxArtificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptx
03ANMOLCHAURASIYA
 
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Maarten Verwaest
 
Top-AI-Based-Tools-for-Game-Developers (1).pptx
Top-AI-Based-Tools-for-Game-Developers (1).pptxTop-AI-Based-Tools-for-Game-Developers (1).pptx
Top-AI-Based-Tools-for-Game-Developers (1).pptx
BR Softech
 
Top 5 Qualities to Look for in Salesforce Partners in 2025
Top 5 Qualities to Look for in Salesforce Partners in 2025Top 5 Qualities to Look for in Salesforce Partners in 2025
Top 5 Qualities to Look for in Salesforce Partners in 2025
Damco Salesforce Services
 
MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...
MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...
MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...
ICT Frame Magazine Pvt. Ltd.
 
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptxDevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
Justin Reock
 
ACE Aarhus - Team'25 wrap-up presentation
ACE Aarhus - Team'25 wrap-up presentationACE Aarhus - Team'25 wrap-up presentation
ACE Aarhus - Team'25 wrap-up presentation
DanielEriksen5
 
Cybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and MitigationCybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and Mitigation
VICTOR MAESTRE RAMIREZ
 
How to Build an AI-Powered App: Tools, Techniques, and Trends
How to Build an AI-Powered App: Tools, Techniques, and TrendsHow to Build an AI-Powered App: Tools, Techniques, and Trends
How to Build an AI-Powered App: Tools, Techniques, and Trends
Nascenture
 

Exploiting hash collisions

  • 1. Exploiting hash collisions Ange Albertini BlackAlps 2017 Switzerland identical prefix
  • 2. This is not a crypto talk. It’s about exploiting hash collisions, (the weakest ones, w/ identical prefix) via manipulating file formats. You may want to watch Marc Stevens’ talk at CRYPTO17. All opinions expressed during this presentation are mine and not endorsed by any of my employers, present or past. DISCLAIMERS
  • 3. Nothing groundbreaking. No new vulnerability. Just a look behind the scenes of Shattered-like research (format-wise)OTOH there are very few talks on the topic AFAIK. TL;DR
  • 4. 2014: Malicious SHA1 - modified SHA1 2015-2017: Shattered - SHA1 2017: PoC||GTFO 0x14 - MD5 This talk is about... MalSha1
  • 5. ● Identical prefix ○ 2 files starting with same data ● Chosen prefix ○ 2 files starting with different (chosen) data ● Second preimage attack ○ Find data to match another data's hash ● Preimage attack ○ Find data to match hash Types of collision From here on, hash collision = IPC = Identical Prefix Collision first, weakest, overlooked Sh*t's broken, yo! Unicorns Dragons MD5:1992-2004 SHA1: 1995-2005 SHA2: 2001-? SHA3: 2015-?
  • 6. Formal way to present IPCs Collisions for Hash Functions MD4, MD5, HAVAL-128 and RIPEMD. X Wang, D Feng, X Lai, H Yu 2004 Notvery“visual”!
  • 8. Impact Better than random-looking blocks? Will it convince anyone to deprecate anything? FTR Shattered took 6500 CPU-Yr and 110 GPU-Yr. (that's a lot of computing power)
  • 10. 2004: Dan Kaminsky: MD5 To Be Considered Harmful Someday https://eprint.iacr.org/2004/357.pdf https://dankaminsky.com/2004/12/06/46/ 2004: Ondredj Mikle: Practical Attacks on Digital Signatures Using MD5 Message Digest https://eprint.iacr.org/2004/356.pdf IPC exploits papers ● 2005 Max Gebhardt, Georg Illies, Werner Schindler A Note on the Practical Value of Single Hash Collisions for Special File Formats ● 2014 MalSHA1 Malicious Hashing: Eve’s Variant of SHA-1 Ange Albertini, Jean-Philippe Aumasson, Maria Eichlseder, Florian Mendel, Martin Schläffer ● 2017 Shattered The first collision for full SHA-1 Marc Stevens, Elie Bursztein, Pierre Karpman, Ange Albertini, Yarik Markov ● 2017 PoC||GTFO 0x14 Greg, spq, Mako, Philippe, Evan2 , Ange, Melissa Elliott Slides a6cb4934945457d16bc90ef9ab3c391474fb78cf844c59f34d4505b95fbad5ea Paper ac7a05b4bf456b4358e8a754f5f70612ce593bca1cdb718c2b38e3e280fc1240 Jean-Philippe’s Slides aba7833ed35eb5b44b44377f7054c7318637a8cb5db002c1ac787a5d2314f658 Paper 5c763e295b95ee8c69fd9430eae62fa59d7c9716ada645a93dcc19387e3d6821 Paper a3396362dcc528ed29918c07701e3b5082365a1dc19a9aac8d104c9c3d07c6b2 Marc’s Crypto17 video Elie’s BlackHat Slides 1a17c315a946409e8ef37c56c962987d41377374c15ac0d855e92297b4f03596 file format collaborator instigator
  • 11. Contraints of hash and formats have nothing in common
  • 12. File constraints ● Collision blocks are very complex ⇒ considered random ● Collision blocks only differ by a mask. ○ The mask may be fixed in advance. ● Collision blocks may contain arbitrary values ○ Or bruteforce them. ⇒ craft your files with random blocks and apply mask = <> = Prefix? Block A Suffix Prefix? Block B Suffix
  • 13. Where the magic happens: random stuff + mask 7F 46 DC 93-A6 B6 7E 01-3B 02 9A AA-1D B2 56 0B FÜ“¦¶~ ; šª ²V 45 CA 67 D6-88 C7 F8 4B-8C 4C 79 1F-E0 2B 3D F6 EÊgÖˆÇøKŒLyà+=ö 14 F8 6D B1-69 09 01 C5-6B 45 C1 53-0A FE DF B7 øm±i ÅkEÁS þß· 60 38 E9 72-72 2F E7 AD-72 8F 0E 49-04 E0 46 C2 `8érr/ç r I àF 30 57 0F E9-D4 13 98 AB-E1 2E F5 BC-94 2B E3 35 0W éÔ ˜«á.õ¼”+ã5 42 A4 80 2D-98 B5 D7 0F-2A 33 2E C3-7F AC 35 14 B¤€-˜µ× *3.ì5 E7 4D DC 0F-2C C1 A8 74-CD 0C 78 30-5A 21 56 64 çMÜ ,Á¨tÍ x0Z!Vd 61 30 97 89-60 6B D0 BF-3F 98 CD A8-04 46 29 A1 a0—‰`kп?˜Í¨F)¡ 73 46 DC 91-66 B6 7E 11-8F 02 9A B6-21 B2 56 0F sFÜ‘f¶~ š¶!²V F9 CA 67 CC-A8 C7 F8 5B-A8 4C 79 03-0C 2B 3D E2 ùÊgÌ¨Çø[¨Ly +=â 18 F8 6D B3-A9 09 01 D5-DF 45 C1 4F-26 FE DF B3 øm³© ÕßEÁO&þß³ DC 38 E9 6A-C2 2F E7 BD-72 8F 0E 45-BC E0 46 D2 Ü8éjÂ/ç½r E¼àF 3C 57 0F EB-14 13 98 BB-55 2E F5 A0-A8 2B E3 31 <W ë ˜»U.õ ¨+ã1 FE A4 80 37-B8 B5 D7 1F-0E 33 2E DF-93 AC 35 00 þ¤€7¸µ× 3.ß“¬5 EB 4D DC 0D-EC C1 A8 64-79 0C 78 2C-76 21 56 60 ëMÜ ìÁ¨dy x,v!V` DD 30 97 91-D0 6B D0 AF-3F 98 CD A4-BC 46 29 B1 Ý0—‘ÐkЯ?˜Í¤¼F)± 0c 00 00 02 c0 00 00 10 b4 00 00 1c 3c 00 00 04 bc 00 00 1a 20 00 00 10 24 00 00 1c ec 00 00 14 0c 00 00 02 c0 00 00 10 b4 00 00 1c 2c 00 00 04 bc 00 00 18 b0 00 00 10 00 00 00 0c b8 00 00 10 ⇒ generate one file from the other. Collision blocksFile A File B xor mask These are Shatteredexamples. That’s a big pile of…randomness :)
  • 14. .X .. .. .X X. .. .. X. XX .. .. XX XX .. .. .X XX .. .. XX X. .. .. X. XX .. .. XX XX .. .. XX .X .. .. .X X. .. .. X. XX .. .. XX XX .. .. .X XX .. .. XX X. .. .. X. .. .. .. .X XX .. .. X. Stevens13: SHA1, 6610 Yr Jump .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. X. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. X. .. .. .. .. .. .. .. .. .. .. .. .. .. X. .. .. .. .. FastColl: MD5, ~1s Very expensive,but trivial to exploit Prefix and masks determine how easily it's exploitable. Instant, but very restrictive→ bruteforce
  • 15. 2D 20 42 EC 61 63 6B 41 6C 70 73 27 31 37 20 2D - B.ackAlps'17 - 01 4D 80 6F 5B CB C0 AE 3D 33 52 3D EA 0B 01 93 .M.o[...=3R=.... 5A 58 58 DB 51 B3 32 B4 F6 17 99 75 62 B8 D3 BD ZXX.Q.2....ub... 58 A3 EE A3 7C 22 0D 10 56 7F 4A D6 EF 58 C9 1F X...|"..V.J..X.. 24 60 25 1F 4A E9 FC F5 55 67 B7 A9 E3 54 C5 72 $`%.J...Ug...T.r 0A A8 05 D6 6C 79 21 85 0A 75 38 D9 C6 D9 01 51 ....ly!..u8....Q BD C3 19 F5 32 F5 EC 99 15 AC 91 9F CF BE BD CE ....2........... E1 2B 75 20 CB D9 76 F5 F6 96 5B 89 3E 8B 10 E0 .+u ..v...[.>... 2D 20 42 6C 61 63 6B 41 6C 70 73 27 31 37 20 2D - BlackAlps'17 - CA 99 ED 4A 7A 59 10 F6 6C 10 5B 71 B0 80 65 5D ...JzY..l.[q..e] 87 07 94 73 71 1F 07 B2 B5 84 12 96 BD 1D 03 2C ...sq.........., E7 09 25 96 6E 0B 02 FD 96 9A 54 32 EB 15 FC F1 ..%.n.....T2.... D7 DF 52 10 C4 35 29 0A 5B 9A 93 40 34 5C 35 4C ..R..5).[..@45L D7 AA 9E 83 16 F3 8C 61 E0 44 5C F0 4C DE F7 1C .......a.D.L... 16 D1 F7 49 B4 D4 EE 9E 65 D5 B6 7F B6 31 27 1E ...I....e....1'. 8B 0A F7 3D E7 42 B5 64 BC 1E 2A 97 64 EA F7 F2 ...=.B.d..*.d... 2 MD5 collisionsfrom HashClash (2 min)with different masks.2D 20 42 6C 61 63 6B 41 6C 71 73 27 31 37 20 2D - BlackAlqs'17 - CA 99 ED 4A 7A 59 10 F6 6C 10 5B 71 B0 80 65 5D ...JzY..l.[q..e] 87 07 94 73 71 1F 07 B2 B5 84 12 96 BD 1D 03 2C ...sq.........., E7 09 25 96 6E 0B 02 FD 96 9A 54 32 EB 15 FC F1 ..%.n.....T2.... D7 DF 52 10 C4 35 29 0A 5B 99 93 40 34 5C 35 4C ..R..5).[..@45L D7 AA 9E 83 16 F3 8C 61 E0 44 5C F0 4C DE F7 1C .......a.D.L... 16 D1 F7 49 B4 D4 EE 9E 65 D5 B6 7F B6 31 27 1E ...I....e....1'. 8B 0A F7 3D E7 42 B5 64 BC 1E 2A 97 64 EA F7 F2 ...=.B.d..*.d... 2D 20 42 6C 61 63 6B 41 6C 70 73 27 31 37 20 2D - BlackAlps'17 - 01 4D 80 6F 5B CB C0 AE 3D 33 52 BD EA 0B 01 93 .M.o[...=3R..... 5A 58 58 DB 51 B3 32 B4 F6 17 99 75 62 B8 D3 BD ZXX.Q.2....ub... 58 A3 EE A3 7C 22 0D 08 56 7F 4A D6 EF 58 C9 1F X...|"..V.J..X.. 24 60 25 9F 4A E9 FC F5 55 67 B7 A9 E3 54 C5 72 $`%.J...Ug...T.r 0A A8 05 D6 6C 79 21 85 0A 75 38 59 C6 D9 01 51 ....ly!..u8Y...Q BD C3 19 F5 32 F5 EC 99 15 AC 91 9F CF BE BD CE ....2........... E1 2B 75 20 CB D9 76 FD F6 96 5B 89 3E 8B 10 E0 .+u ..v...[.>... Same hash, different masks.
  • 17. ● Get collision block ignored (commented out) ● File suffix/separate executable contains code ○ Checks the block values or uses block as decryption key. ⇒ Collision block == passive data Collision blocks (commented out) Code (checking block values) If-then-else (data) Works withmany script languages
  • 18. Code ● Prefix or bruteforcing sets up some opcodes ● 2 target addresses in the collision blocks ● 2 code snippets in suffix Blocks Payload 1 Payload 2 Jump 1 Good Bad Jump 2 Good Bad Only needs few bytesX86 jump = EB xx, But no real-life consequences :( Suffix
  • 19. ● Prefix or bruteforcing sets up a header ● Collision blocks alter a value, To make parsers ignore the rest of the blocks and land at different offsets. See MD5 rogue certificates w/ chosen-prefix. Prefix (declares a header) Collision blocks (changes header value) Data (contains 2 data sets) Format (structure)
  • 20. Concatenation With a top-down file format that can start at any offset (Rar, 7z…) 1. Collision blocks end with signature's start. ○ w/ a difference on that byte. 2. Append a file minus its first byte. 3. Append another file of the same type. Coll. Blocks RAR File 1 RAR File 2 .. .. .. R ar!<file> Rar!<file> .. .. .. ? ar!<file> Rar!<file> One letter is enough (ZIP is bottom-up)
  • 21. Find a way to get 2 files despite the randomness. Prefix. Randomness. Collision block masks. QA Write your prefixInsert totally random dataApply mask General goal Test files,on all tools. (meaningful)
  • 22. Format target ● Something universally used. ○ Preferably multi-platform ⇒ executables ○ By end-users, not just developers. ○ Preferably, something with crypto! (certificates are pretty restrictive) ● With as fewer parsers in the wild as possible. Visual documents: JPEG, PNG, GIF, PDF...
  • 23. Validity. Compatibility. Correct rendering. Re-useability. Test, test, test! Ever dance with the specs by the pale moonlight? Explore all code paths,All headers values Corner cases FTW Challenges
  • 24. 2005: Gebhardt et al. ● If-then-else exploits ○ PostScript ○ PDF ○ TIFF ○ Word 97 Word97 macro Sub collision() Dim b(512) As Byte FName$ = ActiveDocument.Name Open FName$ For Binary Access Read As #1 Len = 512 Get #1, , b ’the price 1000$ is contained in 2nd line of Close #1 ’the .doc file; that line is selected by ’the Selection .. Count:=2 command If b(147) >= 128 Then Selection.Collapse Direction:=wdCollapseStart Selection.GoTo What:=wdGoToLine, Which:=wdGoToAbsolute, Count:=2 Selection.MoveRight Unit:=wdCharacter, Count:=1 Selection.Find.ClearFormatting With Selection.Find .Text = ’$’ .Forward = True .Wrap = wdFindContinue .Format = False .MatchWholeWord = False .MatchWildcards = False .MatchSoundsLike = False .MatchAllWordForms = False End With Selection.Find.Execute Selection.MoveLeft Unit:=wdCharacter, Count:=3 Selection.MoveRight Unit:=wdCharacter, Extend:=wdCharacter Selection.Font.ColorIndex = wdWhite Selection.GoTo What:=wdGoToLine, Which:=wdGoToAbsolute, Count:=1 Selection.Collapse Direction:=wdCollapseEnd End If ’by the Selection .. Count:=1 command ’the cursor returns to the first character ’in the text (disguise of attack) End Sub
  • 25. No widespread scripting language in PDF: ● JavaScript/FormCalc reliably only in Adobe Reader Only binary-based conditional function: ● PostScript Calculator (Type 4) functions PDF features and landscape << /FunctionType 4 /Domain [0.0 1.0] /Range [0.0 1.0] /Length 28 >> stream {255 mul 121 sub 1 exch sub} endstream depends on the collision block
  • 26. ● Poorly supported across readers. ● Limited to 2 non-overlapping objects ⇒ reliable but limited for payload and compatibility Not good enough REJECTED Only OK in Adobe No full control
  • 27. 2014: MalSHA1 ● Very restrictive: no prefix !!! ⇒ very simple collisions ● 30-50h on 80 cores: Many retries are possible, but unclear collision mask. ● If then else: Shell script ● Concatenation: RAR, 7z ● Code: Master Boot Record ● Format: JPEG ● Polyglot: all in the same file! #‘4@ ØM¦ÓTá+¸…[Gx&½ý7+îæP,uKW8¿Ø¥à²D”Q*Í6¢þ⊟2U™ª´zí‚ if [ `od -t x1 -j3 -N1 -An "${0}"` -eq "91" ]; then echo " (__)n (oo)n /-------/n / | ||n* ||----||n ^^ ^^"; Else echo "Hello World."; fi
  • 28. ● Can’t control 4 bytes in a row. ⇒ many file formats aren’t useable ● Windows Executable? (magic = “MZ”) Would end up with huge e_lfanew (a header offset, not a memory pointer) Max value in practice: 0x9000000 (150 Mb) MalSHA1 failures
  • 29. A primer on JPEG signature: FF D8 Segments structure: all start with FF 00 (FF in data always followed by 00) Garbage? Skip until next FF! Big endian lengths, on 2 bytes. Never too big,never too small. very short Very "tolerant"
  • 30. 2 images, 1 "comment" A comment (an ignored segment), of variable length. Use another comment to Jump over the first image. make sure not to jump in the blocks: ⇒ 01 xx is optimal.
  • 33. Can't combine JPEG and MBR: FF D8 is an invalid opcode. Polyglots: a single pair with several use cases.
  • 35. 2015: Implementing Stevens13 1. Research file trick 2. Implement attack 3. Craft files
  • 36. Stevens13 compared with MalSHA1 - Complex computation - Expensive computation + Prefix - Totally random blocks + Fixed mask + Blocks start with a difference Never tried before:(can't be interrupted/tweaked) One. single. try. Constraints-- constraints++ reliability++ reliability++
  • 37. 1. Research file trick ● MalSHA1's JPEG trick would work. ● We'd like a new trick. PDF? ○ Nothing existing versatile so far. ○ Experiments with PDF (XREF, object numbers) ■ Never works reliably accross all readers. ● No SHA1 collision at this stage - hard to get traction. At this stage it's still only a set of weird file constraints.
  • 38. If you're not familiar with PDF... ...with my vision of PDF!
  • 40. %PDF 1 0 obj << /Pages << /Kids [ << /Contents 2 0 R >> ] >> >> 2 0 obj <<>> stream 95 Tf 20 400 Td (Chrome) Tj endstream trailer << /Root 1 0 R >> 1 0 obj << /Resources << /Font << /F1 << /BaseFont /Arial /Subtype /Type1 >> >> >> /Contents << >> stream /F1 170 Tf 10 400 Td (FireFox) Tj endstream >> endobj xref %trailer << /Root << /Pages << /Kids [1 0 R] /Count 1>> >> >> working PDFs
  • 41. 1 0 obj << /Resources << /Font << /F1 << /BaseFont /Arial /Subtype /Type1 >> >> >> /Contents << >> stream /F1 170 Tf 10 400 Td (FireFox) Tj endstream >> endobj xref %trailer << /Root << /Pages << /Kids [1 0 R] /Count 1>> >> >> no signature no /Type inline /Contents no /Length empty XREF Direct /Root no /Size no /Type no startxref no %%EOF %PDF 1 0 obj << /Pages << /Kids [ << /Contents 2 0 R >> ] >> >> 2 0 obj <<>> stream 95 Tf 20 400 Td (Chrome) Tj endstream trailer << /Root 1 0 R >> truncated signature direct /Kids no /Type no /Font no /Countno /Type no /Resources no endobj no /Length no XREF Direct /Root no /Size no /Type no startxref no %%EOF no BT/ET no font reference INVALID? INVALID? no /Parent comment no /Parent no BT/ET no endobj
  • 43. I made extreme PDFs for each reader by hand.
  • 44. These extreme PDFs fail on any other reader.
  • 45. The devil is in the detail ● All PDF parsers have their weirdness ○ Does it work? Does it display, behave normally? ○ A trick on a PDF reader is easy, but a reliable trick for all of them is hard. Examples: ● Preview is more strict for JPEG structures. But created some funky ghost JPEGs :) ● OTOH it's less compatibility for gradients. ● An unusual JPEG in a PDF can easily reboot a Kindle. ● A complex JPEG can take minutes to load. ● A crazy JPEG in a PDF displays glitches in Adobe. GlitchesinAdobe
  • 47. 2015: PDF is tricky... ● A PDF trick with total compatibility...? ○ With doc-level control? (not just a glitch) ● Eventually… JPEG in a PDF: ○ PDF embeds entire JPEG files ○ Image parameters can be referenced ○ Reliable ■ No possible error ■ "Sane" PoCs - very little overhead ○ Reusable After the collision blocks, so no restrictions on dimensions!
  • 48. Pushing the limits of our JPEG trick PDF are usually documents. We wanted fake documents! The first image has to be jumped over.
  • 49. Only 393x438 px in 90% quality ⇒ 55Kb Yet already near limit! Current limit: Size(Image) < 64Kb Good for a photo,Not for a doc!
  • 50. 2 comments per segment
  • 51. The scan length only concerns the start! The ECS grows with the file, and is not limited to 64Kb!
  • 52. 1024x740 Q.100% ⇒ 228 Kb a single scan of 227 Kb!
  • 54. Each scan increases definition ⇒ progressive file, smaller scans
  • 56. libJPEG's JPEGTran & wizard.doc $ jpegtran --help usage: jpegtran [switches] [inputfile] Switches (names may be abbreviated): -copy none Copy no extra markers from source file -copy comments Copy only comment markers (default) -copy all Copy all extra markers -optimize Optimize Huffman table (smaller file, but slow compression) -progressive Create progressive JPEG file Switches for modifying the image: -grayscale Reduce to grayscale (omit color data) -flip [horizontal|vertical] Mirror image (left-right or top-bottom) -rotate [90|180|270] Rotate image (degrees clockwise) -transpose Transpose image -transverse Transverse transpose image -trim Drop non-transformable edge blocks -cut WxH+X+Y Cut out a subset of the image Switches for advanced users: -restart N Set restart interval in rows, or in blocks with B -maxmemory N Maximum memory to use (in kbytes) -outfile name Specify name for output file -verbose or -debug Emit debug output Switches for wizards: -scans file Create multi-scan JPEG per script file http://libjpeg.cvs.sourceforge.net/viewvc/libjpeg/libjpeg/wizard.doc?content-type=text%2Fplain Advanced usage instructions for the Independent JPEG Group's JPEG software ========================================================================== This file describes cjpeg's "switches for wizards". The "wizard" switches are intended for experimentation with JPEG by persons who are reasonably knowledgeable about the JPEG standard. If you don't know what you are doing, DON'T USE THESE SWITCHES. You'll likely produce files with worse image quality and/or poorer compression than you'd get from the default settings. Furthermore, these switches must be used with caution when making files intended for general use, because not all JPEG decoders will support unusual JPEG parameter settings. Quantization Table Adjustment ----------------------------- Ordinarily, cjpeg starts with a default set of tables (the same ones given as examples in the JPEG standard) and scales them up or down according to the -quality setting. The details of the scaling algorithm can be found in jcparam.c. At very low quality settings, some quantization table entries can get scaled up to values exceeding 255. Although 2-byte quantization values are supported by the IJG software, this feature is not in baseline JPEG and is not supported by all implementations. If you need to ensure wide compatibility of low-quality files, you can constrain the scaled quantization values to no more than 255 by giving the -baseline switch. Note that use of -baseline will result in poorer quality for the same file size, since more bits than necessary are expended on higher AC coefficients. You can substitute a different set of quantization values by using the -qtables switch: -qtables file Use the quantization tables given in the named file.
  • 57. Custom scans Use JPEGTran's to tweak scans and make them smaller than 64Kb, Wizardry is hard: ● JPEGTran is inconsistent ● The documentation's examples are broken.
  • 58. 0: 0-0, 0, 0; 0: 1-1, 0, 0; 0: 2-6, 0, 0; 0: 7-10, 0, 0; 0: 11-13, 0, 0; 0: 14-20, 0, 0; 0: 21-26, 0, 0; 0: 27-32, 0, 0; 0: 33-40, 0, 0; 0: 41-48, 0, 0; 0: 49-54, 0, 0; 0: 55-63, 0, 0; 1: 0-0, 0, 0; 1: 1-16, 0, 0; 1: 17-32, 0, 0; 1: 33-63, 0, 0; 2: 0-0, 0, 0; 2: 1-16, 0, 0; 2: 17-32, 0, 0; 2: 33-63, 0, 0; 1944x2508 100%, 860 Kb ⇒ 20 scansSyntax: component: byte min-max, bit min, bit max; Making a big image fit w/ custom scans definitions. Few colors
  • 59. Limitations? LibJPEG has an limit of 100 scans. On writing. Not on reading ;) ⇒ we could release a multi-page doc, but it's giving mobiles a hard time.
  • 60. Shattered: It's a JPEG in a PDF ● We still want a PDF file! ● PDF header, declare image ● Reference all /Image parameters after the file data. ○ After the collision blocks ● Put 2 images contents ○ With the same parameters, unlike MalSHA1 ● Put image parameters values ● Finalize PDF file. colors, dimensions...
  • 62. 8 brain-year, 100 GPU-year and 6500 CPU-year later... Woohoo! We have a collision! "Here is the file…" More details here
  • 64. Then this happened... I also lost compatibility with Adobe and Safari at some point... I completely lost my... ;)
  • 65. Lessons learned ● Keeping notes and PoCs helps. ● a diary and a log of command lines might seem overkill… ...but it really helps! (Especially as readers have been updated in the meantime!)
  • 66. Shattered is real With 0 bug reported! nominated forPéter Szőr award best crypto attack best CRYPTO17 paper
  • 69. ● CVE-2005-4900 updated :) ● It broke SVN in practice! ○ SHA1 for deduplication ○ MD5 for integrity ● BitErrant ○ BitTorrent uses SHA1 for file chunks Impact … Checksum mismatch: shattered-2.pdf expected: 5bd9d8cabc46041579a311230539b8d1 got: ee4aa52b139d925f8d8884402b0a750c … "SHA-1 is not collision resistant..."
  • 70. ● PoCs generators ○ simple within 5 hours (!) ○ advanced ● HTML collision ● Used in Boston Key Party CTF, 50 pts ● Bitcoin bounty claimed ;) [2.8K€] Internet does its thing... first public PoCs FLAG{AfterThursdayWeHadToReduceThePointValue}
  • 71. Enthusiast feedback ● Bruce Schneier Yes, this brute-force example has its own website. ● Linus Torvald ...in a project like git, the hash isn't used for "trust". ● John Gilmore Linus [...] wired assumptions about SHA1 deeply into git. ● Robert J. Hansen [OpenPGP, 2013] Scaremongering about crypto is one of the quickest ways to make me angry.
  • 72. We can do more It's not just about full-page pictures.
  • 73. It's not just full-page pictures ● It's a standard PDF document, with a 'bipolar' JPEG. ● Any PDF element can be part of the JPEG. ○ A multi-page doc w/ an image with appended pages. ○ A totally standard doc, with only a few elements replaced.
  • 74. DEMONotice anything? It's the complete Shattered paper... d3f968d604bf1c31a4b3aaecd0f6b2fad4c33402
  • 76. What's JPEG? ● An image format ● A lossy data storage format (specialized for photos?) ○ PDF takes it too literally: 3 out of 6 readers accept JPEG-stored data for non-images objects, such as page content (rejected by browsers) 1 0 0 RG // color = red 150 w // width 53 53 m // start point 558 558 l // end point B // draw path 53 558 m 558 53 l B =
  • 77. Lossless JPEG? ● Quality 100% ● Grayscale JPEG ⇒ no component mixing Still lossy! ● JPEG is 8x8 block based ⇒ Repeat content lines 8 times. ○ Pad a little to prevent truncation ⇒ Reliably works !
  • 79. 2 sha1-colliding PDFs with vector content stored as lossless JPEG data. Colors via a grayscale image :)
  • 80. Why not both? JPEG as image, JPEG as data... We've seen so far….
  • 81. Lossless data and lossy image ● Pad data to match image width ● Store 8 times to make lossless ● Append image A page content can reference itself No page content terminator :( ⇒ lossly data could fail rendering - YMMV
  • 82. q 612 0 0 792 0 0 cm /Im1 Do Q 1 0 0 rg BT /F1 90 Tf 10 400 Td (GOLDEN AXE) Tj ET Q Standard Page code + padding showing (itself as) an image Displaying text
  • 83. 2 sha1-colliding PDFs with mixed JPEG (on different readers) de9b4237c940ec4af249f2c80bcd841537f6624c
  • 84. Trivial to detect at file level, tricky to detect at rendering level. Shattered: one blocks pair, many kinds of PoCs!
  • 85. MD5?! It's already broken! Nothing to see here, right?
  • 86. Multi-collision files Why create only a pair of colliding files when you can create 2609 ? 2609 = 212455197126706839475835282620987450931837247090812769279777655280161423944340897095665 000906091714267555731794498600406138631735061082895763807991506634940777532508334157287 6126912512 (184 digits)
  • 87. What's a collision? Variable content, same hash
  • 88. Hashquine Display your own file's hash It's a mental trick: "how do you know the hash in advance?" Make your file's content updatable Without changing the final hash.
  • 89. Fake hashquine Actually a script that computes and display its own hash Often comes with obfuscation ;)
  • 90. Format hashquine 1 passive collision ⇒ take this file or skip to the next. X collisions ⇒ X+1 versions of the same element. 1. Store multiple versions of visual elements in a chain of collisions. 2. Display the file hash in the file.
  • 91. Data Hashquine 1 collision == 2 alternate contents ⇒ 1 bit of data. Put some code that parses the bits and displays the stored value. More collision efficient than format hashquines, but requires code to be executed. cheating?
  • 92. PostScript by Greg GIFs by spq animated The first ever!
  • 93. As images PDFs by Mako $ pdftotext -q md5text.pdf - 66DA5E07C0FD4C921679A65931FF8393 $ md5sum md5text.pdf 66da5e07c0fd4c921679a65931ff8393 md5text.pdf As text
  • 94. GIF & TIFF, by Rogdham Very nice writeup for GIF bit-hashquine TIFF with writeup, but 4 Gb !
  • 95. PoC||GTFO 0x14 Articles about hashquines. But also hashquine itself, and polyglot! by Evan2 and Philippe
  • 96. A LaTeX-generated PDF... ...showing its MD5... (15x32=480 collisions) ...showing the same MD5! (4x32=128 collisions) 608? Mmm, seafood! ...also a NES rom...
  • 97. 1 extra collision ⇒ hidden cover, same MD5. 609!
  • 98. You know a cryptographic hash is really broken when it feels like a fancy fidget spinner. When you generate 609 of its collisions for fun. In total, 9824 collisions were computed for the making of this issue. Thanks Marc! https://www.chrisbathgate.com /
  • 101. PNG Strengths: ● 8 byte signature ● Chunk types after lengths ● 4 byte lengths ● Chunk CRCs Weaknesses: ● Easy to make ignored chunks ● CRC usually ignored
  • 102. Attack ⇔ format pairing Hash collision attack ⇒ constraints (prefix, mask) File format ⇒ other constraints (structure, compatibility) The same attack can be used with various file formats. A file format trick can be used with different hashes.
  • 103. @arw's HTML colliding pair made with Shattered prefix. PDF ⇒ HTML (also works as polyglot) Mako's PDF Hashquine with MD5 MalSHA1's JPEG trick + Shattered JPEG in PDF trick for SHA1 SHA'1 ⇒ SHA1 ⇒ MD5
  • 104. Why? "It's just a bag of trick anyway…" "Crypto doesn't care about PoCs..."
  • 105. Attacks rely on PoCs. Attacks convince people to deprecate. You don't get pwned by academic papers, but by their PoCs. A new format trick could benefit MD5, SHA1… or a future attack! In practice, - Shattered generates an infinity of colliding documents, of different kinds. - Shattered broke SVN. Didn't that help?
  • 106. ...the end? ...we still have a few tricks up our sleeves ;)
  • 107. Conclusion ● Hash collisions exploitation is a niche domain: weird constraints, unusual challenges & rewards. ● Researching a file format manipulation now could benefit on a future cryptographic attack.
  • 108. FWIW (full personal disclosure) ● When I was asked about MalSHA1, I saw no solution. ○ I gave up for a while - I didn't think particularly about JPEG. ● In the meantime, I was challenged to encrypt with AES a JPEG to a JPEG. ⇒ AngeCryption ● With that knowledge, I succeeded for MalSHA1. ● That knowledge was the starting point for Shattered. ○ I gave up at some time on the JPEG optimization aspect. ○ But I kept that fidget spinning playfully. ○ Found my 2 breakthroughs… in very unexpected places ;) Don't give up! Keep that fidget spinning! One more thing
  • 109. " How do you do all this?" ● I thought I lacked discipline. That led me nowhere. ● Just do what makes you giggle like a 3-year old. (that's what playing with file formats does to me). ● Have fun! Eventually you'll get feedback, recognition… ● By then, you'll have no reasons to stop anymore. ● And you'll be happily disciplined by then. Have fun!
  • 110. Thanks for your attention! Questions? Special thanks to Marc & Maria Philippe, Evan, spq, Mako, Greg, Melissa, Elie, Jean-Philippe, and CommitStrip.