SlideShare a Scribd company logo
Code Obfuscation: Theory and
Practices
Fang Hui
18/3/2014
Outline
• Theoretic Background
• Obfuscation in Practice
• Case Study
– Virtual Machine Based Binary code Obfuscation
– C#/.NET code obfuscation
– Java Byte-code obfuscation on Blackberry
• Obfuscation Tools
• OS Platforms
2
Obf(P)
What is Code Obfuscation
• Obfuscation makes a program “unintelligible” while preserving its
functionality
• Is a form of security through obscurity to make reverse engineering
difficult
• Transform the binary code or compiled .NET / Java code into a form
difficult to understand
• Obfuscated code has the same behavior
• Obfuscators work in the same way as compiler/optimizers
Obf
Goal: Change program so still has same I/O
behavior but is impossible to understand
P
3
Why Obfuscate?
For Software Protection
Software vendors want to prevent users from
reverse-engineering or tampering executable
code
For Cryptography
Many applications: fully homomorphic
encryption, private to public key crypto, etc.
4
Obfuscation vs. Homomorphic
Encryption
• Obfuscation: hide function’s implementation while preserving functionality
• HE: computing with encrypted data and solve access control problem
5
F Obfuscation F
F Encryption F
x
+  F(x)
Result in the clear
+  F(x)
x Result encrypted
Theoretical Obfuscator
• An obfuscator is a processor of programs (i.e., Turing
machines or Boolean circuits) which outputs a program with
the same functionality, but unintelligible code
• In other words, an adversary possessing the obfuscated
program should be able to learn only input/output, like a
black-box access
– Function equivalence
– Light weighted
– Virtual black box
6
Virtual Black Box Security
• A probabilistic algorithm O is a TM/circuit obfuscator if the
following three conditions hold:
– (Functionality Equivalence) For every TM/circuit P and for every input
x: P(x) = O(P) (x)
– (Polynomial Slowdown) There exists a polynomial q(.) such that for
every TM/circuit P, |O(P)|≤ q(|P|). TMs are additionally required that
for every input x, if P halts after t steps on x then O(P) halts within q(t)
steps on x
– (Virtual Black Box) For any PPT A, there is PPT oracle machine S such
that for all TM/circuit P:
| Pr[A(O(P))=1] – Pr[SP(1|P|)=1] | < negl(|P|)
• Does a general-purpose VBB obfuscator exist?
7
Point Function Obfuscation
• For very few functions, we know how to achieve VBB
• Point function
Ix(w) = { 1 if w=x, 0 otherwise}
• Obfuscate Ix with one-way function f
• Let y=f(x), program
obf Ix(w) = { 1 if f(w)=y, 0 otherwise}
• Intuitively, obf Ix(w) reveals no more advantage than black
box access to Ix, which rely on hardness of reverting one-
way function f. [Canetti97]
adv(obf Ix) = adv(f)
8
VBB Obfuscator does not exist
• Proof sketch [Barak01]
– Firstly, construct two functions C and D that are
not obfuscatable together
– Secondly, combine each C and D to a single
function, giving an ensemble that is
unobfuscatable for Turing Machines
– Finally, show how to implement it for circuits, for
which the same ideas cannot be applied
9
“Proof”
Proof for Turing Machines:
Cα,β(x) = β if x=α, 0 otherwise
Dα,β(C) = 1 if C(α)=β, 0 otherwise
Intuition:
Given Cα,β , Dα’,β’ “know” output Dα’,β’(Cα,β)
Given black-box access to Cα,β , Dα’,β’
“don’t know” what Dα’,β’(Cα,β) outputs!
Fα,β(b,y) = Cα,β(y) if b=0
Dα,β(y) if b=1
Zα,β(b,y) = 0 if b=0
Dα,β(y) if b=1
From black-box access, Fα,β, Zα,β look the same
From non black-box access:
O(Fα,β)(1, O(Fα,β(0,·))) = 1
O(Zα,β)(1, O(Zα,β(0,·))) = 0
10
Indistinguishability Obfuscation
• OK VBB security is strong because some functions are
bad. How about we get O() that does “as well as
possible” on every function?
• Indistinguishability obfuscation [Gentry13]
– An obfuscator of this kind conceals as much information
about the source text as any other feasible method would
– an adversary who is given obfuscated versions of two
distinct but equivalent programs—they both compute the
same function—can’t tell which is which
– It is a weaker notion of obfuscation, but it is the best that’s
attainable
11
VBB vs. IO
• Virtual Black-Box: For any poly learner L, exists poly
simulator S, s.t. for every (poly time) program P:
Pr[L(O(P)) = 1] ≈ Pr[SP(1|P|)=1]
12
SimulatorLearner
O(P) P
0/1 0/1
≈
x P(x)
VBB vs. IO
• Indistinguishability Obfuscation: Obfuscations of equivalent circuits
of the same size should be computationally indistinguishable.
• For every poly learner L, exists poly simulator S s.t. for every circuit
C1, for every equivalent C2 (|C1| = |C2|), distributions L(O(C1)) and
S(C2) indistinguishable.
13
SimulatorLearner
O(C1)
0/1 0/1
≈
C2
x C2(x)
Best-Possible Obfuscation
• Inefficient Indistinguishability Obfuscation is
always possible
• O(C) = lexicographically 1st circuit computing
the same function as C (canonical form)
• Canonicalization is inefficient (unless P=NP)
14
Obfuscation in Practice
15
Obfuscation in Real World
• Copy protection/Licensing
• Software watermarking
• Prevent reverse engineering
– By competitors
– By hackers (e.g., for games)
if (test fails) then exit
else …
16
Off the Shelf Obfuscators
Code Obfuscation Tools
Languages Obfuscators
C++ VMProtect - trial
Stunnix C++ Obfuscator – commercial product
.NET Eazfuscator.NET – free
SmartAssembly – commercial license, very powerful –
assembly obfuscation + compression
Java ProGuard – free, open-source
yGuard – free, open source
JShrink
JavaScript Mostly by applying lexical transformations:
JavaScript minifier/obfuscator - free
Stunnix JavaScript Obfuscator - commercial
Shane Ng's obfuscator
18
Executables
• Compiled executables
– Often built with a higher programming language, such as C++,
which are then translated to a lower level language known as
assembly machine language
– Harder to reverse back to original source because of different
compilers, architectures, and lack of information during
compilation
• Interpreted applications
– Such as Java and .NET, require an application virtual machine
(referred to as CLR, common language runtime)
– Easier to reverse, because the interpreted compilation (JIT)
process is more structured and retains more information about
the executable (where obfuscation needed!)
19
Code Decompilation/Disassemblation
• Reconstructs the source code (to some extent)
from the compiled code
– .NET assembly to C#/VB.NET source code
– Java class file/ JAR archive to Java source code
– .exe file to C/C++/Assembler code
• Note that the reconstructed code
– is not always 100% compilable
– loses private identifier names and comments
20
Code Decompilation Tools
Languages Tools
.EXE file Boomerang Decompiler  outputs C source code
IDA Pro – powerful disassembler / debugger
C#/.NET .NET Reflector – powerful tool, great usability (paid)
ILSpy – free
two tools that will allow us modify the executables
Graywolf - allow to modify the executables
Reflexil - allow to modify the executables
Java DJ Java Decompiler
JD (JD-Core / JD-GUI / JD-Eclipse)
JAD
JavaScript JSBeautifier –reconstructing program and tracing logic
21
Code Obfuscation Techniques
• Data obfuscation
• Layout obfuscation
• Control obfuscation
• Preventive transformation
22
Control Obfuscation Examples
Hide information flow
– Dead code
– Parallelize code
23
Reverse Engineering: How much can it
do?
• Normal Engineering:
write source code -> compile -> binary code
• Reverse Engineering:
gets the binary -> use powerful tools (e.g IDA Pro,
Ollydbg) to gain knowledge about program -> get to
know code structure, control flow, and valuable
assets, keys, algorithms
24
IDA Pro: registers
25
IDA Pro: control graph
26
IDA Pro: data structs
27
Security Level for Obfuscation
• The level of security that an Obfuscator adds
depends on:
– The transformations used
– The power of available deobfuscators
– The amount of resources available to deobfuscators
• Cost tradeoffs
– Obfuscated code is slower due to changes/additions in
its control logic
– Obfuscator is the opposite of high-quality
compiler/optimizer
28
Limits of Obfuscation
• Factor that prevent use of Obfuscation
– Cost of Obfuscation
– Execution time of code
– High Program complexity
• Performance evaluation
29
Features Descriptions
Potency is the level up to which a human reader would be
confused by the new code
Resilience is how well the obfuscated code resists attacks by
deobfuscation tools
Cost is how much load is added to the application
Virtual Machine Based Binary code
Obfuscation
30
Virtual Machine Based Obfuscation
• A new approach: Multi-Stage Binary Code
Obfuscation Using Improved Virtual Machine
[Fang11]
– VM based obfuscator
– Code transformation
– Block obfuscation
– Obfuscation key generation
31
Virtual Machine-ed (Virtualized) Program
• General code structure
– A VM section was appended to the program
– Protected binary code was transformed to byte-code, which is
interpreted by VM
– Remaining code was kept intact
– Entry Point to protected code was redirected into VM code
32
How A VM Obfuscator Works
• Instruction transformation
• VM saves all registers and flags in its own VM
context.
• VM restore upon exiting byte-code
interpretation
33
Instruction transformation
• Intel IA32 instruction format
• VM instruction format
34
Prefix Opcode ModR/M SIB Disp Imm
length
8
Original instruction
length Prefix Opcode …
16 8
0xFFFF max no. is 256Exclude this
byte
Related data
8
An Example of VM Instruction Set
• 39 types of VM instructions
• For example:
– ret xxxx
– shl eax, 1
35
I_COND_JMP_SHORT 0x00
I_COND_JMP_LONG 0x01
I_JECX 0x02
I_CALL_REL 0x03
I_VM_END 0x04
I_RET 0x05
I_LOOPxx 0x06
I_VM_MOV_IMM 0x07
I_VM_MOV_REG 0x08
I_VM_ADD_IMM 0x09
I_VM_ADD_REG 0x0A
I_VM_SHL_IMM 0x0B
I_VM_REAL 0x0C
I_JMP_LONG 0x0D
I_JMP_SHORT 0x0E
I_VM_ROL 0x0F
I_VM_ROR 0x10
I_VM_RCL 0x11
I_VM_RCR 0x12
I_VM_SAL 0x13
I_VM_SHL 0x14
I_VM_SHR 0x15
I_VM_SAR 0x16
I_VM_ADD 0x17
I_VM_OR 0x18
I_VM_ADC 0x19
I_VM_SBB 0x1A
I_VM_AND 0x1B
I_VM_SUB 0x1C
I_VM_XOR 0x1D
I_VM_CMP 0x1E
I_VM_MOV 0x1F
I_VM_CALL 0x20
I_VM_JMP 0x21
I_VM_PUSH 0x22
I_VM_POP 0x23
I_VM_RELOC 0x24
I_VM_FAKE_CALL 0x25
I_VM_NOP 0x26
* VM instruction byte-codes are usually permutated
5 0xFFFF I_RET xxxx
7 0xFFFF MAKE_SHL_REG
5 0xFFFF MAKE_MOV_IMM
An Example of VM Instruction
Transformation
add eax, 1
36
call ADD_EAX_1
jmp ADD_EAX_1_END
;; trash code here
ADD_EAX_1:
push ebp
mov ebp, esp
sub esp, px08
push ebx
mov dword ptr [ebp-0x04],5
mov ebx,4
sub dword ptr [ebp-0x04],ebx
add eax, dword ptr [ebp-0x04]
mov esp, ebp
pop ebp
ret
Add_EAX_1_END:
Classic VM Byte-code Execution
• Stack based style: save registers for native code, and create own VM
stack
• The return value of last execution for each bytecode was saved in VM
register (var_RegEip and var_RegDI below), for next bytecode
execution
37
00401060 VM_procedure proc near
…
004010BC VM_Entry: ; return here upon completion of each bytecode
004010BC inc [ebp+var_RegEip]
004010BF mov eax, [ebp+var_RegEip]
004010C2 mov al, [eax] ; fetch one byte from pseudo-code
004010C4 mov [ebp+var_RegDl], al
004010C7 mov eax, offset lpJumpAddrTable
004010CC movzx ebx, [ebp+var_RegDl]
004010D0 shl ebx, 2 ; x4
004010D3 add eax, ebx ; look up jmp table
004010D5 jmp dword ptr [eax] ; going to interpretation
004010D5 VM_procedure endp
Security of Virtualized Program
• VM does not restore byte-codes to original
codes any more
• Cracking original software requires two steps:
– Understanding VM code
– Decoding mapping between two instruction sets
• Therefore, security was largely transferred
from original program to VM code after
virtualization
38
Improving Execution Efficiency of VM
Obfuscator
• Reduce unnecessary jumps during VM interpretation,
by executing a block of bytecodes together instead of
one-by-one
– Previously, return to VM dispatcher every time after
finishing one bytecode interpretation
– Now, choose a “basic block” to execute before jumping
back to VM entry. Specifically, choose code block between
two nearest jmp/jcc/call instructions, and replace by
bytecodes. At the end of last instruction, jump back
39
How to Improving Execution Efficiency
• Add a new bytecode PI for basic block <instr_1,…,instr_n>
• Write binary code for PI as a whole, with jumping back
• Allow VM dispatcher to handle PI as a bytecode
40
call XXX
; non-jump instructions
instr_1
instr_2
…
instr_n
jmp YYY
;VM bytecodes
db 14h, 0xABh, 78h
VM_proc proc near
VM_entry:
…
VM_proc endp
…
; implementation of 0xABh
PI_entry:
…
jmp VM_entry
Can We Start “Multi-Stage” by Simply
Repeating VM?
• We like to design a multi-stage code obfuscator, which may
generates more secure code than relying on a single VM core
• Not that easy as first thought
– One binary instruction maps to many bytecodes, together with
implementation
– Randomly select bytecodes for each instruction from protected code
– Second round of VM cannot recognize previous bytecode due to
polymorphism
• Existing VM-over-VM can only obfuscate VM engine part not byte-
codes
– One needs to remember the implementation of bytecode, and in second
stage transform implementation into bytecodes again
41
Multi-stage Obfuscation
• P: original program
• n: # of obfuscation stages
• K = <k0,k1,…,kn>: the keys used for each stage.
• Algorithm: Randomly choose a n, and iteratively obfuscate program P for n
times by key set K. The encryption key ki is generated from program Pi of last
obfuscation stage, and ki is again applied to Pi to get Pi+1.
– P0=P, k0 = f(P0)/ or predefined
– P1 = Enc(P0, k0), k1 = f(P1)
– …
– Pn = Enc(Pn-1, kn-1)= Enc(Pn-1, f(Pn-1)), kn = f(Pn)= f(Enc(Pn-1, kn-1))
• Require: all Pi executable
• Output: Pn only. Adversary need to crack all intermediate obfuscated
programs in order to recover original code/flow.
42
Multiple Copies of Program P
• ki = f(Pi)
• Pi+1 = Enc(Pi, ki)
43
…
…
Enc
f
P0=P
K0
P1
K1
Enc Pn
Kn
f f
Function f: Program  Key
• ki = f(Pi), for i=0,…,n-1
• Function f maps any program to a key (in binary
string), satisfying that
– f have one-way hardness, and
– key can characterize program
• Examples of f(P):
– MD5 of P, where P is viewed as data
– # (nodes of CFG(P) ), where P is viewed as graph
44
Obfuscating Program by Key
• Pi+1 = Enc(Pi, ki)
• Encryption algorithm shall
– obfuscate program P’s data/control flow, while
– preserving P’s functionality
• Detail: extract all JMP/JCC/Call points of P, and transform such information
into a jumping table S. Then S is obfuscated by K. Original program P is
modified accordingly to S in order to preserve correct control
• In other words, a separate hidden jumping table will take control over
program’s running
45
Example of Encryption
• CFG of program P is normalized
to have at most 2 out edges for
each block
• Predefine a set of dummy blocks
C= {C1,…}
• Initial K0=1101…, where each bit
represents certain action on
specific block
– Bit(i) = 1: add one more branch for
block i
– Bit(i) = 0: do nothing on block i
46
B1
B2
B3
B4
K0=1101…(random generated)
Adding More Branches to Block
• For sequential block B1, add a dummy branch
C1 together with predicate x
47
B1
B2
B1
predicate x
B2 dummy C1
True False
Adding More Branches to Block -II
• For branching block B2, add a dummy block
C2 together with predicate y…
48
B2; x
(with predicate x)
B3
True
False
B4
B2;
x & y
B3
True True
B4
dummy C2
True False
False
Adding More Branches to Block -III
• …, and normalize subgraph such that in
which each block has at most 2 out edges
49
B2
x
y
True
B4
dummy C2
False
False
B3
True
P1 = Enc(P0,K0) is Ready
• Further
compute K1
– K1 = K0 xor #(blocks of P1)
– Keys are not stored
• Compute
P2=Enc(P1,K1)
50
B1;
predicate x
dummy C1
True False
B2
x
C3: y
True
B4
dummy C2
False
False
B3
True
Construction of Jump Table
• Take actions on CFG according to key
• Generating new jump table
• Generating new key
51
; jump table in data section
; insert a switch code, i.e.,
; at the end of each block
; add a microcode reading transition
; location from jmp_table
cmp … ; construct a predicate here
mov eax, jmp_table[block_ind]
mov ebx, jmp_table[block_ind +4]
jcond eax
jmp ebx
Previous jmp_table:
B1  B2
B2  B3,B4
B3  B4
B4  nil
Obfuscated jmp_table:
B1  B2, C1
B2  C3,B4
B3  B4
B4  nil
C1  B2
C2  B3
C3  B3, C2
Details of Constructing Jump Table
• Practically, control instructions include
– Cmp ( and machine status word)
– Jmp
– Jcond: jne/je, jz, jg/jge, jl/jle
– Call, ret
• Target address is the location that current instruction will transfer to
– jmp addr; For direct jump, target address is specified in the original instruction
– jcc addr; For conditional jump, there are two target addresses
– L: call addr; For call instruction, one target address for called function, and another
target address for return address
– ret; For return instruction, target address is stored on the stack
• Jump table is further obfuscated by hash function h, such that:
– table_index = h( instruction_address )
– target_address = jump_table [ table_index ]
52
Byte-code Polymorphism
• Randomly select instruction implementation
for byte-codes
– Provide several instruction templates from which
final language is derived
– Indices of opcode table are randomly generated,
which is used to connect binary instruction to
byte-code
53
Block Obfuscation
• Before obfuscation
54
PE header
Sections
Code
function block:
instructions…
Revised PE header
Sections
Code
function block:
jmp fb_dispatcher
nops
VM section
VM core
LoaderAlloc
encrypted fb
instructions
fb_dispatcher:
• After obfuscation
Before Obfuscation
• Choose a block of
instructions to obfuscate
• Intermediately, each
instruction was transformed
into a bytecode together
with an implementation
55
; original code
…
00403E8E 6A 00 PUSH 0
00403E90 E8 96FFFFFFCALL 00403E2B
00403E95 59 POP ECX
00403E96 C3 RETN
00403E99 ADD EAX,1
…
; in VM stack
.VM
db 0C2h, 0C9h, 0BDh, 14h, 0D2h
00401169 VM_Add_EAX_1 proc near
00401169
00401169 var_RegEdi = dword ptr -1Ch
00401169 var_RegEcx = dword ptr -18h
00401169 var_RegEsp = dword ptr -0Fh
00401169 var_RegEax = dword ptr -0Bh
00401169 var_SFlag = byte ptr -3
00401169 var_ZFlag = byte ptr -2
00401169
00401169 mov eax, [ebp+var_RegEax]
0040116C sub [ebp+var_RegEsp], 4
00401170 mov ebx, [ebp+var_RegEsp]
00401173 mov [ebp+var_RegEdi], ebx
00401176 mov edx, [ebx]
00401178 mov [ebp+var_RegEcx], edx
0040117B add [ebp+var_RegEcx], eax
0040117E setz [ebp+var_ZFlag] ; set Z flag
00401182 sets [ebp+var_SFlag] ; set F flag
00401186 push [ebp+var_RegEcx]
00401189 pop [ebp+var_RegEax]
0040118C mov eax, [ebp+var_RegEcx]
0040118F mov [ebx], eax
00401191 jmp VM_Entry
00401191 VM_Add_EAX_1 endp
After Obfuscation
• VM obfuscator create a stack to save registers for native
code
• The return value of last bytecode execution was saved in
VM stack, for current execution
56
00401060 VM_procedure proc near
…
004010BC VM_Entry: ;
004010BC inc [ebp+var_RegEip]
004010BF mov eax, [ebp+var_RegEip]
004010C2 mov al, [eax] ; fetch one byte from stack
004010C4 mov [ebp+var_RegDl], al
004010C7 mov eax, offset lpJumpAddrTable
004010CC movzx ebx, [ebp+var_RegDl]
004010D0 shl ebx, 2 ; x4
004010D3 add eax, ebx ; look up jump table
004010D5 jmp dword ptr [eax] ; going to interprete
004010D5 VM_procedure endp
; original code
…
00403E8E jmp VM_dispatcher
00403E90 nop
00403E95 nop
00403E96 nop
00403E99 nop
…
VM_func1 proc near
…
…
jmp VM_func2
VM_func endp
VM_funcN proc near
…
…
jmp VM_Exit
VM_func endp
…
Security Analysis of Multi-Stage
• Each (binary) instruction x maps to a set T(x) of templates/bytecodes,
where T(x) = { t1(x), t2(x),…, tn(x)}, and bytecode ti(x) = < ti,1(x),..,
ti,ji(x)> represents a sequence of instructions
• When this instruction was obf-ed more than twice, the separation
between native instructions and instruction itself become
unrecognizable
57
x1
y1 y2 y3
y4 y5
y6 y7
Random selection
z1 z2
z3 z4 z5
z6 z7
z8 z9
x1 maps to 3 bytecodes;
y4 maps to 2 bytecodes;
y5 maps to 2 bytecodes;
After two runs, x1 turns to <z3,z4,z5,z6,z7>
Multi-stage Polymorphism Makes Guessing
Even Harder
• Given instruction sequence <z3,z4,z5,z6,z7>,
guessing
– separation of bytecode instructions
– connection of polymorphism bytecodes
will be exponentially hard
• The number of stage is randomly chosen
58
Summary of Difference from Existing Code
Virtualizers
• Efficiency: reduce #(jumps) unnecessary during VM
interpretation, by executing a block of bytecodes together
instead of one-by-one
– Previously, return to VM dispatcher every time after finishing one
bytecode interpretation
– Currently: choose a “basic block” to execute before jumping back to
VM entry. Specifically, choose code block between two nearest
jmp/jcc/call instructions, and replace by bytecodes. At the end of last
instruction, jump back
• Multi-staged: introduce additional randomness in
– #(stages)
– Keys, represented by bytecode selections
59
Pros & Cons
• Pros:
– Without key and stage, adversary could not know what step of
codes to de-obfuscate and what key is in use. Key is further
protected by one-way function
– Literally, adversary will have to decode all n variants of program to
get original program. But n itself is unknown
– Use jump_table to revert controls, rather than using a virtual
machine module to hide instructions in bytecodes.
– Improved execution efficiency because of block obfuscation
• Cons:
– Stage obfuscation modifies control flow only, without touching
internal of block
– Program efficiency may slowdown by factor n
– Dummy codes are required not to impact functionality
60
C#/.NET code obfuscation
61
Why .NET Obfuscation?
• .NET Programs compiled in MSIL, which is at a higher
level than binary machine code
• .NET Programs are easy to reverse engineer using
decompilation
• .NET framework ships with a tool (ILDASM) that can
disassemble MSIL
• Anyone can peruse the details of the software
62
Code Refactoring and Reflection
• Refactoring
– “Improving” the design of the existing code
without changing its behavior
• Typical refactoring patterns
– Rename variable/class/method/member
– Extract method
– Extract constant
– Extract interface
– Encapsulate field
63
Refactoring in Visual Studio
64
.NET Reflector
65
Dotfuscator – Obfuscator Tool
• Dotfuscator is a post-development recompilation
system for .NET applications, to enhance code security
– Obfuscation is applied to MSIL and not source code
– Obfuscated code is functionally equivalent to traditional
MSIL
– It executes on CLR with same results
66
Dotfuscator features
• Renaming
• Control Flow Obfuscation
• String Encryption
• Pruning
• Linking
• Watermarking
67
Renaming in Dotfuscator
• Renaming :
– Uses an Overload-Induction renaming system that
Renames as many methods as possible to a same name.
– Saves space as short names used for renaming
• Several Options exist for class renaming. For example,
– Specify classes to be renamed while keeping their
namespace membership (keepnamespace).
– Rename namespace names while preserving namespace
hierarchy (keephierarchy)
– Rename completely, removing the namespace(default)
68
Overload Induction Method Renaming
• The underlying idea is to rename as many methods as possible to exactly the same
name
• Original source code before obfuscation
private void CalcPayroll(SpecialList employeeGroup)
{
while (employeeGroup.HasMore())
{
employee = employeeGroup.GetNext(true);
employee.UpdateSalary();
DistributeCheck(employee);
}
}
• Reverse-Engineered Source Code
private void a(a b)
{
while (b.a())
{
a = b.a(true);
a.a();
a(a);
}
}
• Since overload-induction tends to use the same letter more often, it reaches into
longer length names more slowly (e.g. aa, aaa, etc.). This also saves space
69
Renaming Options (keepnamespace)
• Hide the names of program classes while
maintaining namespace hierarchy
• Example:
70
Renaming Options (keephierarchy)
• Preserve the namespace hierarchy while
renaming the namespace and class names.
71
Renaming Options (default)
• Renames the class and namespace name to a
new, smaller name
72
String Encryption in Dotfuscator
• Crackers will frequently search for specific strings in an
application to locate strategic logic. For example,
someone looking to bypass a registration and
verification process can search for the string displayed
when the program asks the user for a serial number.
When the attacker finds the string, he can look for
instructions near it and alter the logic
• String Encryption makes this much more difficult to do,
because the attacker's search will come up empty. The
original string is nowhere to be found in the code. Only
its encrypted version is present
73
Control Flow Obfuscation
• Introduce false
conditions and other
misleading
constructs in order
to confuse and
break decompilers
• It destroys the code
patterns. The result
is semantically
equivalent to
original
• Original Source Code Before Obfuscation
public int CompareTo(Object o)
{
int n = occurrences –
((WordOccurrence)o).occurrences;
if (n == 0)
{
n = String.Compare(word,
((WordOccurrence)o).word);
}
return(n);
}
• After Control Flow Obfuscation
public virtual int _a(Object A_0)
{
int local0; int local1;
local0 = this.a – (c) A_0.a;
if (local0 != 0) goto i0;
goto i1;
while (true) {
return local1;
i0: local1 = local0;}
i1: local0 =
System.String.Compare(this.b, (c)
A_0.b); goto i0;
}
74
Pruning
• Determine unused types, methods and fields. It
extracts exactly the pieces you need for any given
application. It helps reduce size of the assembly
• The static analysis works by traversing your code,
starting at a set of methods called “triggers”
(Application Entry Points). As it traverse each trigger
method’s code, it notes which fields, methods, types
are being used
• In standalone application, the Main method would be
defined as a trigger
75
Pruning Report
• Dotfuscator generates a removal report in XML format
that lists all input assemblies and how each was pruned
– types
– methods,
– fields,
– properties, and
– managed resources
• If a type was pruned, then obviously all its members
are pruned
• Constructors are named .ctor, while static constructors
are named .cctor
76
Assembly Linking
• Also called merging, Links multiple assemblies into one or
more output assemblies
• Prime Assemblies
– When you set up linking, you must specify one of the input
assemblies as the prime assembly
• Name Mangling
– When the linker is merging assemblies, the linker sometimes
encounters situations where a name needs to be changed in
order to prevent a naming collision
– For example, if two input assemblies contain private classes
with identical names then the linker must change one of the
names in order to merge the assemblies
77
Watermarking
• Embed data (copyright info/unique nos.) into applications,
making them unique. This is one method that can be used
to track unauthorized copies of your software back to the
source
• To watermark an application
– Select the assemblies to watermark
– Select whether the watermark string is to be encrypted and
provide a passphrase if so
– Provide a string and an encoding that will be the watermark
– Select how Dotfuscator will behave if the watermark string is
too large to fit in a selected assembly
78
.NET Obfuscation Drawbacks
• Maintaining and troubleshooting becomes
difficult
• Can break code that depends on reflection,
serialization or remoting
• Hampers the debugging process, as
obfuscation alters MSIL
79
Java Byte-code obfuscation on
Blackberry: A Case Study
80
Background
• Java increased the threat of reverse engineering
– High-level bytecode
– Platform independent
• Portable
• Anyone can have access to the bytecode
• Reverse engineering
– Analyse system to create higher level representation
– Recreate Java source code
81
Problem and Objective
• Problem: Java bytecode/.class files can be easily
decompiled by adversary tools such as
– JAD, a fairly old and not that effective against
obfuscated code
– SourceAgain, a more modern decompiler, commercial
– Dava, a research project using control flow analysis
and a typing system
• Objective: Obfuscate bytecode/.class files from
being decompiled into readable and valid Java
code
82
Commonly Techniques used by Java
Obfuscators
• Obfuscation techniques:
– Name obfuscation
– Incremental obfuscation
– String obfuscation
– Flow obfuscation
– Debug info obfuscation
• Some concerns
– Method and field renaming can cause reflection calls to stop working
– Changing actual class and package names can break several other Java
APIs (JNDI, URL providers etc.)
– If the association between class byte-code offsets and source line
numbers is altered, recovering the original exception stack traces
could become difficult
83
Name Obfuscation
• Change package, class, field and method names to meaningless strings
• Incremental name obfuscation
– When obfuscating a.jar, remember the mapping between names (e.g., MyUtil  M)
– Later when obfuscating all other classes like b.jar, apply that name mapping (all MyUtil  M)
• Package name obfuscation
– One good Java program usually contains at most 10 classes under one package. By viewing
package name, adversary can easily figure out the program intention
• Com.mycompany.license.a, com.mycompany.license.b, com.mycompany.license.c
• Only class names were obfuscated
• Com.mycompany.a.a, com.ycompany.a.b, com.mycompany.a.c
• Package name was obfuscated too
– Package name obfuscation increases the difficulty of decompilation exponentially faster than
class name obfuscation
– Package > class > fields & methods
84
String Obfuscation
• The string literals embedded in application includes:
– Text of labels or other GUI components on dialogs
– Text of error messages
– Text of exception messages
• Adversary: decompiles all classes and search strings
• Obfuscator: encrypts string literals and stores in the
Constant Pools of class files. It then modifies the class
or classloader so that the strings are decrypted at
runtime
85
Flow Obfuscation
• Obscure the control flow to no longer have a
direct Java source code equivalent
– Selection (if…else…)
– Looping (while…, for…)
• Decompilers will have to produce a series of
labels and goto statements into the source
code
86
Encrypted classes plus a customized
classloader?
• Encrypt all classes after compilation and decrypt them on the fly inside JVM by a
customized classloader
• Difficulty: each subclass of classloader has to call final method
ClassLoader.defineClass(), which is interceptable. Debugging Java classloading can
get a load trace for a customized classloader
– All ClassLoaders have to pass their class definitions to JVM via one well-defined API point:
java.lang.ClassLoader.defineClass() method. The ClassLoader API has several overloads of this
method, but all of them call into the defineClass(String, byte[], int, int, ProtectionDomain)
method. It is a final method that calls into JVM native code after doing a few checks.
– No classloader can avoid calling this method if it wants to create a new Class
– The defineClass() method is the only place where the magic of creating a Class object out of a
flat byte array can take place. And the byte array must contain the unencrypted class
definition in a well-documented format (class file format specification)
– Intercepting all calls to this method and decompiling all interesting classes becomes easy: get
the source for java.lang.ClassLoader for J2SDK and modify defineClass to have some additional
class logging
• Summary: high-level bytecode encryption is infeasible, unless given a secure JVM
native code
87
Protecting a PDF viewer/wrapper
written in Java
• Yes, the ultimate goal is to protect PDF files from illegal
view
• Server
– Obfuscating Java wrapper program
– Encrypting PDF files
• Blackberry phone
– Running PDFReaderWrapper program
88
PDF wrapper does:
1. Stores a local
copy of sPDF file
2. Decrypts sPDF file
internally
3. Renders PDF
display
Wrapper Protection Model for
Blackberry
• Wrapper: running on Blackberry to receive and decrypt sPDF file,
then display PDF content. Written in Java and unique to device
• Java obfuscator: inject a key in wrapper, then obfuscate the
wrapper, and convert it to COD file
• PDF encryptor: encrypt PDF file into a secure PDF file
89
Content
provider
Java
Obfuscator
PDF
encryptor
Software
vendor
Plain wrapper W
Plain PDF file F
Obf-ed wrapper Wa
Secured PDF Fa
Server
Security on Blackberry COD files
• COD is Blackberry proprietary file format with compiler rapc
• Blackberry RIM security
– BB password encryption system uses a standard key-derivation function, PBKDF2 of 256-bit
AES with only one iteration in BB encryption
– Elcomsoft claims in 2010 that they have cracked it
– System COD files are available under C:Program
Fileseclipsepluginsnet.rim.ejde.componentpack4.7.0_4.7.0.57componentssimulatorJava.
For example:
• Personal Information Management: net_rim_bbapi_pim.cod, net_rim_bbapi_pim_res.cod,
net_rim_bbapi_pim_res__en.cod, net_rim_bbapi_pim_todo.cod
• Cryptography: net_rim_bb_crypto_api.cod, net_rim_bb_crypto_resource.cod,
nt_rim_bb_crypto_resource__en.cod etc.
• Blackberry application security
– Blackberry COD is based on private virtual machine which translates JAVA bytecode.
Technically no more difficulty to break it than any other JVM obfuscators (Blackberry
simulator/JDE, WinHEX)
90
Building Obf-ed wrapper program
• Java Obfuscator:
– Customize wrapper with device PIN and secret soft key (code injection)
– Obfuscate wrapper into a JAR
– Convert wrapper JAR to COD format
• Wrapper download is device-specific
– Device PIN is used to encrypt PDF file
– Call Blackberry Device Info API to get IMSI
91
HelloWorld.ja
va
SecureCheckT
emplate.java
HelloWorldM
odified.java
SecureCheck.
java
Wrapper.jar
customize rapc ObfWrapper.j
ar
ObfWrapper.c
od
obf
preverify&sign
Install a file filter driver for Blackberry
Java Obfuscator
• A device driver implements a set of handler routines to
process I/O request packet (IRP) calls from OS
• A file system filter driver intercepts requests targeted
at a file system or another file system filter driver. By
intercepting the request before it reaches its intended
target, the filter driver can extend/replace functionality
provided by the original target of the request
• Further protection: encrypt the intermediate files
created by Java obfuscator
92
Hook File Read/Write Dispatch
Functions
• Put our file monitor
driver on top of
device stack
• Driver logic for file
filtering
– Hook only related
files opened by
specified process
– Other processes
can only visit
encrypted files
93
irp = IoGetCurrentIrpStackLocation();
filename = fileGetFullPath(irp->FileObject);
process = GetProcessNameOffset();
Process name
selected?
File name
hooked?
Decrypt and set file stream
buffer;
Return true;
Pass thru to lower device stack;
Return false;
End
Yes
Yes
No
No
Transparent File
Encryption/Decryption
• Driver: receive file list from selected process. When file
hooked:
– Encrypt file before writing it to disk (encrypted file stored on
disk)
– Decrypt it after reading it from disk ( file decrypted in memory
only)
• Method details:
– rewrite pre/post operation functions and hook to filter manager
– Operate (R/W/Dir control) on intermediate file buffers instead
of on system-supplied buffer directly
– After operation completes, copy contents of new buffer back
into original buffer
94
File Pre/Post Read Operation
• PreRead
– Allocate new buffer
– Set up MDL for the new allocated buffer
– Update context with new buffer pointer and MDL
address
– Pass context to PostRead callback
• PostRead
– Decrypt the read data
– Copy the read data back into system buffer
95
File Pre/Post Write Operation
• PreWrite
– Allocate new buffer we are writing to
– Set up MDL for the new allocated buffer
– Copy and encrypt the old buffer to new buffer
– Update context with new buffer pointer and MDL
address
– Pass context to PostWrite callback
• PostWrite
– Free the allocated buffers
96
Communications betw. Filter Driver
and Application
• Application (jobfuscator.exe) sends a file-list to
filter driver
97
0 FltCreateCommunicatonPort
2 ConnectNotifyCallback*
3 FltSendMessage
6 MessageNotifyCallback*
8 DisconnectNotifyCallback*
1 FltConnectCommunicationPort
4 FilterGetMessage
5 FilterReplyMessage
7 FltCloseCommunicationPort
Start filtering
Filtering check OK
File list
Close filtering
References for Research
* Chosen Ciphertext Security via Point Obfuscation, T. Matsuda and G. Hanaoka, TCC 2014
* Two-round secure MPC from Indistinguishability Obfuscation, Sanjam Garg et al., TCC 2014
* Virtual Black-Box Obfuscation for All Circuits via Generic Graded Encoding, Z. Brakerski and G. Rothblum, TCC 2014
* Extractable Obfuscation and Applications, Elette Boyle and Kai-Min Chung and Rafael Pass, TCC 2014
* Candidate Indistinguishability Obfuscation and Functional Encryption for all circuits. Amit Sahai, 2013
* How to Use Indistinguishability Obfuscation: Deniable Encryption, and More. Amit Sahai, 2013
* Barak et al., On the (im)possibility of obfuscating programs, Crypto’01
* Goldweisser, on the impossibility of obfuscation with auxiliary input, FOCS’05
* Canneti, towards realizing random oracles: hash functions that hide all partial information, Crypto’97
* Rolf Rolles, Unpacking virtualization obfuscators, 2009
* H. Wee, on obfuscating point functions, STOC’05
* Collberg, code transformation techniques for software protection, 2009
* Ogiso et al., software obfuscation on a theoretical basis and its impl.2003
* Sharif et al., Automatic reverse engineering of malware emulators, 30th IEEE Sym. on security &privacy, 2009
* D. Boccardo, Context sensitive analysis of x86 obfuscated executables, thesis, 2009
* Beaucamps and Filiol, On the possibility of practically obfuscating programs towards a unified perspective of code
protection, 2008
* Kanzaki et al., Exploiting self-modification mechanism for program protection, 27th COMPSAC 2003
* Monden et al., Security improvements for encrypted interpretation
* Ehrig et al., graphical representation and graph transformation, 1999
* James Smith and Ravi Nair, Virtual machines, Morgan Kaufmann Publisher
* Fang et al., Multi-stage binary code obfuscation using an improved virtual machine, ISC 2011
(…more please refer to slide notes below)
98
References for Practice
• Hacker disassembler engine: transform ASM into a sequence of pseudo-instructions.
Open source http://patkov-site.narod.ru/
• IDA pro https://www.hex-rays.com/products/ida/ disassembler and debugger: provide
advanced disassembler SDK. Commercial
• Vmprotect software, www.vmprotect.com trial
• Reflector, www.reflector.net
• ILSpy, http://wiki.sharpdevelop.net/ILSpy.ashx
• Stunnix Obfuscators www.stunnix.com
• Graywolf , http://digitalbodyguard.com/GrayWolf.html
• Reflexil, http://reflexil.net/
• Shane Ng's obfuscator http://daven.se/usefulstuff/javascript-obfuscator.html GPL-
licensed
• JavaScript Obfuscator http://www.javascriptobfuscator.com/ Free
• JSBeautifier, http://jsbeautifier.org/ Online unpack or deobfuscate JavaScript
• Protecting Java code via code obfuscation, ACM crossroads, Springer 1998
• Protect your Java code – through obfuscators and beyond, Dmitry Leskov, 2009
• A qualitative analysis of Java obfuscation, Ravi Ramachandra, Rowan University 2008
• Dava and JBCO Java obfuscation projects, Sable lab of Mcgill univeristy, Canada
• Blackberry java development tutorial
99
Ad

More Related Content

What's hot (20)

From Zero to Docker
From Zero to DockerFrom Zero to Docker
From Zero to Docker
Abhishek Verma
 
Introduction Node.js
Introduction Node.jsIntroduction Node.js
Introduction Node.js
Erik van Appeldoorn
 
Docker presentation
Docker presentationDocker presentation
Docker presentation
Shankar Chaudhary
 
Introduction to ASP.NET Core
Introduction to ASP.NET CoreIntroduction to ASP.NET Core
Introduction to ASP.NET Core
Avanade Nederland
 
Ionic Framework
Ionic FrameworkIonic Framework
Ionic Framework
Thinh VoXuan
 
Dot Net Core
Dot Net CoreDot Net Core
Dot Net Core
Amir Barylko
 
React workshop presentation
React workshop presentationReact workshop presentation
React workshop presentation
Bojan Golubović
 
rx-java-presentation
rx-java-presentationrx-java-presentation
rx-java-presentation
Mateusz Bukowicz
 
Getting started with Jenkins
Getting started with JenkinsGetting started with Jenkins
Getting started with Jenkins
Edureka!
 
Docker Hub: Past, Present and Future by Ken Cochrane & BC Wong
Docker Hub: Past, Present and Future by Ken Cochrane & BC WongDocker Hub: Past, Present and Future by Ken Cochrane & BC Wong
Docker Hub: Past, Present and Future by Ken Cochrane & BC Wong
Docker, Inc.
 
Building RESTful applications using Spring MVC
Building RESTful applications using Spring MVCBuilding RESTful applications using Spring MVC
Building RESTful applications using Spring MVC
IndicThreads
 
Docker Swarm 0.2.0
Docker Swarm 0.2.0Docker Swarm 0.2.0
Docker Swarm 0.2.0
Docker, Inc.
 
presentation on Docker
presentation on Dockerpresentation on Docker
presentation on Docker
Virendra Ruhela
 
What's an api
What's an apiWhat's an api
What's an api
Jacques Ledoux
 
WebAssembly
WebAssemblyWebAssembly
WebAssembly
Jens Siebert
 
Présentation du DevOps
Présentation du DevOpsPrésentation du DevOps
Présentation du DevOps
Cyrielle Orban
 
Ansible
AnsibleAnsible
Ansible
Raul Leite
 
Docker intro
Docker introDocker intro
Docker intro
Oleg Z
 
Web assembly: a brief overview
Web assembly: a brief overviewWeb assembly: a brief overview
Web assembly: a brief overview
Pavlo Iatsiuk
 
Cloud Ubuntu Open Stack, Juju, MaaS - Ua Deck Nov 2013
Cloud Ubuntu Open Stack, Juju, MaaS - Ua Deck Nov 2013Cloud Ubuntu Open Stack, Juju, MaaS - Ua Deck Nov 2013
Cloud Ubuntu Open Stack, Juju, MaaS - Ua Deck Nov 2013
The World Bank
 
Introduction to ASP.NET Core
Introduction to ASP.NET CoreIntroduction to ASP.NET Core
Introduction to ASP.NET Core
Avanade Nederland
 
React workshop presentation
React workshop presentationReact workshop presentation
React workshop presentation
Bojan Golubović
 
Getting started with Jenkins
Getting started with JenkinsGetting started with Jenkins
Getting started with Jenkins
Edureka!
 
Docker Hub: Past, Present and Future by Ken Cochrane & BC Wong
Docker Hub: Past, Present and Future by Ken Cochrane & BC WongDocker Hub: Past, Present and Future by Ken Cochrane & BC Wong
Docker Hub: Past, Present and Future by Ken Cochrane & BC Wong
Docker, Inc.
 
Building RESTful applications using Spring MVC
Building RESTful applications using Spring MVCBuilding RESTful applications using Spring MVC
Building RESTful applications using Spring MVC
IndicThreads
 
Docker Swarm 0.2.0
Docker Swarm 0.2.0Docker Swarm 0.2.0
Docker Swarm 0.2.0
Docker, Inc.
 
Présentation du DevOps
Présentation du DevOpsPrésentation du DevOps
Présentation du DevOps
Cyrielle Orban
 
Docker intro
Docker introDocker intro
Docker intro
Oleg Z
 
Web assembly: a brief overview
Web assembly: a brief overviewWeb assembly: a brief overview
Web assembly: a brief overview
Pavlo Iatsiuk
 
Cloud Ubuntu Open Stack, Juju, MaaS - Ua Deck Nov 2013
Cloud Ubuntu Open Stack, Juju, MaaS - Ua Deck Nov 2013Cloud Ubuntu Open Stack, Juju, MaaS - Ua Deck Nov 2013
Cloud Ubuntu Open Stack, Juju, MaaS - Ua Deck Nov 2013
The World Bank
 

Viewers also liked (16)

WCMSU Strategy
WCMSU StrategyWCMSU Strategy
WCMSU Strategy
Katie Murray
 
Android device protection
Android device protectionAndroid device protection
Android device protection
nlog2n
 
Arti_Trainer
Arti_TrainerArti_Trainer
Arti_Trainer
ArtiGarg
 
Veri̇mli̇ ders çalişma tekni̇kleri̇
Veri̇mli̇ ders çalişma tekni̇kleri̇Veri̇mli̇ ders çalişma tekni̇kleri̇
Veri̇mli̇ ders çalişma tekni̇kleri̇
Erdal Demir
 
Fisica moderna serway- 3ra edicion
Fisica moderna  serway- 3ra edicionFisica moderna  serway- 3ra edicion
Fisica moderna serway- 3ra edicion
Luis Core
 
WC Social Media Strategy #MSU
WC Social Media Strategy #MSUWC Social Media Strategy #MSU
WC Social Media Strategy #MSU
Katie Murray
 
ios device protection review
ios device protection reviewios device protection review
ios device protection review
nlog2n
 
Crack ios firmware-nlog2n
Crack ios firmware-nlog2nCrack ios firmware-nlog2n
Crack ios firmware-nlog2n
nlog2n
 
Barthes, roland - de la obra al texto - el susurro del lenguaje
Barthes, roland - de la obra al texto - el susurro del lenguajeBarthes, roland - de la obra al texto - el susurro del lenguaje
Barthes, roland - de la obra al texto - el susurro del lenguaje
Alvaro Elgueta
 
Chobani Digital Media Strategy
Chobani Digital Media StrategyChobani Digital Media Strategy
Chobani Digital Media Strategy
Katie Murray
 
Coaliciones una guía para Partidos Políticos
Coaliciones una guía para Partidos PolíticosCoaliciones una guía para Partidos Políticos
Coaliciones una guía para Partidos Políticos
Red Innovación
 
Bad news messages
Bad news messagesBad news messages
Bad news messages
Akshay Kumar
 
Chapter 2, 7 cs of_business_communication_a
Chapter 2, 7 cs of_business_communication_aChapter 2, 7 cs of_business_communication_a
Chapter 2, 7 cs of_business_communication_a
Akshay Kumar
 
Chapter 7, the appearance and design of business message
Chapter 7, the appearance and design of business messageChapter 7, the appearance and design of business message
Chapter 7, the appearance and design of business message
Akshay Kumar
 
Bao cao thuc tap hoan chinh
Bao cao thuc tap hoan chinhBao cao thuc tap hoan chinh
Bao cao thuc tap hoan chinh
Nguyễn Thị Thảo
 
Android device protection
Android device protectionAndroid device protection
Android device protection
nlog2n
 
Arti_Trainer
Arti_TrainerArti_Trainer
Arti_Trainer
ArtiGarg
 
Veri̇mli̇ ders çalişma tekni̇kleri̇
Veri̇mli̇ ders çalişma tekni̇kleri̇Veri̇mli̇ ders çalişma tekni̇kleri̇
Veri̇mli̇ ders çalişma tekni̇kleri̇
Erdal Demir
 
Fisica moderna serway- 3ra edicion
Fisica moderna  serway- 3ra edicionFisica moderna  serway- 3ra edicion
Fisica moderna serway- 3ra edicion
Luis Core
 
WC Social Media Strategy #MSU
WC Social Media Strategy #MSUWC Social Media Strategy #MSU
WC Social Media Strategy #MSU
Katie Murray
 
ios device protection review
ios device protection reviewios device protection review
ios device protection review
nlog2n
 
Crack ios firmware-nlog2n
Crack ios firmware-nlog2nCrack ios firmware-nlog2n
Crack ios firmware-nlog2n
nlog2n
 
Barthes, roland - de la obra al texto - el susurro del lenguaje
Barthes, roland - de la obra al texto - el susurro del lenguajeBarthes, roland - de la obra al texto - el susurro del lenguaje
Barthes, roland - de la obra al texto - el susurro del lenguaje
Alvaro Elgueta
 
Chobani Digital Media Strategy
Chobani Digital Media StrategyChobani Digital Media Strategy
Chobani Digital Media Strategy
Katie Murray
 
Coaliciones una guía para Partidos Políticos
Coaliciones una guía para Partidos PolíticosCoaliciones una guía para Partidos Políticos
Coaliciones una guía para Partidos Políticos
Red Innovación
 
Chapter 2, 7 cs of_business_communication_a
Chapter 2, 7 cs of_business_communication_aChapter 2, 7 cs of_business_communication_a
Chapter 2, 7 cs of_business_communication_a
Akshay Kumar
 
Chapter 7, the appearance and design of business message
Chapter 7, the appearance and design of business messageChapter 7, the appearance and design of business message
Chapter 7, the appearance and design of business message
Akshay Kumar
 
Ad

Similar to Code obfuscation theory and practices (20)

Embedded programming Embedded programming (1).pptx
Embedded programming Embedded programming (1).pptxEmbedded programming Embedded programming (1).pptx
Embedded programming Embedded programming (1).pptx
lematadese670
 
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
MLconf
 
08 subprograms
08 subprograms08 subprograms
08 subprograms
baran19901990
 
Current & Future Use-Cases of OpenDaylight
Current & Future Use-Cases of OpenDaylightCurrent & Future Use-Cases of OpenDaylight
Current & Future Use-Cases of OpenDaylight
abhijit2511
 
Simon Peyton Jones: Managing parallelism
Simon Peyton Jones: Managing parallelismSimon Peyton Jones: Managing parallelism
Simon Peyton Jones: Managing parallelism
Skills Matter
 
Peyton jones-2011-parallel haskell-the_future
Peyton jones-2011-parallel haskell-the_futurePeyton jones-2011-parallel haskell-the_future
Peyton jones-2011-parallel haskell-the_future
Takayuki Muranushi
 
Open Source Swift Under the Hood
Open Source Swift Under the HoodOpen Source Swift Under the Hood
Open Source Swift Under the Hood
C4Media
 
Introduction Of C++
Introduction Of C++Introduction Of C++
Introduction Of C++
Sangharsh agarwal
 
Swift: A parallel scripting for applications at the petascale and beyond.
Swift: A parallel scripting for applications at the petascale and beyond.Swift: A parallel scripting for applications at the petascale and beyond.
Swift: A parallel scripting for applications at the petascale and beyond.
Nagasuri Bala Venkateswarlu
 
.NET Core, ASP.NET Core Course, Session 3
.NET Core, ASP.NET Core Course, Session 3.NET Core, ASP.NET Core Course, Session 3
.NET Core, ASP.NET Core Course, Session 3
Amin Mesbahi
 
Digital design with Systemc
Digital design with SystemcDigital design with Systemc
Digital design with Systemc
Marc Engels
 
ElixirでFPGAを設計する
ElixirでFPGAを設計するElixirでFPGAを設計する
ElixirでFPGAを設計する
Hideki Takase
 
DEF CON 27 - JEFF DILEO - evil e bpf in depth
DEF CON 27 - JEFF DILEO - evil e bpf in depthDEF CON 27 - JEFF DILEO - evil e bpf in depth
DEF CON 27 - JEFF DILEO - evil e bpf in depth
Felipe Prado
 
4_5802928814682016556.pptx
4_5802928814682016556.pptx4_5802928814682016556.pptx
4_5802928814682016556.pptx
AshenafiGirma5
 
Declare Your Language: What is a Compiler?
Declare Your Language: What is a Compiler?Declare Your Language: What is a Compiler?
Declare Your Language: What is a Compiler?
Eelco Visser
 
CD - CH1 - Introduction to compiler design.pptx
CD - CH1 - Introduction to compiler design.pptxCD - CH1 - Introduction to compiler design.pptx
CD - CH1 - Introduction to compiler design.pptx
ZiyadMohammed17
 
Cd ch1 - introduction
Cd   ch1 - introductionCd   ch1 - introduction
Cd ch1 - introduction
mengistu23
 
Computer Programming In C.pptx
Computer Programming In C.pptxComputer Programming In C.pptx
Computer Programming In C.pptx
chouguleamruta24
 
고급컴파일러구성론_개레_230303.pptx
고급컴파일러구성론_개레_230303.pptx고급컴파일러구성론_개레_230303.pptx
고급컴파일러구성론_개레_230303.pptx
ssuser1e7611
 
[Defcon24] Introduction to the Witchcraft Compiler Collection
[Defcon24] Introduction to the Witchcraft Compiler Collection[Defcon24] Introduction to the Witchcraft Compiler Collection
[Defcon24] Introduction to the Witchcraft Compiler Collection
Moabi.com
 
Embedded programming Embedded programming (1).pptx
Embedded programming Embedded programming (1).pptxEmbedded programming Embedded programming (1).pptx
Embedded programming Embedded programming (1).pptx
lematadese670
 
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
MLconf
 
Current & Future Use-Cases of OpenDaylight
Current & Future Use-Cases of OpenDaylightCurrent & Future Use-Cases of OpenDaylight
Current & Future Use-Cases of OpenDaylight
abhijit2511
 
Simon Peyton Jones: Managing parallelism
Simon Peyton Jones: Managing parallelismSimon Peyton Jones: Managing parallelism
Simon Peyton Jones: Managing parallelism
Skills Matter
 
Peyton jones-2011-parallel haskell-the_future
Peyton jones-2011-parallel haskell-the_futurePeyton jones-2011-parallel haskell-the_future
Peyton jones-2011-parallel haskell-the_future
Takayuki Muranushi
 
Open Source Swift Under the Hood
Open Source Swift Under the HoodOpen Source Swift Under the Hood
Open Source Swift Under the Hood
C4Media
 
Swift: A parallel scripting for applications at the petascale and beyond.
Swift: A parallel scripting for applications at the petascale and beyond.Swift: A parallel scripting for applications at the petascale and beyond.
Swift: A parallel scripting for applications at the petascale and beyond.
Nagasuri Bala Venkateswarlu
 
.NET Core, ASP.NET Core Course, Session 3
.NET Core, ASP.NET Core Course, Session 3.NET Core, ASP.NET Core Course, Session 3
.NET Core, ASP.NET Core Course, Session 3
Amin Mesbahi
 
Digital design with Systemc
Digital design with SystemcDigital design with Systemc
Digital design with Systemc
Marc Engels
 
ElixirでFPGAを設計する
ElixirでFPGAを設計するElixirでFPGAを設計する
ElixirでFPGAを設計する
Hideki Takase
 
DEF CON 27 - JEFF DILEO - evil e bpf in depth
DEF CON 27 - JEFF DILEO - evil e bpf in depthDEF CON 27 - JEFF DILEO - evil e bpf in depth
DEF CON 27 - JEFF DILEO - evil e bpf in depth
Felipe Prado
 
4_5802928814682016556.pptx
4_5802928814682016556.pptx4_5802928814682016556.pptx
4_5802928814682016556.pptx
AshenafiGirma5
 
Declare Your Language: What is a Compiler?
Declare Your Language: What is a Compiler?Declare Your Language: What is a Compiler?
Declare Your Language: What is a Compiler?
Eelco Visser
 
CD - CH1 - Introduction to compiler design.pptx
CD - CH1 - Introduction to compiler design.pptxCD - CH1 - Introduction to compiler design.pptx
CD - CH1 - Introduction to compiler design.pptx
ZiyadMohammed17
 
Cd ch1 - introduction
Cd   ch1 - introductionCd   ch1 - introduction
Cd ch1 - introduction
mengistu23
 
Computer Programming In C.pptx
Computer Programming In C.pptxComputer Programming In C.pptx
Computer Programming In C.pptx
chouguleamruta24
 
고급컴파일러구성론_개레_230303.pptx
고급컴파일러구성론_개레_230303.pptx고급컴파일러구성론_개레_230303.pptx
고급컴파일러구성론_개레_230303.pptx
ssuser1e7611
 
[Defcon24] Introduction to the Witchcraft Compiler Collection
[Defcon24] Introduction to the Witchcraft Compiler Collection[Defcon24] Introduction to the Witchcraft Compiler Collection
[Defcon24] Introduction to the Witchcraft Compiler Collection
Moabi.com
 
Ad

Recently uploaded (20)

Mastering Advance Window Functions in SQL.pdf
Mastering Advance Window Functions in SQL.pdfMastering Advance Window Functions in SQL.pdf
Mastering Advance Window Functions in SQL.pdf
Spiral Mantra
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Social Media App Development Company-EmizenTech
Social Media App Development Company-EmizenTechSocial Media App Development Company-EmizenTech
Social Media App Development Company-EmizenTech
Steve Jonas
 
Web and Graphics Designing Training in Rajpura
Web and Graphics Designing Training in RajpuraWeb and Graphics Designing Training in Rajpura
Web and Graphics Designing Training in Rajpura
Erginous Technology
 
MINDCTI revenue release Quarter 1 2025 PR
MINDCTI revenue release Quarter 1 2025 PRMINDCTI revenue release Quarter 1 2025 PR
MINDCTI revenue release Quarter 1 2025 PR
MIND CTI
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Top 10 IT Help Desk Outsourcing Services
Top 10 IT Help Desk Outsourcing ServicesTop 10 IT Help Desk Outsourcing Services
Top 10 IT Help Desk Outsourcing Services
Infrassist Technologies Pvt. Ltd.
 
Mastering Advance Window Functions in SQL.pdf
Mastering Advance Window Functions in SQL.pdfMastering Advance Window Functions in SQL.pdf
Mastering Advance Window Functions in SQL.pdf
Spiral Mantra
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Social Media App Development Company-EmizenTech
Social Media App Development Company-EmizenTechSocial Media App Development Company-EmizenTech
Social Media App Development Company-EmizenTech
Steve Jonas
 
Web and Graphics Designing Training in Rajpura
Web and Graphics Designing Training in RajpuraWeb and Graphics Designing Training in Rajpura
Web and Graphics Designing Training in Rajpura
Erginous Technology
 
MINDCTI revenue release Quarter 1 2025 PR
MINDCTI revenue release Quarter 1 2025 PRMINDCTI revenue release Quarter 1 2025 PR
MINDCTI revenue release Quarter 1 2025 PR
MIND CTI
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 

Code obfuscation theory and practices

  • 1. Code Obfuscation: Theory and Practices Fang Hui 18/3/2014
  • 2. Outline • Theoretic Background • Obfuscation in Practice • Case Study – Virtual Machine Based Binary code Obfuscation – C#/.NET code obfuscation – Java Byte-code obfuscation on Blackberry • Obfuscation Tools • OS Platforms 2
  • 3. Obf(P) What is Code Obfuscation • Obfuscation makes a program “unintelligible” while preserving its functionality • Is a form of security through obscurity to make reverse engineering difficult • Transform the binary code or compiled .NET / Java code into a form difficult to understand • Obfuscated code has the same behavior • Obfuscators work in the same way as compiler/optimizers Obf Goal: Change program so still has same I/O behavior but is impossible to understand P 3
  • 4. Why Obfuscate? For Software Protection Software vendors want to prevent users from reverse-engineering or tampering executable code For Cryptography Many applications: fully homomorphic encryption, private to public key crypto, etc. 4
  • 5. Obfuscation vs. Homomorphic Encryption • Obfuscation: hide function’s implementation while preserving functionality • HE: computing with encrypted data and solve access control problem 5 F Obfuscation F F Encryption F x +  F(x) Result in the clear +  F(x) x Result encrypted
  • 6. Theoretical Obfuscator • An obfuscator is a processor of programs (i.e., Turing machines or Boolean circuits) which outputs a program with the same functionality, but unintelligible code • In other words, an adversary possessing the obfuscated program should be able to learn only input/output, like a black-box access – Function equivalence – Light weighted – Virtual black box 6
  • 7. Virtual Black Box Security • A probabilistic algorithm O is a TM/circuit obfuscator if the following three conditions hold: – (Functionality Equivalence) For every TM/circuit P and for every input x: P(x) = O(P) (x) – (Polynomial Slowdown) There exists a polynomial q(.) such that for every TM/circuit P, |O(P)|≤ q(|P|). TMs are additionally required that for every input x, if P halts after t steps on x then O(P) halts within q(t) steps on x – (Virtual Black Box) For any PPT A, there is PPT oracle machine S such that for all TM/circuit P: | Pr[A(O(P))=1] – Pr[SP(1|P|)=1] | < negl(|P|) • Does a general-purpose VBB obfuscator exist? 7
  • 8. Point Function Obfuscation • For very few functions, we know how to achieve VBB • Point function Ix(w) = { 1 if w=x, 0 otherwise} • Obfuscate Ix with one-way function f • Let y=f(x), program obf Ix(w) = { 1 if f(w)=y, 0 otherwise} • Intuitively, obf Ix(w) reveals no more advantage than black box access to Ix, which rely on hardness of reverting one- way function f. [Canetti97] adv(obf Ix) = adv(f) 8
  • 9. VBB Obfuscator does not exist • Proof sketch [Barak01] – Firstly, construct two functions C and D that are not obfuscatable together – Secondly, combine each C and D to a single function, giving an ensemble that is unobfuscatable for Turing Machines – Finally, show how to implement it for circuits, for which the same ideas cannot be applied 9
  • 10. “Proof” Proof for Turing Machines: Cα,β(x) = β if x=α, 0 otherwise Dα,β(C) = 1 if C(α)=β, 0 otherwise Intuition: Given Cα,β , Dα’,β’ “know” output Dα’,β’(Cα,β) Given black-box access to Cα,β , Dα’,β’ “don’t know” what Dα’,β’(Cα,β) outputs! Fα,β(b,y) = Cα,β(y) if b=0 Dα,β(y) if b=1 Zα,β(b,y) = 0 if b=0 Dα,β(y) if b=1 From black-box access, Fα,β, Zα,β look the same From non black-box access: O(Fα,β)(1, O(Fα,β(0,·))) = 1 O(Zα,β)(1, O(Zα,β(0,·))) = 0 10
  • 11. Indistinguishability Obfuscation • OK VBB security is strong because some functions are bad. How about we get O() that does “as well as possible” on every function? • Indistinguishability obfuscation [Gentry13] – An obfuscator of this kind conceals as much information about the source text as any other feasible method would – an adversary who is given obfuscated versions of two distinct but equivalent programs—they both compute the same function—can’t tell which is which – It is a weaker notion of obfuscation, but it is the best that’s attainable 11
  • 12. VBB vs. IO • Virtual Black-Box: For any poly learner L, exists poly simulator S, s.t. for every (poly time) program P: Pr[L(O(P)) = 1] ≈ Pr[SP(1|P|)=1] 12 SimulatorLearner O(P) P 0/1 0/1 ≈ x P(x)
  • 13. VBB vs. IO • Indistinguishability Obfuscation: Obfuscations of equivalent circuits of the same size should be computationally indistinguishable. • For every poly learner L, exists poly simulator S s.t. for every circuit C1, for every equivalent C2 (|C1| = |C2|), distributions L(O(C1)) and S(C2) indistinguishable. 13 SimulatorLearner O(C1) 0/1 0/1 ≈ C2 x C2(x)
  • 14. Best-Possible Obfuscation • Inefficient Indistinguishability Obfuscation is always possible • O(C) = lexicographically 1st circuit computing the same function as C (canonical form) • Canonicalization is inefficient (unless P=NP) 14
  • 16. Obfuscation in Real World • Copy protection/Licensing • Software watermarking • Prevent reverse engineering – By competitors – By hackers (e.g., for games) if (test fails) then exit else … 16
  • 17. Off the Shelf Obfuscators
  • 18. Code Obfuscation Tools Languages Obfuscators C++ VMProtect - trial Stunnix C++ Obfuscator – commercial product .NET Eazfuscator.NET – free SmartAssembly – commercial license, very powerful – assembly obfuscation + compression Java ProGuard – free, open-source yGuard – free, open source JShrink JavaScript Mostly by applying lexical transformations: JavaScript minifier/obfuscator - free Stunnix JavaScript Obfuscator - commercial Shane Ng's obfuscator 18
  • 19. Executables • Compiled executables – Often built with a higher programming language, such as C++, which are then translated to a lower level language known as assembly machine language – Harder to reverse back to original source because of different compilers, architectures, and lack of information during compilation • Interpreted applications – Such as Java and .NET, require an application virtual machine (referred to as CLR, common language runtime) – Easier to reverse, because the interpreted compilation (JIT) process is more structured and retains more information about the executable (where obfuscation needed!) 19
  • 20. Code Decompilation/Disassemblation • Reconstructs the source code (to some extent) from the compiled code – .NET assembly to C#/VB.NET source code – Java class file/ JAR archive to Java source code – .exe file to C/C++/Assembler code • Note that the reconstructed code – is not always 100% compilable – loses private identifier names and comments 20
  • 21. Code Decompilation Tools Languages Tools .EXE file Boomerang Decompiler  outputs C source code IDA Pro – powerful disassembler / debugger C#/.NET .NET Reflector – powerful tool, great usability (paid) ILSpy – free two tools that will allow us modify the executables Graywolf - allow to modify the executables Reflexil - allow to modify the executables Java DJ Java Decompiler JD (JD-Core / JD-GUI / JD-Eclipse) JAD JavaScript JSBeautifier –reconstructing program and tracing logic 21
  • 22. Code Obfuscation Techniques • Data obfuscation • Layout obfuscation • Control obfuscation • Preventive transformation 22
  • 23. Control Obfuscation Examples Hide information flow – Dead code – Parallelize code 23
  • 24. Reverse Engineering: How much can it do? • Normal Engineering: write source code -> compile -> binary code • Reverse Engineering: gets the binary -> use powerful tools (e.g IDA Pro, Ollydbg) to gain knowledge about program -> get to know code structure, control flow, and valuable assets, keys, algorithms 24
  • 26. IDA Pro: control graph 26
  • 27. IDA Pro: data structs 27
  • 28. Security Level for Obfuscation • The level of security that an Obfuscator adds depends on: – The transformations used – The power of available deobfuscators – The amount of resources available to deobfuscators • Cost tradeoffs – Obfuscated code is slower due to changes/additions in its control logic – Obfuscator is the opposite of high-quality compiler/optimizer 28
  • 29. Limits of Obfuscation • Factor that prevent use of Obfuscation – Cost of Obfuscation – Execution time of code – High Program complexity • Performance evaluation 29 Features Descriptions Potency is the level up to which a human reader would be confused by the new code Resilience is how well the obfuscated code resists attacks by deobfuscation tools Cost is how much load is added to the application
  • 30. Virtual Machine Based Binary code Obfuscation 30
  • 31. Virtual Machine Based Obfuscation • A new approach: Multi-Stage Binary Code Obfuscation Using Improved Virtual Machine [Fang11] – VM based obfuscator – Code transformation – Block obfuscation – Obfuscation key generation 31
  • 32. Virtual Machine-ed (Virtualized) Program • General code structure – A VM section was appended to the program – Protected binary code was transformed to byte-code, which is interpreted by VM – Remaining code was kept intact – Entry Point to protected code was redirected into VM code 32
  • 33. How A VM Obfuscator Works • Instruction transformation • VM saves all registers and flags in its own VM context. • VM restore upon exiting byte-code interpretation 33
  • 34. Instruction transformation • Intel IA32 instruction format • VM instruction format 34 Prefix Opcode ModR/M SIB Disp Imm length 8 Original instruction length Prefix Opcode … 16 8 0xFFFF max no. is 256Exclude this byte Related data 8
  • 35. An Example of VM Instruction Set • 39 types of VM instructions • For example: – ret xxxx – shl eax, 1 35 I_COND_JMP_SHORT 0x00 I_COND_JMP_LONG 0x01 I_JECX 0x02 I_CALL_REL 0x03 I_VM_END 0x04 I_RET 0x05 I_LOOPxx 0x06 I_VM_MOV_IMM 0x07 I_VM_MOV_REG 0x08 I_VM_ADD_IMM 0x09 I_VM_ADD_REG 0x0A I_VM_SHL_IMM 0x0B I_VM_REAL 0x0C I_JMP_LONG 0x0D I_JMP_SHORT 0x0E I_VM_ROL 0x0F I_VM_ROR 0x10 I_VM_RCL 0x11 I_VM_RCR 0x12 I_VM_SAL 0x13 I_VM_SHL 0x14 I_VM_SHR 0x15 I_VM_SAR 0x16 I_VM_ADD 0x17 I_VM_OR 0x18 I_VM_ADC 0x19 I_VM_SBB 0x1A I_VM_AND 0x1B I_VM_SUB 0x1C I_VM_XOR 0x1D I_VM_CMP 0x1E I_VM_MOV 0x1F I_VM_CALL 0x20 I_VM_JMP 0x21 I_VM_PUSH 0x22 I_VM_POP 0x23 I_VM_RELOC 0x24 I_VM_FAKE_CALL 0x25 I_VM_NOP 0x26 * VM instruction byte-codes are usually permutated 5 0xFFFF I_RET xxxx 7 0xFFFF MAKE_SHL_REG 5 0xFFFF MAKE_MOV_IMM
  • 36. An Example of VM Instruction Transformation add eax, 1 36 call ADD_EAX_1 jmp ADD_EAX_1_END ;; trash code here ADD_EAX_1: push ebp mov ebp, esp sub esp, px08 push ebx mov dword ptr [ebp-0x04],5 mov ebx,4 sub dword ptr [ebp-0x04],ebx add eax, dword ptr [ebp-0x04] mov esp, ebp pop ebp ret Add_EAX_1_END:
  • 37. Classic VM Byte-code Execution • Stack based style: save registers for native code, and create own VM stack • The return value of last execution for each bytecode was saved in VM register (var_RegEip and var_RegDI below), for next bytecode execution 37 00401060 VM_procedure proc near … 004010BC VM_Entry: ; return here upon completion of each bytecode 004010BC inc [ebp+var_RegEip] 004010BF mov eax, [ebp+var_RegEip] 004010C2 mov al, [eax] ; fetch one byte from pseudo-code 004010C4 mov [ebp+var_RegDl], al 004010C7 mov eax, offset lpJumpAddrTable 004010CC movzx ebx, [ebp+var_RegDl] 004010D0 shl ebx, 2 ; x4 004010D3 add eax, ebx ; look up jmp table 004010D5 jmp dword ptr [eax] ; going to interpretation 004010D5 VM_procedure endp
  • 38. Security of Virtualized Program • VM does not restore byte-codes to original codes any more • Cracking original software requires two steps: – Understanding VM code – Decoding mapping between two instruction sets • Therefore, security was largely transferred from original program to VM code after virtualization 38
  • 39. Improving Execution Efficiency of VM Obfuscator • Reduce unnecessary jumps during VM interpretation, by executing a block of bytecodes together instead of one-by-one – Previously, return to VM dispatcher every time after finishing one bytecode interpretation – Now, choose a “basic block” to execute before jumping back to VM entry. Specifically, choose code block between two nearest jmp/jcc/call instructions, and replace by bytecodes. At the end of last instruction, jump back 39
  • 40. How to Improving Execution Efficiency • Add a new bytecode PI for basic block <instr_1,…,instr_n> • Write binary code for PI as a whole, with jumping back • Allow VM dispatcher to handle PI as a bytecode 40 call XXX ; non-jump instructions instr_1 instr_2 … instr_n jmp YYY ;VM bytecodes db 14h, 0xABh, 78h VM_proc proc near VM_entry: … VM_proc endp … ; implementation of 0xABh PI_entry: … jmp VM_entry
  • 41. Can We Start “Multi-Stage” by Simply Repeating VM? • We like to design a multi-stage code obfuscator, which may generates more secure code than relying on a single VM core • Not that easy as first thought – One binary instruction maps to many bytecodes, together with implementation – Randomly select bytecodes for each instruction from protected code – Second round of VM cannot recognize previous bytecode due to polymorphism • Existing VM-over-VM can only obfuscate VM engine part not byte- codes – One needs to remember the implementation of bytecode, and in second stage transform implementation into bytecodes again 41
  • 42. Multi-stage Obfuscation • P: original program • n: # of obfuscation stages • K = <k0,k1,…,kn>: the keys used for each stage. • Algorithm: Randomly choose a n, and iteratively obfuscate program P for n times by key set K. The encryption key ki is generated from program Pi of last obfuscation stage, and ki is again applied to Pi to get Pi+1. – P0=P, k0 = f(P0)/ or predefined – P1 = Enc(P0, k0), k1 = f(P1) – … – Pn = Enc(Pn-1, kn-1)= Enc(Pn-1, f(Pn-1)), kn = f(Pn)= f(Enc(Pn-1, kn-1)) • Require: all Pi executable • Output: Pn only. Adversary need to crack all intermediate obfuscated programs in order to recover original code/flow. 42
  • 43. Multiple Copies of Program P • ki = f(Pi) • Pi+1 = Enc(Pi, ki) 43 … … Enc f P0=P K0 P1 K1 Enc Pn Kn f f
  • 44. Function f: Program  Key • ki = f(Pi), for i=0,…,n-1 • Function f maps any program to a key (in binary string), satisfying that – f have one-way hardness, and – key can characterize program • Examples of f(P): – MD5 of P, where P is viewed as data – # (nodes of CFG(P) ), where P is viewed as graph 44
  • 45. Obfuscating Program by Key • Pi+1 = Enc(Pi, ki) • Encryption algorithm shall – obfuscate program P’s data/control flow, while – preserving P’s functionality • Detail: extract all JMP/JCC/Call points of P, and transform such information into a jumping table S. Then S is obfuscated by K. Original program P is modified accordingly to S in order to preserve correct control • In other words, a separate hidden jumping table will take control over program’s running 45
  • 46. Example of Encryption • CFG of program P is normalized to have at most 2 out edges for each block • Predefine a set of dummy blocks C= {C1,…} • Initial K0=1101…, where each bit represents certain action on specific block – Bit(i) = 1: add one more branch for block i – Bit(i) = 0: do nothing on block i 46 B1 B2 B3 B4 K0=1101…(random generated)
  • 47. Adding More Branches to Block • For sequential block B1, add a dummy branch C1 together with predicate x 47 B1 B2 B1 predicate x B2 dummy C1 True False
  • 48. Adding More Branches to Block -II • For branching block B2, add a dummy block C2 together with predicate y… 48 B2; x (with predicate x) B3 True False B4 B2; x & y B3 True True B4 dummy C2 True False False
  • 49. Adding More Branches to Block -III • …, and normalize subgraph such that in which each block has at most 2 out edges 49 B2 x y True B4 dummy C2 False False B3 True
  • 50. P1 = Enc(P0,K0) is Ready • Further compute K1 – K1 = K0 xor #(blocks of P1) – Keys are not stored • Compute P2=Enc(P1,K1) 50 B1; predicate x dummy C1 True False B2 x C3: y True B4 dummy C2 False False B3 True
  • 51. Construction of Jump Table • Take actions on CFG according to key • Generating new jump table • Generating new key 51 ; jump table in data section ; insert a switch code, i.e., ; at the end of each block ; add a microcode reading transition ; location from jmp_table cmp … ; construct a predicate here mov eax, jmp_table[block_ind] mov ebx, jmp_table[block_ind +4] jcond eax jmp ebx Previous jmp_table: B1  B2 B2  B3,B4 B3  B4 B4  nil Obfuscated jmp_table: B1  B2, C1 B2  C3,B4 B3  B4 B4  nil C1  B2 C2  B3 C3  B3, C2
  • 52. Details of Constructing Jump Table • Practically, control instructions include – Cmp ( and machine status word) – Jmp – Jcond: jne/je, jz, jg/jge, jl/jle – Call, ret • Target address is the location that current instruction will transfer to – jmp addr; For direct jump, target address is specified in the original instruction – jcc addr; For conditional jump, there are two target addresses – L: call addr; For call instruction, one target address for called function, and another target address for return address – ret; For return instruction, target address is stored on the stack • Jump table is further obfuscated by hash function h, such that: – table_index = h( instruction_address ) – target_address = jump_table [ table_index ] 52
  • 53. Byte-code Polymorphism • Randomly select instruction implementation for byte-codes – Provide several instruction templates from which final language is derived – Indices of opcode table are randomly generated, which is used to connect binary instruction to byte-code 53
  • 54. Block Obfuscation • Before obfuscation 54 PE header Sections Code function block: instructions… Revised PE header Sections Code function block: jmp fb_dispatcher nops VM section VM core LoaderAlloc encrypted fb instructions fb_dispatcher: • After obfuscation
  • 55. Before Obfuscation • Choose a block of instructions to obfuscate • Intermediately, each instruction was transformed into a bytecode together with an implementation 55 ; original code … 00403E8E 6A 00 PUSH 0 00403E90 E8 96FFFFFFCALL 00403E2B 00403E95 59 POP ECX 00403E96 C3 RETN 00403E99 ADD EAX,1 … ; in VM stack .VM db 0C2h, 0C9h, 0BDh, 14h, 0D2h 00401169 VM_Add_EAX_1 proc near 00401169 00401169 var_RegEdi = dword ptr -1Ch 00401169 var_RegEcx = dword ptr -18h 00401169 var_RegEsp = dword ptr -0Fh 00401169 var_RegEax = dword ptr -0Bh 00401169 var_SFlag = byte ptr -3 00401169 var_ZFlag = byte ptr -2 00401169 00401169 mov eax, [ebp+var_RegEax] 0040116C sub [ebp+var_RegEsp], 4 00401170 mov ebx, [ebp+var_RegEsp] 00401173 mov [ebp+var_RegEdi], ebx 00401176 mov edx, [ebx] 00401178 mov [ebp+var_RegEcx], edx 0040117B add [ebp+var_RegEcx], eax 0040117E setz [ebp+var_ZFlag] ; set Z flag 00401182 sets [ebp+var_SFlag] ; set F flag 00401186 push [ebp+var_RegEcx] 00401189 pop [ebp+var_RegEax] 0040118C mov eax, [ebp+var_RegEcx] 0040118F mov [ebx], eax 00401191 jmp VM_Entry 00401191 VM_Add_EAX_1 endp
  • 56. After Obfuscation • VM obfuscator create a stack to save registers for native code • The return value of last bytecode execution was saved in VM stack, for current execution 56 00401060 VM_procedure proc near … 004010BC VM_Entry: ; 004010BC inc [ebp+var_RegEip] 004010BF mov eax, [ebp+var_RegEip] 004010C2 mov al, [eax] ; fetch one byte from stack 004010C4 mov [ebp+var_RegDl], al 004010C7 mov eax, offset lpJumpAddrTable 004010CC movzx ebx, [ebp+var_RegDl] 004010D0 shl ebx, 2 ; x4 004010D3 add eax, ebx ; look up jump table 004010D5 jmp dword ptr [eax] ; going to interprete 004010D5 VM_procedure endp ; original code … 00403E8E jmp VM_dispatcher 00403E90 nop 00403E95 nop 00403E96 nop 00403E99 nop … VM_func1 proc near … … jmp VM_func2 VM_func endp VM_funcN proc near … … jmp VM_Exit VM_func endp …
  • 57. Security Analysis of Multi-Stage • Each (binary) instruction x maps to a set T(x) of templates/bytecodes, where T(x) = { t1(x), t2(x),…, tn(x)}, and bytecode ti(x) = < ti,1(x),.., ti,ji(x)> represents a sequence of instructions • When this instruction was obf-ed more than twice, the separation between native instructions and instruction itself become unrecognizable 57 x1 y1 y2 y3 y4 y5 y6 y7 Random selection z1 z2 z3 z4 z5 z6 z7 z8 z9 x1 maps to 3 bytecodes; y4 maps to 2 bytecodes; y5 maps to 2 bytecodes; After two runs, x1 turns to <z3,z4,z5,z6,z7>
  • 58. Multi-stage Polymorphism Makes Guessing Even Harder • Given instruction sequence <z3,z4,z5,z6,z7>, guessing – separation of bytecode instructions – connection of polymorphism bytecodes will be exponentially hard • The number of stage is randomly chosen 58
  • 59. Summary of Difference from Existing Code Virtualizers • Efficiency: reduce #(jumps) unnecessary during VM interpretation, by executing a block of bytecodes together instead of one-by-one – Previously, return to VM dispatcher every time after finishing one bytecode interpretation – Currently: choose a “basic block” to execute before jumping back to VM entry. Specifically, choose code block between two nearest jmp/jcc/call instructions, and replace by bytecodes. At the end of last instruction, jump back • Multi-staged: introduce additional randomness in – #(stages) – Keys, represented by bytecode selections 59
  • 60. Pros & Cons • Pros: – Without key and stage, adversary could not know what step of codes to de-obfuscate and what key is in use. Key is further protected by one-way function – Literally, adversary will have to decode all n variants of program to get original program. But n itself is unknown – Use jump_table to revert controls, rather than using a virtual machine module to hide instructions in bytecodes. – Improved execution efficiency because of block obfuscation • Cons: – Stage obfuscation modifies control flow only, without touching internal of block – Program efficiency may slowdown by factor n – Dummy codes are required not to impact functionality 60
  • 62. Why .NET Obfuscation? • .NET Programs compiled in MSIL, which is at a higher level than binary machine code • .NET Programs are easy to reverse engineer using decompilation • .NET framework ships with a tool (ILDASM) that can disassemble MSIL • Anyone can peruse the details of the software 62
  • 63. Code Refactoring and Reflection • Refactoring – “Improving” the design of the existing code without changing its behavior • Typical refactoring patterns – Rename variable/class/method/member – Extract method – Extract constant – Extract interface – Encapsulate field 63
  • 66. Dotfuscator – Obfuscator Tool • Dotfuscator is a post-development recompilation system for .NET applications, to enhance code security – Obfuscation is applied to MSIL and not source code – Obfuscated code is functionally equivalent to traditional MSIL – It executes on CLR with same results 66
  • 67. Dotfuscator features • Renaming • Control Flow Obfuscation • String Encryption • Pruning • Linking • Watermarking 67
  • 68. Renaming in Dotfuscator • Renaming : – Uses an Overload-Induction renaming system that Renames as many methods as possible to a same name. – Saves space as short names used for renaming • Several Options exist for class renaming. For example, – Specify classes to be renamed while keeping their namespace membership (keepnamespace). – Rename namespace names while preserving namespace hierarchy (keephierarchy) – Rename completely, removing the namespace(default) 68
  • 69. Overload Induction Method Renaming • The underlying idea is to rename as many methods as possible to exactly the same name • Original source code before obfuscation private void CalcPayroll(SpecialList employeeGroup) { while (employeeGroup.HasMore()) { employee = employeeGroup.GetNext(true); employee.UpdateSalary(); DistributeCheck(employee); } } • Reverse-Engineered Source Code private void a(a b) { while (b.a()) { a = b.a(true); a.a(); a(a); } } • Since overload-induction tends to use the same letter more often, it reaches into longer length names more slowly (e.g. aa, aaa, etc.). This also saves space 69
  • 70. Renaming Options (keepnamespace) • Hide the names of program classes while maintaining namespace hierarchy • Example: 70
  • 71. Renaming Options (keephierarchy) • Preserve the namespace hierarchy while renaming the namespace and class names. 71
  • 72. Renaming Options (default) • Renames the class and namespace name to a new, smaller name 72
  • 73. String Encryption in Dotfuscator • Crackers will frequently search for specific strings in an application to locate strategic logic. For example, someone looking to bypass a registration and verification process can search for the string displayed when the program asks the user for a serial number. When the attacker finds the string, he can look for instructions near it and alter the logic • String Encryption makes this much more difficult to do, because the attacker's search will come up empty. The original string is nowhere to be found in the code. Only its encrypted version is present 73
  • 74. Control Flow Obfuscation • Introduce false conditions and other misleading constructs in order to confuse and break decompilers • It destroys the code patterns. The result is semantically equivalent to original • Original Source Code Before Obfuscation public int CompareTo(Object o) { int n = occurrences – ((WordOccurrence)o).occurrences; if (n == 0) { n = String.Compare(word, ((WordOccurrence)o).word); } return(n); } • After Control Flow Obfuscation public virtual int _a(Object A_0) { int local0; int local1; local0 = this.a – (c) A_0.a; if (local0 != 0) goto i0; goto i1; while (true) { return local1; i0: local1 = local0;} i1: local0 = System.String.Compare(this.b, (c) A_0.b); goto i0; } 74
  • 75. Pruning • Determine unused types, methods and fields. It extracts exactly the pieces you need for any given application. It helps reduce size of the assembly • The static analysis works by traversing your code, starting at a set of methods called “triggers” (Application Entry Points). As it traverse each trigger method’s code, it notes which fields, methods, types are being used • In standalone application, the Main method would be defined as a trigger 75
  • 76. Pruning Report • Dotfuscator generates a removal report in XML format that lists all input assemblies and how each was pruned – types – methods, – fields, – properties, and – managed resources • If a type was pruned, then obviously all its members are pruned • Constructors are named .ctor, while static constructors are named .cctor 76
  • 77. Assembly Linking • Also called merging, Links multiple assemblies into one or more output assemblies • Prime Assemblies – When you set up linking, you must specify one of the input assemblies as the prime assembly • Name Mangling – When the linker is merging assemblies, the linker sometimes encounters situations where a name needs to be changed in order to prevent a naming collision – For example, if two input assemblies contain private classes with identical names then the linker must change one of the names in order to merge the assemblies 77
  • 78. Watermarking • Embed data (copyright info/unique nos.) into applications, making them unique. This is one method that can be used to track unauthorized copies of your software back to the source • To watermark an application – Select the assemblies to watermark – Select whether the watermark string is to be encrypted and provide a passphrase if so – Provide a string and an encoding that will be the watermark – Select how Dotfuscator will behave if the watermark string is too large to fit in a selected assembly 78
  • 79. .NET Obfuscation Drawbacks • Maintaining and troubleshooting becomes difficult • Can break code that depends on reflection, serialization or remoting • Hampers the debugging process, as obfuscation alters MSIL 79
  • 80. Java Byte-code obfuscation on Blackberry: A Case Study 80
  • 81. Background • Java increased the threat of reverse engineering – High-level bytecode – Platform independent • Portable • Anyone can have access to the bytecode • Reverse engineering – Analyse system to create higher level representation – Recreate Java source code 81
  • 82. Problem and Objective • Problem: Java bytecode/.class files can be easily decompiled by adversary tools such as – JAD, a fairly old and not that effective against obfuscated code – SourceAgain, a more modern decompiler, commercial – Dava, a research project using control flow analysis and a typing system • Objective: Obfuscate bytecode/.class files from being decompiled into readable and valid Java code 82
  • 83. Commonly Techniques used by Java Obfuscators • Obfuscation techniques: – Name obfuscation – Incremental obfuscation – String obfuscation – Flow obfuscation – Debug info obfuscation • Some concerns – Method and field renaming can cause reflection calls to stop working – Changing actual class and package names can break several other Java APIs (JNDI, URL providers etc.) – If the association between class byte-code offsets and source line numbers is altered, recovering the original exception stack traces could become difficult 83
  • 84. Name Obfuscation • Change package, class, field and method names to meaningless strings • Incremental name obfuscation – When obfuscating a.jar, remember the mapping between names (e.g., MyUtil  M) – Later when obfuscating all other classes like b.jar, apply that name mapping (all MyUtil  M) • Package name obfuscation – One good Java program usually contains at most 10 classes under one package. By viewing package name, adversary can easily figure out the program intention • Com.mycompany.license.a, com.mycompany.license.b, com.mycompany.license.c • Only class names were obfuscated • Com.mycompany.a.a, com.ycompany.a.b, com.mycompany.a.c • Package name was obfuscated too – Package name obfuscation increases the difficulty of decompilation exponentially faster than class name obfuscation – Package > class > fields & methods 84
  • 85. String Obfuscation • The string literals embedded in application includes: – Text of labels or other GUI components on dialogs – Text of error messages – Text of exception messages • Adversary: decompiles all classes and search strings • Obfuscator: encrypts string literals and stores in the Constant Pools of class files. It then modifies the class or classloader so that the strings are decrypted at runtime 85
  • 86. Flow Obfuscation • Obscure the control flow to no longer have a direct Java source code equivalent – Selection (if…else…) – Looping (while…, for…) • Decompilers will have to produce a series of labels and goto statements into the source code 86
  • 87. Encrypted classes plus a customized classloader? • Encrypt all classes after compilation and decrypt them on the fly inside JVM by a customized classloader • Difficulty: each subclass of classloader has to call final method ClassLoader.defineClass(), which is interceptable. Debugging Java classloading can get a load trace for a customized classloader – All ClassLoaders have to pass their class definitions to JVM via one well-defined API point: java.lang.ClassLoader.defineClass() method. The ClassLoader API has several overloads of this method, but all of them call into the defineClass(String, byte[], int, int, ProtectionDomain) method. It is a final method that calls into JVM native code after doing a few checks. – No classloader can avoid calling this method if it wants to create a new Class – The defineClass() method is the only place where the magic of creating a Class object out of a flat byte array can take place. And the byte array must contain the unencrypted class definition in a well-documented format (class file format specification) – Intercepting all calls to this method and decompiling all interesting classes becomes easy: get the source for java.lang.ClassLoader for J2SDK and modify defineClass to have some additional class logging • Summary: high-level bytecode encryption is infeasible, unless given a secure JVM native code 87
  • 88. Protecting a PDF viewer/wrapper written in Java • Yes, the ultimate goal is to protect PDF files from illegal view • Server – Obfuscating Java wrapper program – Encrypting PDF files • Blackberry phone – Running PDFReaderWrapper program 88 PDF wrapper does: 1. Stores a local copy of sPDF file 2. Decrypts sPDF file internally 3. Renders PDF display
  • 89. Wrapper Protection Model for Blackberry • Wrapper: running on Blackberry to receive and decrypt sPDF file, then display PDF content. Written in Java and unique to device • Java obfuscator: inject a key in wrapper, then obfuscate the wrapper, and convert it to COD file • PDF encryptor: encrypt PDF file into a secure PDF file 89 Content provider Java Obfuscator PDF encryptor Software vendor Plain wrapper W Plain PDF file F Obf-ed wrapper Wa Secured PDF Fa Server
  • 90. Security on Blackberry COD files • COD is Blackberry proprietary file format with compiler rapc • Blackberry RIM security – BB password encryption system uses a standard key-derivation function, PBKDF2 of 256-bit AES with only one iteration in BB encryption – Elcomsoft claims in 2010 that they have cracked it – System COD files are available under C:Program Fileseclipsepluginsnet.rim.ejde.componentpack4.7.0_4.7.0.57componentssimulatorJava. For example: • Personal Information Management: net_rim_bbapi_pim.cod, net_rim_bbapi_pim_res.cod, net_rim_bbapi_pim_res__en.cod, net_rim_bbapi_pim_todo.cod • Cryptography: net_rim_bb_crypto_api.cod, net_rim_bb_crypto_resource.cod, nt_rim_bb_crypto_resource__en.cod etc. • Blackberry application security – Blackberry COD is based on private virtual machine which translates JAVA bytecode. Technically no more difficulty to break it than any other JVM obfuscators (Blackberry simulator/JDE, WinHEX) 90
  • 91. Building Obf-ed wrapper program • Java Obfuscator: – Customize wrapper with device PIN and secret soft key (code injection) – Obfuscate wrapper into a JAR – Convert wrapper JAR to COD format • Wrapper download is device-specific – Device PIN is used to encrypt PDF file – Call Blackberry Device Info API to get IMSI 91 HelloWorld.ja va SecureCheckT emplate.java HelloWorldM odified.java SecureCheck. java Wrapper.jar customize rapc ObfWrapper.j ar ObfWrapper.c od obf preverify&sign
  • 92. Install a file filter driver for Blackberry Java Obfuscator • A device driver implements a set of handler routines to process I/O request packet (IRP) calls from OS • A file system filter driver intercepts requests targeted at a file system or another file system filter driver. By intercepting the request before it reaches its intended target, the filter driver can extend/replace functionality provided by the original target of the request • Further protection: encrypt the intermediate files created by Java obfuscator 92
  • 93. Hook File Read/Write Dispatch Functions • Put our file monitor driver on top of device stack • Driver logic for file filtering – Hook only related files opened by specified process – Other processes can only visit encrypted files 93 irp = IoGetCurrentIrpStackLocation(); filename = fileGetFullPath(irp->FileObject); process = GetProcessNameOffset(); Process name selected? File name hooked? Decrypt and set file stream buffer; Return true; Pass thru to lower device stack; Return false; End Yes Yes No No
  • 94. Transparent File Encryption/Decryption • Driver: receive file list from selected process. When file hooked: – Encrypt file before writing it to disk (encrypted file stored on disk) – Decrypt it after reading it from disk ( file decrypted in memory only) • Method details: – rewrite pre/post operation functions and hook to filter manager – Operate (R/W/Dir control) on intermediate file buffers instead of on system-supplied buffer directly – After operation completes, copy contents of new buffer back into original buffer 94
  • 95. File Pre/Post Read Operation • PreRead – Allocate new buffer – Set up MDL for the new allocated buffer – Update context with new buffer pointer and MDL address – Pass context to PostRead callback • PostRead – Decrypt the read data – Copy the read data back into system buffer 95
  • 96. File Pre/Post Write Operation • PreWrite – Allocate new buffer we are writing to – Set up MDL for the new allocated buffer – Copy and encrypt the old buffer to new buffer – Update context with new buffer pointer and MDL address – Pass context to PostWrite callback • PostWrite – Free the allocated buffers 96
  • 97. Communications betw. Filter Driver and Application • Application (jobfuscator.exe) sends a file-list to filter driver 97 0 FltCreateCommunicatonPort 2 ConnectNotifyCallback* 3 FltSendMessage 6 MessageNotifyCallback* 8 DisconnectNotifyCallback* 1 FltConnectCommunicationPort 4 FilterGetMessage 5 FilterReplyMessage 7 FltCloseCommunicationPort Start filtering Filtering check OK File list Close filtering
  • 98. References for Research * Chosen Ciphertext Security via Point Obfuscation, T. Matsuda and G. Hanaoka, TCC 2014 * Two-round secure MPC from Indistinguishability Obfuscation, Sanjam Garg et al., TCC 2014 * Virtual Black-Box Obfuscation for All Circuits via Generic Graded Encoding, Z. Brakerski and G. Rothblum, TCC 2014 * Extractable Obfuscation and Applications, Elette Boyle and Kai-Min Chung and Rafael Pass, TCC 2014 * Candidate Indistinguishability Obfuscation and Functional Encryption for all circuits. Amit Sahai, 2013 * How to Use Indistinguishability Obfuscation: Deniable Encryption, and More. Amit Sahai, 2013 * Barak et al., On the (im)possibility of obfuscating programs, Crypto’01 * Goldweisser, on the impossibility of obfuscation with auxiliary input, FOCS’05 * Canneti, towards realizing random oracles: hash functions that hide all partial information, Crypto’97 * Rolf Rolles, Unpacking virtualization obfuscators, 2009 * H. Wee, on obfuscating point functions, STOC’05 * Collberg, code transformation techniques for software protection, 2009 * Ogiso et al., software obfuscation on a theoretical basis and its impl.2003 * Sharif et al., Automatic reverse engineering of malware emulators, 30th IEEE Sym. on security &privacy, 2009 * D. Boccardo, Context sensitive analysis of x86 obfuscated executables, thesis, 2009 * Beaucamps and Filiol, On the possibility of practically obfuscating programs towards a unified perspective of code protection, 2008 * Kanzaki et al., Exploiting self-modification mechanism for program protection, 27th COMPSAC 2003 * Monden et al., Security improvements for encrypted interpretation * Ehrig et al., graphical representation and graph transformation, 1999 * James Smith and Ravi Nair, Virtual machines, Morgan Kaufmann Publisher * Fang et al., Multi-stage binary code obfuscation using an improved virtual machine, ISC 2011 (…more please refer to slide notes below) 98
  • 99. References for Practice • Hacker disassembler engine: transform ASM into a sequence of pseudo-instructions. Open source http://patkov-site.narod.ru/ • IDA pro https://www.hex-rays.com/products/ida/ disassembler and debugger: provide advanced disassembler SDK. Commercial • Vmprotect software, www.vmprotect.com trial • Reflector, www.reflector.net • ILSpy, http://wiki.sharpdevelop.net/ILSpy.ashx • Stunnix Obfuscators www.stunnix.com • Graywolf , http://digitalbodyguard.com/GrayWolf.html • Reflexil, http://reflexil.net/ • Shane Ng's obfuscator http://daven.se/usefulstuff/javascript-obfuscator.html GPL- licensed • JavaScript Obfuscator http://www.javascriptobfuscator.com/ Free • JSBeautifier, http://jsbeautifier.org/ Online unpack or deobfuscate JavaScript • Protecting Java code via code obfuscation, ACM crossroads, Springer 1998 • Protect your Java code – through obfuscators and beyond, Dmitry Leskov, 2009 • A qualitative analysis of Java obfuscation, Ravi Ramachandra, Rowan University 2008 • Dava and JBCO Java obfuscation projects, Sable lab of Mcgill univeristy, Canada • Blackberry java development tutorial 99