Symbolic MEV Extraction
Symbolic MEV Extraction
[email protected]
@gakonst
Disclaimer: early brainstorm, I’m not a searcher,
might be impractical, open-ended discussion
Question: Can we apply techniques used for software security
in MEV extraction?
State of MEV Searching (for the most part)
● Main categories:
○ Atomic / stat / top of block arbitrage
○ Backrunning / sandwiching
○ Liquidations
○ “Long tail”
● Software Stack:
● Geodistributed nodes
○ Fast access to state (e.g. Uniswap reserve values)
○ Strong mempool access (Eg. peer directly with AVAX validators)
● Handwritten bot for each integration
○ “White box” simulations (E.g. native Uniswap in Rust, vs going to the EVM)
● Reliable submission & closing (e.g. PGA / Flashbots auction)
● Observability
State of MEV Searching (for the most part)
● Main categories:
○ Atomic / stat / top of block arbitrage
○ Backrunning / sandwiching
○ Liquidations
○ “Long tail”
● Software Stack:
● Geodistributed nodes
○ Fast access to state (e.g. Uniswap reserve values)
○ Strong mempool access (Eg. peer directly with AVAX validators)
● Handwritten bot for each integration
○ “White box” simulations (E.g. native Uniswap in Rust, vs going to the EVM)
● Reliable submission & closing (e.g. PGA / Flashbots auction)
● Observability
Black vs Grey vs White Box MEV Extraction
● Black Box: No knowledge of the insides of a system
● Grey Box: Some knowledge
● White box: Full knowledge
In MEV extraction:
● White box: custom code per app, maintenance as apps change, reliable
● Black box = property-based, unclear how effective bc of no knowledge of the
app
● Grey box = property-based, extracts info about the app as it runs
But Ethereum is a Dark Forest
● Not all bots are application specific
● “Generalized” frontrunners
○ Simulate others’ transactions
○ Look at state diff + balance changes
○ If any subcall is profitable → copy + frontrun
● Pros:
○ Can cover a wider range of opportunities / long tail mev discovered by others
● Cons:
○ Unclear how effective it really is vs specialized
○ Relies on frontrunning / copying, does not discover new opportunities
Can we build a bot that reliably extracts MEV on arbitrary contracts
without knowledge of application logic?
Property-based Testing
Testing kinds:
● Unit testing: test pre-defined scenarios
● Property-based testing: check that certain conditions hold over multiple inputs
1. Define properties for contract solvency: e.g. ETH/token balance does not go
down
2. Monitor for new contract bytecodes
3. For each contract
a. Get all its functions from the jumptable
b. Call them with various arguments many times
c. Find arguments which break the defined properties
4. Submit series of transactions required to break the property
5. ???
6. $$$
Property-based MEV Extraction
1. Define properties for contract solvency: e.g. ETH/token balance does not go
down
2. Monitor for new contract bytecodes
3. For each contract
a. Get all its functions from the jumptable
b. Call them with various arguments many times
c. Find arguments which break the defined properties
4. Submit series of transactions required to break the property
5. ???
6. $$$
What inputs? Random!?
Symbolic Execution
Black box, but much better than random because the constraints identify
the structure of the problem, which z3 can solve effectively.
..people can get creative
..people can get creative
The Concrete vs Symbolic Searcher
“The Concrete” THE SYMBOLIC
Hasn’t checked the
trends in weeks
Looks for alpha
Knows every
Rewrites solidity contract inside outThe alpha finds him
contracts in rust for
speed
Only does arbs, Doesn’t know Solidity Writes his properties
liqs & top of once
block
Maintenance
takes all his time Writes his own
Code is 5 years old Lets the solver do his job
bots
Future: Fast Symbolic MEV Extraction
● Rust Symbolic EVMs
○ https://github.com/WilfredTA/symbolic-stack-machines (goal to integrate in Foundry)
○ https://github.com/williamberman/evm-symbolic-execution/blob/master/EVM%20Symbolic%20
Execution.ipynb
● EVM-specific SMT solvers
○ https://github.com/EVM-SMT/solver
● Solvers are largely heuristic based → teach & generalize the heuristics via
reinforcement learning? (e.g. train an RNN vs z3)
● Hardware acceleration: MEV ASICs?! (really SMT ASICs)
○ Probably a bad idea, z3 & others are heuristics based so if they take a long time they will
never find the answer, h/w will not help
Thank you for your attention!
Q&A?
[email protected]
@gakonst