SlideShare a Scribd company logo
Fine-Grained Fault Tolerance
using Device Checkpoints
Asim Kadav
with Matthew Renzelmann and Michael M. Swift
University of Wisconsin-Madison
Thursday, March 7, 13
The (old) elephant in the room
2
device
drivers
3rd party developers
+
OS
kernel
Thursday, March 7, 13
The (old) elephant in the room
2
device
drivers
3rd party developers
+
OS
kernel
Thursday, March 7, 13
The (old) elephant in the room
2
device
drivers
3rd party developers
+
OS
kernel
Recipe
for
disaster
Thursday, March 7, 13
Improvement System ValidationValidationValidationImprovement System
Drivers Bus Classes
New functionality Shadow driver migration [OSR09] 1 1 1
RevNIC [Eurosys 10] 1 1 1
Reliability Nooks [SOSP 03] 6 1 2
XFI [ OSDI 06] 2 1 1
CuriOS [OSDI 08] 2 1 2
Type Safety SafeDrive [OSDI 06] 6 2 3
Singularity [Eurosys 06] 1 1 1
Specification Nexus [OSDI 08] 2 1 2
Termite [SOSP 09] 2 1 2
Static analysis tools Windows SDV [Eurosys 06] All All All
Coverity [CACM 10] All All All
Cocinelle [Eurosys 08] All All All
3
Past work mostly looks at detection and isolation
Thursday, March 7, 13
Improvement System ValidationValidationValidationImprovement System
Drivers Bus Classes
New functionality Shadow driver migration [OSR09] 1 1 1
RevNIC [Eurosys 10] 1 1 1
Reliability Nooks [SOSP 03] 6 1 2
XFI [ OSDI 06] 2 1 1
CuriOS [OSDI 08] 2 1 2
Type Safety SafeDrive [OSDI 06] 6 2 3
Singularity [Eurosys 06] 1 1 1
Specification Nexus [OSDI 08] 2 1 2
Termite [SOSP 09] 2 1 2
Static analysis tools Windows SDV [Eurosys 06] All All All
Coverity [CACM 10] All All All
Cocinelle [Eurosys 08] All All All
3
Large kernel subsystems and validity of few device types
result in limited adoption of research solutions
Past work mostly looks at detection and isolation
Thursday, March 7, 13
Improvement System ValidationValidationValidationImprovement System
Drivers Bus Classes
New functionality Shadow driver migration [OSR09] 1 1 1
RevNIC [Eurosys 10] 1 1 1
Reliability Nooks [SOSP 03] 6 1 2
XFI [ OSDI 06] 2 1 1
CuriOS [OSDI 08] 2 1 2
Type Safety SafeDrive [OSDI 06] 6 2 3
Singularity [Eurosys 06] 1 1 1
Specification Nexus [OSDI 08] 2 1 2
Termite [SOSP 09] 2 1 2
Static analysis tools Windows SDV [Eurosys 06] All All All
Coverity [CACM 10] All All All
Cocinelle [Eurosys 08] All All All
3
Limited kernel changes + Applicable to lots of drivers =>
Real Impact
Past work mostly looks at detection and isolation
Thursday, March 7, 13
Improvement System ValidationValidationValidationImprovement System
Drivers Bus Classes
New functionality Shadow driver migration [OSR09] 1 1 1
RevNIC [Eurosys 10] 1 1 1
Reliability Nooks [SOSP 03] 6 1 2
XFI [ OSDI 06] 2 1 1
CuriOS [OSDI 08] 2 1 2
Type Safety SafeDrive [OSDI 06] 6 2 3
Singularity [Eurosys 06] 1 1 1
Specification Nexus [OSDI 08] 2 1 2
Termite [SOSP 09] 2 1 2
Static analysis tools Windows SDV [Eurosys 06] All All All
Coverity [CACM 10] All All All
Cocinelle [Eurosys 08] All All All
3
Limited kernel changes + Applicable to lots of drivers =>
Real Impact
Goal: Improve recovery with complete solutions
that can be applied to many drivers
Past work mostly looks at detection and isolation
Thursday, March 7, 13
State of the art in recovery: Shadow drivers
• Carburizer calls generic recovery
service if check fails
• Low cost transparent recovery
★ Based on shadow drivers
★ Records state of driver at all times
★ Transparently restarts and replays
recorded state on failure
Shadow
Driver
Device
Driver
Device
Taps
Driver-Kernel
Interface
4
Swift [OSDI ’04]
Thursday, March 7, 13
Recovery Performance: Device initialization is slow
5
Cold boot hardware,
flash device memory
Perform EEPROM
checsumming
Set chipset
specific ops
Optional self
test on boot
Allocate driver
structures
Configure device to
working state
Device ready
for requests
Allocate device structures
Module registration
Map BAR and I/O ports
Register device operations
Detect chipset capabilities
Self test?
Self test on boot
Cold boot the device
Verify EEPROM checksum
Set chipset specific ops
Allocate driver structures
Configure device
Device ready
for requests★ Multi-second device probe
★ Identify device
★ Cold boot device
★ Setup device/driver
structures
★ Configuration/Self-test
Thursday, March 7, 13
Recovery Performance: Device initialization is slow
5
Cold boot hardware,
flash device memory
Perform EEPROM
checsumming
Set chipset
specific ops
Optional self
test on boot
Allocate driver
structures
Configure device to
working state
Device ready
for requests
Allocate device structures
Module registration
Map BAR and I/O ports
Register device operations
Detect chipset capabilities
Self test?
Self test on boot
Cold boot the device
Verify EEPROM checksum
Set chipset specific ops
Allocate driver structures
Configure device
Device ready
for requests★ Multi-second device probe
★ Identify device
★ Cold boot device
★ Setup device/driver
structures
★ Configuration/Self-test
★ What does it hurt?
★ Fault tolerance: Driver recovery
★ Virtualization: Live migration,
cloning, consolidation
★ OS functions: Boot, upgrade,
NVM checkpoints
Thursday, March 7, 13
Driver Code
Characteristics
6
★ “Understanding Modern Device Drivers” ASPLOS 2012
Thursday, March 7, 13
0
10
20
30
40
50
Percent-
age of LOC
uwb
net
infiniband
atm
scsi
mtd
md
ide
block
ata
watchdog
video
tty
sound
serial
pnp
platform
parport
misc
message
media
leds
isdn
input
hwmon
hid
gpu
gpio
firewire
edac
crypto
char
cdrom
bluetooth
acpi
init cleanup ioctl config power error proc core intr
0
10
20
30
40
50
Percent-
age of LOC
chardriversblockdriversnetdrivers
Driver Code
Characteristics
6
★ “Understanding Modern Device Drivers” ASPLOS 2012
Thursday, March 7, 13
0
10
20
30
40
50
Percent-
age of LOC
uwb
net
infiniband
atm
scsi
mtd
md
ide
block
ata
watchdog
video
tty
sound
serial
pnp
platform
parport
misc
message
media
leds
isdn
input
hwmon
hid
gpu
gpio
firewire
edac
crypto
char
cdrom
bluetooth
acpi
init cleanup ioctl config power error proc core intr
0
10
20
30
40
50
Percent-
age of LOC
chardriversblockdriversnetdrivers
★Initialization/cleanup – 36%
★Core I/O & interrupts – 23%
★Device configuration – 15%
★Power management – 7.4%
★Device ioctl – 6.2%
Driver Code
Characteristics
6
★ “Understanding Modern Device Drivers” ASPLOS 2012
Thursday, March 7, 13
0
10
20
30
40
50
Percent-
age of LOC
uwb
net
infiniband
atm
scsi
mtd
md
ide
block
ata
watchdog
video
tty
sound
serial
pnp
platform
parport
misc
message
media
leds
isdn
input
hwmon
hid
gpu
gpio
firewire
edac
crypto
char
cdrom
bluetooth
acpi
init cleanup ioctl config power error proc core intr
0
10
20
30
40
50
Percent-
age of LOC
chardriversblockdriversnetdrivers
★Initialization/cleanup – 36%
★Core I/O & interrupts – 23%
★Device configuration – 15%
★Power management – 7.4%
★Device ioctl – 6.2%
Driver Code
Characteristics
Initialization code dominates driver
LOC and adds to complexity
6
★ “Understanding Modern Device Drivers” ASPLOS 2012
Thursday, March 7, 13
Recovery works by interposing class defined entry points
7
★ Class definition includes:
★ Callbacks registered with the bus,
device and kernel subsystem
network
driver
bus
net device
subsystem
kernel
probe
xmit
config
network
card
Thursday, March 7, 13
Recovery works by interposing class defined entry points
7
How many drivers follow class behavior?
★ Class definition includes:
★ Callbacks registered with the bus,
device and kernel subsystem
network
driver
bus
net device
subsystem
kernel
probe
xmit
config
network
card
Thursday, March 7, 13
Restart/replay doesn’t work with all drivers
★ Non-class behavior stems from:
- Load time parameters, procfs and sysfs interactions, unique ioctls
...	
  qlcnic_sysfs_write_esw_config	
  (...)	
  	
   {
	
  	
  ...
	
   switch	
  (esw_cfg[i].op_mode)	
  {
	
   case	
  QLCNIC_PORT_DEFAULTS:	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
   	
   	
  	
  	
  
	
   	
   qlcnic_set_eswitch_...(...,&esw_cfg[i]);
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  ...
	
   case	
  QLCNIC_ADD_VLAN:
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  qlcnic_set_vlan_config(...,&esw_cfg[i]);
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  ...
	
   case	
  QLCNIC_DEL_VLAN:
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  esw_cfg[i].vlan_id	
  =	
  0;
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  qlcnic_set_vlan_config(...,&esw_cfg[i]);
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  ...
Drivers/net/qlcnic/qlcnic_main.c:	
  Qlogic	
  driver(network	
  class)
8★ “Understanding Modern Device Drivers” ASPLOS 2012
Thursday, March 7, 13
Restart/replay doesn’t work with all drivers
★ Non-class behavior stems from:
- Load time parameters, procfs and sysfs interactions, unique ioctls
8
★ Results as measured by our analyses:
★ 36% of drivers use load time parameters
★ 16% of drivers use proc /sysfs support
★ Overall, 44% of drivers do not conform to class
behavior and recovery will not work correctly for
these drivers
★ “Understanding Modern Device Drivers” ASPLOS 2012
Thursday, March 7, 13
Limitations of restart/replay recovery
Shadow
Driver
Device
Driver
Device
Taps
Driver-Kernel
Interface
9
★ Device save/restore limited to
restart/replay
★ Slow: Device initialization is
complex (multiple seconds)
★ Incomplete: Unique device
semantics not captured
★ Hard: Need to be written for
every class of drivers
★ Large changes: Introduces new
large kernel subsystems
Thursday, March 7, 13
Limitations of restart/replay recovery
Shadow
Driver
Device
Driver
Device
Taps
Driver-Kernel
Interface
9
★ Device save/restore limited to
restart/replay
★ Slow: Device initialization is
complex (multiple seconds)
★ Incomplete: Unique device
semantics not captured
★ Hard: Need to be written for
every class of drivers
★ Large changes: Introduces new
large kernel subsystems
Checkpoint/restore of device and driver state
removes the need to reboot device and replay state
Thursday, March 7, 13
Fine-Grained Fault Tolerance (FGFT)
10
Goal: Fault isolation and recovery system based on “pay as
you go” failure model
Fine-Grained Isolation
★ Ability to run select entry points as transactions
Checkpoint based recovery
★ Provides fast and correct recovery semantics
★ Requires incremental changes to drivers and has
low overhead
Thursday, March 7, 13
Outline
11
Introduction
Conclusion
Fine-grained isolation
Checkpoint based recovery
Thursday, March 7, 13
Outline
12
Introduction
Conclusion
Fine-grained isolation
Checkpoint based recovery
Thursday, March 7, 13
FGFT overview
If	
  (c==0)	
  {
.
print	
  (“Driver	
  
init”);
}
.
.
Driver with
checkpoint support
Static modifications
13
Thursday, March 7, 13
FGFT overview
If	
  (c==0)	
  {
.
print	
  (“Driver	
  
init”);
}
.
.
Driver with
checkpoint support
Static modifications
13
User supplied
annotations
Source transformation
(adds driver transactions)
Thursday, March 7, 13
FGFT overview
If	
  (c==0)	
  {
.
print	
  (“Driver	
  
init”);
}
.
.
Driver with
checkpoint support
Static modifications
13
If	
  (c==0)	
  {
.
print	
  (“Driver	
  
init”);
}
.
.
If	
  (c==0)	
  {
.
print	
  (“Driver	
  
init”);
}
.
.
User supplied
annotations
Source transformation
(adds driver transactions)
Main driver
module
SFI driver
module
SFI = software fault
isolated
Thursday, March 7, 13
FGFT overview
If	
  (c==0)	
  {
.
print	
  (“Driver	
  
init”);
}
.
.
Driver with
checkpoint support
Static modifications Run-time support
13
If	
  (c==0)	
  {
.
print	
  (“Driver	
  
init”);
}
.
.
If	
  (c==0)	
  {
.
print	
  (“Driver	
  
init”);
}
.
.
User supplied
annotations
Source transformation
(adds driver transactions)
Main driver
module
SFI driver
module
SFI = software fault
isolated
Thursday, March 7, 13
FGFT overview
If	
  (c==0)	
  {
.
print	
  (“Driver	
  
init”);
}
.
.
Driver with
checkpoint support
Communication
and recovery
support
Static modifications Run-time support
13
If	
  (c==0)	
  {
.
print	
  (“Driver	
  
init”);
}
.
.
If	
  (c==0)	
  {
.
print	
  (“Driver	
  
init”);
}
.
.
1200 LOC
User supplied
annotations
Source transformation
(adds driver transactions)
Object tracking
Marshaling/
Demarshaling
Kernel
undo log
Main driver
module
SFI driver
module
SFI = software fault
isolated
Thursday, March 7, 13
Fault model in FGFT
14
network
driver
network
card
probe
xmit
config
★ Can be applied to untested code, statically and dynamically
detected suspicious entry points
★ Detect and recover from:
★ Memory errors like NULL pointer accesses
★ Structural errors like malformed structures
★ Processor exceptions like divide by zero, stack corruption
Thursday, March 7, 13
Fault model in FGFT
14
★ Provide fault tolerance to specific driver entry points
network
driver
network
card
probe
xmit
config
★ Can be applied to untested code, statically and dynamically
detected suspicious entry points
★ Detect and recover from:
★ Memory errors like NULL pointer accesses
★ Structural errors like malformed structures
★ Processor exceptions like divide by zero, stack corruption
Thursday, March 7, 13
Transactional support through code generation
15
★ Generate code to run driver invocations on a separate
stack with a copy of parameters
★ Reduce copy overhead by copying only referenced fields
in driver and kernel structures to a range table
★ Instrument all memory references in SFI module to
compare accesses against copied fields in range table
Range Table
SFI
network
driver
network
driver
get ringparam
netdev->priv->tx_ring
netdev->priv->rx_ring
Address Access rights
0xffffa000 Read
0xffffa008 Write
0xffffa00a Read
Thursday, March 7, 13
Resource access during isolated execution
16
★ Device registers and I/O memory
★ Grant drivers full access to devices
★ Restore device checkpoint in case of failure
★ Locks: Spinlocks and semaphores
★ Grants read access to locks
★ Maintain kernel log of locks acquired
★ Release locks at the end of entry point/failures
★ Kernel resources like memory
★ All allocations generate range table entry
★ Maintain kernel log of all acquired resources
★ Free resources on failures
Address Access rights
0xffffa000 Read
0xffffa008 Write
0xffffa00a Read
Range Tablemalloc	
  ()
Thursday, March 7, 13
Outline
17
Introduction
Conclusion
Fine-grained isolation
Checkpoint based recovery
Thursday, March 7, 13
Checkpointing drivers is hard
★Existing mechanisms limited to capturing memory state
18
network
driver
network
card
Thursday, March 7, 13
Checkpointing drivers is hard
★Existing mechanisms limited to capturing memory state
18
network
driver
network
card
checkpoint
Thursday, March 7, 13
Checkpointing drivers is hard
★Existing mechanisms limited to capturing memory state
18
network
driver
network
card
checkpoint
★ Device state is not captured
★ Device configuration space
Thursday, March 7, 13
Checkpointing drivers is hard
★Existing mechanisms limited to capturing memory state
18
network
driver
network
card
checkpoint
★ Device state is not captured
★ Device configuration space
★ Internal device registers and counters
Thursday, March 7, 13
Checkpointing drivers is hard
★Existing mechanisms limited to capturing memory state
18
network
driver
network
card
checkpoint
★ Device state is not captured
★ Device configuration space
★ Internal device registers and counters
★ Memory buffer addresses used for DMA
Thursday, March 7, 13
Checkpointing drivers is hard
★Existing mechanisms limited to capturing memory state
18
network
driver
network
card
checkpoint
★ Device state is not captured
★ Device configuration space
★ Internal device registers and counters
★ Memory buffer addresses used for DMA
★ Unique for every class, bus and vendor
Thursday, March 7, 13
Device checkpoint/restore from PM code
19
Save config state
Save register state
Disable device
Copy-out s/w state
Suspend device
Restore config state
Restore register state
Restore s/w state &
reset
Re-attach/Enable
device
Device Ready
Suspend Resume
Thursday, March 7, 13
Device checkpoint/restore from PM code
19
Save config state
Save register state
Copy-out s/w state
Suspend device
Restore config state
Restore register state
Restore s/w state &
reset
Re-attach/Enable
device
Device Ready
Suspend Resume
Thursday, March 7, 13
Device checkpoint/restore from PM code
19
Save config state
Save register state
Copy-out s/w state
Restore config state
Restore register state
Restore s/w state &
reset
Re-attach/Enable
device
Device Ready
Suspend Resume
Thursday, March 7, 13
Device checkpoint/restore from PM code
19
Save config state
Save register state
Copy-out s/w state
Restore config state
Restore register state
Restore s/w state &
reset
Re-attach/Enable
device
Device Ready
Suspend Resume
Thursday, March 7, 13
Device checkpoint/restore from PM code
19
Save config state
Save register state
Copy-out s/w state
Restore config state
Restore register state
Restore s/w state &
reset
Re-attach/Enable
device
Device Ready
ResumeCheckpoint
Thursday, March 7, 13
Device checkpoint/restore from PM code
19
Save config state
Save register state
Copy-out s/w state
Restore config state
Restore register state
Restore s/w state &
reset
Re-attach/Enable
device
ResumeCheckpoint
Thursday, March 7, 13
Device checkpoint/restore from PM code
19
Save config state
Save register state
Copy-out s/w state
Restore config state
Restore register state
Restore s/w state &
reset
ResumeCheckpoint
Thursday, March 7, 13
Device checkpoint/restore from PM code
19
Save config state
Save register state
Copy-out s/w state
Restore config state
Restore register state
Restore s/w state &
reset
RestoreCheckpoint
Thursday, March 7, 13
Device checkpoint/restore from PM code
19
Save config state
Save register state
Copy-out s/w state
Restore config state
Restore register state
Restore s/w state &
reset
Suspend/resume code provides device
checkpoint functionality
RestoreCheckpoint
Thursday, March 7, 13
Intuition with power management
20
Thursday, March 7, 13
Intuition with power management
20
★ Intuition: Power management code captures device
specific state for every driver
★ Our study: Present in 76% of all common classes
Thursday, March 7, 13
Intuition with power management
20
★ Intuition: Power management code captures device
specific state for every driver
★ Our study: Present in 76% of all common classes
Linux
driver Device
Thursday, March 7, 13
Intuition with power management
20
★ Intuition: Power management code captures device
specific state for every driver
★ Our study: Present in 76% of all common classes
Linux
driver Device
RAM
Thursday, March 7, 13
Intuition with power management
20
★ Intuition: Power management code captures device
specific state for every driver
★ Our study: Present in 76% of all common classes
Linux
driver Device
RAM
suspend()
Thursday, March 7, 13
Intuition with power management
20
★ Intuition: Power management code captures device
specific state for every driver
★ Our study: Present in 76% of all common classes
Linux
driver Device
RAM
suspend()
resume()
Thursday, March 7, 13
Intuition with power management
20
★ Intuition: Power management code captures device
specific state for every driver
★ Our study: Present in 76% of all common classes
★ Refactor power management code for device checkpoints
★ Correct: Developer captures unique device semantics
★ Fast: Avoids probe and latency critical for applications
Linux
driver Device
RAM
suspend()
resume()
Thursday, March 7, 13
Synergy of isolation and recovery
21
★ Goal: Improve driver recovery with minor changes to drivers
★ Solution: Run drivers as transactions using device checkpoints
Thursday, March 7, 13
Synergy of isolation and recovery
21
★ Goal: Improve driver recovery with minor changes to drivers
★ Solution: Run drivers as transactions using device checkpoints
C R
★ Developers export
checkpoint/restore
in drivers
Device state
Thursday, March 7, 13
Synergy of isolation and recovery
21
★ Goal: Improve driver recovery with minor changes to drivers
★ Solution: Run drivers as transactions using device checkpoints
C R
★ Developers export
checkpoint/restore
in drivers
Device state Driver state
★ Run drivers invocations as
memory transactions
★ Use source transformation
to copy parameters and
run on separate stack
SFI
network
driver
network
driver
Thursday, March 7, 13
Synergy of isolation and recovery
21
★ Goal: Improve driver recovery with minor changes to drivers
★ Solution: Run drivers as transactions using device checkpoints
C R
★ Developers export
checkpoint/restore
in drivers
Device state Driver state
★ Run drivers invocations as
memory transactions
★ Use source transformation
to copy parameters and
run on separate stack
SFI
network
driver
network
driver
Execution model
★ Checkpoint device
★ Execute driver code as
memory transactions
★ On failure, rollback
and restore device
★ Re-use existing device
locks in the driver
Thursday, March 7, 13
SFI
network
driver
Example transactional execution
22
network
driver
probe
xmit
get config
get ringparam
Thursday, March 7, 13
SFI
network
driver
Example transactional execution
22
network
driver
probe
xmit
get config
get ringparam netdev
Thursday, March 7, 13
SFI
network
driver
Example transactional execution
22
network
driver
probe
xmit
get config
get ringparam
C
netdev
Thursday, March 7, 13
SFI
network
driver
Example transactional execution
22
network
driver
netdev->priv->tx_ring
probe
xmit
get config
get ringparam
netdev->priv->rx_ringC
netdev
Thursday, March 7, 13
SFI
network
driver
Address Access rights
0xffffa000 Read
0xffffa008 Write
0xffffa00a Read
Example transactional execution
22
network
driver
netdev->priv->tx_ring
probe
xmit
get config
get ringparam
netdev->priv->rx_ring
Range Table
C
netdev
Thursday, March 7, 13
SFI
network
driver
Address Access rights
0xffffa000 Read
0xffffa008 Write
0xffffa00a Read
Example transactional execution
22
network
driver
netdev->priv->tx_ring
probe
xmit
get config
get ringparam
netdev->priv->rx_ring
Range Table
C
netdev
Thursday, March 7, 13
SFI
network
driver
Address Access rights
0xffffa000 Read
0xffffa008 Write
0xffffa00a Read
Example transactional execution
22
network
driver
netdev->priv->tx_ring
probe
xmit
get config
get ringparam
netdev->priv->rx_ring
Range Table
C
netdev
Thursday, March 7, 13
SFI
network
driver
Address Access rights
0xffffa000 Read
0xffffa008 Write
0xffffa00a Read
Example transactional execution
22
network
driver
netdev->priv->tx_ring
probe
xmit
get config
get ringparam
netdev->priv->rx_ring
Range Table
netdev->priv->tx_ring
netdev->priv->rx_ring
result
Kernel
Log
alloc
C
netdev
Thursday, March 7, 13
SFI
network
driver
Example failed transaction
23
network
driver
probe
xmit
get config
get ringparam
Thursday, March 7, 13
SFI
network
driver
Example failed transaction
23
network
driver
probe
xmit
get config
get ringparam netdev
Thursday, March 7, 13
SFI
network
driver
Example failed transaction
23
network
driver
probe
xmit
get config
get ringparam
C
netdev
Thursday, March 7, 13
SFI
network
driver
Example failed transaction
23
network
driver
netdev->priv->tx_ring
probe
xmit
get config
get ringparam
netdev->priv->rx_ringC
netdev
Thursday, March 7, 13
SFI
network
driver
Address Access rights
0xffffa000 Read
0xffffa008 Write
0xffffa00a Read
Example failed transaction
23
network
driver
netdev->priv->tx_ring
probe
xmit
get config
get ringparam
netdev->priv->rx_ring
Range Table
C
netdev
Thursday, March 7, 13
SFI
network
driver
Address Access rights
0xffffa000 Read
0xffffa008 Write
0xffffa00a Read
Example failed transaction
23
network
driver
netdev->priv->tx_ring
probe
xmit
get config
get ringparam
netdev->priv->rx_ring
Range Table
C
netdev
Thursday, March 7, 13
SFI
network
driver
Address Access rights
0xffffa000 Read
0xffffa008 Write
0xffffa00a Read
Example failed transaction
23
network
driver
netdev->priv->tx_ring
probe
xmit
get config
get ringparam
netdev->priv->rx_ring
Range Table
C
netdev
Thursday, March 7, 13
SFI
network
driver
Address Access rights
0xffffa000 Read
0xffffa008 Write
0xffffa00a Read
Example failed transaction
23
network
driver
netdev->priv->tx_ring
probe
xmit
get config
get ringparam
netdev->priv->rx_ring
Range Table
Kernel
Log
alloc
C
netdev
Thursday, March 7, 13
SFI
network
driver
Address Access rights
0xffffa000 Read
0xffffa008 Write
0xffffa00a Read
Example failed transaction
23
network
driver
netdev->priv->tx_ring
probe
xmit
get config
get ringparam
netdev->priv->rx_ring
Range Table
Kernel
Log
alloc
C
netdev
Thursday, March 7, 13
SFI
network
driver
Address Access rights
0xffffa000 Read
0xffffa008 Write
0xffffa00a Read
Example failed transaction
23
network
driver
netdev->priv->tx_ring
probe
xmit
get config
get ringparam
netdev->priv->rx_ring
Range Table
Kernel
Log
alloc
C
netdev
Thursday, March 7, 13
SFI
network
driver
Address Access rights
0xffffa000 Read
0xffffa008 Write
0xffffa00a Read
Example failed transaction
23
network
driver
netdev->priv->tx_ring
probe
xmit
get config
get ringparam
netdev->priv->rx_ring
Range Table
C
netdev
Thursday, March 7, 13
SFI
network
driver
Address Access rights
0xffffa000 Read
0xffffa008 Write
0xffffa00a Read
Example failed transaction
23
network
driver
netdev->priv->tx_ring
probe
xmit
get config
get ringparam
netdev->priv->rx_ring
Range Table
err
C
netdev
Thursday, March 7, 13
SFI
network
driver
Address Access rights
0xffffa000 Read
0xffffa008 Write
0xffffa00a Read
Example failed transaction
23
network
driver
netdev->priv->tx_ring
probe
xmit
get config
get ringparam
netdev->priv->rx_ring
Range Table
err
C
R
netdev
Thursday, March 7, 13
SFI
network
driver
Address Access rights
0xffffa000 Read
0xffffa008 Write
0xffffa00a Read
Example failed transaction
23
network
driver
netdev->priv->tx_ring
probe
xmit
get config
get ringparam
netdev->priv->rx_ring
Range Table
err
C
R
FGFT provides transactional
execution of driver entry points
netdev
Thursday, March 7, 13
Outline
24
Introduction
Evaluation & Conclusions
Fine-grained isolation
Checkpoint based recovery
Thursday, March 7, 13
Outline
25
Introduction
Evaluation & Conclusion
Fine-grained isolation
Checkpoint based recovery
Thursday, March 7, 13
Recovery speedup
Driver Class Bus Restart
recovery
FGFT
recovery
Speedup
8139too net PCI 0.31s 70μs 4400
e1000 net PCI 1.80s 295ms 6
r8169 net PCI 0.12s 40μs 3000
pegasus net USB 0.15s 5ms 30
ens1371 sound PCI 1.03s 115ms 9
psmouse input serio 0.68s 410ms 1.65
26
FGFT provides significant
speedup in driver recovery
Thursday, March 7, 13
Static and dynamic fault injection
Driver Injected
Faults
Benign
Faults
Native
Crashes
FGFT
Crashes
8139too 43 0 43 NONE
e1000 47 0 47 NONE
r8169 36 0 36 NONE
pegasus 34 1 33 NONE
ens1371 22 1 21 NONE
psmouse 46 0 46 NONE
TOTAL 258 2 256 NONE
27
FGFT survives multiple static and dynamic faults
Thursday, March 7, 13
Programming effort
Driver LOC Isolation annotationsIsolation annotations Recovery additionsRecovery additions
Driver
annotations
Kernel
annotations
LOC Moved LOC
Added
8139too 1, 904 15 20 26 4
e1000 13, 973 32 32 10
r8169 2, 993 10 17 5
pegasus 1, 541 26 12 22 5
ens1371 2, 110 23 66 16 6
psmouse 2, 448 11 19 19 6
28
FGFT requires limited programmer effort
and needs only 38 lines of new kernel code
Thursday, March 7, 13
Throughput with isolation and recovery
Native
FGFT-­‐off-­‐I/O
FGFT-­‐I/O-­‐1/2
FGFT-­‐I/O-­‐all
netperf on Intel quad-core machines
29
Thursday, March 7, 13
Throughput with isolation and recovery
0
25
50
75
100
Throughput%age(Baseline844Mbps)
e1000 Network Card
Native
FGFT-­‐off-­‐I/O
FGFT-­‐I/O-­‐1/2
FGFT-­‐I/O-­‐all
netperf on Intel quad-core machines
29
Thursday, March 7, 13
Throughput with isolation and recovery
0
25
50
75
100
100
Throughput%age(Baseline844Mbps)
e1000 Network Card
Native
FGFT-­‐off-­‐I/O
FGFT-­‐I/O-­‐1/2
FGFT-­‐I/O-­‐all
netperf on Intel quad-core machines
29
CPU: 2.4%
Thursday, March 7, 13
Throughput with isolation and recovery
0
25
50
75
100
100 100
Throughput%age(Baseline844Mbps)
e1000 Network Card
Native
FGFT-­‐off-­‐I/O
FGFT-­‐I/O-­‐1/2
FGFT-­‐I/O-­‐all
netperf on Intel quad-core machines
29
CPU: 2.4% 2.4%
Thursday, March 7, 13
Throughput with isolation and recovery
0
25
50
75
100
100 100
96
Throughput%age(Baseline844Mbps)
e1000 Network Card
Native
FGFT-­‐off-­‐I/O
FGFT-­‐I/O-­‐1/2
FGFT-­‐I/O-­‐all
netperf on Intel quad-core machines
29
CPU: 2.4% 2.4% 2.9%
Thursday, March 7, 13
Throughput with isolation and recovery
0
25
50
75
100
100 100
96 93
Throughput%age(Baseline844Mbps)
e1000 Network Card
Native
FGFT-­‐off-­‐I/O
FGFT-­‐I/O-­‐1/2
FGFT-­‐I/O-­‐all
netperf on Intel quad-core machines
29
CPU: 2.4% 2.4% 2.9% 3.4%
Thursday, March 7, 13
Throughput with isolation and recovery
0
25
50
75
100
100 100
96 93
Throughput%age(Baseline844Mbps)
e1000 Network Card
Native
FGFT-­‐off-­‐I/O
FGFT-­‐I/O-­‐1/2
FGFT-­‐I/O-­‐all
netperf on Intel quad-core machines
29
CPU: 2.4% 2.4% 2.9% 3.4%
FGFT can isolate and recover high bandwidth devices
at low overhead without adding kernel subsystems
Thursday, March 7, 13
Summary
30
Thursday, March 7, 13
Summary
30
★ Fine-Grained Fault tolerance based on a pay-
as-you go model
★ Provides fault tolerance at incremental
performance costs and programmer efforts
★ Introduces fast checkpointing for drivers
★ Device checkpoints average ~20micros
★ Reduces recovery time significantly
★ Should be explored in other domains apart from
fault tolerance like fast reboot, upgrade etc.
Thursday, March 7, 13
Questions
Asim Kadav
★ http://cs.wisc.edu/~kadav
★ kadav@cs.wisc.edu
Thursday, March 7, 13

More Related Content

What's hot (20)

Embedded Recipes 2019 - Introduction to JTAG debugging
Embedded Recipes 2019 - Introduction to JTAG debuggingEmbedded Recipes 2019 - Introduction to JTAG debugging
Embedded Recipes 2019 - Introduction to JTAG debugging
Anne Nicolas
 
HCK-CI: Enabling CI for Windows Guest Paravirtualized Drivers (KVM Forum 2021)
HCK-CI: Enabling CI for Windows Guest Paravirtualized Drivers (KVM Forum 2021)HCK-CI: Enabling CI for Windows Guest Paravirtualized Drivers (KVM Forum 2021)
HCK-CI: Enabling CI for Windows Guest Paravirtualized Drivers (KVM Forum 2021)
KostiantynKostiuk
 
Lcu14 306 - OP-TEE Future Enhancements
Lcu14 306 - OP-TEE Future EnhancementsLcu14 306 - OP-TEE Future Enhancements
Lcu14 306 - OP-TEE Future Enhancements
Linaro
 
HKG15-311: OP-TEE for Beginners and Porting Review
HKG15-311: OP-TEE for Beginners and Porting ReviewHKG15-311: OP-TEE for Beginners and Porting Review
HKG15-311: OP-TEE for Beginners and Porting Review
Linaro
 
LAS16-111: Easing Access to ARM TrustZone – OP-TEE and Raspberry Pi 3
LAS16-111: Easing Access to ARM TrustZone – OP-TEE and Raspberry Pi 3LAS16-111: Easing Access to ARM TrustZone – OP-TEE and Raspberry Pi 3
LAS16-111: Easing Access to ARM TrustZone – OP-TEE and Raspberry Pi 3
Linaro
 
Workshop su Android Kernel Hacking
Workshop su Android Kernel HackingWorkshop su Android Kernel Hacking
Workshop su Android Kernel Hacking
Develer S.r.l.
 
Java mission control and java flight recorder
Java mission control and java flight recorderJava mission control and java flight recorder
Java mission control and java flight recorder
Wolfgang Weigend
 
Advanced NDISTest options
Advanced NDISTest optionsAdvanced NDISTest options
Advanced NDISTest options
Yan Vugenfirer
 
HKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with CoresightHKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with Coresight
Linaro
 
Persistent Bios Infection
Persistent Bios InfectionPersistent Bios Infection
Persistent Bios Infection
guest042636
 
Persistent BIOS Infection
Persistent BIOS InfectionPersistent BIOS Infection
Persistent BIOS Infection
guest042636
 
Testing CAN network with help of CANToolz
Testing CAN network with help of CANToolzTesting CAN network with help of CANToolz
Testing CAN network with help of CANToolz
Alexey Sintsov
 
Power management android
Power management androidPower management android
Power management android
Adhithyan Vijayakumar
 
ACPI and FreeBSD (Part 2)
ACPI and FreeBSD (Part 2)ACPI and FreeBSD (Part 2)
ACPI and FreeBSD (Part 2)
Nate Lawson
 
Mandriva 2011 x86_64 rpm.lst
Mandriva 2011 x86_64 rpm.lstMandriva 2011 x86_64 rpm.lst
Mandriva 2011 x86_64 rpm.lst
St Louis MUG
 
ACPI and FreeBSD (Part 1)
ACPI and FreeBSD (Part 1)ACPI and FreeBSD (Part 1)
ACPI and FreeBSD (Part 1)
Nate Lawson
 
I/O仮想化最前線〜ネットワークI/Oを中心に〜
I/O仮想化最前線〜ネットワークI/Oを中心に〜I/O仮想化最前線〜ネットワークI/Oを中心に〜
I/O仮想化最前線〜ネットワークI/Oを中心に〜
Ryousei Takano
 
Insecure Obsolete and Trivial - The Real IOT
Insecure Obsolete and Trivial - The Real IOTInsecure Obsolete and Trivial - The Real IOT
Insecure Obsolete and Trivial - The Real IOT
Price McDonald
 
LCU14 500 ARM Trusted Firmware
LCU14 500 ARM Trusted FirmwareLCU14 500 ARM Trusted Firmware
LCU14 500 ARM Trusted Firmware
Linaro
 
Note - (EDK2) Acpi Tables Compile and Install
Note - (EDK2) Acpi Tables Compile and InstallNote - (EDK2) Acpi Tables Compile and Install
Note - (EDK2) Acpi Tables Compile and Install
boyw165
 
Embedded Recipes 2019 - Introduction to JTAG debugging
Embedded Recipes 2019 - Introduction to JTAG debuggingEmbedded Recipes 2019 - Introduction to JTAG debugging
Embedded Recipes 2019 - Introduction to JTAG debugging
Anne Nicolas
 
HCK-CI: Enabling CI for Windows Guest Paravirtualized Drivers (KVM Forum 2021)
HCK-CI: Enabling CI for Windows Guest Paravirtualized Drivers (KVM Forum 2021)HCK-CI: Enabling CI for Windows Guest Paravirtualized Drivers (KVM Forum 2021)
HCK-CI: Enabling CI for Windows Guest Paravirtualized Drivers (KVM Forum 2021)
KostiantynKostiuk
 
Lcu14 306 - OP-TEE Future Enhancements
Lcu14 306 - OP-TEE Future EnhancementsLcu14 306 - OP-TEE Future Enhancements
Lcu14 306 - OP-TEE Future Enhancements
Linaro
 
HKG15-311: OP-TEE for Beginners and Porting Review
HKG15-311: OP-TEE for Beginners and Porting ReviewHKG15-311: OP-TEE for Beginners and Porting Review
HKG15-311: OP-TEE for Beginners and Porting Review
Linaro
 
LAS16-111: Easing Access to ARM TrustZone – OP-TEE and Raspberry Pi 3
LAS16-111: Easing Access to ARM TrustZone – OP-TEE and Raspberry Pi 3LAS16-111: Easing Access to ARM TrustZone – OP-TEE and Raspberry Pi 3
LAS16-111: Easing Access to ARM TrustZone – OP-TEE and Raspberry Pi 3
Linaro
 
Workshop su Android Kernel Hacking
Workshop su Android Kernel HackingWorkshop su Android Kernel Hacking
Workshop su Android Kernel Hacking
Develer S.r.l.
 
Java mission control and java flight recorder
Java mission control and java flight recorderJava mission control and java flight recorder
Java mission control and java flight recorder
Wolfgang Weigend
 
Advanced NDISTest options
Advanced NDISTest optionsAdvanced NDISTest options
Advanced NDISTest options
Yan Vugenfirer
 
HKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with CoresightHKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with Coresight
Linaro
 
Persistent Bios Infection
Persistent Bios InfectionPersistent Bios Infection
Persistent Bios Infection
guest042636
 
Persistent BIOS Infection
Persistent BIOS InfectionPersistent BIOS Infection
Persistent BIOS Infection
guest042636
 
Testing CAN network with help of CANToolz
Testing CAN network with help of CANToolzTesting CAN network with help of CANToolz
Testing CAN network with help of CANToolz
Alexey Sintsov
 
ACPI and FreeBSD (Part 2)
ACPI and FreeBSD (Part 2)ACPI and FreeBSD (Part 2)
ACPI and FreeBSD (Part 2)
Nate Lawson
 
Mandriva 2011 x86_64 rpm.lst
Mandriva 2011 x86_64 rpm.lstMandriva 2011 x86_64 rpm.lst
Mandriva 2011 x86_64 rpm.lst
St Louis MUG
 
ACPI and FreeBSD (Part 1)
ACPI and FreeBSD (Part 1)ACPI and FreeBSD (Part 1)
ACPI and FreeBSD (Part 1)
Nate Lawson
 
I/O仮想化最前線〜ネットワークI/Oを中心に〜
I/O仮想化最前線〜ネットワークI/Oを中心に〜I/O仮想化最前線〜ネットワークI/Oを中心に〜
I/O仮想化最前線〜ネットワークI/Oを中心に〜
Ryousei Takano
 
Insecure Obsolete and Trivial - The Real IOT
Insecure Obsolete and Trivial - The Real IOTInsecure Obsolete and Trivial - The Real IOT
Insecure Obsolete and Trivial - The Real IOT
Price McDonald
 
LCU14 500 ARM Trusted Firmware
LCU14 500 ARM Trusted FirmwareLCU14 500 ARM Trusted Firmware
LCU14 500 ARM Trusted Firmware
Linaro
 
Note - (EDK2) Acpi Tables Compile and Install
Note - (EDK2) Acpi Tables Compile and InstallNote - (EDK2) Acpi Tables Compile and Install
Note - (EDK2) Acpi Tables Compile and Install
boyw165
 

Viewers also liked (12)

Understanding Modern Device Drivers
Understanding Modern Device DriversUnderstanding Modern Device Drivers
Understanding Modern Device Drivers
asimkadav
 
Video (1. dio)
Video (1. dio)Video (1. dio)
Video (1. dio)
rdusper
 
Servisi internetu
Servisi internetuServisi internetu
Servisi internetu
Woftcraft
 
Каталог Elixir AW 14-1
Каталог Elixir AW 14-1Каталог Elixir AW 14-1
Каталог Elixir AW 14-1
Yulia Bakman
 
Tolerating Hardware Device Failures in Software
Tolerating Hardware Device Failures in SoftwareTolerating Hardware Device Failures in Software
Tolerating Hardware Device Failures in Software
asimkadav
 
Live Migration of Direct-Access Devices
Live Migration of Direct-Access DevicesLive Migration of Direct-Access Devices
Live Migration of Direct-Access Devices
asimkadav
 
2014 trendy items
2014 trendy items2014 trendy items
2014 trendy items
Chris Zarelli
 
Grafika 2. dio (MMS 2014)
Grafika 2. dio (MMS 2014)Grafika 2. dio (MMS 2014)
Grafika 2. dio (MMS 2014)
rdusper
 
Parrot black is back tools available (2)
Parrot black is back tools available (2)Parrot black is back tools available (2)
Parrot black is back tools available (2)
everyoneplus
 
Understanding and Improving Device Access Complexity
Understanding and Improving Device Access ComplexityUnderstanding and Improving Device Access Complexity
Understanding and Improving Device Access Complexity
asimkadav
 
Presentation1 sintaa MACAM MACAM WEB BROWSER
Presentation1 sintaa MACAM MACAM WEB BROWSERPresentation1 sintaa MACAM MACAM WEB BROWSER
Presentation1 sintaa MACAM MACAM WEB BROWSER
Suharyanto Suharyanto
 
Spring introduction
Spring introductionSpring introduction
Spring introduction
Narasimha Dts
 
Understanding Modern Device Drivers
Understanding Modern Device DriversUnderstanding Modern Device Drivers
Understanding Modern Device Drivers
asimkadav
 
Video (1. dio)
Video (1. dio)Video (1. dio)
Video (1. dio)
rdusper
 
Servisi internetu
Servisi internetuServisi internetu
Servisi internetu
Woftcraft
 
Каталог Elixir AW 14-1
Каталог Elixir AW 14-1Каталог Elixir AW 14-1
Каталог Elixir AW 14-1
Yulia Bakman
 
Tolerating Hardware Device Failures in Software
Tolerating Hardware Device Failures in SoftwareTolerating Hardware Device Failures in Software
Tolerating Hardware Device Failures in Software
asimkadav
 
Live Migration of Direct-Access Devices
Live Migration of Direct-Access DevicesLive Migration of Direct-Access Devices
Live Migration of Direct-Access Devices
asimkadav
 
Grafika 2. dio (MMS 2014)
Grafika 2. dio (MMS 2014)Grafika 2. dio (MMS 2014)
Grafika 2. dio (MMS 2014)
rdusper
 
Parrot black is back tools available (2)
Parrot black is back tools available (2)Parrot black is back tools available (2)
Parrot black is back tools available (2)
everyoneplus
 
Understanding and Improving Device Access Complexity
Understanding and Improving Device Access ComplexityUnderstanding and Improving Device Access Complexity
Understanding and Improving Device Access Complexity
asimkadav
 
Presentation1 sintaa MACAM MACAM WEB BROWSER
Presentation1 sintaa MACAM MACAM WEB BROWSERPresentation1 sintaa MACAM MACAM WEB BROWSER
Presentation1 sintaa MACAM MACAM WEB BROWSER
Suharyanto Suharyanto
 
Ad

Similar to Fine-grained fault tolerance using device checkpoints (20)

Faults inside System Software
Faults inside System SoftwareFaults inside System Software
Faults inside System Software
National Cheng Kung University
 
Andy Davis' Black Hat USA Presentation Revealing embedded fingerprints
Andy Davis' Black Hat USA Presentation Revealing embedded fingerprintsAndy Davis' Black Hat USA Presentation Revealing embedded fingerprints
Andy Davis' Black Hat USA Presentation Revealing embedded fingerprints
NCC Group
 
Understanding and Protecting Ourselves from Buggy Device Drivers
Understanding and Protecting Ourselves from Buggy Device DriversUnderstanding and Protecting Ourselves from Buggy Device Drivers
Understanding and Protecting Ourselves from Buggy Device Drivers
Pradeeban Kathiravelu, Ph.D.
 
Hardware Probing in the Linux Kernel
Hardware Probing in the Linux KernelHardware Probing in the Linux Kernel
Hardware Probing in the Linux Kernel
Kernel TLV
 
580 584
580 584580 584
580 584
Editor IJARCET
 
Project ACRN expose and pass through platform hidden PCIe devices to SOS
Project ACRN expose and pass through platform hidden PCIe devices to SOSProject ACRN expose and pass through platform hidden PCIe devices to SOS
Project ACRN expose and pass through platform hidden PCIe devices to SOS
Project ACRN
 
DeviceDriverNov18.ppt
DeviceDriverNov18.pptDeviceDriverNov18.ppt
DeviceDriverNov18.ppt
TerrenceRamirez1
 
Project ACRN Device Passthrough Introduction
Project ACRN Device Passthrough IntroductionProject ACRN Device Passthrough Introduction
Project ACRN Device Passthrough Introduction
Project ACRN
 
Develop Your Own Operating Systems using Cheap ARM Boards
Develop Your Own Operating Systems using Cheap ARM BoardsDevelop Your Own Operating Systems using Cheap ARM Boards
Develop Your Own Operating Systems using Cheap ARM Boards
National Cheng Kung University
 
15 storage
15 storage15 storage
15 storage
Arun Kumar M
 
Acceleration of Statistical Detection of Zero-day Malware in the Memory Dump ...
Acceleration of Statistical Detection of Zero-day Malware in the Memory Dump ...Acceleration of Statistical Detection of Zero-day Malware in the Memory Dump ...
Acceleration of Statistical Detection of Zero-day Malware in the Memory Dump ...
Igor Korkin
 
Module 13 - Troubleshooting
Module 13 - TroubleshootingModule 13 - Troubleshooting
Module 13 - Troubleshooting
T. J. Saotome
 
04_Bolen-and-Ballard_PCIe-Hot-Plug-and-Error-Handling-for-NVMe_Final-3.13-apb...
04_Bolen-and-Ballard_PCIe-Hot-Plug-and-Error-Handling-for-NVMe_Final-3.13-apb...04_Bolen-and-Ballard_PCIe-Hot-Plug-and-Error-Handling-for-NVMe_Final-3.13-apb...
04_Bolen-and-Ballard_PCIe-Hot-Plug-and-Error-Handling-for-NVMe_Final-3.13-apb...
ssuser1c8ca21
 
Understanding Windows NT Internals - Part 3
Understanding Windows NT Internals - Part 3Understanding Windows NT Internals - Part 3
Understanding Windows NT Internals - Part 3
Arun Seetharaman
 
PCI Drivers
PCI DriversPCI Drivers
PCI Drivers
Anil Kumar Pugalia
 
New Approaches to Enhance OS Security
New Approaches to Enhance OS SecurityNew Approaches to Enhance OS Security
New Approaches to Enhance OS Security
Pradeeban Kathiravelu, Ph.D.
 
The AV says: Your Hardware Definitions were Updated!
The AV says: Your Hardware Definitions were Updated!The AV says: Your Hardware Definitions were Updated!
The AV says: Your Hardware Definitions were Updated!
Marcus Botacin
 
What if you can’t trust your network card? (slides)
What if you can’t trust your network card? (slides)What if you can’t trust your network card? (slides)
What if you can’t trust your network card? (slides)
TWD Industries AG
 
Michele Dionisio & Pietro Lorefice - Developing and testing a device driver w...
Michele Dionisio & Pietro Lorefice - Developing and testing a device driver w...Michele Dionisio & Pietro Lorefice - Developing and testing a device driver w...
Michele Dionisio & Pietro Lorefice - Developing and testing a device driver w...
linuxlab_conf
 
SYBSC IT SEM IV EMBEDDED SYSTEMS UNIT II Embedded Systems Peripherals
SYBSC IT SEM IV EMBEDDED SYSTEMS UNIT II  Embedded Systems PeripheralsSYBSC IT SEM IV EMBEDDED SYSTEMS UNIT II  Embedded Systems Peripherals
SYBSC IT SEM IV EMBEDDED SYSTEMS UNIT II Embedded Systems Peripherals
Arti Parab Academics
 
Andy Davis' Black Hat USA Presentation Revealing embedded fingerprints
Andy Davis' Black Hat USA Presentation Revealing embedded fingerprintsAndy Davis' Black Hat USA Presentation Revealing embedded fingerprints
Andy Davis' Black Hat USA Presentation Revealing embedded fingerprints
NCC Group
 
Understanding and Protecting Ourselves from Buggy Device Drivers
Understanding and Protecting Ourselves from Buggy Device DriversUnderstanding and Protecting Ourselves from Buggy Device Drivers
Understanding and Protecting Ourselves from Buggy Device Drivers
Pradeeban Kathiravelu, Ph.D.
 
Hardware Probing in the Linux Kernel
Hardware Probing in the Linux KernelHardware Probing in the Linux Kernel
Hardware Probing in the Linux Kernel
Kernel TLV
 
Project ACRN expose and pass through platform hidden PCIe devices to SOS
Project ACRN expose and pass through platform hidden PCIe devices to SOSProject ACRN expose and pass through platform hidden PCIe devices to SOS
Project ACRN expose and pass through platform hidden PCIe devices to SOS
Project ACRN
 
Project ACRN Device Passthrough Introduction
Project ACRN Device Passthrough IntroductionProject ACRN Device Passthrough Introduction
Project ACRN Device Passthrough Introduction
Project ACRN
 
Develop Your Own Operating Systems using Cheap ARM Boards
Develop Your Own Operating Systems using Cheap ARM BoardsDevelop Your Own Operating Systems using Cheap ARM Boards
Develop Your Own Operating Systems using Cheap ARM Boards
National Cheng Kung University
 
Acceleration of Statistical Detection of Zero-day Malware in the Memory Dump ...
Acceleration of Statistical Detection of Zero-day Malware in the Memory Dump ...Acceleration of Statistical Detection of Zero-day Malware in the Memory Dump ...
Acceleration of Statistical Detection of Zero-day Malware in the Memory Dump ...
Igor Korkin
 
Module 13 - Troubleshooting
Module 13 - TroubleshootingModule 13 - Troubleshooting
Module 13 - Troubleshooting
T. J. Saotome
 
04_Bolen-and-Ballard_PCIe-Hot-Plug-and-Error-Handling-for-NVMe_Final-3.13-apb...
04_Bolen-and-Ballard_PCIe-Hot-Plug-and-Error-Handling-for-NVMe_Final-3.13-apb...04_Bolen-and-Ballard_PCIe-Hot-Plug-and-Error-Handling-for-NVMe_Final-3.13-apb...
04_Bolen-and-Ballard_PCIe-Hot-Plug-and-Error-Handling-for-NVMe_Final-3.13-apb...
ssuser1c8ca21
 
Understanding Windows NT Internals - Part 3
Understanding Windows NT Internals - Part 3Understanding Windows NT Internals - Part 3
Understanding Windows NT Internals - Part 3
Arun Seetharaman
 
The AV says: Your Hardware Definitions were Updated!
The AV says: Your Hardware Definitions were Updated!The AV says: Your Hardware Definitions were Updated!
The AV says: Your Hardware Definitions were Updated!
Marcus Botacin
 
What if you can’t trust your network card? (slides)
What if you can’t trust your network card? (slides)What if you can’t trust your network card? (slides)
What if you can’t trust your network card? (slides)
TWD Industries AG
 
Michele Dionisio & Pietro Lorefice - Developing and testing a device driver w...
Michele Dionisio & Pietro Lorefice - Developing and testing a device driver w...Michele Dionisio & Pietro Lorefice - Developing and testing a device driver w...
Michele Dionisio & Pietro Lorefice - Developing and testing a device driver w...
linuxlab_conf
 
SYBSC IT SEM IV EMBEDDED SYSTEMS UNIT II Embedded Systems Peripherals
SYBSC IT SEM IV EMBEDDED SYSTEMS UNIT II  Embedded Systems PeripheralsSYBSC IT SEM IV EMBEDDED SYSTEMS UNIT II  Embedded Systems Peripherals
SYBSC IT SEM IV EMBEDDED SYSTEMS UNIT II Embedded Systems Peripherals
Arti Parab Academics
 
Ad

Recently uploaded (20)

Talk at INFN CCR Workshop on "Quantum Computing Simulation on FPGA"
Talk at INFN CCR Workshop on "Quantum Computing Simulation on FPGA"Talk at INFN CCR Workshop on "Quantum Computing Simulation on FPGA"
Talk at INFN CCR Workshop on "Quantum Computing Simulation on FPGA"
Mirko Mariotti
 
Schizophrenia AB (4) notes post graduate degree
Schizophrenia AB (4) notes post graduate degreeSchizophrenia AB (4) notes post graduate degree
Schizophrenia AB (4) notes post graduate degree
aiswaryavvg
 
Chapter 3 File and Folder Management.pptx
Chapter 3 File and Folder Management.pptxChapter 3 File and Folder Management.pptx
Chapter 3 File and Folder Management.pptx
deejayrakib5
 
IPA Science Quiz for competitions related
IPA Science Quiz for competitions relatedIPA Science Quiz for competitions related
IPA Science Quiz for competitions related
vkheraj
 
Introduction-to-Water-Pollution1234.pptx
Introduction-to-Water-Pollution1234.pptxIntroduction-to-Water-Pollution1234.pptx
Introduction-to-Water-Pollution1234.pptx
shilpakadf
 
Drying in hte pharmaceutical ptocesses technology
Drying in hte pharmaceutical ptocesses technologyDrying in hte pharmaceutical ptocesses technology
Drying in hte pharmaceutical ptocesses technology
Procter & Gamble International Operations
 
Electric Circuit Simulation With QSPICE (IEEE Student Branch Workshop at OVGU...
Electric Circuit Simulation With QSPICE (IEEE Student Branch Workshop at OVGU...Electric Circuit Simulation With QSPICE (IEEE Student Branch Workshop at OVGU...
Electric Circuit Simulation With QSPICE (IEEE Student Branch Workshop at OVGU...
Mathias Magdowski
 
Forensic-Photography-Preliminary Module.pptx
Forensic-Photography-Preliminary Module.pptxForensic-Photography-Preliminary Module.pptx
Forensic-Photography-Preliminary Module.pptx
JayarrLlagas1
 
The Lecture Include code of practice.pptx
The Lecture Include code of practice.pptxThe Lecture Include code of practice.pptx
The Lecture Include code of practice.pptx
KashiAli7
 
Comparative anatomy of brain of different types of vertebrates
Comparative anatomy of brain of different types of vertebratesComparative anatomy of brain of different types of vertebrates
Comparative anatomy of brain of different types of vertebrates
arpanmaji480
 
Toward Understanding Representation Methods in Visualization Recommendations ...
Toward Understanding Representation Methods in Visualization Recommendations ...Toward Understanding Representation Methods in Visualization Recommendations ...
Toward Understanding Representation Methods in Visualization Recommendations ...
sehilyi
 
Comparative Layouts Revisited: Design Space, Guidelines, and Future Directions
Comparative Layouts Revisited: Design Space, Guidelines, and Future DirectionsComparative Layouts Revisited: Design Space, Guidelines, and Future Directions
Comparative Layouts Revisited: Design Space, Guidelines, and Future Directions
sehilyi
 
Class 2-Def symp and N.ppt .soil science
Class 2-Def symp and N.ppt .soil scienceClass 2-Def symp and N.ppt .soil science
Class 2-Def symp and N.ppt .soil science
anithadasappara2001
 
Agriculture fruit production for crrot candy amala candy banana sudi stem borrer
Agriculture fruit production for crrot candy amala candy banana sudi stem borrerAgriculture fruit production for crrot candy amala candy banana sudi stem borrer
Agriculture fruit production for crrot candy amala candy banana sudi stem borrer
kanopatel8840
 
Telehealth For Maternal and Child Health: Expanding Access (www.kiu.ac.ug)
Telehealth For Maternal and Child Health: Expanding  Access (www.kiu.ac.ug)Telehealth For Maternal and Child Health: Expanding  Access (www.kiu.ac.ug)
Telehealth For Maternal and Child Health: Expanding Access (www.kiu.ac.ug)
publication11
 
BOT-245_Breeding_of_Field_and_Horticultural_Crops.pdf
BOT-245_Breeding_of_Field_and_Horticultural_Crops.pdfBOT-245_Breeding_of_Field_and_Horticultural_Crops.pdf
BOT-245_Breeding_of_Field_and_Horticultural_Crops.pdf
zap6635
 
The COCONUT Natural Products Database, Talk at ICCS 2025
The COCONUT Natural Products Database, Talk at ICCS 2025The COCONUT Natural Products Database, Talk at ICCS 2025
The COCONUT Natural Products Database, Talk at ICCS 2025
Christoph Steinbeck
 
Grammar-Based 
Interactive Visualization of Genomics Data
Grammar-Based 
Interactive Visualization of Genomics DataGrammar-Based 
Interactive Visualization of Genomics Data
Grammar-Based 
Interactive Visualization of Genomics Data
sehilyi
 
Kenneth S. Krane - Introductory Nuclear Physics (1988, John Wiley & Sons, Inc...
Kenneth S. Krane - Introductory Nuclear Physics (1988, John Wiley & Sons, Inc...Kenneth S. Krane - Introductory Nuclear Physics (1988, John Wiley & Sons, Inc...
Kenneth S. Krane - Introductory Nuclear Physics (1988, John Wiley & Sons, Inc...
slayer420op
 
SmapleProductionofamylaseinscienscedemo3.pptx
SmapleProductionofamylaseinscienscedemo3.pptxSmapleProductionofamylaseinscienscedemo3.pptx
SmapleProductionofamylaseinscienscedemo3.pptx
khanmohdaarif001
 
Talk at INFN CCR Workshop on "Quantum Computing Simulation on FPGA"
Talk at INFN CCR Workshop on "Quantum Computing Simulation on FPGA"Talk at INFN CCR Workshop on "Quantum Computing Simulation on FPGA"
Talk at INFN CCR Workshop on "Quantum Computing Simulation on FPGA"
Mirko Mariotti
 
Schizophrenia AB (4) notes post graduate degree
Schizophrenia AB (4) notes post graduate degreeSchizophrenia AB (4) notes post graduate degree
Schizophrenia AB (4) notes post graduate degree
aiswaryavvg
 
Chapter 3 File and Folder Management.pptx
Chapter 3 File and Folder Management.pptxChapter 3 File and Folder Management.pptx
Chapter 3 File and Folder Management.pptx
deejayrakib5
 
IPA Science Quiz for competitions related
IPA Science Quiz for competitions relatedIPA Science Quiz for competitions related
IPA Science Quiz for competitions related
vkheraj
 
Introduction-to-Water-Pollution1234.pptx
Introduction-to-Water-Pollution1234.pptxIntroduction-to-Water-Pollution1234.pptx
Introduction-to-Water-Pollution1234.pptx
shilpakadf
 
Electric Circuit Simulation With QSPICE (IEEE Student Branch Workshop at OVGU...
Electric Circuit Simulation With QSPICE (IEEE Student Branch Workshop at OVGU...Electric Circuit Simulation With QSPICE (IEEE Student Branch Workshop at OVGU...
Electric Circuit Simulation With QSPICE (IEEE Student Branch Workshop at OVGU...
Mathias Magdowski
 
Forensic-Photography-Preliminary Module.pptx
Forensic-Photography-Preliminary Module.pptxForensic-Photography-Preliminary Module.pptx
Forensic-Photography-Preliminary Module.pptx
JayarrLlagas1
 
The Lecture Include code of practice.pptx
The Lecture Include code of practice.pptxThe Lecture Include code of practice.pptx
The Lecture Include code of practice.pptx
KashiAli7
 
Comparative anatomy of brain of different types of vertebrates
Comparative anatomy of brain of different types of vertebratesComparative anatomy of brain of different types of vertebrates
Comparative anatomy of brain of different types of vertebrates
arpanmaji480
 
Toward Understanding Representation Methods in Visualization Recommendations ...
Toward Understanding Representation Methods in Visualization Recommendations ...Toward Understanding Representation Methods in Visualization Recommendations ...
Toward Understanding Representation Methods in Visualization Recommendations ...
sehilyi
 
Comparative Layouts Revisited: Design Space, Guidelines, and Future Directions
Comparative Layouts Revisited: Design Space, Guidelines, and Future DirectionsComparative Layouts Revisited: Design Space, Guidelines, and Future Directions
Comparative Layouts Revisited: Design Space, Guidelines, and Future Directions
sehilyi
 
Class 2-Def symp and N.ppt .soil science
Class 2-Def symp and N.ppt .soil scienceClass 2-Def symp and N.ppt .soil science
Class 2-Def symp and N.ppt .soil science
anithadasappara2001
 
Agriculture fruit production for crrot candy amala candy banana sudi stem borrer
Agriculture fruit production for crrot candy amala candy banana sudi stem borrerAgriculture fruit production for crrot candy amala candy banana sudi stem borrer
Agriculture fruit production for crrot candy amala candy banana sudi stem borrer
kanopatel8840
 
Telehealth For Maternal and Child Health: Expanding Access (www.kiu.ac.ug)
Telehealth For Maternal and Child Health: Expanding  Access (www.kiu.ac.ug)Telehealth For Maternal and Child Health: Expanding  Access (www.kiu.ac.ug)
Telehealth For Maternal and Child Health: Expanding Access (www.kiu.ac.ug)
publication11
 
BOT-245_Breeding_of_Field_and_Horticultural_Crops.pdf
BOT-245_Breeding_of_Field_and_Horticultural_Crops.pdfBOT-245_Breeding_of_Field_and_Horticultural_Crops.pdf
BOT-245_Breeding_of_Field_and_Horticultural_Crops.pdf
zap6635
 
The COCONUT Natural Products Database, Talk at ICCS 2025
The COCONUT Natural Products Database, Talk at ICCS 2025The COCONUT Natural Products Database, Talk at ICCS 2025
The COCONUT Natural Products Database, Talk at ICCS 2025
Christoph Steinbeck
 
Grammar-Based 
Interactive Visualization of Genomics Data
Grammar-Based 
Interactive Visualization of Genomics DataGrammar-Based 
Interactive Visualization of Genomics Data
Grammar-Based 
Interactive Visualization of Genomics Data
sehilyi
 
Kenneth S. Krane - Introductory Nuclear Physics (1988, John Wiley & Sons, Inc...
Kenneth S. Krane - Introductory Nuclear Physics (1988, John Wiley & Sons, Inc...Kenneth S. Krane - Introductory Nuclear Physics (1988, John Wiley & Sons, Inc...
Kenneth S. Krane - Introductory Nuclear Physics (1988, John Wiley & Sons, Inc...
slayer420op
 
SmapleProductionofamylaseinscienscedemo3.pptx
SmapleProductionofamylaseinscienscedemo3.pptxSmapleProductionofamylaseinscienscedemo3.pptx
SmapleProductionofamylaseinscienscedemo3.pptx
khanmohdaarif001
 

Fine-grained fault tolerance using device checkpoints

  • 1. Fine-Grained Fault Tolerance using Device Checkpoints Asim Kadav with Matthew Renzelmann and Michael M. Swift University of Wisconsin-Madison Thursday, March 7, 13
  • 2. The (old) elephant in the room 2 device drivers 3rd party developers + OS kernel Thursday, March 7, 13
  • 3. The (old) elephant in the room 2 device drivers 3rd party developers + OS kernel Thursday, March 7, 13
  • 4. The (old) elephant in the room 2 device drivers 3rd party developers + OS kernel Recipe for disaster Thursday, March 7, 13
  • 5. Improvement System ValidationValidationValidationImprovement System Drivers Bus Classes New functionality Shadow driver migration [OSR09] 1 1 1 RevNIC [Eurosys 10] 1 1 1 Reliability Nooks [SOSP 03] 6 1 2 XFI [ OSDI 06] 2 1 1 CuriOS [OSDI 08] 2 1 2 Type Safety SafeDrive [OSDI 06] 6 2 3 Singularity [Eurosys 06] 1 1 1 Specification Nexus [OSDI 08] 2 1 2 Termite [SOSP 09] 2 1 2 Static analysis tools Windows SDV [Eurosys 06] All All All Coverity [CACM 10] All All All Cocinelle [Eurosys 08] All All All 3 Past work mostly looks at detection and isolation Thursday, March 7, 13
  • 6. Improvement System ValidationValidationValidationImprovement System Drivers Bus Classes New functionality Shadow driver migration [OSR09] 1 1 1 RevNIC [Eurosys 10] 1 1 1 Reliability Nooks [SOSP 03] 6 1 2 XFI [ OSDI 06] 2 1 1 CuriOS [OSDI 08] 2 1 2 Type Safety SafeDrive [OSDI 06] 6 2 3 Singularity [Eurosys 06] 1 1 1 Specification Nexus [OSDI 08] 2 1 2 Termite [SOSP 09] 2 1 2 Static analysis tools Windows SDV [Eurosys 06] All All All Coverity [CACM 10] All All All Cocinelle [Eurosys 08] All All All 3 Large kernel subsystems and validity of few device types result in limited adoption of research solutions Past work mostly looks at detection and isolation Thursday, March 7, 13
  • 7. Improvement System ValidationValidationValidationImprovement System Drivers Bus Classes New functionality Shadow driver migration [OSR09] 1 1 1 RevNIC [Eurosys 10] 1 1 1 Reliability Nooks [SOSP 03] 6 1 2 XFI [ OSDI 06] 2 1 1 CuriOS [OSDI 08] 2 1 2 Type Safety SafeDrive [OSDI 06] 6 2 3 Singularity [Eurosys 06] 1 1 1 Specification Nexus [OSDI 08] 2 1 2 Termite [SOSP 09] 2 1 2 Static analysis tools Windows SDV [Eurosys 06] All All All Coverity [CACM 10] All All All Cocinelle [Eurosys 08] All All All 3 Limited kernel changes + Applicable to lots of drivers => Real Impact Past work mostly looks at detection and isolation Thursday, March 7, 13
  • 8. Improvement System ValidationValidationValidationImprovement System Drivers Bus Classes New functionality Shadow driver migration [OSR09] 1 1 1 RevNIC [Eurosys 10] 1 1 1 Reliability Nooks [SOSP 03] 6 1 2 XFI [ OSDI 06] 2 1 1 CuriOS [OSDI 08] 2 1 2 Type Safety SafeDrive [OSDI 06] 6 2 3 Singularity [Eurosys 06] 1 1 1 Specification Nexus [OSDI 08] 2 1 2 Termite [SOSP 09] 2 1 2 Static analysis tools Windows SDV [Eurosys 06] All All All Coverity [CACM 10] All All All Cocinelle [Eurosys 08] All All All 3 Limited kernel changes + Applicable to lots of drivers => Real Impact Goal: Improve recovery with complete solutions that can be applied to many drivers Past work mostly looks at detection and isolation Thursday, March 7, 13
  • 9. State of the art in recovery: Shadow drivers • Carburizer calls generic recovery service if check fails • Low cost transparent recovery ★ Based on shadow drivers ★ Records state of driver at all times ★ Transparently restarts and replays recorded state on failure Shadow Driver Device Driver Device Taps Driver-Kernel Interface 4 Swift [OSDI ’04] Thursday, March 7, 13
  • 10. Recovery Performance: Device initialization is slow 5 Cold boot hardware, flash device memory Perform EEPROM checsumming Set chipset specific ops Optional self test on boot Allocate driver structures Configure device to working state Device ready for requests Allocate device structures Module registration Map BAR and I/O ports Register device operations Detect chipset capabilities Self test? Self test on boot Cold boot the device Verify EEPROM checksum Set chipset specific ops Allocate driver structures Configure device Device ready for requests★ Multi-second device probe ★ Identify device ★ Cold boot device ★ Setup device/driver structures ★ Configuration/Self-test Thursday, March 7, 13
  • 11. Recovery Performance: Device initialization is slow 5 Cold boot hardware, flash device memory Perform EEPROM checsumming Set chipset specific ops Optional self test on boot Allocate driver structures Configure device to working state Device ready for requests Allocate device structures Module registration Map BAR and I/O ports Register device operations Detect chipset capabilities Self test? Self test on boot Cold boot the device Verify EEPROM checksum Set chipset specific ops Allocate driver structures Configure device Device ready for requests★ Multi-second device probe ★ Identify device ★ Cold boot device ★ Setup device/driver structures ★ Configuration/Self-test ★ What does it hurt? ★ Fault tolerance: Driver recovery ★ Virtualization: Live migration, cloning, consolidation ★ OS functions: Boot, upgrade, NVM checkpoints Thursday, March 7, 13
  • 12. Driver Code Characteristics 6 ★ “Understanding Modern Device Drivers” ASPLOS 2012 Thursday, March 7, 13
  • 13. 0 10 20 30 40 50 Percent- age of LOC uwb net infiniband atm scsi mtd md ide block ata watchdog video tty sound serial pnp platform parport misc message media leds isdn input hwmon hid gpu gpio firewire edac crypto char cdrom bluetooth acpi init cleanup ioctl config power error proc core intr 0 10 20 30 40 50 Percent- age of LOC chardriversblockdriversnetdrivers Driver Code Characteristics 6 ★ “Understanding Modern Device Drivers” ASPLOS 2012 Thursday, March 7, 13
  • 14. 0 10 20 30 40 50 Percent- age of LOC uwb net infiniband atm scsi mtd md ide block ata watchdog video tty sound serial pnp platform parport misc message media leds isdn input hwmon hid gpu gpio firewire edac crypto char cdrom bluetooth acpi init cleanup ioctl config power error proc core intr 0 10 20 30 40 50 Percent- age of LOC chardriversblockdriversnetdrivers ★Initialization/cleanup – 36% ★Core I/O & interrupts – 23% ★Device configuration – 15% ★Power management – 7.4% ★Device ioctl – 6.2% Driver Code Characteristics 6 ★ “Understanding Modern Device Drivers” ASPLOS 2012 Thursday, March 7, 13
  • 15. 0 10 20 30 40 50 Percent- age of LOC uwb net infiniband atm scsi mtd md ide block ata watchdog video tty sound serial pnp platform parport misc message media leds isdn input hwmon hid gpu gpio firewire edac crypto char cdrom bluetooth acpi init cleanup ioctl config power error proc core intr 0 10 20 30 40 50 Percent- age of LOC chardriversblockdriversnetdrivers ★Initialization/cleanup – 36% ★Core I/O & interrupts – 23% ★Device configuration – 15% ★Power management – 7.4% ★Device ioctl – 6.2% Driver Code Characteristics Initialization code dominates driver LOC and adds to complexity 6 ★ “Understanding Modern Device Drivers” ASPLOS 2012 Thursday, March 7, 13
  • 16. Recovery works by interposing class defined entry points 7 ★ Class definition includes: ★ Callbacks registered with the bus, device and kernel subsystem network driver bus net device subsystem kernel probe xmit config network card Thursday, March 7, 13
  • 17. Recovery works by interposing class defined entry points 7 How many drivers follow class behavior? ★ Class definition includes: ★ Callbacks registered with the bus, device and kernel subsystem network driver bus net device subsystem kernel probe xmit config network card Thursday, March 7, 13
  • 18. Restart/replay doesn’t work with all drivers ★ Non-class behavior stems from: - Load time parameters, procfs and sysfs interactions, unique ioctls ...  qlcnic_sysfs_write_esw_config  (...)     {    ...   switch  (esw_cfg[i].op_mode)  {   case  QLCNIC_PORT_DEFAULTS:                                       qlcnic_set_eswitch_...(...,&esw_cfg[i]);                        ...   case  QLCNIC_ADD_VLAN:                        qlcnic_set_vlan_config(...,&esw_cfg[i]);                        ...   case  QLCNIC_DEL_VLAN:                        esw_cfg[i].vlan_id  =  0;                        qlcnic_set_vlan_config(...,&esw_cfg[i]);                        ... Drivers/net/qlcnic/qlcnic_main.c:  Qlogic  driver(network  class) 8★ “Understanding Modern Device Drivers” ASPLOS 2012 Thursday, March 7, 13
  • 19. Restart/replay doesn’t work with all drivers ★ Non-class behavior stems from: - Load time parameters, procfs and sysfs interactions, unique ioctls 8 ★ Results as measured by our analyses: ★ 36% of drivers use load time parameters ★ 16% of drivers use proc /sysfs support ★ Overall, 44% of drivers do not conform to class behavior and recovery will not work correctly for these drivers ★ “Understanding Modern Device Drivers” ASPLOS 2012 Thursday, March 7, 13
  • 20. Limitations of restart/replay recovery Shadow Driver Device Driver Device Taps Driver-Kernel Interface 9 ★ Device save/restore limited to restart/replay ★ Slow: Device initialization is complex (multiple seconds) ★ Incomplete: Unique device semantics not captured ★ Hard: Need to be written for every class of drivers ★ Large changes: Introduces new large kernel subsystems Thursday, March 7, 13
  • 21. Limitations of restart/replay recovery Shadow Driver Device Driver Device Taps Driver-Kernel Interface 9 ★ Device save/restore limited to restart/replay ★ Slow: Device initialization is complex (multiple seconds) ★ Incomplete: Unique device semantics not captured ★ Hard: Need to be written for every class of drivers ★ Large changes: Introduces new large kernel subsystems Checkpoint/restore of device and driver state removes the need to reboot device and replay state Thursday, March 7, 13
  • 22. Fine-Grained Fault Tolerance (FGFT) 10 Goal: Fault isolation and recovery system based on “pay as you go” failure model Fine-Grained Isolation ★ Ability to run select entry points as transactions Checkpoint based recovery ★ Provides fast and correct recovery semantics ★ Requires incremental changes to drivers and has low overhead Thursday, March 7, 13
  • 25. FGFT overview If  (c==0)  { . print  (“Driver   init”); } . . Driver with checkpoint support Static modifications 13 Thursday, March 7, 13
  • 26. FGFT overview If  (c==0)  { . print  (“Driver   init”); } . . Driver with checkpoint support Static modifications 13 User supplied annotations Source transformation (adds driver transactions) Thursday, March 7, 13
  • 27. FGFT overview If  (c==0)  { . print  (“Driver   init”); } . . Driver with checkpoint support Static modifications 13 If  (c==0)  { . print  (“Driver   init”); } . . If  (c==0)  { . print  (“Driver   init”); } . . User supplied annotations Source transformation (adds driver transactions) Main driver module SFI driver module SFI = software fault isolated Thursday, March 7, 13
  • 28. FGFT overview If  (c==0)  { . print  (“Driver   init”); } . . Driver with checkpoint support Static modifications Run-time support 13 If  (c==0)  { . print  (“Driver   init”); } . . If  (c==0)  { . print  (“Driver   init”); } . . User supplied annotations Source transformation (adds driver transactions) Main driver module SFI driver module SFI = software fault isolated Thursday, March 7, 13
  • 29. FGFT overview If  (c==0)  { . print  (“Driver   init”); } . . Driver with checkpoint support Communication and recovery support Static modifications Run-time support 13 If  (c==0)  { . print  (“Driver   init”); } . . If  (c==0)  { . print  (“Driver   init”); } . . 1200 LOC User supplied annotations Source transformation (adds driver transactions) Object tracking Marshaling/ Demarshaling Kernel undo log Main driver module SFI driver module SFI = software fault isolated Thursday, March 7, 13
  • 30. Fault model in FGFT 14 network driver network card probe xmit config ★ Can be applied to untested code, statically and dynamically detected suspicious entry points ★ Detect and recover from: ★ Memory errors like NULL pointer accesses ★ Structural errors like malformed structures ★ Processor exceptions like divide by zero, stack corruption Thursday, March 7, 13
  • 31. Fault model in FGFT 14 ★ Provide fault tolerance to specific driver entry points network driver network card probe xmit config ★ Can be applied to untested code, statically and dynamically detected suspicious entry points ★ Detect and recover from: ★ Memory errors like NULL pointer accesses ★ Structural errors like malformed structures ★ Processor exceptions like divide by zero, stack corruption Thursday, March 7, 13
  • 32. Transactional support through code generation 15 ★ Generate code to run driver invocations on a separate stack with a copy of parameters ★ Reduce copy overhead by copying only referenced fields in driver and kernel structures to a range table ★ Instrument all memory references in SFI module to compare accesses against copied fields in range table Range Table SFI network driver network driver get ringparam netdev->priv->tx_ring netdev->priv->rx_ring Address Access rights 0xffffa000 Read 0xffffa008 Write 0xffffa00a Read Thursday, March 7, 13
  • 33. Resource access during isolated execution 16 ★ Device registers and I/O memory ★ Grant drivers full access to devices ★ Restore device checkpoint in case of failure ★ Locks: Spinlocks and semaphores ★ Grants read access to locks ★ Maintain kernel log of locks acquired ★ Release locks at the end of entry point/failures ★ Kernel resources like memory ★ All allocations generate range table entry ★ Maintain kernel log of all acquired resources ★ Free resources on failures Address Access rights 0xffffa000 Read 0xffffa008 Write 0xffffa00a Read Range Tablemalloc  () Thursday, March 7, 13
  • 35. Checkpointing drivers is hard ★Existing mechanisms limited to capturing memory state 18 network driver network card Thursday, March 7, 13
  • 36. Checkpointing drivers is hard ★Existing mechanisms limited to capturing memory state 18 network driver network card checkpoint Thursday, March 7, 13
  • 37. Checkpointing drivers is hard ★Existing mechanisms limited to capturing memory state 18 network driver network card checkpoint ★ Device state is not captured ★ Device configuration space Thursday, March 7, 13
  • 38. Checkpointing drivers is hard ★Existing mechanisms limited to capturing memory state 18 network driver network card checkpoint ★ Device state is not captured ★ Device configuration space ★ Internal device registers and counters Thursday, March 7, 13
  • 39. Checkpointing drivers is hard ★Existing mechanisms limited to capturing memory state 18 network driver network card checkpoint ★ Device state is not captured ★ Device configuration space ★ Internal device registers and counters ★ Memory buffer addresses used for DMA Thursday, March 7, 13
  • 40. Checkpointing drivers is hard ★Existing mechanisms limited to capturing memory state 18 network driver network card checkpoint ★ Device state is not captured ★ Device configuration space ★ Internal device registers and counters ★ Memory buffer addresses used for DMA ★ Unique for every class, bus and vendor Thursday, March 7, 13
  • 41. Device checkpoint/restore from PM code 19 Save config state Save register state Disable device Copy-out s/w state Suspend device Restore config state Restore register state Restore s/w state & reset Re-attach/Enable device Device Ready Suspend Resume Thursday, March 7, 13
  • 42. Device checkpoint/restore from PM code 19 Save config state Save register state Copy-out s/w state Suspend device Restore config state Restore register state Restore s/w state & reset Re-attach/Enable device Device Ready Suspend Resume Thursday, March 7, 13
  • 43. Device checkpoint/restore from PM code 19 Save config state Save register state Copy-out s/w state Restore config state Restore register state Restore s/w state & reset Re-attach/Enable device Device Ready Suspend Resume Thursday, March 7, 13
  • 44. Device checkpoint/restore from PM code 19 Save config state Save register state Copy-out s/w state Restore config state Restore register state Restore s/w state & reset Re-attach/Enable device Device Ready Suspend Resume Thursday, March 7, 13
  • 45. Device checkpoint/restore from PM code 19 Save config state Save register state Copy-out s/w state Restore config state Restore register state Restore s/w state & reset Re-attach/Enable device Device Ready ResumeCheckpoint Thursday, March 7, 13
  • 46. Device checkpoint/restore from PM code 19 Save config state Save register state Copy-out s/w state Restore config state Restore register state Restore s/w state & reset Re-attach/Enable device ResumeCheckpoint Thursday, March 7, 13
  • 47. Device checkpoint/restore from PM code 19 Save config state Save register state Copy-out s/w state Restore config state Restore register state Restore s/w state & reset ResumeCheckpoint Thursday, March 7, 13
  • 48. Device checkpoint/restore from PM code 19 Save config state Save register state Copy-out s/w state Restore config state Restore register state Restore s/w state & reset RestoreCheckpoint Thursday, March 7, 13
  • 49. Device checkpoint/restore from PM code 19 Save config state Save register state Copy-out s/w state Restore config state Restore register state Restore s/w state & reset Suspend/resume code provides device checkpoint functionality RestoreCheckpoint Thursday, March 7, 13
  • 50. Intuition with power management 20 Thursday, March 7, 13
  • 51. Intuition with power management 20 ★ Intuition: Power management code captures device specific state for every driver ★ Our study: Present in 76% of all common classes Thursday, March 7, 13
  • 52. Intuition with power management 20 ★ Intuition: Power management code captures device specific state for every driver ★ Our study: Present in 76% of all common classes Linux driver Device Thursday, March 7, 13
  • 53. Intuition with power management 20 ★ Intuition: Power management code captures device specific state for every driver ★ Our study: Present in 76% of all common classes Linux driver Device RAM Thursday, March 7, 13
  • 54. Intuition with power management 20 ★ Intuition: Power management code captures device specific state for every driver ★ Our study: Present in 76% of all common classes Linux driver Device RAM suspend() Thursday, March 7, 13
  • 55. Intuition with power management 20 ★ Intuition: Power management code captures device specific state for every driver ★ Our study: Present in 76% of all common classes Linux driver Device RAM suspend() resume() Thursday, March 7, 13
  • 56. Intuition with power management 20 ★ Intuition: Power management code captures device specific state for every driver ★ Our study: Present in 76% of all common classes ★ Refactor power management code for device checkpoints ★ Correct: Developer captures unique device semantics ★ Fast: Avoids probe and latency critical for applications Linux driver Device RAM suspend() resume() Thursday, March 7, 13
  • 57. Synergy of isolation and recovery 21 ★ Goal: Improve driver recovery with minor changes to drivers ★ Solution: Run drivers as transactions using device checkpoints Thursday, March 7, 13
  • 58. Synergy of isolation and recovery 21 ★ Goal: Improve driver recovery with minor changes to drivers ★ Solution: Run drivers as transactions using device checkpoints C R ★ Developers export checkpoint/restore in drivers Device state Thursday, March 7, 13
  • 59. Synergy of isolation and recovery 21 ★ Goal: Improve driver recovery with minor changes to drivers ★ Solution: Run drivers as transactions using device checkpoints C R ★ Developers export checkpoint/restore in drivers Device state Driver state ★ Run drivers invocations as memory transactions ★ Use source transformation to copy parameters and run on separate stack SFI network driver network driver Thursday, March 7, 13
  • 60. Synergy of isolation and recovery 21 ★ Goal: Improve driver recovery with minor changes to drivers ★ Solution: Run drivers as transactions using device checkpoints C R ★ Developers export checkpoint/restore in drivers Device state Driver state ★ Run drivers invocations as memory transactions ★ Use source transformation to copy parameters and run on separate stack SFI network driver network driver Execution model ★ Checkpoint device ★ Execute driver code as memory transactions ★ On failure, rollback and restore device ★ Re-use existing device locks in the driver Thursday, March 7, 13
  • 63. SFI network driver Example transactional execution 22 network driver probe xmit get config get ringparam C netdev Thursday, March 7, 13
  • 64. SFI network driver Example transactional execution 22 network driver netdev->priv->tx_ring probe xmit get config get ringparam netdev->priv->rx_ringC netdev Thursday, March 7, 13
  • 65. SFI network driver Address Access rights 0xffffa000 Read 0xffffa008 Write 0xffffa00a Read Example transactional execution 22 network driver netdev->priv->tx_ring probe xmit get config get ringparam netdev->priv->rx_ring Range Table C netdev Thursday, March 7, 13
  • 66. SFI network driver Address Access rights 0xffffa000 Read 0xffffa008 Write 0xffffa00a Read Example transactional execution 22 network driver netdev->priv->tx_ring probe xmit get config get ringparam netdev->priv->rx_ring Range Table C netdev Thursday, March 7, 13
  • 67. SFI network driver Address Access rights 0xffffa000 Read 0xffffa008 Write 0xffffa00a Read Example transactional execution 22 network driver netdev->priv->tx_ring probe xmit get config get ringparam netdev->priv->rx_ring Range Table C netdev Thursday, March 7, 13
  • 68. SFI network driver Address Access rights 0xffffa000 Read 0xffffa008 Write 0xffffa00a Read Example transactional execution 22 network driver netdev->priv->tx_ring probe xmit get config get ringparam netdev->priv->rx_ring Range Table netdev->priv->tx_ring netdev->priv->rx_ring result Kernel Log alloc C netdev Thursday, March 7, 13
  • 70. SFI network driver Example failed transaction 23 network driver probe xmit get config get ringparam netdev Thursday, March 7, 13
  • 71. SFI network driver Example failed transaction 23 network driver probe xmit get config get ringparam C netdev Thursday, March 7, 13
  • 72. SFI network driver Example failed transaction 23 network driver netdev->priv->tx_ring probe xmit get config get ringparam netdev->priv->rx_ringC netdev Thursday, March 7, 13
  • 73. SFI network driver Address Access rights 0xffffa000 Read 0xffffa008 Write 0xffffa00a Read Example failed transaction 23 network driver netdev->priv->tx_ring probe xmit get config get ringparam netdev->priv->rx_ring Range Table C netdev Thursday, March 7, 13
  • 74. SFI network driver Address Access rights 0xffffa000 Read 0xffffa008 Write 0xffffa00a Read Example failed transaction 23 network driver netdev->priv->tx_ring probe xmit get config get ringparam netdev->priv->rx_ring Range Table C netdev Thursday, March 7, 13
  • 75. SFI network driver Address Access rights 0xffffa000 Read 0xffffa008 Write 0xffffa00a Read Example failed transaction 23 network driver netdev->priv->tx_ring probe xmit get config get ringparam netdev->priv->rx_ring Range Table C netdev Thursday, March 7, 13
  • 76. SFI network driver Address Access rights 0xffffa000 Read 0xffffa008 Write 0xffffa00a Read Example failed transaction 23 network driver netdev->priv->tx_ring probe xmit get config get ringparam netdev->priv->rx_ring Range Table Kernel Log alloc C netdev Thursday, March 7, 13
  • 77. SFI network driver Address Access rights 0xffffa000 Read 0xffffa008 Write 0xffffa00a Read Example failed transaction 23 network driver netdev->priv->tx_ring probe xmit get config get ringparam netdev->priv->rx_ring Range Table Kernel Log alloc C netdev Thursday, March 7, 13
  • 78. SFI network driver Address Access rights 0xffffa000 Read 0xffffa008 Write 0xffffa00a Read Example failed transaction 23 network driver netdev->priv->tx_ring probe xmit get config get ringparam netdev->priv->rx_ring Range Table Kernel Log alloc C netdev Thursday, March 7, 13
  • 79. SFI network driver Address Access rights 0xffffa000 Read 0xffffa008 Write 0xffffa00a Read Example failed transaction 23 network driver netdev->priv->tx_ring probe xmit get config get ringparam netdev->priv->rx_ring Range Table C netdev Thursday, March 7, 13
  • 80. SFI network driver Address Access rights 0xffffa000 Read 0xffffa008 Write 0xffffa00a Read Example failed transaction 23 network driver netdev->priv->tx_ring probe xmit get config get ringparam netdev->priv->rx_ring Range Table err C netdev Thursday, March 7, 13
  • 81. SFI network driver Address Access rights 0xffffa000 Read 0xffffa008 Write 0xffffa00a Read Example failed transaction 23 network driver netdev->priv->tx_ring probe xmit get config get ringparam netdev->priv->rx_ring Range Table err C R netdev Thursday, March 7, 13
  • 82. SFI network driver Address Access rights 0xffffa000 Read 0xffffa008 Write 0xffffa00a Read Example failed transaction 23 network driver netdev->priv->tx_ring probe xmit get config get ringparam netdev->priv->rx_ring Range Table err C R FGFT provides transactional execution of driver entry points netdev Thursday, March 7, 13
  • 83. Outline 24 Introduction Evaluation & Conclusions Fine-grained isolation Checkpoint based recovery Thursday, March 7, 13
  • 84. Outline 25 Introduction Evaluation & Conclusion Fine-grained isolation Checkpoint based recovery Thursday, March 7, 13
  • 85. Recovery speedup Driver Class Bus Restart recovery FGFT recovery Speedup 8139too net PCI 0.31s 70μs 4400 e1000 net PCI 1.80s 295ms 6 r8169 net PCI 0.12s 40μs 3000 pegasus net USB 0.15s 5ms 30 ens1371 sound PCI 1.03s 115ms 9 psmouse input serio 0.68s 410ms 1.65 26 FGFT provides significant speedup in driver recovery Thursday, March 7, 13
  • 86. Static and dynamic fault injection Driver Injected Faults Benign Faults Native Crashes FGFT Crashes 8139too 43 0 43 NONE e1000 47 0 47 NONE r8169 36 0 36 NONE pegasus 34 1 33 NONE ens1371 22 1 21 NONE psmouse 46 0 46 NONE TOTAL 258 2 256 NONE 27 FGFT survives multiple static and dynamic faults Thursday, March 7, 13
  • 87. Programming effort Driver LOC Isolation annotationsIsolation annotations Recovery additionsRecovery additions Driver annotations Kernel annotations LOC Moved LOC Added 8139too 1, 904 15 20 26 4 e1000 13, 973 32 32 10 r8169 2, 993 10 17 5 pegasus 1, 541 26 12 22 5 ens1371 2, 110 23 66 16 6 psmouse 2, 448 11 19 19 6 28 FGFT requires limited programmer effort and needs only 38 lines of new kernel code Thursday, March 7, 13
  • 88. Throughput with isolation and recovery Native FGFT-­‐off-­‐I/O FGFT-­‐I/O-­‐1/2 FGFT-­‐I/O-­‐all netperf on Intel quad-core machines 29 Thursday, March 7, 13
  • 89. Throughput with isolation and recovery 0 25 50 75 100 Throughput%age(Baseline844Mbps) e1000 Network Card Native FGFT-­‐off-­‐I/O FGFT-­‐I/O-­‐1/2 FGFT-­‐I/O-­‐all netperf on Intel quad-core machines 29 Thursday, March 7, 13
  • 90. Throughput with isolation and recovery 0 25 50 75 100 100 Throughput%age(Baseline844Mbps) e1000 Network Card Native FGFT-­‐off-­‐I/O FGFT-­‐I/O-­‐1/2 FGFT-­‐I/O-­‐all netperf on Intel quad-core machines 29 CPU: 2.4% Thursday, March 7, 13
  • 91. Throughput with isolation and recovery 0 25 50 75 100 100 100 Throughput%age(Baseline844Mbps) e1000 Network Card Native FGFT-­‐off-­‐I/O FGFT-­‐I/O-­‐1/2 FGFT-­‐I/O-­‐all netperf on Intel quad-core machines 29 CPU: 2.4% 2.4% Thursday, March 7, 13
  • 92. Throughput with isolation and recovery 0 25 50 75 100 100 100 96 Throughput%age(Baseline844Mbps) e1000 Network Card Native FGFT-­‐off-­‐I/O FGFT-­‐I/O-­‐1/2 FGFT-­‐I/O-­‐all netperf on Intel quad-core machines 29 CPU: 2.4% 2.4% 2.9% Thursday, March 7, 13
  • 93. Throughput with isolation and recovery 0 25 50 75 100 100 100 96 93 Throughput%age(Baseline844Mbps) e1000 Network Card Native FGFT-­‐off-­‐I/O FGFT-­‐I/O-­‐1/2 FGFT-­‐I/O-­‐all netperf on Intel quad-core machines 29 CPU: 2.4% 2.4% 2.9% 3.4% Thursday, March 7, 13
  • 94. Throughput with isolation and recovery 0 25 50 75 100 100 100 96 93 Throughput%age(Baseline844Mbps) e1000 Network Card Native FGFT-­‐off-­‐I/O FGFT-­‐I/O-­‐1/2 FGFT-­‐I/O-­‐all netperf on Intel quad-core machines 29 CPU: 2.4% 2.4% 2.9% 3.4% FGFT can isolate and recover high bandwidth devices at low overhead without adding kernel subsystems Thursday, March 7, 13
  • 96. Summary 30 ★ Fine-Grained Fault tolerance based on a pay- as-you go model ★ Provides fault tolerance at incremental performance costs and programmer efforts ★ Introduces fast checkpointing for drivers ★ Device checkpoints average ~20micros ★ Reduces recovery time significantly ★ Should be explored in other domains apart from fault tolerance like fast reboot, upgrade etc. Thursday, March 7, 13