0% found this document useful (0 votes)
60 views

Speculative Execution in A Distributed File System: E. B. Nightingale P. M. Chen J. Flint

Speculative execution in distributed file systems allows clients to predictably speculate on the outcomes of operations and execute speculatively to reduce latency. The system tracks causal dependencies to ensure correctness. Evaluation shows SpecNFS is significantly faster than NFS for common workloads like Apache building, with performance degrading little as speculation failures increase. Speculation enables safe, consistent yet fast distributed file systems.

Uploaded by

sushmsn
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views

Speculative Execution in A Distributed File System: E. B. Nightingale P. M. Chen J. Flint

Speculative execution in distributed file systems allows clients to predictably speculate on the outcomes of operations and execute speculatively to reduce latency. The system tracks causal dependencies to ensure correctness. Evaluation shows SpecNFS is significantly faster than NFS for common workloads like Apache building, with performance degrading little as speculation failures increase. Speculation enables safe, consistent yet fast distributed file systems.

Uploaded by

sushmsn
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 30

SPECULATIVE EXECUTION IN

A DISTRIBUTED FILE SYSTEM

E. B. Nightingale
P. M. Chen
J. Flint
University of Michigan
Motivation
• Distributed file systems are often much slower than
local file systems
– Due to synchronous operations required for
cache coherence and data safety
– Even true for file systems that weaken
consistency and safety guarantees
• Close-to-open consistency for AFS and most
versions of NFS
A better solution
• Most of these synchronous operations have
predictable outcomes
– We can bet on the outcome and let the client
process go forward (speculation)
• Make operation asynchronous
– Must take before that a checkpoint of the process
• Can restart operation if speculation failed
Why it works
1. Clients can correctly predict the outcome of many
operations
• Few concurrent accesses to files
2. Time to take a lightweight checkpoint is often less
than network round-trip time
• 52 ms for a small process thanks to
copy-on-write
3. Most clients have free cycles
Speculator
• File system controls when speculations start,
succeed and fail
• Speculator provides a mechanism to ensure
correct execution of speculative code
• No application changes are required
• Speculative state is never visible from the
outside
Correctness rules (I)
• A process that executes in speculative mode
cannot externalize output
– Speculator blocks the process
• Speculator tracks causal dependencies between
kernel objects
– Kernel objects modified by a speculative
process will be put in a speculative state
Correctness rules (II)
• Speculator tracks causal dependencies between
processes
– Processes receiving a message or a signal
from a speculative process will be
checkpointed and become speculative
• In case of doubt, Speculator will block the
execution of the speculative process
An example: conventional NFS
An example: conventional NFS
• Linux 2.4.21 NFSv3 implements close to open
consistency
– At close time, client sends to server:
1. Asynchronous write calls with the
modified data
2. A synchronous commit call once it
has received replies for all write calls
An example: SpecNFS
An example: SpecNFS
• All calls are non-blocking but force the calling
process to become speculative
• If a call returns an unexpected result, the calling
process is rolled back to its checkpoint and the
call is executed again
– A new speculation starts
Speculation interface
• Three new system calls:
– Create_speculation():
• Returns unique spec_id and a list of
previous speculations on which the
speculation depends
– Commit_speculation(spec_id)
– Fail_speculation(spec_id)
Implementing checkpoints
• Checkpoints are implemented through
copy-on-write fork
– Speculator also saves the state of any open
file descriptor and copies all pending signals
• Forked child is not placed on the ready queue
– It just waits
• If speculation fails, forked child assumes the
identity of the failed parent
New kernel structures
• Speculation structure:
– Created during create_speculation()
– Tracks the set of kernel objects that depend
on the speculation
• Undo log:
– Associated with each kernel object that has a
speculative state
– Ordered list of speculative modifications
Sharing checkpoints
• Letting successive speculations share the same
checkpoint reduces the speculation overhead
• Two limitations
– Speculator limits the amount of rollback work
by not letting speculation share a checkpoint
that is more than 500 ms old
– Cannot let a speculation share a checkpoint
with a previous speculation that changes state
of file system
Correctness invariants
1. Speculative state should never be visible to the
user or to any external device
– Speculator prevents all speculative
processes from externalizing output to any
interface
2. A process should never view speculative state
unless it is already speculatively dependent
upon that state.
Invariant implementations (I)
• First Implementation:
Block speculative processes whenever they try
to perform a system call
– Always correct
– Limits the amount of work that can be done by
a process in a speculative state
Invariant implementations (II)
• Second Implementation:
Allow speculative processes to perform systems
calls that
– Do not modify state
• “Read-only” calls such as getpid()
– Only modify state that is private to the calling
process
• It will be rolled back if speculation fails
Invariant implementations (III)
• Third Implementation:
Allow speculative processes to perform
operations on files in speculative file systems
– With VFS, can have multiple file systems on
the same machine
• Typically NFS plus FFS or ext3
• Must check type of file system
– Have a special bit in superblock
Multiprocess speculation (I)
• Whenever a speculative process P participates
in interprocess communication with a process Q
• Process Q must become speculatively
dependent on the speculative state of
process P and get checkpointed
Multiprocess speculation (II)
• Whenever a speculative process P modifies an
object X
• Object X must become speculatively
dependent on the speculative state of
process P and get an undo list

You are not responsible for the


implementation details
Performance: PostMark benchmark
Performance: PostMark benchmark
• SpecNFS is
– 2.5 times faster than NFS with no latency
between client and server
– 41 times faster than NFS with a 30ms round-trip
time delay between client and server
• A version of BlueFS providing single-copy
semantics is 49 times faster than NFS with same
30ms round-trip time delay
Performance: Apache benchmark
Performance: Apache benchmark
• Building Apache server from a tarred file
• SpecNFS is
– 2 times faster than NFS with no latency
between client and server
– 14 times faster than NFS with a 30ms round-
trip time delay between client and serve
– Always better than BlueFS and Coda
Performance: impact of rollbacks
Performance: impact of rollbacks
• Repeated Apache benchmark marking a
varying fraction of the files out-of-date
– Will result in speculation failures
– Percentage of out-of-date files has little
impact on SpecNFS performance
Performance: other
Performance: other
• Impact of group commits and sharing state
– Mostly affects Blue FS
• When speculative processes cannot
propagate their state, Blue FS performs
worse than NFS with no latency between
client and server
• Impact magnified at 30ms latency
Conclusion
• Speculation enables the development of
distributed file systems that are
– Safe
– Consistent
– Fast
• Generic kernel support for speculative execution
and causal dependency tracking could have
many other applications

You might also like