0% found this document useful (0 votes)
308 views

Coda

Coda is a highly available distributed file system that was developed at Carnegie Mellon University beginning in 1987. It is based on the Andrew File System and allows for extended periods of client disconnection through data replication across multiple servers and client caching. Coda uses techniques like hoarding and emulation to provide availability even when clients are disconnected from the server network, and reintegration to propagate changes when clients reconnect.

Uploaded by

creeti
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
308 views

Coda

Coda is a highly available distributed file system that was developed at Carnegie Mellon University beginning in 1987. It is based on the Andrew File System and allows for extended periods of client disconnection through data replication across multiple servers and client caching. Coda uses techniques like hoarding and emulation to provide availability even when clients are disconnected from the server network, and reintegration to propagate changes when clients reconnect.

Uploaded by

creeti
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 25

RAJASTHAN COLLEGE OF ENGINEERING FOR WOMEN

CODA : A Highly Available File System For Distributed W0rkstation Environment


KRITI JAIN (I.T.) SUMAN KUMARI (C.S) NEETI JAIN (I.T.) NEHA JAIN (I.T.)

What is Coda?

Coda is a highly-available distributed file system that supports extended periods of client disconnection.

Utilizes replication across multiple servers


Client cache data is used for availability as well as performance.

Proactively hoards data to avoid cache misses while disconnected

Coda's History

Developed at Carnegie Mellon University, beginning around 1987.

Based on the Andrew File System (AFS), specifically AFS2 Developed by Mahadev Satyanarayanan, a principle architect of AFS First implemented on Mach 2.6 one of the earliest microkernel OS Ported to Linux, NetBSD, FreeBSD, others... Development is ongoing

Why CODA??? The Rationale Behind Coda


In general, distributed file systems benefit collaborative development and propagation of data. However, they are vulnerable to remote failure. Coda provides all the benefits of a distributed file system but also attempts to minimize or even eliminate the effects of remote failures. By gracefully handling remote failure, voluntary disconnection, say by portable computers, may be handled in the same manner.

CODA : a descendant of AFS

The internal organization of a virtue work station -Designed to allow access to files even if server is unavailable. -Uses VFS .

COMMUNICATION IN CODA

Coda uses RPC2: a sophisticated reliable RPC system

Start a new thread for each request, server periodically informs client it is still working on the request Useful for video streaming [where RPCs are less useful]

RPC2 supports side-effects: application-specific protocols

FEATURES
1.High performance through client side persistent caching

2.Server replication
3.Works during disconnected operation 4.Security model for authentication, encryption and access control 5.Continued operation during partial network failures in server network

6.Good scalability

Implementation
S C C

S C

S
C

C
C

A few trusted (secure) servers A much larger number of untrusted clients

Each client has a local disk for caching.


Coda presents itself as a single, transparent shared file system to the user.

Disconnected Operation

Fundamentally handles a voluntary disconnection the same way as a network failure.

Occurs when AVSG becomes empty


Accomplished through caching of data

Cache misses appear as failures


Hoarding can prepare for this in advance

Naming

Clients in Coda have access to a single shared name space

Files are grouped into volumes [partial subtree in the directory structure]

Volume is the basic unit of mounting Namespace: /afs/filesrv.cs.umass.edu [same namespace on all client; different from NFS] Name lookup can cross mount points: support for detecting crossing and automounts

Client Caching

Cache consistency maintained using callbacks


Server tracks all clients that have a copy of the file [provide callback promise] Upon modification: send invalidate to clients

Server Replication

Use replicated writes: read-once write-all


Writes are sent to all AVSG (all accessible replicas)

How to handle network partitions?


Use optimistic strategy for replication Detect conflicts using a Coda version vector Example: [2,2,1] and [1,1,2] is a conflict => manual reconciliation

VSG

Volume Storage Group (VSG) - all the servers in a Coda file system

AVSG

Available Volume Storage Group (AVSG) all the servers accessible to an arbitrary client; that clients universe of servers

Disconnected Operation

HOARDING

Occurs during normal, connected operation Replica-like cache behavior Gathers data from servers in anticipation of future disconnected operation Relies on its Hoard Database (HDB) Most recently-used files

User-specified files

Probes its preferred server in AVSG every t seconds to check AVSG size.
Hoarded files are sticky in the cache

EMULATION

Occurs during disconnected operation

Also defined as the size of the AVSG = 0


Transparently appears as if the user is still connected to the file system, thanks to hoarding

To be maintained for hours.


Venus takes on server-like characteristics

Responsible for access and semantics checks


Generates file identifiers for new objects It is not a true server; revalidation must occur during reintegration

REINTEGRATION
A transitory state from disconnection back to connected operation Propagates changes made during emulation Updates cache based on changes in AVSG Updates are made using a replay algorithm that reenacts all changes made while disconnected to each object Testing indicates reintegration typically lasts less than 1% of the duration of emulation

LIMITATIONS

Limited scope: some applications simply resist hoarding. Not supporting to vista and windows 7 yet. Fault tolerant because the goal of disconnected transparent operation cannot be made failure-free.

CONCLUSIONS
Coda is still under development, though the focus has shifted from research to creating a robust product for commercial use. Updates to the network computer would automatically be made when as they become available on servers, and for the most part the computer would operate without network traffic, even after restarts. The rough edges, which inevitably come with research systems, are slowly being smoothed out.

REFERENCES
1. Braam PJ. The Coda Distributed File System [Internet]. Carnegie Mellon University; [updated 2008 Mar 25; cited 2011 May 22]. Available from http://www.coda.cs.cmu.edu/ljpaper/lj.html. 2. Coulouris G, Dollimore J, Kindberg T. Distributed Systems: Concepts and Design, 4th ed. Harlow, England: Addison-Westley; 2005. pp. 631-641.

3. Kistler JJ, Satyanarayanan M. Disconnected Operation in the Coda File System. In: Milojicic D, Douglis F, Wheeler R, editors. Mobility: Processes, Computers, and Agents. ACM Press; 1999. pp. 293-305.
4. Sataynarayanan M, Kistler JJ, Puneet K, Okasaki M, Siegel E, Steere D. Coda: A Highly Available File System . Available from http://www.cs.cmu.edu/afs/cs/project/coda-www/ResearchWebPages/docdir/lj98.pdf 5. Tanenbaum AS, Van Steen M. Distributed Systems: Principles and Paradigms, 2nd ed. Upper Saddle River (NJ): Prentice Hall; 2007. pp. 518-526.

THANK YOU

You might also like