Persisting OS Final
Persisting OS Final
INTRODUCTION
The Computational Model of Conventional Systems The Computational Model of Persistent Systems The Grasshopper Model
Most existing operating systems provide the resource management, permanent data maintenance, and execution environment, but there are two common inadequacies: the discontinuity between permanent and temporary data and lack of resilience to failure. The file system is the model for permanent data and virtual memory is the model for data supported in the execution environment.
o Permanent data have to be accessed via the file system, which makes it difficult to maintain complex data structures like graphs. On the other hand, complex data structures can be created and manipulated in the virtual memory but can only persist in the lifetime of the program that is creating them.
Secondly, if the data amended in a complex data structure is to be stored in the permanent storage, the programmer must write code to flatten the data and as well as code to reload the structure. Finally, the programmer must work with two different protection models.
In persistent operating systems, data is treated uniformly for as long as it persists in the system. In a persistence system, the programmer does not see any difference between data in RAM and that in the secondary storage. This extends the same idea of virtual memory in that the programmer does not know what part of the code is in the virtual memory or in RAM.
For over 10 years, researchers have been attempting to construct an operating system that supports orthogonal persistence. This brings about a two basic questions. ? What is orthogonal persistence? ?Why does it require support from the operating system? There are two basic principles behind orthogonal persistence:
objects may persist for as long as they are needed in the system or for as long as they are required. objects may be manipulated uniformly irrespective of the duration for which they have persisted in the system.
The support for orthogonal persistence requires uniform treatment of data structures, data resilience, location independence and protection of data.
Uniform treatment of data means that all data irrespective of how long it has persisted in the system should be treated the same. The programmer need not know how long the data has persisted and the various mapping since the data is already in the system.
As for data resilience, the OS must provide a certain level of resilience from failure. In conventional OSs like UNIX, a utility like FSCK may be used to recover data in the stable storage.
Location independence works more or less like the virtual memory. Programmers do not have to know the location of the data in order to manipulate it. All data is accessed in a uniform manner irrespective of whether it is in the RAM or Disk.
Persistent systems provide a large store of data more like a repository, in which all data reside and against which all the processes execute. This calls for a protection mechanism that protects the data from accidental or malicious misuse. This is achieved in persistent systems by Encapsulation and use of capabilities.
ADVANTAGES
Persistent Systems
Persistent System
Real World
Data Base
Real World
Providing Persistence
An example of a language supporting the persistence paradigm is Napier. Current native operating systems that support persistence include: Monads, Clouds, Opal, and Mungi.
In persistent systems it is important that data be preserved across failures. Failures include: Hardware Failures: Disk crashes, machine crashes, power outages. Software Failures: Operating system bugs, incorrect programs. These failures can be classed into three categories: disaster, corrupting, and noncritical.
A disaster failure is one that is impossible to recover from without external aid. The destruction of physical hardware such as the CPU or disks would be classified as a disaster failure. The strategy for recovering from disaster failures is to replace the destroyed hardware and recover the system to a previous consistent state by using backups. It is possible to have standby hardware to allow automated recovery for disaster failures. For example, RAID techniques can provide on-line backups for single, or limited multiple disk failures.
A corrupting failure is one in which data is corrupted or lost. For example, an unexpected power failure to a machine would result in the loss of the information in transient storage. To recover from these types of failures the machine would be shut down, if not already down, and restarted. The start up code would reconstruct a previous consistent state and computation would progress from that state.
non-critical failure is one that can be recovered from without loss of information. An example of a non-critical failure is a broken network cable. The computation would halt until the cable is repaired; no data would be lost, and no recovery step is necessary by the operating system.
Grasshopper is an operating system explicitly designed to support orthogonal persistence, and is an attempt to explore some of the persistent system design space. The fundamental goal of the Grasshopper project was to design and develop an operating system to support orthogonal persistence across a large network.
Containers
Containers are the abstraction over data storage and access in Grasshopper. In Grasshopper, containers are mutated by loci, an orthogonal concept. A locus may move between containers, and a container may have many loci executing in it simultaneously.
Loci
A locus is simply the contents of the registers of the machine on which it is executing. A locus is associated with a container, its host container. The locus perceives the host containers contents plus any containers mapped by locus private mappings within its own address space. Any number of loci may execute within a given container
oCapabilities
In Grasshopper a capability refers to a single entity and contains a set of permissions which describe the access which is permitted using that capability. A capability list may be associated with each locus and container. A capability is specified by a tuple identifying a capability list, associated with either the current locus or the host container, and the index of the capability in the list.
Managers
In Grasshopper the backing storage for the data held in containers is not controlled by the kernel. Instead, user level programs called managers control the movement of data between backing store and main memory. Each container has a manager associated with it which, amongst other things, handles the page fault requests for the containers it manages.
When a page fault occurs, the kernel traverses the mapping graph of the faulting locus to determine which container holds the page. The kernel then invokes the relevant manager, requesting the page. The manager returns the page to the kernel which sets up the hardware translation unit to allow the data to be accessed.
Managers are also responsible for the resilience of the data in a container. They are required to implement a stability algorithm on the data, and, in cooperation with the kernel and other managers, ensure that consistent data is stored on stable storage.