0% found this document useful (0 votes)
9 views

Lecture 07

Uploaded by

browninasia
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Lecture 07

Uploaded by

browninasia
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 68

Advanced Operating

System
Professor Mangal Sain
Lecture 7

Storage Management
Lecture 7 – Part 1

Mass-Storage Systems
OBJECTIVES

 To describe the physical structure of secondary


storage devices and its effects on the uses of
the devices
 To explain the performance characteristics of
mass-storage devices
 To evaluate disk scheduling algorithms
OVERVIEW OF MASS STORAGE STRUCTURE
 Magnetic disks provide bulk of secondary storage of
modern computers
 Drives rotate at 60 to 250 times per second
 Transfer rate is rate at which data flow between drive and
computer
 Positioning time (random-access time) is time to move disk arm
to desired cylinder (seek time) and time for desired sector to rotate
under the disk head (rotational latency)
 Head crash results from disk head making contact with the disk
surface -- That’s bad
 Disks can be removable
 Drive attached to computer via I/O bus
 Busses vary, including EIDE, ATA, SATA, USB, Fibre Channel,
SCSI, SAS, Firewire
 Host controller in computer uses bus to talk to disk controller
built into drive or storage array
MOVING-HEAD DISK MECHANISM
HARD DISKS
 Platters range from .85” to 14”
(historically)
 Commonly 3.5”, 2.5”, and 1.8”
 Range from 30GB to 3TB per drive
 Performance
 Transfer Rate – theoretical – 6 Gb/sec
 Effective Transfer Rate – real – 1Gb/sec
 Seek time from 3ms to 12ms – 9ms common (From Wikipedia)
for desktop drives
 Average seek time measured or calculated
based on 1/3 of tracks
 Latency based on spindle speed
 1 / (RPM / 60) = 60 / RPM
 Average latency = ½ latency
THE FIRST COMMERCIAL DISK DRIVE

1956
IBM RAMDAC computer included the I
BM Model 350 disk storage system

5M (7 bit) characters
50 x 24” platters
Access time = < 1 second
SOLID-STATE DISKS
 Nonvolatile memory used like a hard drive
 Many technology variations
 Can be more reliable than HDDs
 More expensive per MB
 Maybe have shorter life span
 Less capacity
 But much faster
 Busses can be too slow -> connect directly to
PCI for example
 No moving parts, so no seek time or rotational
latency
MAGNETIC TAPE
 Was early secondary-storage medium
 Evolved from open spools to cartridges
 Relatively permanent and holds large quantities of data
 Access time slow
 Random access ~1000 times slower than disk
 Mainly used for backup, storage of infrequently-used
data, transfer medium between systems
 Kept in spool and wound or rewound past read-write
head
 Once data under head, transfer rates comparable to disk
 140MB/sec and greater
 200GB to 1.5TB typical storage
 Common technologies are LTO-{3,4,5} and T10000
DISK STRUCTURE
 Disk drives are addressed as large 1-dimensional
arrays of logical blocks, where the logical block is the
smallest unit of transfer
 Low-level formatting creates logical blocks on physical
media
 The 1-dimensional array of logical blocks is mapped
into the sectors of the disk sequentially
 Sector 0 is the first sector of the first track on the outermost
cylinder
 Mapping proceeds in order through that track, then the rest
of the tracks in that cylinder, and then through the rest of
the cylinders from outermost to innermost
 Logical to physical address should be easy
 Except for bad sectors
 Non-constant # of sectors per track via constant angular velocity
DISK ATTACHMENT
 Host-attached storage accessed through I/O ports
talking to I/O busses
 SCSI itself is a bus, up to 16 devices on one cable,
SCSI initiator requests operation and SCSI targets
perform tasks
 Each target can have up to 8 logical units (disks attached
to device controller)
 FC is high-speed serial architecture
 Can be switched fabric with 24-bit address space – the
basis of storage area networks (SANs) in which many
hosts attach to many storage units
 I/O directed to bus ID, device ID, logical unit (LUN)
STORAGE ARRAY

 Can just attach disks, or arrays of disks


 Storage Array has controller(s), provides features to
attached host(s)
 Ports to connect hosts to array
 Memory, controlling software (sometimes NVRAM, etc)
 A few to thousands of disks
 RAID, hot spares, hot swap (discussed later)
 Shared storage -> more efficiency
 Features found in some file systems
 Snaphots, clones, thin provisioning, replication, deduplication, etc
STORAGE AREA NETWORK

 Common in large storage environments


 Multiple hosts attached to multiple storage arrays -
flexible
STORAGE AREA NETWORK (CONT.)

 SAN is one or more storage arrays


 Connected to one or more Fibre Channel switches
 Hosts also attach to the switches
 Storage made available via LUN Masking
from specific arrays to specific servers
 Easy to add or remove storage, add new host
and allocate it storage
 Over low-latency Fibre Channel fabric
 Why have separate storage networks and
communications networks?
 Consider iSCSI, FCOE
NETWORK-ATTACHED STORAGE
 Network-attached storage (NAS) is storage
made available over a network rather than
over a local connection (such as a bus)
 Remotely attaching to file systems
 NFS and CIFS are common protocols
 Implemented via remote procedure calls
(RPCs) between host and storage over
typically TCP or UDP on IP network
 iSCSI protocol uses IP network to carry the
SCSI protocol
 Remotely attaching to devices (blocks)
DISK SCHEDULING
 The operating system is responsible for using
hardware efficiently — for the disk drives, this
means having a fast access time and disk
bandwidth
 Minimize seek time

 Seek time  seek distance

 Disk bandwidth is the total number of bytes


transferred, divided by the total time between
the first request for service and the completion of
the last transfer
DISK SCHEDULING (CONT.)

 There are many sources of disk I/O request


 OS
 System processes
 Users processes

 I/O request includes input or output mode, disk


address, memory address, number of sectors to
transfer
 OS maintains queue of requests, per disk or device

 Idle disk can immediately work on I/O request, busy


disk means work must queue
 Optimization algorithms only make sense when a queue
exists
DISK SCHEDULING (CONT.)

 Note that drive controllers have small buffers and


can manage a queue of I/O requests (of varying
“depth”)
 Several algorithms exist to schedule the servicing
of disk I/O requests
 The analysis is true for one or many platters

 We illustrate scheduling algorithms with a request


queue (0-199)

98, 183, 37, 122, 14, 124, 65, 67


Head pointer 53
FCFS
Illustration shows total head movement of 640 cylinders
SCAN

 The disk arm starts at one end of the disk, and


moves toward the other end, servicing requests
until it gets to the other end of the disk, where
the head movement is reversed and servicing
continues.
 SCAN algorithm Sometimes called the
elevator algorithm
 Illustration shows total head movement of 236
cylinders
 But note that if requests are uniformly dense,
largest density at other end of disk and those
wait the longest
SCAN (CONT.)
C-SCAN
 Provides a more uniform wait time than SCAN
 The head moves from one end of the disk to the
other, servicing requests as it goes
 When it reaches the other end, however, it
immediately returns to the beginning of the disk,
without servicing any requests on the return trip
 Treats the cylinders as a circular list that wraps
around from the last cylinder to the first one
 Total number of cylinders?
C-SCAN (CONT.)
SELECTING A DISK-SCHEDULING ALGORITHM
 SSTF is common and has a natural appeal
 SCAN and C-SCAN perform better for systems that place a
heavy load on the disk
 Less starvation
 Performance depends on the number and types of requests

 Requests for disk service can be influenced by the file-allocation


method
 And metadata layout
 The disk-scheduling algorithm should be written as a separate
module of the operating system, allowing it to be replaced with
a different algorithm if necessary
DISK MANAGEMENT
 Low-level formatting, or physical formatting — Dividing a
disk into sectors that the disk controller can read and write
 Each sector can hold header information, plus data, plus error
correction code (ECC)
 Usually 512 bytes of data but can be selectable
 To use a disk to hold files, the operating system still needs to
record its own data structures on the disk
 Partition the disk into one or more groups of cylinders, each
treated as a logical disk
 Logical formatting or “making a file system”
 To increase efficiency most file systems group blocks into
clusters
 Disk I/O done in blocks

 File I/O done in clusters


DISK MANAGEMENT (CONT.)

 Raw disk access for apps that want to do their


own block management, keep OS out of the
way (databases for example)
 Boot block initializes system
 The bootstrap is stored in ROM
 Bootstrap loader program stored in boot blocks
of boot partition
 Methods such as sector sparing used to
handle bad blocks
BOOTING FROM A DISK IN WINDOWS
SWAP-SPACE MANAGEMENT
 Swap-space — Virtual memory uses disk space as an
extension of main memory
 Less common now due to memory capacity increases
 Swap-space can be carved out of the normal file system, or,
more commonly, it can be in a separate disk partition (raw)
 Swap-space management

 4.3BSD allocates swap space when process starts; holds


text segment (the program) and data segment
 Kernel uses swap maps to track swap-space use
DATA STRUCTURES FOR SWAPPING ON LINUX SYSTEMS
STABLE-STORAGE IMPLEMENTATION
 Write-ahead log scheme requires stable storage
 Stable storage means data is never lost (due to failure, etc)
 To implement stable storage:
 Replicate information on more than one nonvolatile storage
media with independent failure modes
 Update information in a controlled manner to ensure that we can
recover the stable data after any failure during data transfer or
recovery
 Disk write has 1 of 3 outcomes
1. Successful completion - The data were written correctly on
disk
2. Partial failure - A failure occurred in the midst of transfer, so
only some of the sectors were written with the new data, and the
sector being written during the failure may have been corrupted
3. Total failure - The failure occurred before the disk write
started, so the previous data values on the disk remain intact
STABLE-STORAGE IMPLEMENTATION (CONT.)
 If failure occurs during block write, recovery
procedure restores block to consistent state
 System maintains 2 physical blocks per logical
block and does the following:
1. Write to 1st physical

2. When successful, write to 2


nd physical

3. Declare complete only after second write

completes successfully
Systems frequently use NVRAM as one physical to
accelerate
Lecture 7 – Part 2

File System Interface


OBJECTIVES

 To explain the function of file systems


 To describe the interfaces to file systems

 To discuss file-system design tradeoffs,


including access methods, file sharing, file
locking, and directory structures
 To explore file-system protection
FILE CONCEPT

 Contiguous logical address space


 Types:
 Data
 numeric
 character

 binary

 Program
 Contents defined by file’s creator
 Many types
 Consider text file, source file, executable file
FILE ATTRIBUTES
 Name – only information kept in human-readable form
 Identifier – unique tag (number) identifies file within file
system
 Type – needed for systems that support different types
 Location – pointer to file location on device
 Size – current file size
 Protection – controls who can do reading, writing, executing
 Time, date, and user identification – data for protection,
security, and usage monitoring
 Information about files are kept in the directory structure,
which is maintained on the disk
 Many variations, including extended file attributes such as file
checksum
 Information kept in the directory structure
FILE INFO WINDOW ON MAC OS X
FILE OPERATIONS
 File is an abstract data type
 Create
 Write – at write pointer location
 Read – at read pointer location
 Reposition within file - seek
 Delete
 Truncate
 Open(Fi) – search the directory structure on
disk for entry Fi, and move the content of entry
to memory
 Close (Fi) – move the content of entry Fi in
memory to directory structure on disk
OPEN FILES
 Several pieces of data are needed to manage open files:
 Open-file table: tracks open files
 File pointer: pointer to last read/write location, per
process that has the file open
 File-open count: counter of number of times a file
is open – to allow removal of data from open-file
table when last processes closes it
 Disk location of the file: cache of data access
information
 Access rights: per-process access mode information
OPEN FILE LOCKING

 Provided by some operating systems and file


systems
 Similar to reader-writer locks
 Shared lock similar to reader lock – several
processes can acquire concurrently
 Exclusive lock similar to writer lock
 Mediates access to a file
 Mandatory or advisory:
 Mandatory – access is denied depending on locks
held and requested
 Advisory – processes can find status of locks and
decide what to do
FILE TYPES – NAME, EXTENSION
FILE STRUCTURE

 None - sequence of words, bytes


 Simple record structure
 Lines
 Fixed length
 Variable length
 Complex Structures
 Formatted document
 Relocatable load file
 Can simulate last two with first method by
inserting appropriate control characters
 Who decides:
 Operating system
 Program
SEQUENTIAL-ACCESS FILE
ACCESS METHODS
 Sequential Access
read next
write next
reset
no read after last write
(rewrite)
 Direct Access – file is fixed length logical records
read n
write n
position to n
read next
write next
rewrite n
n = relative block number

 Relative block numbers allow OS to decide where file should be placed


DIRECTORY STRUCTURE

 A collection of nodes containing information about all files

Directory

Files
F1 F2 F4
F3
Fn

Both the directory structure and the files reside on disk


DISK STRUCTURE
 Disk can be subdivided into partitions
 Disks or partitions can be RAID protected against
failure
 Disk or partition can be used raw – without a file
system, or formatted with a file system
 Partitions also known as minidisks, slices
 Entity containing file system known as a volume
 Each volume containing file system also tracks that
file system’s info in device directory or volume
table of contents
 As well as general-purpose file systems there are
many special-purpose file systems, frequently all
within the same operating system or computer
A TYPICAL FILE-SYSTEM ORGANIZATION
TYPES OF FILE SYSTEMS

 We mostly talk of general-purpose file systems


 But systems frequently have may file systems, some
general- and some special- purpose
 Consider Solaris has
 tmpfs – memory-based volatile FS for fast, temporary I/O
 objfs – interface into kernel memory to get kernel symbols
for debugging
 ctfs – contract file system for managing daemons
 lofs – loopback file system allows one FS to be accessed in
place of another
 procfs – kernel interface to process structures
 ufs, zfs – general purpose file systems
OPERATIONS PERFORMED ON DIRECTORY

 Search for a file

 Create a file

 Delete a file

 List a directory

 Rename a file

 Traverse the file system


DIRECTORY ORGANIZATION

The directory is organized logically to obtain

 Efficiency – locating a file quickly


 Naming – convenient to users
 Two users can have same name for different
files
 The same file can have several different
names
 Grouping – logical grouping of files by
properties, (e.g., all Java programs, all
games, …)
Lecture 7 – Part 3

File System Interface


SINGLE-LEVEL DIRECTORY
 A single directory for all users

 Naming problem
 Grouping problem
TWO-LEVEL DIRECTORY

 Separate directory for each user

Path name
Can have the same file name for different user
Efficient searching
No grouping capability
TREE-STRUCTURED DIRECTORIES
ACYCLIC-GRAPH DIRECTORIES

 Have shared subdirectories and files


ACYCLIC-GRAPH DIRECTORIES (CONT.)

 Two different names (aliasing)


 If dict deletes list  dangling pointer

Solutions:
 Backpointers, so we can delete all pointers
Variable size records a problem
 Backpointers using a daisy chain organization
 Entry-hold-count solution

 New directory entry type


 Link – another name (pointer) to an existing file
 Resolve the link – follow pointer to locate the file
GENERAL GRAPH DIRECTORY
FILE SYSTEM MOUNTING

 A file system must be mounted before it can be


accessed
 A unmounted file system is mounted at a mount
point
MOUNT POINT
FILE SHARING

 Sharing of files on multi-user systems is desirable


 Sharing may be done through a protection
scheme
 On distributed systems, files may be shared across
a network
 Network File System (NFS) is a common
distributed file-sharing method
 If multi-user system
 User IDs identify users, allowing permissions and
protections to be per-user
Group IDs allow users to be in groups, permitting
group access rights
 Owner of a file / directory
 Group of a file / directory
FILE SHARING – REMOTE FILE SYSTEMS
 Uses networking to allow file system access between systems
 Manually via programs like FTP
 Automatically, seamlessly using distributed file systems
 Semi automatically via the world wide web

 Client-server model allows clients to mount remote file systems


from servers
 Server can serve multiple clients
 Client and user-on-client identification is insecure or complicated
 NFS is standard UNIX client-server file sharing protocol
 CIFS is standard Windows protocol
 Standard operating system file calls are translated into remote calls
 Distributed Information Systems (distributed naming
services) such as LDAP, DNS, NIS, Active Directory implement
unified access to information needed for remote computing
FILE SHARING – FAILURE MODES
 All file systems have failure modes
 For example corruption of directory structures or other
non-user data, called metadata
 Remote file systems add new failure modes, due to
network failure, server failure
 Recovery from failure can involve state
information about status of each remote request
 Stateless protocols such as NFS v3 include all
information in each request, allowing easy
recovery but less security
FILE SHARING – CONSISTENCY SEMANTICS

 Specify how multiple users are to access a shared file


simultaneously
 Andrew File System (AFS) implemented complex remote
file sharing semantics
 Unix file system (UFS) implements:
 Writes to an open file visible immediately to other users of the
same open file
 Sharing file pointer to allow multiple users to read and write
concurrently
 AFS has session semantics
 Writes only visible to sessions starting after the file is closed
PROTECTION

 File owner/creator should be able to control:


 what can be done
 by whom

 Types of access
 Read
 Write
 Execute
 Append
 Delete
 List
ACCESS LISTS AND GROUPS
 Mode of access: read, write, execute
 Three classes of users on Unix / Linux
RWX
a) owner access 7  111
RWX
b) group access 6  110
RWX
c) public access 1  001
 Ask manager to create a group (unique name),
say G, and add some users to the group.
 For a particular file (say game) or subdirectory,
define an appropriate access.

Attach a group to a file


chgrp G game
WINDOWS 7 ACCESS-CONTROL LIST MANAGEMENT
A SAMPLE UNIX DIRECTORY LISTING

You might also like