0% found this document useful (0 votes)
14 views

UNIT 2 OS

Operation system

Uploaded by

ANKIT BHARDWAJ
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

UNIT 2 OS

Operation system

Uploaded by

ANKIT BHARDWAJ
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 16

UNIT 2

What is File System?

A file is a collection of correlated information which is recorded on secondary or non-volatile storage like magnetic disks, optical disks,
and tapes. It is a method of data collection that is used as a medium for giving input and receiving output from that program.

In general, a file is a sequence of bits, bytes, or records whose meaning is defined by the file creator and user. Every File has a logical
location where they are located for storage and retrieval.

only used terms in File systems

Field:

This element stores a single value, which can be static or variable length.

DATABASE:

Collection of related data is called a database. Relationships among elements of data are explicit.

FILES:

Files is the collection of similar record which is treated as a single entity.

RECORD:

A Record type is a complex data type that allows the programmer to create a new data type with the desired column structure. Its
groups one or more columns to form a new data type. These columns will have their own names and data type.

Objective of File management System

Here are the main objectives of the file management system:

26.3M
173
What is JVM (Java Virtual Machine) with Architecture JAVA Programming Tutorial

● It provides I/O support for a variety of storage device types.


● Minimizes the chances of lost or destroyed data
● Helps OS to standardized I/O interface routines for user processes.
● It provides I/O support for multiple users in a multiuser systems environment.

Properties of a File System

Here, are important properties of a file system:

● Files are stored on disk or other storage and do not disappear when a user logs off.
● Files have names and are associated with access permission that permits controlled sharing.
● Files could be arranged or more complex structures to reflect the relationship between them.

File Attributes

A file has a name and data. Moreover, it also stores meta information like file creation date and time, current size, last modified date,
etc. All this information is called the attributes of a file system.

Here, are some important File attributes used in OS:


● Name: It is the only information stored in a human-readable form.
● Identifier: Every file is identified by a unique tag number within a file system known as an identifier.
● Location: Points to file location on device.
● Type: This attribute is required for systems that support various types of files.
● Size. Attribute used to display the current file size.
● Protection. This attribute assigns and controls the access rights of reading, writing, and executing the file.
● Time, date and security: It is used for protection, security, and also used for monitoring

Functions of File

● Create file, find space on disk, and make an entry in the directory.
● Write to file, requires positioning within the file
● Read from file involves positioning within the file
● Delete directory entry, regain disk space.

Reposition: move read/write position. FILE DIRECTORIES:


Collection of files is a file directory. The directory contains information about the files, including attributes,
location and ownership. Much of this information, especially that is concerned with storage, is managed by the
operating system. The directory is itself a file, accessible by various file management routines.

Information contained in a device directory are:


● Name
● Type
● Address
● Current length
● Maximum length
● Date last accessed
● Date last updated
● Owner id
● Protection information
Operation performed on directory are:
● Search for a file
● Create a file
● Delete a file
● List a directory
● Rename a file
● Traverse the file system
Advantages of maintaining directories are:
● Efficiency: A file can be located more quickly.
● Naming: It becomes convenient for users as two users can have same name for different files or
may have different name for same file.
● Grouping: Logical grouping of files can be done by properties e.g. all java programs, all games
etc.

SINGLE-LEVEL DIRECTORY
In this a single directory is maintained for all the users.

● Naming problem: Users cannot have same name for two files.
● Grouping problem: Users cannot group files according to their need.
TWO-LEVEL DIRECTORY
In this separate directories for each user is maintained.
● Path name:Due to two levels there is a path name for every file to locate that file.
● Now,we can have same file name for different user.
● Searching is efficient in this method.

TREE-STRUCTURED DIRECTORY :
Directory is maintained in the form of a tree. Searching is efficient and also there is grouping capability. We
have absolute or relative path name for a file.

File Allocation Methods


The allocation methods define how the files are stored in the disk blocks. There are three main disk space or file
allocation methods.

● Contiguous Allocation
● Linked Allocation
● Indexed Allocation
The main idea behind these methods is to provide:

● Efficient disk space utilization.


● Fast access to the file blocks.
All the three methods have their own advantages and disadvantages as discussed below:

1. Contiguous Allocation
In this scheme, each file occupies a contiguous set of blocks on the disk. For example, if a file requires n blocks
and is given a block b as the starting location, then the blocks assigned to the file will be: b, b+1, b+2,……b+n-
1. This means that given the starting block address and the length of the file (in terms of blocks required), we
can determine the blocks occupied by the file.
The directory entry for a file with contiguous allocation contains
● Address of starting block
● Length of the allocated portion.
The file ‘mail’ in the following figure starts from the block 19 with length = 6 blocks. Therefore, it occupies 19,
20, 21, 22, 23, 24 blocks.

Advantages:
● Both the Sequential and Direct Accesses are supported by this. For direct access, the address of
the kth block of the file which starts at block b can easily be obtained as (b+k).
● This is extremely fast since the number of seeks are minimal because of contiguous allocation of
file blocks.
Disadvantages:
● This method suffers from both internal and external fragmentation. This makes it inefficient in
terms of memory utilization.
● Increasing file size is difficult because it depends on the availability of contiguous memory at a
particular instance.
2. Linked List Allocation
In this scheme, each file is a linked list of disk blocks which need not be contiguous. The disk blocks can be
scattered anywhere on the disk.
The directory entry contains a pointer to the starting and the ending file block. Each block contains a pointer to
the next block occupied by the file.
The file ‘jeep’ in following image shows how the blocks are randomly distributed. The last block (25) contains -1
indicating a null pointer and does not point to any other block.

Advantages:
● This is very flexible in terms of file size. File size can be increased easily since the system does
not have to look for a contiguous chunk of memory.
● This method does not suffer from external fragmentation. This makes it relatively better in terms
of memory utilization.
Disadvantages:
● Because the file blocks are distributed randomly on the disk, a large number of seeks are needed
to access every block individually. This makes linked allocation slower.
● It does not support random or direct access. We can not directly access the blocks of a file. A
block k of a file can be accessed by traversing k blocks sequentially (sequential access ) from the
starting block of the file via block pointers.
● Pointers required in the linked allocation incur some extra overhead.

3. Indexed Allocation
In this scheme, a special block known as the Index block contains the pointers to all the blocks occupied by a
file. Each file has its own index block. The ith entry in the index block contains the disk address of the ith file
block. The directory entry contains the address of the index block as shown in the image:

Advantages:
● This supports direct access to the blocks occupied by the file and therefore provides fast access to
the file blocks.
● It overcomes the problem of external fragmentation.
Disadvantages:
● The pointer overhead for indexed allocation is greater than linked allocation.
● For very small files, say files that expand only 2-3 blocks, the indexed allocation would keep one
entire block (index block) for the pointers which is inefficient in terms of memory utilization.
However, in linked allocation we lose the space of only 1 pointer per block.
For files that are very large, single index block may not be able to hold all the pointers.
Following mechanisms can be used to resolve this:

1. Linked scheme: This scheme links two or more index blocks together for holding the pointers.
Every index block would then contain a pointer or the address to the next index block.
Multilevel index: In this policy, a first level index block is used to point to the second level index blocks which
inturn points to the disk blocks occupied by the file. This can be extended to 3 or more levels depending on the
maximum file size. File Access Methods in Operating System

FILE ACCSESS METHODS


When a file is used, information is read and accessed into computer memory and there are several ways to
access this information of the file. Some systems provide only one access method for files. Other systems, such
as those of IBM, support many access methods, and choosing the right one for a particular application is a
major design problem.

: Sequential-Access, Direct Access, Index sequential Method.

1. Sequential Access –
It is the simplest access method. Information in the file is processed in order, one record after the
other. This mode of access is by far the most common; for example, editor and compiler usually
access the file in this fashion.
Read and write make up the bulk of the operation on a file. A read operation -read next- read the
next position of the file and automatically advance a file pointer, which keeps track I/O location.
Similarly, for the writewrite next append to the end of the file and advance to the newly written
material.
Key points:

● Data is accessed one record right after another record in an order.

● When we use read command, it move ahead pointer by one

● When we use write command, it will allocate memory and move the pointer to the
end of the file

● Such a method is reasonable for tape.

2. Direct Access –
Another method is direct access method also known as relative access method. A filed-length
logical record that allows the program to read and write record rapidly. in no particular order. The
direct access is based on the disk model of a file since disk allows random access to any file block.
For direct access, the file is viewed as a numbered sequence of block or record. Thus, we may
read block 14 then block 59 and then we can write block 17. There is no restriction on the order of
reading and writing for a direct access file.
A block number provided by the user to the operating system is normally a relative block number,
the first relative block of the file is 0 and then 1 and so on.

3. Index sequential method –


It is the other method of accessing a file which is built on the top of the sequential access method.
These methods construct an index for the file. The index, like an index in the back of a book,
contains the pointer to the various blocks. To find a record in the file, we first search the index and
then by the help of pointer we access the file directly.
Key points:
● It is built on top of Sequential access.

Magnetic Disks
Traditional magnetic disks have the following basic structure:

One or more platters in the form of disks covered with magnetic media. Hard disk platters are made of rigid metal, while "floppy" disks are
made of more flexible plastic.

Each platter has two working surfaces. Older hard disk drives would sometimes not use the very top or bottom surface of a stack of
platters, as these surfaces were more susceptible to potential damage.

Each working surface is divided into a number of concentric rings called tracks. The collection of all tracks that are the same distance from
the edge of the platter, ( i.e. all tracks immediately above one another in the following diagram ) is called a cylinder.
Each track is further divided into sectors, traditionally containing 512 bytes of data each, although some modern disks occasionally use
larger sector sizes. ( Sectors also include a header and a trailer, including checksum information among other things. Larger sector sizes
reduce the fraction of the disk consumed by headers and trailers, but increase internal fragmentation and the amount of disk that must be
marked bad in the case of errors. )

The data on a hard drive is read by read-write heads. The standard configuration ( shown below ) uses one head per surface, each on a
separate arm, and controlled by a common arm assembly which moves all heads simultaneously from one cylinder to another. ( Other
configurations, including independent read-write heads, may speed up disk access, but involve serious technical difficulties. )

The storage capacity of a traditional disk drive is equal to the number of heads ( i.e. the number of working surfaces ), times the number of
tracks per surface, times the number of sectors per track, times the number of bytes per sector. A particular physical block of data is
specified by providing the head-sector-cylinder number at which it is located.


Disk Scheduling Algorithms
Disk scheduling is done by operating systems to schedule I/O requests arriving for the disk. Disk scheduling is
also known as I/O scheduling.
Disk scheduling is important because:

● Multiple I/O requests may arrive by different processes and only one I/O request can be served at
a time by the disk controller. Thus other I/O requests need to wait in the waiting queue and need
to be scheduled.
● Two or more request may be far from each other so can result in greater disk arm movement.
● Hard drives are one of the slowest parts of the computer system and thus need to be accessed in
an efficient manner.
There are many Disk Scheduling Algorithms but before discussing them let’s have a quick look at some of the
important terms:

● Seek Time:Seek time is the time taken to locate the disk arm to a specified track where the data
is to be read or write. So the disk scheduling algorithm that gives minimum average seek time is
better.
● Rotational Latency: Rotational Latency is the time taken by the desired sector of disk to rotate
into a position so that it can access the read/write heads. So the disk scheduling algorithm that
gives minimum rotational latency is better.
● Transfer Time: Transfer time is the time to transfer the data. It depends on the rotating speed of
the disk and number of bytes to be transferred.
● Disk Access Time: Disk Access Time is:

Disk Access Time = Seek Time +


Rotational Latency +

Transfer Time

● Disk Response Time: Response Time is the average of time spent by a request waiting to
perform its I/O operation. Average Response time is the response time of the all
requests. Variance Response Time is measure of how individual request are serviced with respect
to average response time. So the disk scheduling algorithm that gives minimum variance
response time is better.

Disk Scheduling Algorithms


1. FCFS: FCFS is the simplest of all the Disk Scheduling Algorithms. In FCFS, the requests are
addressed in the order they arrive in the disk queue.Let us understand this with the help of an
example.

Example:
Suppose the order of request is- (82,170,43,140,24,16,190)
And current position of Read/Write head is : 50

So, total seek time:


=(82-50)+(170-82)+(170-43)+(140-43)+(140-24)+(24-16)+(190-16)
=642

Advantages:

● Every request gets a fair chance


● No indefinite postponement
Disadvantages:

● Does not try to optimize seek time


● May not provide the best possible service

2. SSTF: In SSTF (Shortest Seek Time First), requests having shortest seek time are executed first.
So, the seek time of every request is calculated in advance in the queue and then they are
scheduled according to their calculated seek time. As a result, the request near the disk arm will
get executed first. SSTF is certainly an improvement over FCFS as it decreases the average
response time and increases the throughput of system.Let us understand this with the help of an
example.

Example:
Suppose the order of request is- (82,170,43,140,24,16,190)
And current position of Read/Write head is : 50

So, total seek time:

=(50-43)+(43-24)+(24-16)+(82-16)+(140-82)+(170-40)+(190-170)
=208

Advantages:

● Average Response Time decreases


● Throughput increases
Disadvantages:

● Overhead to calculate seek time in advance


● Can cause Starvation for a request if it has higher seek time as compared to incoming requests
● High variance of response time as SSTF favours only some requests
3. SCAN: In SCAN algorithm the disk arm moves into a particular direction and services the requests
coming in its path and after reaching the end of disk, it reverses its direction and again services
the request arriving in its path. So, this algorithm works as an elevator and hence also known
as elevator algorithm. As a result, the requests at the midrange are serviced more and those
arriving behind the disk arm will have to wait.

Example:
Suppose the requests to be addressed are-82,170,43,140,24,16,190. And the Read/Write arm is at
50, and it is also given that the disk arm should move “towards the larger value”.

Therefore, the seek time is calculated as:

=(199-50)+(199-16)
=332

Advantages:

● High throughput
● Low variance of response time
● Average response time
Disadvantages:

● Long waiting time for requests for locations just visited by disk arm

4. CSCAN: In SCAN algorithm, the disk arm again scans the path that has been scanned, after
reversing its direction. So, it may be possible that too many requests are waiting at the other end
or there may be zero or few requests pending at the scanned area.
These situations are avoided in CSCAN algorithm in which the disk arm instead of reversing its direction goes to
the other end of the disk and starts servicing the requests from there. So, the disk arm moves in a circular
fashion and this algorithm is also similar to SCAN algorithm and hence it is known as C-SCAN (Circular SCAN).
Example:
Suppose the requests to be addressed are-82,170,43,140,24,16,190. And the Read/Write arm is at 50, and it is
also given that the disk arm should move “towards the larger value”.

Seek time is calculated as:

=(199-50)+(199-0)+(43-0)
=391

Advantages:

● Provides more uniform wait time compared to SCAN


5. LOOK: It is similar to the SCAN disk scheduling algorithm except for the difference that the disk
arm in spite of going to the end of the disk goes only to the last request to be serviced in front of
the head and then reverses its direction from there only. Thus it prevents the extra delay which
occurred due to unnecessary traversal to the end of the disk.

Example:
Suppose the requests to be addressed are-82,170,43,140,24,16,190. And the Read/Write arm is at
50, and it is also given that the disk arm should move “towards the larger value”.

So, the seek time is calculated as:

=(190-50)+(190-16)
=314

6. CLOOK: As LOOK is similar to SCAN algorithm, in similar way, CLOOK is similar to CSCAN disk
scheduling algorithm. In CLOOK, the disk arm in spite of going to the end goes only to the last
request to be serviced in front of the head and then from there goes to the other end’s last
request. Thus, it also prevents the extra delay which occurred due to unnecessary traversal to the
end of the disk.

Example:
Suppose the requests to be addressed are-82,170,43,140,24,16,190. And the Read/Write arm is at
50, and it is also given that the disk arm should move “towards the larger value”

So, the seek time is calculated as:

=(190-50)+(190-16)+(43-16)
=341

Protection in File System


In computer systems, alot of user’s information is stored, the objective of the operating system is to keep safe
the data of the user from the improper access to the system. Protection can be provided in number of ways. For
a single laptop system, we might provide protection by locking the computer in a desk drawer or file cabinet.
For multi-user systems, different mechanisms are used for the protection.

Types of Access :
The files which have direct access of the any user have the need of protection. The files which are not
accessible to other users doesn’t require any kind of protection. The mechanism of the protection provide the
facility of the controlled access by just limiting the types of access to the file. Access can be given or not given
to any user depends on several factors, one of which is the type of access required. Several different types of
operations can be controlled:
● Read –
Reading from a file.
● Write –
Writing or rewriting the file.
● Execute –
Loading the file and after loading the execution process starts.
● Append –
Writing the new information to the already existing file, editing must be end at the end of the
existing file.
● Delete –
Deleting the file which is of no use and using its space for the another data.
● List –
List the name and attributes of the file.
Operations like renaming, editing the existing file, copying; these can also be controlled. There are many
protection mechanism. each of them mechanism have different advantages and disadvantages and must be
appropriate for the intended application.

Access Control :
There are different methods used by different users to access any file. The general way of protection is to
associate identity-dependent access with all the files and directories an list called access-control list
(ACL) which specify the names of the users and the types of access associate with each of the user. The main
problem with the access list is their length. If we want to allow everyone to read a file, we must list all the users
with the read access. This technique has two undesirable consequences:
Constructing such a list may be tedious and unrewarding task, especially if we do not know in advance the list
of the users in the system.

Previously, the entry of the any directory is of the fixed size but now it changes to the variable size which
results in the complicates space management. These problems can be resolved by use of a condensed version
of the access list. To condense the length of the access-control list, many systems recognize three classification
of users in connection with each file:

● Owner –
Owner is the user who has created the file.
● Group –
A group is a set of members who has similar needs and they are sharing the same file.
● Universe –
In the system, all other users are under the category called universe.
The most common recent approach is to combine access-control lists with the normal general owner, group,
and universe access control scheme. For example: Solaris uses the three categories of access by default but
allows access-control lists to be added to specific files and directories when more fine-grained access control is
desired.

Other Protection Approaches:


The access to any system is also controlled by the password. If the use of password are is random and it is
changed often, this may be result in limit the effective access to a file.
The use of passwords has a few disadvantages:
● The number of passwords are very large so it is difficult to remember the large passwords.
● If one password is used for all the files, then once it is discovered, all files are accessible;
protection is on all-or-none basis.

File system

A file system is a structure used by an operating system to organise and manage files on a storage device such as
a hard drive, solid state drive (SSD), or USB flash drive. It defines how data is stored, accessed, and organised
on the storage device. Different file systems have varying characteristics and are often specific to certain
operating systems or devices.

types of file systems include:


 FAT (File Allocation Table): An older file system used by older versions of Windows and other operating
systems.
 NTFS (New Technology File System): A modern file system used by Windows. It supports features such
as file and folder permissions, compression, and encryption.
 ext (Extended File System): A file system commonly used on Linux and Unix-based operating systems.

FAT (File Allocation Table), FAT16, FAT32

FAT is one of the oldest and simplest file systems. It was initially developed for MS-DOS and is still used in
many removable storage devices. The two major versions of this system are FAT16 and FAT32. FAT uses a file
allocation table to keep track of file locations on the disk. However, it lacks some advanced features like file
permissions and journaling, making it less suitable for modern operating systems. FAT 16 was introduces in
1987 with DOS 3.31 while FAT32 was introduced with Windows 95 OSR2(MS-DOS 7.1) in 1996.

Advantages:

 Simplicity: This simplicity makes it easy to implement and use, making it suitable for devices with
limited resources or compatibility requirements.
 Data recovery: Due to its simple structure, FAT file systems are relatively easy to recover in case of data
corruption or accidental deletion.
 Compatibility: It can natively be read from and written to by Windows, MacOS and Linux operating
systems without the need for third-party software.

Disadvantages:

 Fragmentation: Fragmentation occurs when file data is scattered across different parts of the disk, resulting in
reduced performance. Regular defragmentation is required to optimise disk performance.
 Lack of advanced features: The newest version, FAT32, lacks several advanced features found in other file
systems. It does not support file-level security permissions, journaling, encryption, or compression.
 Volume name limitations: The volume names for FAT16 and FAT32 cannot exceed 11 characters and cannot
include most non-alphanumeric characters.
 File name limitations: Files on a FAT16 file system cannot exceed 8.3 character for their files names. This means
8 characters plus a 3 character file extension.

exFAT (Extended File Allocation Table)

exFAT is a file system introduced by Microsoft as an improved version of FAT32. It addresses some of the
limitations of FAT32, allowing for larger file sizes and better performance. exFAT is commonly used for
removable storage devices, such as external SSDs, hard drives and SD cards as it provides compatibility across
multiple operating systems. It was first introduced in 2006 as part of Windows CE 6.0.

Advantages:

 Large file and partition size support: exFAT supports much larger file sizes and partition sizes
compared to FAT file systems. It can handle files bigger than 4 GB, making it suitable for storing large
media files or disk images.
 Efficient disk space utilisation: exFAT improves disk space utilisation compared to older FAT file
systems. It uses smaller cluster sizes, which reduces the amount of wasted disk space for smaller files.
 Compatibility: It can natively be read from and written to by Windows and MacOS operating systems
without the need for third-party software.

Disadvantages:

 Limited metadata support: exFAT lacks some advanced features found in other modern file systems. It doesn’t
support file-level security permissions, journaling, or file system-level encryption.
 Fragmentation: Like FAT file systems, exFAT is still susceptible to fragmentation. As files are created, modified,
and deleted, fragmentation can occur leading to decreased performance over time.

NTFS (New Technology File System)

NTFS is the default file system used by Windows NT-based operating systems, starting in 1993 with Windows
NT 3.1, all the way up to and including Windows 11. It offers advanced features like file permissions,
encryption, compression, and journaling. NTFS supports large file and partition sizes, making it suitable for
modern storage devices. However, it has limited compatibility with non-Windows operating systems.

Advantages:

 Security and permissions: NTFS provides a solid security model with file-level permissions. It allows
you to set permissions for individual files and folders, controlling access rights for users and groups.
 Trim support on solid-state drives (SSDs): TRIM informs the drive about unused data, which allows the
SSD to erase and prepare the space for future writes. TRIM is enabled by default, when NTFS file system
is chosen to maintain its performance.

Disadvantages:
 Disk errors and repairs: Although NTFS is designed to be reliable, disk errors can still occur. When encountering
disk errors, NTFS repairs can be time-consuming and may require special tools.
 Fragmentation: Over time, NTFS file systems can become fragmented, especially as files are created, modified
and deleted. Fragmentation can lead to decreased performance as the system needs to access scattered file
fragments.

(Redundant Arrays of Independent Disks)


RAID, or “Redundant Arrays of Independent Disks” is a technique which makes use of a combination of multiple
disks instead of using a single disk for increased performance, data redundancy or both. The term was coined
by David Patterson, Garth A. Gibson, and Randy Katz at the University of California, Berkeley in 1987.

Why data redundancy?


Data redundancy, although taking up extra space, adds to disk reliability. This means, in case of disk failure, if
the same data is also backed up onto another disk, we can retrieve the data and go on with the operation. On
the other hand, if the data is spread across just multiple disks without the RAID technique, the loss of a single
disk can affect the entire data.

Key evaluation points for a RAID System


● Reliability: How many disk faults can the system tolerate?
● Availability: What fraction of the total session time is a system in uptime mode, i.e. how
available is the system for actual use?
● Performance: How good is the response time? How high is the throughput (rate of processing
work)? Note that performance contains a lot of parameters and not just the two.
● Capacity: Given a set of N disks each with B blocks, how much useful capacity is available to the
user?
RAID is very transparent to the underlying system. This means, to the host system, it appears as a single big
disk presenting itself as a linear array of blocks. This allows older technologies to be replaced by RAID without
making too many changes in the existing code.

RAID 0

o RAID level 0 provides data stripping, i.e., a data can place across multiple disks. It is based on
stripping that means if one disk fails then all data in the array is lost.
o This level doesn't provide fault tolerance but increases the system performance.

Example:

Disk 0 Disk 1 Disk 2 Disk 3

20 21 22 23

24 25 26 27

28 29 30 31

32 33 34 35

In this figure, block 0, 1, 2, 3 form a stripe.

In this level, instead of placing just one block into a disk at a time, we can work with two or more blocks placed
it into a disk before moving on to the next one.

Disk 0 Disk 1 Disk 2 Disk 3

20 22 24 26

21 23 25 27

28 30 32 34
29 31 33 35

In this above figure, there is no duplication of data. Hence, a block once lost cannot be recovered.

Pros of RAID 0:

o In this level, throughput is increased because multiple data requests probably not on the same disk.
o This level full utilizes the disk space and provides high performance.
o It requires minimum 2 drives.

Cons of RAID 0:

o It doesn't contain any error detection mechanism.


o The RAID 0 is not a true RAID because it is not fault-tolerance.
o In this level, failure of either disk results in complete data loss in respective array.

RAID 1

This level is called mirroring of data as it copies the data from drive 1 to drive 2. It provides 100% redundancy
in case of a failure.

Example:

Disk 0 Disk 1 Disk 2 Disk 3

A A B B

C C D D

E E F F

G G H H

Only half space of the drive is used to store the data. The other half of drive is just a mirror to the already
stored data.

Pros of RAID 1:

o The main advantage of RAID 1 is fault tolerance. In this level, if one disk fails, then the other
automatically takes over.
o In this level, the array will function even if any one of the drives fails.

Cons of RAID 1:

o In this level, one extra drive is required per drive for mirroring, so the expense is higher.

RAID 2

o RAID 2 consists of bit-level striping using hamming code parity. In this level, each data bit in a word is
recorded on a separate disk and ECC code of data words is stored on different set disks.
o Due to its high cost and complex structure, this level is not commercially used. This same performance
can be achieved by RAID 3 at a lower cost.

Pros of RAID 2:
o This level uses one designated drive to store parity.
o It uses the hamming code for error detection.

Cons of RAID 2:

o It requires an additional drive for error detection.

RAID 3

o RAID 3 consists of byte-level striping with dedicated parity. In this level, the parity information is
stored for each disk section and written to a dedicated parity drive.
o In case of drive failure, the parity drive is accessed, and data is reconstructed from the remaining
devices. Once the failed drive is replaced, the missing data can be restored on the new drive.
o In this level, data can be transferred in bulk. Thus high-speed data transmission is possible.

Disk 0 Disk 1 Disk 2 Disk 3

A B C P(A, B, C)

D E F P(D, E, F)

G H I P(G, H, I)

J K L P(J, K, L)

Pros of RAID 3:

o In this level, data is regenerated using parity drive.


o It contains high data transfer rates.
o In this level, data is accessed in parallel.

Cons of RAID 3:

o It required an additional drive for parity.


o It gives a slow performance for operating on small sized files.

RAID 4

o RAID 4 consists of block-level stripping with a parity disk. Instead of duplicating data, the RAID 4
adopts a parity-based approach.
o This level allows recovery of at most 1 disk failure due to the way parity works. In this level, if more
than one disk fails, then there is no way to recover the data.
o Level 3 and level 4 both are required at least three disks to implement RAID.

Disk 0 Disk 1 Disk 2 Disk 3

A B C P0

D E F P1

G H I P2

J K L P3
In this figure, we can observe one disk dedicated to parity.

In this level, parity can be calculated using an XOR function. If the data bits are 0,0,0,1 then the parity bits is
XOR(0,1,0,0) = 1. If the parity bits are 0,0,1,1 then the parity bit is XOR(0,0,1,1)= 0. That means, even number
of one results in parity 0 and an odd number of one results in parity 1.

C1 C2 C3 C4 Parity

0 1 0 0 1

0 0 1 1 0

Suppose that in the above figure, C2 is lost due to some disk failure. Then using the values of all the other
columns and the parity bit, we can recompute the data bit stored in C2. This level allows us to recover lost
data.

RAID 5

o RAID 5 is a slight modification of the RAID 4 system. The only difference is that in RAID 5, the parity
rotates among the drives.
o It consists of block-level striping with DISTRIBUTED parity.
o Same as RAID 4, this level allows recovery of at most 1 disk failure. If more than one disk fails, then
there is no way for data recovery.

Disk 0 Disk 1 Disk 2 Disk 3 Disk 4

0 1 2 3 P0

5 6 7 P1 4

10 11 P2 8 9

15 P3 12 13 14

P4 16 17 18 19

This figure shows that how parity bit rotates.

This level was introduced to make the random write performance better.

Pros of RAID 5:

o This level is cost effective and provides high performance.


o In this level, parity is distributed across the disks in an array.
o It is used to make the random write performance better.

Cons of RAID 5:

o In this level, disk failure recovery takes longer time as parity has to be calculated from all available
drives.
o This level cannot survive in concurrent drive failure.

RAID 6

o This level is an extension of RAID 5. It contains block-level stripping with 2 parity bits.
o In RAID 6, you can survive 2 concurrent disk failures. Suppose you are using RAID 5, and RAID 1. When
your disks fail, you need to replace the failed disk because if simultaneously another disk fails then
you won't be able to recover any of the data, so in this case RAID 6 plays its part where you can
survive two concurrent disk failures before you run out of options.

Disk 1 Disk 2 Disk 3 Disk 4

A0 B0 Q0 P0

A1 Q1 P1 D1

Q2 P2 C2 D2

P3 B3 C3 Q3

Pros of RAID 6:

o This level performs RAID 0 to strip data and RAID 1 to mirror. In this level, stripping is performed
before mirroring.
o In this level, drives required should be multiple of 2.

Cons of RAID 6:

o It is not utilized 100% disk capability as half is used for mirroring.


o It contains very limited scalability.

You might also like