0% found this document useful (0 votes)
42 views

File System

In computing, a file system is a method for storing and organizing computer files. File systems may use a data storage device such as a hard disk or CD-ROM. They may also be virtual and exist only as an access method for virtual data.
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views

File System

In computing, a file system is a method for storing and organizing computer files. File systems may use a data storage device such as a hard disk or CD-ROM. They may also be virtual and exist only as an access method for virtual data.
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 27

File system

In computing, a file system (often


alsFo written as filesystem) is a
method for storing and organizing
computer files and the data they
contain to make it easy to find and
access them. File systems may use a
data storage device such as a hard
disk or CD-ROM and involve
maintaining the physical location of
the files, they might provide access to
data on a file server by acting as
clients for a network protocol (e.g.,
NFS, SMB, or 9P clients), or they may
be virtual and exist only as an access
method for virtual data (e.g., procfs).
More formally, a file system is a set of
abstract data types that are
implemented for the storage,
hierarchical organization,
manipulation, navigation, access, and
retrieval of data. File systems share
much in common with database
technology, but it is debatable
whether a file system can be classified
as a special-purpose database (DBMS).

Aspects of file systems


The most familiar file systems make
use of an underlying data storage
device that offers access to an array of
fixed-size blocks, sometimes called
sector, generally 512 bytes each. The
file system software is responsible for
organizing these sectors into files and
directories, and keeping track of which
sectors belong to which file and which
are not being used. Most file systems
address data in fixed-sized units called
"clusters" or "blocks" which contain a
certain number of disk sectors (usually
1-64). This is the smallest logical
amount of disk space that can be
allocated to hold a file.
However, file systems need not make
use of a storage device at all. A file
system can be used to organize and
represent access to any data, whether
it be stored or dynamically generated
(eg, from a network connection).
Whether the file system has an
underlying storage device or not, file
systems typically have directories
which associate file names with files,
usually by connecting the file name to
an index into a file allocation table of
some sort, such as the FAT in an MS-
DOS file system, or an inode in a Unix-
like file system. Directory structures
may be flat, or allow hierarchies where
directories may contain subdirectories.
In some file systems, file names are
structured, with special syntax for
filename extensions and version
numbers. In others, file names are
simple strings, and per-file metadata
is stored elsewhere.
Other bookkeeping information is
typically associated with each file
within a file system. The length of the
data contained in a file may be stored
as the number of blocks allocated for
the file or as an exact byte count. The
time that the file was last modified
may be stored as the file's timestamp.
Some file systems also store the file
creation time, the time it was last
accessed, and the time that the file's
meta-data was changed. (Note that
many early PC operating systems did
not keep track of file times.) Other
information can include the file's
device type (e.g., block, character,
socket, subdirectory, etc.), its owner
user-ID and group-ID, and its access
permission settings (e.g., whether the
file is read-only, executable, etc.).
The hierarchical file system was an
early research interest of Dennis
Ritchie of Unix fame; previous
implementations were restricted to
only a few levels, notably the IBM
implementations, even of their early
databases like IMS. After the success
of Unix, Ritchie extended the file
system concept to every object in his
later operating system developments,
such as Plan 9 and Inferno.
Traditional file systems offer facilities
to create, move and delete both files
and directories. They lack facilities to
create additional links to a directory
(hard links in Unix), rename parent
links (".." in Unix-like OS), and create
bidirectional links to files.
Traditional file systems also offer
facilities to truncate, append to,
create, move, delete and in-place
modify files. They do not offer facilities
to prepend to or truncate from the
beginning of a file, let alone arbitrary
insertion into or deletion from a file.
The operations provided are highly
asymmetric and lack the generality to
be useful in unexpected contexts. For
example, interprocess pipes in Unix
have to be implemented outside of the
file system because the pipes concept
does not offer truncation from the
beginning of files.
Secure access to basic file system
operations can be based on a scheme
of access control lists or capabilities.
Research has shown access control
lists to be difficult to secure properly,
which is why research operating
systems tend to use capabilities.
Commercial file systems still use
access control lists. see: secure
computing
Arbitrary attributes can be associated
on advanced file systems, such as
XFS, ext2/ext3, some versions of UFS,
and HFS+, using extended file
attributes. This feature is implemented
in the kernels of Linux, FreeBSD and
Mac OS X operating systems, and
allows metadata to be associated with
the file at the file system level. This,
for example, could be the author of a
document, the character encoding of a
plain-text document, or a checksum.

Types of file systems


File system types can be classified into
disk file systems, network file systems
and special purpose file systems.

Disk file systems


files on a data
A disk file system is a file system designed for the storage of

storage device, most commonly a disk drive, which might


be directly or indirectly connected to the computer. Examples of disk file systems include
FAT, FAT32, NTFS, HFS and HFS+, ext2, ext3,
ISO 9660, ODS-5, and UDF. Some disk file systems are
journaling file systems or versioning file
systems.

Flash file systems


A flash file system is a file system
designed for storing files on flash
memory devices. These are becoming
more prevalent as the number of
mobile devices is increasing, and the
capacity of flash memories catches up
with hard drives.
While a block device layer can
emulate a disk drive so that a disk file
system can be used on a flash device,
this is suboptimal for several reasons:
• Erasing blocks: Flash memory
blocks have to be explicitly erased
before they can be written to. The
time taken to erase blocks can be
significant, thus it is beneficial to
erase unused blocks while the
device is idle.
• Random access: Disk file systems
are optimized to avoid disk seeks
whenever possible, due to the high
cost of seeking. Flash memory
devices impose no seek latency.
• Wear levelling: Flash memory
devices tend to wear out when a
single block is repeatedly
overwritten; flash file systems are
designed to spread out writes
evenly.
Log-structured file systems have all
the desirable properties for a flash file
system. Such file systems include
JFFS2 and YAFFS.
Database file systems
A new concept for file management is
the concept of a database-based file
system. Instead of, or in addition to,
hierarchical structured management,
files are identified by their
characteristics, like type of file, topic,
author, or similar metadata. Example:
dbfs.
Transactional file systems
Each disk operation may involve
changes to a number of different files
and disk structures. In many cases,
these changes are related, meaning
that it is important that they all be
executed at the same time. Take for
example a bank sending another bank
some money electronically. The bank's
computer will "send" the transfer
instruction to the other bank and also
update its own records to indicate the
transfer has occurred. If for some
reason the computer crashes before it
has had a chance to update its own
records, then on reset, there will be no
record of the transfer but the bank will
be missing some money.
Transaction processing introduces the
guarantee that at any point while it is
running, a transaction can either be
finished completely or reverted
completely (though not necessarily
both at any given point). This means
that if there is a crash or power failure,
after recovery, the stored state will be
consistent. (Either the money will be
transferred or it will not be
transferred, but it won't ever go
missing "in transit".)
This type of file system is designed to
be fault tolerant, but may incur
additional overhead to do so.
Journaling file systems are one
technique used to introduce
transaction-level consistency to
filesystem structures.

Network file systems


A network file system is a file system
that acts as a client for a remote file
access protocol, providing access to
files on a server. Examples of network
file systems include clients for the
NFS, SMB protocols, and file-system-
like clients for FTP and WebDAV.

Special purpose file systems


A special purpose file system is
basically any file system that is not a
disk file system or network file system.
This includes systems where the files
are arranged dynamically by software,
intended for such purposes as
communication between computer
processes or temporary file space.
Special purpose file systems are most
commonly used by file-centric
operating systems such as Unix.
Examples include the procfs (/proc)
file system used by some Unix
variants, which grants access to
information about processes and other
operating system features.
Deep space science exploration craft,
like Voyager I & II used digital tape
based special file systems. Most
modern space exploration craft like
Cassini-Huygens used Real-time
operating system file systems or RTOS
influenced file systems. The Mars
Rovers are one such example of an
RTOS file system, important in this
case because they are implemented in
flash memory.

File systems and operating


systems
Most operating systems provide a file
system, as a file system is an integral
part of any modern operating system.
Early microcomputer operating
systems' only real task was file
management — a fact reflected in
their names (see DOS). Some early
operating systems had a separate
component for handling file systems
which was called a disk operating
system. On some microcomputers, the
disk operating system was loaded
separately from the rest of the
operating system. On early operating
systems, there was usually support for
only one, native, unnamed file system;
for example, CP/M supports only its
own file system, which might be called
"CP/M file system" if needed, but
which didn't bear any official name at
all.
Because of this, there needs to be an
interface provided by the operating
system software between the user and
the file system. This interface can be
textual (such as provided by a
command line interface, such as the
Unix shell, or OpenVMS DCL) or
graphical (such as provided by a
graphical user interface, such as file
browsers). If graphical, the metaphor
of the folder, containing documents,
other files, and nested folders is often
used (see also: directory and folder).
Flat file systems
In a flat file system, there are no
subdirectories—everything is stored at
the same (root) level on the media, be
it a hard disk, floppy disk, etc. While
simple, this system rapidly becomes
inefficient as the number of files
grows, and makes it difficult for users
to organise data into related groups.
File systems under Unix-like
operating systems
Unix-like systems assign a device
name to each device, but this is not
how the files on that device are
accessed. Instead, to gain access to
files on another device, you must first
inform the operating system where in
the directory tree you would like those
files to appear. This process is called
mounting a file system. For example,
to access the files on a CD-ROM, one
must tell the operating system "Take
the file system from this CD-ROM and
make it appear under such-and-such
directory". The directory given to the
operating system is called the mount
point - it might, for example, be
/media. The /media directory exists on
many Unix systems (as specified in
the Filesystem Hierarchy Standard)
and is intended specifically for use as
a mount point for removable media
such as CDs, DVDs and like floppy
disks. It may be empty, or it may
contain subdirectories for mounting
individual devices. Generally, only the
administrator (i.e. root user) may
authorize the mounting of file
systems.
Unix-like operating systems often
include software and tools that assist
in the mounting process and provide it
new functionality. Some of these
strategies have been coined "auto-
mounting" as a reflection of their
purpose.
1. In many situations, file systems
other than the root need to be
available as soon as the operating
system has booted. All Unix-like
systems therefore provide a facility
for mounting file systems at boot
time. System administrators define
these file systems in the
configuration file fstab, which also
indicates options and mount points.
2. In some situations, there is no need
to mount certain file systems at
boot time, although their use may
be desired thereafter. There are
some utilities for Unix-like systems
that allow the mounting of
predefined file systems upon
demand.
3. Removable media have become
very common with microcomputer
platforms. They allow programs
and data to be transferred between
machines without a physical
connection. Common examples
include USB flash drives, CD-ROMs
and DVDs. Utilities have therefore
been developed to detect the
presence and availability of a
medium and then mount that
medium without any user
intervention.
4. Progressive Unix-like systems have
also introduced a concept called
supermounting; see, for example,
the Linux supermount-ng project.
For example, a floppy disk that has
been supermounted can be
physically removed from the
system. Under normal
circumstances, the disk should
have been synchronised and then
unmounted before its removal.
Provided synchronisation has
occurred, a different disk can be
inserted into the drive. The system
automatically notices that the disk
has changed and updates the
mount point contents to reflect the
new medium. Similar functionality
is found on standard Windows
machines.
5. A similar innovation preferred by
some users is the use of autofs, a
system that, like supermounting,
eliminates the need for manual
mounting commands. The
difference from supermount, other
than compatibility in an apparent
greater range of applications such
as access to file systems on
network servers, is that devices are
mounted transparently when
requests to their file systems are
made, as would be appropriate for
file systems on network servers,
rather than relying on events such
as the insertion of media, as would
be appropriate for removable
media.
File systems under Mac OS X
Mac OS X uses a file system that it
inherited from classic Mac OS called
HFS Plus. HFS Plus is a metadata-rich
and case preserving file system. Due
to the Unix roots of Mac OS X, Unix
permissions were added to HFS Plus.
Later versions of HFS Plus added
journaling to prevent corruption of the
file system structure and introduced a
number of optimizations to the
allocation algorithms in an attempt to
defragment files automatically without
requiring an external defragmenter.
Filenames can be up to 255
characters. HFS Plus uses Unicode to
store filenames. On Mac OS X, the
filetype can come from the type code,
stored in file's metadata, or the
filename.
HFS Plus has three kinds of links: Unix-
style hard links, Unix-style symbolic
links and aliases. Aliases are designed
to maintain a link to their original file
even if they are moved or renamed;
they are not interpreted by the file
system itself, but by the File Manager
code in userland.
Mac OS X also supports the UFS file
system, derived from the BSD Unix
Fast File System via NeXTSTEP.
File systems under Plan 9 from
Bell Labs
Plan 9 from Bell Labs was originally
designed to extend some of Unix's
good points, and to introduce some
new ideas of its own while fixing the
shortcomings of Unix.
With respect to file systems, the Unix
system of treating things as files was
continued, but in Plan 9, everything is
treated as a file, and accessed as a file
would be (i.e., no ioctl or mmap).
Perhaps surprisingly, while the file
interface is made universal it is also
simplified considerably, for example
symlinks, hard links and suid are made
obsolete, and an atomic create/open
operation is introduced. More
importantly the set of file operations
becomes well defined and subversions
of this like ioctl are eliminated.
Secondly, the underlying 9P protocol
was used to remove the difference
between local and remote files (except
for a possible difference in latency).
This has the advantage that a device
or devices, represented by files, on a
remote computer could be used as
though it were the local computer's
own device(s). This means that under
Plan 9, multiple file servers provide
access to devices, classing them as
file systems. Servers for "synthetic"
file systems can also run in user space
bringing many of the advantages of
micro kernel systems while
maintaining the simplicity of the
system.
Everything on a Plan 9 system has an
abstraction as a file; networking,
graphics, debugging, authentication,
capabilities, encryption, and other
services are accessed via I-O
operations on file descriptors. For
example, this allows the use of the IP
stack of a gateway machine without
need of NAT, or provides a network-
transparent window system without
the need of any extra code.
Another example: a Plan-9 application
receives FTP service by opening an
FTP site. The ftpfs server handles the
open by essentially mounting the
remote FTP site as part of the local file
system. With ftpfs as an intermediary,
the application can now use the usual
file-system operations to access the
FTP site as if it were part of the local
file system. A further example is the
mail system which uses file servers
that synthesize virtual files and
directories to represent a user mailbox
as /mail/fs/mbox. The wikifs provides a
file system interface to a wiki.
These file systems are organized with
the help of private, per-process
namespaces, allowing each process to
have a different view of the many file
systems that provide resources in a
distributed system.
File systems under Microsoft
Windows
Windows makes use of the FAT and
NTFS (New Technology File System)
file systems.
The FAT (File Allocation Table) filing
system, supported by all versions of
Microsoft Windows, was an evolution
of that used in Microsoft's earlier
operating system (MS-DOS which in
turn was based on 86-DOS). FAT
ultimately traces its roots back to the
shortlived M-DOS project and
Standalone disk BASIC before it. Over
the years various features have been
added to it, inspired by similar
features found on file systems used by
operating systems such as Unix.
Older versions of the FAT file system
(FAT12 and FAT16) had file name
length limits, a limit on the number of
entries in the root directory of the file
system and had restrictions on the
maximum size of FAT-formatted disks
or partitions. Specifically, FAT12 and
FAT16 had a limit of 8 characters for
the file name, and 3 characters for the
extension. This is commonly referred
to as the 8.3 filename limit. VFAT,
which was an extension to FAT12 and
FAT16 introduced in Windows NT 3.5
and subsequently included in Windows
95, allowed long file names (LFN).
FAT32 also addressed many of the
limits in FAT12 and FAT16, but
remains limited compared to NTFS.
NTFS, introduced with the Windows NT
operating system, allowed ACL-based
permission control. Hard links,
multiple file streams, attribute
indexing, quota tracking, compression
and mount-points for other file
systems (called "junctions") are also
supported, though not all these
features are well-documented.
Unlike many other operating systems,
Windows uses a drive letter
abstraction at the user level to
distinguish one disk or partition from
another. For example, the path
C:\WINDOWS represents a directory
WINDOWS on the partition
represented by the letter C. The C
drive is most commonly used for the
primary hard disk partition, on which
Windows is installed and from which it
boots. This "tradition" has become so
firmly ingrained that bugs came about
in older versions of Windows which
made assumptions that the drive that
the operating system was installed on
was C. The tradition of using "C" for
the drive letter can be traced to MS-
DOS, where the letters A and B were
reserved for up to two floppy disk
drives; in a common configuration, A
would be the 3½-inch floppy drive,
and B the 5¼-inch one. Network drives
may also be mapped to drive letters.

You might also like