Ntfs and Fat
Ntfs and Fat
NTFS
Developer Microsoft
EBD0A0A2-B9E5-4433-87C0-68B6B72699C7 (GPT)
Structures
Limits
Max file size 264 bytes (16 EiB) minus 1 KiB [2]
Allowed characters in filenames In Posix namespace, any UTF-16 code unit (case sensitive)
except U+0000 (NUL) and / (slash). In Win32 namespace, any UTF-16 code unit (case insensitive)
except U+0000 (NUL) / (slash) \ (backslash) : (colon) * (asterisk) ? (Question mark) " (quote) < (less
than) > (greater than) and | (pipe) [3]
Features
Date range 1 January 1601 – 28 May 60056 (File times are 64-bit numbers counting 100-
nanosecond intervals (ten million per second) since 1601, which is 58,000+ years)
Supported operating systems Windows NT family (Windows NT 3.1 to Windows NT 4.0, Windows
2000, Windows XP, Windows Server 2003, Windows Vista, Windows Server 2008)
NTFS is the standard file system of Windows NT, including its later versions
Windows 2000, Windows XP, Windows Server 2003, Windows Server 2008, and
Windows Vista.[4]
NTFS supersedes the FAT file system as the preferred file system for Microsoft’s
“Windows”-branded operating systems. NTFS has several improvements over
FAT and HPFS (High Performance File System) such as improved support for
metadata and the use of advanced data structures to improve performance,
reliability, and disk space utilization, plus additional extensions such as security
access control lists (ACL) and file system journaling. The file system specification
is a trade secret,[5][6][7] although it can be licensed commercially from
Microsoft through their Intellectual Property licensing program.[8][9]
Contents
[hide]
* 1 History
* 2 Versions
* 3 Features
o 3.2 Quotas
* 4 Interoperability
o 4.1 Linux
o 4.2 Windows
o 4.3 Others
* 5 Internals
o 5.1 Metafiles
o 5.2 Resident vs. non-resident files
* 6 Limitations
* 7 Developers
* 8 See also
* 9 References
* 10 External links
[edit] History
In the early 1990s Microsoft and IBM formed a joint project to create the next generation graphical
operating system. The result of the project was OS/2, but eventually Microsoft and IBM disagreed on
many important issues and separated. OS/2 remained an IBM project. Microsoft started to work on
Windows NT. The OS/2 filesystem HPFS contained several important new features. When Microsoft
created their new operating system, they borrowed many of these concepts for NTFS.[10] Probably as a
result of this common ancestry, HPFS and NTFS share the same disk partition identification type code
(07). Sharing an ID is unusual since there were dozens of available codes, and other major filesystems
have their own code. FAT has more than nine (one each for FAT12, FAT16, FAT32, etc.). Algorithms
which identify the filesystem in a partition type 07 must perform additional checks.
[edit] Versions
* v1.2 written by NT 3.51 (mid-1995) and NT 4 (mid-1996) (occasionally referred to as "NTFS 4.0",
because OS version is 4.0)
V1.0 and V1.1 (and newer) are incompatible: that is, volumes written by NT 3.5x cannot be read by NT
3.1 until an update on the NT 3.5x CD is applied to NT 3.1, which also adds FAT long file name support.
[11] V1.2 supports compressed files, named streams, ACL-based security, etc.[12] V3.0 added disk
quotas, encryption, sparse files, reparse points, update sequence number (USN) journaling, the $Extend
folder and its files, and reorganized security descriptors so that multiple files which use the same
security setting can share the same descriptor.[13] V3.1 expanded the Master File Table (MFT) entries
with redundant MFT record number (useful for recovering damaged MFT files).
Windows Vista introduced Transactional NTFS, NTFS symbolic links, and self-healing functionality[14]
though these features owe more to additional functionality of the operating system than the filesystem
itself, yet the NTFS version number was not changed.
[edit] Features
NTFS v3.0, the third version of NTFS to be introduced, includes several new features over its
predecessors: disk usage quotas, sparse file support, reparse points, distributed link tracking, and file-
level encryption, also known as the Encrypting File System (EFS).
Alternate data streams allows files to be associated with more than one data stream. For example, a
file such as text.txt can have an ADS with the name of text.txt:secret (of form filename:streamname)
that can only be accessed by knowing the ADS name or by specialized directory browsing programs.
Alternate streams are not detectable in the original file's size but are lost when the original file (i.e.
text.txt) is deleted with a RemoveFile or RemoveFileTransacted call (or a call that uses those calls), or
when the file is copied or moved to a partition that doesn't support ADS (e.g. a FAT partition, a floppy
disk, or a network share). While ADS is a useful feature, it can also easily eat up hard disk space if
unknown either through being forgotten or not being detected.
[edit] Quotas
Disk quotas were introduced in NTFS v3. They allow the administrator of a computer that runs a
version of Windows that supports NTFS to set a threshold of disk space that users may utilize. It also
allows administrators to keep track of how much disk space each user is using. An administrator may
specify a certain level of disk space that a user may use before they receive a warning, and then deny
access to the user once they hit their upper limit of space. Disk quotas do not take into account NTFS's
transparent file-compression, should this be enabled. Applications that query the amount of free space
will also see the amount of free space left to the user who has a quota applied to them.
Sparse files are files which contain sparse data sets, data mostly filled with zeroes. Many scientific
applications can generate very large sparse data sets. Because of this, Microsoft has implemented
support for efficient storage of sparse files by allowing an application to specify regions of empty (zero)
data. An application that reads a sparse file reads it in the normal manner with the file system
calculating what data should be returned based upon the file offset. As with compressed files, the actual
size of sparse files are not taken into account when determining quota limits.[15][16]
This feature was introduced in NTFS v3. These are used by associating a reparse tag in the user space
attribute of a file or directory. When the object manager (see Windows NT line executive) parses a file
system name lookup and encounters a reparse attribute, it knows to reparse the name lookup, passing
the user controlled reparse data to every file system filter driver that is loaded into Windows 2000. Each
filter driver examines the reparse data to see if it is associated with that reparse point, and if that filter
driver determines a match then it intercepts the file system call and executes its special functionality.
Reparse points are used to implement Volume Mount Points, Directory Junctions, Hierarchical Storage
Management, Native Structured Storage and Single Instance Storage:
Similar to volume mount points, however directory junctions reference other directories in the file
system instead of other volumes. For instance, the directory C:\exampledir with a directory junction
attribute that contains a link to D:\linkeddir will automatically refer to the directory D:\linkeddir when it
is accessed by a user-mode application. [17] This function is conceptually similar to symbolic links to
directories in Unix except that the target in NTFS must always be another directory. (Typical Unix file
systems allow the target of a symbolic link to be any type of file.)
Originally included to support the POSIX subsystem in Windows NT[18], hard links are similar to
directory junctions, but used for files instead of directories. Hard links can only be applied to files on the
same volume since an additional filename record is added to the file's MFT record. Short (8.3) filenames
are also implemented as additional filename records that don't have separate directory entries.
Hierarchical Storage Management is a means of transferring files that are not used for some period of
time to less expensive storage media. When the file is next accessed the reparse point on that file
determines that it is needed and retrieves it from storage.
NSS was an ActiveX document storage technology that has since been discontinued by Microsoft. It
allowed ActiveX Documents to be stored in the same multi-stream format that ActiveX uses internally.
An NSS file system filter was loaded and used to process the multiple streams transparently to the
application, and when the file was transferred to a non-NTFS formatted disk volume it would also
transfer the multiple streams into a single stream.[19]
The Volume Shadow Copy (VSC) service keeps historical versions of files and folders on NTFS volumes
by copying old, newly-overwritten data to shadow copy (copy-on-write). The old file data is overlaid on
the new when the user requests a revert to an earlier version. This also allows data backup programs to
archive files currently in use by the file system. On heavily loaded systems, Microsoft recommends
setting up a shadow copy volume on separate disk to reduce the I/O load on the main volume.
NTFS can compress files using a variant of the LZ77 algorithm (also used in the popular ZIP file format).
[20] Although read-write access to compressed files is transparent, Microsoft recommends avoiding
compression on server systems and/or network shares holding roaming profiles because it puts a
considerable load on the processor.[21]
Single-user systems with limited hard disk space will probably use NTFS compression successfully.
[citation needed] The slowest link in a computer is not the CPU but the speed of the hard drive, so NTFS
compression allows the limited, slow storage space to be better used, in terms of both space and (often)
speed.[22] NTFS compression can also serve as a replacement for sparse files when a program (e.g. a
download manager) is not able to create files without content as sparse files.
When there are several directories that have different, but similar, files, some of these files may have
identical content. Single instance storage allows identical files to be merged to one file and create
references to that merged file. SIS consists of a file system filter that manages copies, modification and
merges to files; and a user space service (or groveler) that searches for files that are identical and need
merging. SIS was mainly designed for remote installation servers as these may have multiple installation
images that contain many identical files; SIS allows these to be consolidated but, unlike for example hard
links, each file remains distinct; changes to one copy of a file will leave others unaltered. This is similar to
copy-on-write, which is a technique by which memory copying is not really done until one copy is
modified.[23]
EFS provides strong and user-transparent encryption of any file or folder on an NTFS volume. EFS
works in conjunction with the EFS service, Microsoft's CryptoAPI and the EFS File System Run-Time
Library (FSRTL).
EFS works by encrypting a file with a bulk symmetric key (also known as the File Encryption Key, or
FEK), which is used because it takes a relatively smaller amount of time to encrypt and decrypt large
amounts of data than if an asymmetric key cipher is used. The symmetric key that is used to encrypt the
file is then encrypted with a public key that is associated with the user who encrypted the file, and this
encrypted data is stored in an alternate data stream of the encrypted file. To decrypt the file, the file
system uses the private key of the user to decrypt the symmetric key that is stored in the file header. It
then uses the symmetric key to decrypt the file. Because this is done at the file system level, it is
transparent to the user.[24] Also, in case of a user losing access to their key, support for additional
decryption keys has been built in to the EFS system, so that a recovery agent can still access the files if
needed.
Symbolic links were introduced in Windows Vista. Symbolic links (or Soft links) are resolved on the
client side. So when a symbolic link is shared, the target is subject to the access restrictions on the client,
and not the server.
As of Windows Vista, applications can use Transactional NTFS to group changes to files together into a
transaction. The transaction will guarantee that all changes happen, or none of them do, and it will
guarantee that applications outside the transaction will not see the changes until the precise instant
they're committed.[25]
[edit] USN Journal
[edit] Interoperability
Details on the implementation's internals are closed, which makes it difficult for third-party vendors to
provide tools to handle NTFS.
[edit] Linux
Full and safe read/write of NTFS is provided by the NTFS-3G driver. It is included in most Linux
distributions. Other outdated and mostly read-only solutions exist as well:
* Linux kernel 2.2: NTFS partitions can be read by the kernel since version 2.2.0.
* Linux kernel 2.6: contains a driver written by Anton Altaparmakov (University of Cambridge) and
Richard Russon. It supports file read, overwrite and resize, in some cases.
* NTFSMount: A userspace driver with limited file and directory read/write support is available using
ntfsmount[26]
* NTFS for Linux: A commercial driver with full read/write support available from Paragon.
* Captive NTFS: A 'wrapping' driver which uses Windows' own driver, ntfs.sys.
Note that all three userspace drivers, namely NTFSMount, NTFS-3G and Captive NTFS, are built on the
Filesystem in Userspace (FUSE), a Linux kernel module tasked with bridging userspace and kernel code
to save and retrieve data. Almost all drivers listed above (except Paragon NTFS for Linux) are open
source (GPL). Due to the complexity of internal NTFS structures, both the built-in 2.6.14 kernel driver
and the FUSE drivers disallow changes to the volume that are considered unsafe, to avoid corruption.
[edit] Windows
While the different NTFS versions have a great degree of both forward and backward compatibility,
there are technical considerations for mounting newer NTFS volumes in older versions of Windows. This
affects dual-booting, and external portable hard drives.
For example, "Previous Versions" (a.k.a. Volume Shadow Copy) are lost because the older OS doesn't
understand how to keep the new features' data updated.[27]
[edit] Others
eComStation, KolibriOS, and Mac OS X versions 10.3 and later offer read-only NTFS support (there is a
beta NTFS driver that allows write/delete for eComStation, but is generally considered unsafe). A free
third-party tool for BeOS, which was based on NTFS-3G, allows full NTFS read and write. NTFS-3G also
works on Mac OS X, FreeBSD, NetBSD, Solaris and Haiku besides Linux. A commercial read/write driver
for DOS called "NTFS4DOS" also exists.[28] A commercial solution for Mac OS X with read/write access is
"Paragon NTFS for Mac OS X".[29]
Microsoft currently provides a tool (convert.exe) to convert HPFS (only on Windows NT 3), FAT16 and,
on Windows 2000 and higher, FAT32 to NTFS, but not the other way around.[30] Various third-party
tools are all capable of safely resizing NTFS partitions. Microsoft added the ability to shrink or expand a
partition with Windows Vista, but this capability is limited because it will not relocate the master file
table, thus limiting the ability to shrink a partition to roughly half of its original size. [31]
For historical reasons, the versions of Windows that do not support NTFS all keep time internally as local
zone time, and therefore so do all file systems other than NTFS that are supported by current versions of
Windows. However, Windows NT and its descendants keep internal timestamps as UTC and make the
appropriate conversions for display purposes. Therefore, NTFS timestamps are in UTC. This means that
when files are copied or moved between NTFS and non-NTFS partitions, the OS needs to convert
timestamps on the fly. But if some files are moved when daylight saving time (DST) is in effect, and other
files are moved when standard time is in effect, there can be some ambiguities in the conversions. As a
result, especially shortly after one of the days on which local zone time changes, users may observe that
some files have timestamps that are incorrect by one hour. Due to the differences in implementation of
DST between the northern and southern hemispheres, this can result in a potential timestamp error of
up to 4 hours in any given 12 months.[32]
[edit] Internals
In NTFS, all file data—file name, creation date, access permissions, and contents—are stored as
metadata in the Master File Table. This abstract approach allowed easy addition of file system features
during Windows NT's development — an interesting example is the addition of fields for indexing used
by the Active Directory software.
NTFS allows any sequence of 16-bit values for name encoding (file names, stream names, index names,
etc.). This means UTF-16 codepoints are supported, but the file system does not check whether a
sequence is valid UTF-16 (it allows any sequence of short values, not restricted to those in the Unicode
standard).
Internally, NTFS uses B+ trees to index file system data. Although complex to implement, this allows
faster file look up times in most cases. A file system journal is used to guarantee the integrity of the file
system metadata but not individual files' content. Systems using NTFS are known to have improved
reliability compared to FAT file systems.[33]
The Master File Table (MFT) contains metadata about every file, directory, and metafile on an NTFS
volume. It includes filenames, locations, size, and permissions. Its structure supports algorithms which
minimize disk fragmentation. A directory entry consists of a filename and a "file ID" which is the record
number representing the file in the Master File Table. The file ID also contains a reuse count to detect
stale references. While this strongly resembles the W_FID of Files-11, other NTFS structures radically
differ.
[edit] Metafiles
NTFS contains several files which define and organize the file system. In all respects, most of these files
are structured like any other user file ($Volume being the most peculiar), but are not of direct interest to
file system clients. These metafiles define files, back up critical file system data, buffer file system
changes, manage free space allocation, satisfy BIOS expectations, track bad allocation units, and store
security and disk space usage information.
0 $MFT Describes all files on the volume, including file names, timestamps, stream names and
lists of cluster numbers where data streams reside, indexes, security identifiers, and file attributes like
"read only", "compressed", "encrypted", etc.
1 $MFTMirr Is a duplicate of the first vital entries of $MFT, usually 4 entries (4 KiB).
2 $LogFile Contains transaction log of file system changes for metadata consistency.
3 $Volume Contains information about the volume, namely the volume object identifier,
volume label, file system version, and volume flags (mounted, chkdsk requested, requested $LogFile
resize, mounted on NT 4, volume serial number updating, structure upgrade request). The volume serial
number is in $Boot file.
4 $AttrDef A table of NTFS attributes used with names, numbers and descriptions.
5 . Root directory.
6 $Bitmap A table of bit entries representing if particular cluster on the volume is used or
free.
7 $Boot Volume boot record. This file located at first cluster on the volume includes bootstrap
code (used to find and launch NTLDR/ BOOTMGR and a BIOS parameter block including volume serial
number and cluster numbers of $MFT and $MFTMirr.
8 $BadClus A file which contains all the clusters marked as having bad sectors. This file
simplifies cluster management by the chkdsk utility, both as a place to put newly discovered bad sectors,
and for identifying unreferenced clusters.
9 $Secure Access control list database which reduces overhead having many identical ACLs
stored with each file, by uniquely storing these ACLs in this database only (contains two indices $SII:
perhaps[citation needed] Security ID Index and $SDH: Security Descriptor Hash which index the stream
named $SDS containing actual ACL table).[34]
12 ... 23 Reserved.
These metafiles are treated specially by NTFS and are difficult to directly view: special purpose-built
tools are needed.
To optimize storage for the common case of small data files, NTFS prefers to place file data within the
master file table if it fits instead of using MFT space to list clusters containing the data. The former is
called "resident data" by computer forensics workers. The amount of data which fits is highly dependent
on the file's characteristics, but 700 to 800 bytes is common in single-stream files with non-lengthy
filenames and no ACLs. Encrypted-by-NTFS, sparse, or compressed files cannot be resident.
Since resident files do not directly occupy clusters ("allocation units"), it is possible for an NTFS volume
to contain more files on a volume than there are clusters. For example, an 80 GB (74.5 GiB) partition
NTFS formats with 19,543,064 clusters of 4 KiB. Subtracting system files (64 MiB log file, a 2,442,888-
byte $Bitmap file, and about 25 clusters of fixed overhead) leaves 19,526,158 clusters free for files and
indices. Since there are four MFT records per cluster, this volume theoretically could hold almost 4 ×
19,526,158 = 78,104,632 resident files.
[edit] Limitations
Though the file system supports paths up to about 32767 Unicode characters[35] with each path
component (directory or filename) up to 255 characters[35] long, certain names are unusable, since
NTFS stores its metadata in regular (albeit hidden and for the most part inaccessible) files; accordingly,
user files cannot use these names. These files are all in the root directory of a volume (and are reserved
only for that directory). The names are: $MFT, $MFTMirr, $LogFile, $Volume, $AttrDef, . (dot), $Bitmap,
$Boot, $BadClus, $Secure, $Upcase, and $Extend;[2] . (dot) and $Extend are both directories; the others
are files.
In theory, the maximum NTFS volume size is 264-1 clusters. However, the maximum NTFS volume size
as implemented in Windows XP Professional is 232-1 clusters. For example, using 64 KiB clusters, the
maximum NTFS volume size is 256 TiB minus 64 KiB. Using the default cluster size of 4 KiB, the maximum
NTFS volume size is 16 TiB minus 4 KiB. Because partition tables on master boot record (MBR) disks only
support partition sizes up to 2 TiB, dynamic or GPT volumes must be used to create bootable NTFS
volumes over 2 TiB.
Theoretical: 16 EiB minus 1 KiB (264 ? 210 bytes). Implementation: 16 TiB minus 64 KiB (244 ? 216
bytes)
Windows system calls may—or may not—handle alternate data streams.[2] Depending on the
operating system, utility and remote file system, a file transfer might silently strip data streams.[2] A
safe way of copying or moving files is to use the BackupRead and BackupWrite system calls, which allow
programs to enumerate streams, to verify whether each stream should be written to the destination
volume and to knowingly skip offending streams.[2]
An absolute path may be up to 32767 characters[35] long; a relative path is limited to 255 characters.
Date range
NTFS uses the same time reckoning as Windows NT: 64-bit timestamps with a range from January 1,
1601 to May 28 60056 at a resolution of ten million ticks per second.
FAT
FAT
Developer Microsoft
Introduced 1980 (Seattle QDOS) November 1987, (Compaq DOS 3.31) August 1996
(Windows 95 OSR2) March 2008 (Vista SP1)
Partition identifier 0x01 (MBR) 0x04, 0x06, 0x0E (MBR) 0x0B, 0x0C (MBR)
EBD0A0A2-B9E5-4433
Structures
Limits
Max file size 4 GB ? 1 byte (or volume size if smaller) 264 bytes (16 Exabytes)
Max cluster count 4,077 (212-19) 65,517 (216-19) 268,435,437 (228-19) TBU
Max filename size 8.3 filename, or 255 UTF-16 characters when using LFN TBU
(Creation time and access date are only available when LFN support is enabled)
Permissions No
File Allocation Table or FAT is a computer file system architecture originally developed by Bill
Gates and Marc McDonald in 1976/1977.[1][2] It is the primary file system for various
operating systems including DR-DOS, OpenDOS, freeDOS, MS-DOS, OS/2(v1.1), and Microsoft
Windows (up to Windows Me). For floppy disks (FAT12 and FAT16 without long filename
support) it has been standardized as ECMA-107[3] and ISO/IEC 9293.[4][5] The use of long
filenames with FAT is patented in part.
The FAT file system is relatively straightforward and is supported by virtually all existing
operating systems for personal computers. This makes it an ideal format for solid-state
memory cards and a convenient way to share data between operating systems.
Common implementations have a serious drawback in that when files are deleted and new
files written to the media, directory fragments tend to become scattered over the entire disk,
making reading and writing slower on storage devices, which have lower seek time for
random data access (Such as mechanical hard drives). Manually-invoked periodic
defragmentation is one solution to this problem, but is often a lengthy process, and unwise in
some instances. Due to a limited number of lifetime writes, and their quick access times,
solid-state memory cards should usually not be defragmented.
Contents
[hide]
* 1 History
o 1.1 FAT12
o 1.2 Directories
o 1.8 Fragmentation
o 1.11 Future
o 1.12 exFAT
* 2 Design
+ 2.1.1 Exceptions
* 3 FAT licensing
o 3.1 Appeal
* 4 See also
* 6 External links
[edit] History
The FAT file system was created for managing disks in Microsoft Standalone Disk BASIC. In
August 1980 Tim Paterson incorporated FAT into his 86-DOS operating system for the S-100
8086 CPU boards;[6] the file system was the main difference between 86-DOS and its
predecessor, CP/M.
The name originates from the usage of a table which centralizes the information about which
areas belong to files, are free or possibly unusable, and where each file is stored on the disk.
To limit the size of the table, disk space is allocated to files in contiguous groups of hardware
sectors called clusters. As disk drives have evolved, the maximum number of clusters has
dramatically increased, and so the number of bits to identify a cluster has grown. The
successive major versions of the FAT format are named after the number of table element
bits: 12, 16, and 32. The FAT standard has also been expanded in other ways while preserving
backward compatibility with existing software.
[edit] FAT12
This initial version of FAT is now referred to as FAT12. Designed as a file system for floppy
diskettes, it limited cluster addresses to 12-bit values, which not only limited the cluster
count to 4078,[7] but made FAT manipulation tricky with the PC's 8-bit and 16-bit registers.
(Under Linux, FAT12 is limited to 4084 clusters.[8]) The disk's size is stored as a 16-bit count of
sectors, which limited the size to 32 MB[9]. FAT12 was used by several manufacturers with
different physical formats, but a typical floppy diskette at the time was 5.25-inch, single-
sided, 40 tracks, with 8 sectors per track, resulting in a capacity of 160 KB for both the system
areas and files. The FAT12 limitations exceeded this capacity by one or more orders of
magnitude. The limits were successively lifted in the following years which increased storage
capacity dramatically but eventually rendered FAT12 obsolete.
By convention, all the control structures were organized to fit inside the first track, thus
avoiding head movement during read and write operations, although this varied depending
on the manufacturer and physical format of the disk. At the time FAT12 was introduced, DOS
did not support hierarchical directories, and the maximum number of files was typically
limited to a few dozen.
A limitation which was not addressed until much later was that any bad sector in the control
structures area, track 0, could prevent the diskette from being usable. The DOS formatting
tool rejected such diskettes completely. Bad sectors were allowed only in the file area, where
they made the entire holding cluster unusable as well.
[edit] Directories
In 1984 IBM released the PC AT, which featured a 20 MB hard disk. Microsoft introduced MS-
DOS 3.0 in parallel. Cluster addresses were increased to 16-bit, allowing for up to 65,517
clusters per volume, and consequently much greater file system sizes. However, the
maximum possible number of sectors and the maximum (partition, rather than disk) size of
32 MB did not change. Therefore, although technically already "FAT16", this format was not
what today is commonly understood as FAT16. A 20 MB hard disk formatted under MS-DOS
3.0 was not accessible by the older MS-DOS 2.0. Of course, MS-DOS 3.0 could still access MS-
DOS 2.0 style 8 KB cluster partitions.
MS-DOS 3.0 also introduced support for high-density 1.2 MB 5.25" diskettes, which notably
had 15 sectors per track, hence more space for FAT. This probably prompted a dubious
optimization of the cluster size, which went down from 2 sectors to just 1. The net effect was
that high density diskettes were significantly slower than older double density ones.[dubious
– discuss]
Apart from improving the structure of the FAT file system itself, a parallel development
allowing an increase in the maximum possible FAT size was the introduction of multiple FAT
partitions. Originally partitions were supposed to be used only for sharing the disk between
operating systems, typically DOS and Xenix at the time, so DOS was only prepared to handle
one FAT partition. It was not possible to create multiple DOS partitions using DOS tools, and
third party tools would warn that such a scheme would not be compatible with DOS. Simply
allowing several identical-looking DOS partitions could lead to naming problems: should C: be
the first FAT partition on disk, for simplicity, or rather the partition marked as active in the
partition table, so that several DOS versions can co-exist? And which partition should be C: if
the system was booted from a diskette?
To allow the use of more FAT partitions in a compatible way, a new partition type was
introduced (in MS-DOS 3.2, January 1986), the extended partition; which is a container for
additional partitions called logical drives. Originally only one logical drive was possible,
permitting hard disks up to 64 MB. In MS-DOS 3.3 (August 1987) this limit was increased to 24
drives; it probably came from the compulsory letter-based disk naming (A and B being
reserved for the two floppy drives, with which many, if not most, systems of the era were
equipped). Logical drives are described by on-disk structures which closely resemble the
Master Boot Record (MBR) of the disk (which describes the primary partitions), likely to
simplify the implementation. Though some believe these partitions were nested in a way
analogous to Russian matryoshka dolls, that isn't the case. They are stored as a row of
separate blocks within a single box; these blocks are often referred to as being chained
together, by the links in their extended boot record (EBR) sectors. Only one extended
partition is allowed. Logical drives are not bootable, and the extended partition can only be
created after the primary FAT partition (except with third party formatting tools), which
remove all ambiguity, but also the possibility of booting several DOS versions from the same
hard disk.
A useful side-effect of the extended partition scheme was to significantly increase the
maximum number of partitions possible on a PC hard disk beyond the four which could be
described by the MBR alone.
Prior to the introduction of extended partitions, some hard disk controllers (which at that
time were separate option boards, since the IDE standard did not yet exist) could make large
hard disks appear as two separate disks.
Finally in November 1987, Compaq DOS 3.31 introduced what is today called the FAT16
format, with the expansion of the 16-bit disk sector count to 32 bits. The result was initially
called the DOS 3.31 Large File System. Although the on-disk changes were minor, the entire
DOS disk driver had to be converted to use 32-bit sector numbers, a task complicated by the
fact that it was written in 16-bit assembly language.
In 1988 the improvement became more generally available through MS-DOS 4.0 and OS/2
1.1. The limit on partition size was dictated by the 8-bit signed count of sectors-per-cluster,
which had a maximum power-of-two value of 64. With the standard hard disk sector size of
512 bytes, this gives a maximum of 32 KB clusters, thereby fixing the "definitive" limit for the
FAT16 partition size at 2 gigabytes. On magneto-optical media, which can have 1 or 2 KB
sectors, the limit is proportionally greater.
Much later, Windows NT increased the maximum cluster size to 64 KB by considering the
sectors-per-cluster count as unsigned. However, the resulting format was not compatible
with any other FAT implementation of the time, and it generated greater internal
fragmentation. Windows 98 also supported reading and writing this variant, but its disk
utilities did not work with it.
The number of root directory entries available is determined when the volume is formatted,
and is stored in a 16-bit signed field setting an absolute limit of 32767 entries (32736, a
multiple of 32, in practice). For historical reasons, FAT12 and FAT16 media generally use 512
root directory entries on non-floppy media. Other sizes may be incompatible with some
software or devices (entries being file and/or folder names in the original 8.3 format).[10]
Some third party tools like mkdosfs allow the user to set this parameter.[11]
One of the user experience goals for the designers of Windows 95 was the ability to use long
filenames (LFNs—up to 255 UTF-16 code points long), in addition to classic 8.3 filenames.
LFNs were implemented using a work-around in the way directory entries are laid out (see
below). The version of the file system with this extension is usually known as VFAT after the
Windows 95 VxD device driver, also known as "Virtual FAT" in Microsoft's documentation.
Interestingly, the VFAT driver actually appeared before Windows 95, in Windows for
Workgroups 3.11, but was only used for implementing 32-bit File Access, a higher
performance protected mode file access method, bypassing DOS and directly using either the
BIOS, or, better, the Windows-native protected mode disk drivers.
In Windows NT, support for long filenames on FAT started from version 3.5. OS/2 added long
filename support to FAT using extended attributes (EA) before the introduction of VFAT; thus,
VFAT long filenames are invisible to OS/2, and EA long filenames are invisible to Windows.
In order to overcome the volume size limit of FAT16, while still allowing DOS real mode code
to handle the format without unnecessarily reducing the available conventional memory,
Microsoft implemented a newer generation of FAT, known as FAT32, with cluster values held
in a 32-bit field, of which 28 bits are used to hold the cluster number, for a maximum of
approximately 268 million (228) clusters. This allows for drive sizes of up to 8 terabytes with
32KB clusters, but the boot sector uses a 32-bit field for the sector count, limiting volume size
to 2 TB on a hard disk with 512 byte sectors.
On Windows 95/98, due to the version of Microsoft's SCANDISK utility included with these
operating systems being a 16-bit application, the FAT structure is not allowed to grow beyond
around 4.2 million (< 222) clusters, placing the volume limit at 127.53 gigabytes.[12] A
limitation in original versions of Windows 98/98SE's Fdisk utility causes it to incorrectly
report disk sizes over 64GB.[13] A corrected version is available from Microsoft. These
limitations do not apply to Windows 2000/XP except during Setup, in which there is a 32GB
limit.[14] Windows Me supports the FAT32 file system without any limits.[15] However,
similarly to Windows 95/98/98SE there is no native support for 48bit LBA in Windows ME,
meaning that the maximum disk size is 127.6GB
FAT32 was introduced with Windows 95 OSR2, although reformatting was needed to use it,
and DriveSpace 3 (the version that came with Windows 95 OSR2 and Windows 98) never
supported it. Windows 98 introduced a utility to convert existing hard disks from FAT16 to
FAT32 without loss of data. In the NT line, native support for FAT32 arrived in Windows 2000.
A free FAT32 driver for Windows NT 4.0 was available from Winternals, a company later
acquired by Microsoft. Since the acquisition the driver is no longer officially available.
Windows 2000 and Windows XP can read and write to FAT32 file systems of any size, but the
format program included in Windows 2000 and higher can only create FAT32 file systems of
32 GB or less. This limitation is by design and according to Microsoft was imposed because
many tasks on a very large FAT32 file system become slow and inefficient.[12][16] This
limitation can be bypassed by using third-party formatting utilities or by using the built-in
FORMAT.EXE command-line utility.[17][18]
The maximum possible size for a file on a FAT32 volume is 4 GB minus 1 "null" byte (232?1
bytes). Video applications, large databases, and some other software easily exceed this limit.
Larger files require another formatting type such as HFS+ or NTFS. Until mid-2006, those who
run dual boot systems or who move external data drives between computers with different
operating systems had little choice but to stick with FAT32. Since then, full support for NTFS
has become available in Linux and many other operating systems, by installing the FUSE
library (on Linux) together with the NTFS-3G driver. Data exchange is also possible between
Windows and Linux by using the Linux-native ext2 or ext3 file systems through the use of
external drivers for Windows, such as ext2 IFS; however, Windows cannot boot from ext2 or
ext3 partitions.
[edit] Fragmentation
The FAT file system does not contain mechanisms which prevent newly written files from becoming
scattered across the partition.[6] Other file systems, like HPFS, use free space bitmaps that indicate used
and available clusters, which could then be quickly looked up in order to find free contiguous areas
(improved in exFAT). Another solution is the linkage of all free clusters into one or more lists (as is done
in Unix file systems). Instead, the FAT has to be scanned as an array to find free clusters, which can lead
to performance penalties with large hard disks.
In fact, computing free disk space on FAT is one of the most resource intensive operations, as it requires
reading the entire FAT linearly. A possible justification suggested by Microsoft's Raymond Chen for
limiting the maximum size of FAT32 partitions created on Windows was the time required to perform a
"DIR" operation, which always displays the free disk space as the last line.[16] Displaying this line took
longer and longer as the number of clusters increased.
The High Performance File System (HPFS) divides disk space into bands, which have their own free space
bitmap, where multiple files opened for simultaneous write could be expanded separately.[6]
Some of the perceived problems with fragmentation resulted from operating system and hardware
limitations.
The single-tasking DOS and the traditionally single-tasking PC hard disk architecture (only 1 outstanding
input/output request at a time, no DMA transfers) did not contain mechanisms which could alleviate
fragmentation by asynchronously prefetching next data while the application was processing the
previous chunks.
Similarly, write-behind caching was often not enabled by default with Microsoft software (if present)
given the problem of data loss in case of a crash, made easier by the lack of hardware protection
between applications and the system.
MS-DOS also did not offer a system call which would allow applications to make sure a particular file has
been completely written to disk in the presence of deferred writes (cf. fsync in Unix or DosBufReset in
OS/2). Disk caches on MS-DOS were operating on disk block level and were not aware of higher-level
structures of the file system. In this situation, cheating with regard to the real progress of a disk
operation was most dangerous.
Modern operating systems have introduced these optimizations to FAT partitions, but optimizations can
still produce unwanted artifacts in case of a system crash. A Windows NT system will allocate space to
files on FAT in advance, selecting large contiguous areas, but in case of a crash, files which were being
appended will appear larger than they were ever written into, with dozens of random kilobytes at the
end.
With the large cluster sizes, 16 or 32K, forced by larger FAT32 partitions, the external fragmentation
becomes somewhat less significant, and internal fragmentation, ie. disk space waste (since files are
rarely exact multiples of cluster size), starts to be a problem as well, especially when there are a great
many small files.
Other IBM PC operating systems—such as Linux, FreeBSD, BeOS and JNode—have all supported FAT,
and most added support for VFAT, FAT32, JFAT shortly after the corresponding Windows versions were
released. Early Linux distributions also supported a format known as UMSDOS, which was FAT with Unix
file attributes (such as long file name and access permissions) stored in a separate file called “--
linux-.---”. UMSDOS fell into disuse after VFAT was released and is not enabled by default in Linux
kernels from version 2.5.7 onwards.[19] The Mac OS X operating system also supports the FAT file
systems on volumes other than the boot disk. The Amiga supports FAT through the CrossDOS file
system.
The FAT file system itself is not designed for supporting Alternate Data Streams (ADS), but some
operating systems that heavily depend on them have devised various methods for handling them in FAT
drives. Such methods either store the additional information in extra files and directories (Mac OS), or
give new semantics to previously unused fields of the FAT on-disk data structures (OS/2 and Windows
NT). The second design, while presumably more efficient, prevents any copying or backing-up of those
volumes using non-aware tools; manipulating such volumes using non-aware disk utilities (e.g.
defragmenters or CHKDSK) will probably lose the information.
Mac OS using PC Exchange stores its various dates, file attributes and long filenames in a hidden file
called FINDER.DAT, and resource forks (a common Mac OS ADS) in a subdirectory called RESOURCE.FRK,
in every directory where they are used. From PC Exchange 2.1 onwards, they store the Mac OS long
filenames as standard FAT long filenames and convert FAT filenames longer than 31 characters to
unique 31-character filenames, which can then be made visible to Macintosh applications.
Mac OS X stores resource forks and metadata (file attributes, other ADS) in a hidden file with a name
constructed from the owner filename prefixed with "._", and Finder stores some folder and file
metadata in a hidden file called ".DS Store".
OS/2 heavily depends on extended attributes (EAs) and stores them in a hidden file called "EA DATA. SF"
in the root directory of the FAT12 or FAT16 volume. This file is indexed by 2 previously reserved bytes in
the file's (or directory's) directory entry. In the FAT32 format, these bytes hold the upper 16 bits of the
starting cluster number of the file or directory, hence making it difficult to store EAs on FAT32. Extended
attributes are accessible via the Workplace Shell desktop, through REXX scripts, and many system GUI
and command-line utilities (such as 4OS2).[20]
To accommodate its OS/2 subsystem, Windows NT supports the handling of extended attributes in
HPFS, NTFS, and FAT. It stores EAs on FAT and HPFS using exactly the same scheme as OS/2, but does
not support any other kind of ADS as held on NTFS volumes. Trying to copy a file with any ADS other
than EAs from an NTFS volume to a FAT or HPFS volume gives a warning message with the names of the
ADSs that will be lost.
Windows 2000 onward acts exactly as Windows NT, except that it ignores EAs when copying to FAT32
without any warning (but shows the warning for other ADSs, like "Macintosh Finder Info" and
"Macintosh Resource Fork").
[edit] Future
Microsoft has recently secured patents for VFAT and FAT32 (but not the original FAT). Despite two
earlier rulings against them, Microsoft prevailed and was awarded the patents.
Since Microsoft has announced the discontinuation of its MS-DOS-based consumer operating systems
with Windows Me, it remains unlikely that any new versions of FAT will appear. For most purposes, the
NTFS file system that was developed for the Windows NT line is superior to FAT from the points of view
of efficiency, performance, and reliability; its main drawbacks are the size overhead for small volumes
and the very limited support by anything other than the NT-based versions of Windows, since the exact
specification is a trade secret of Microsoft. The availability of NTFS-3G since mid 2006 has led to much
improved NTFS support in Unix-like operating systems, considerably alleviating this concern. It is still not
possible to use NTFS in DOS-like operating systems, which in turn makes it difficult to use a DOS floppy
for recovery purposes. Microsoft provided a recovery console to work around this issue, but for security
reasons it severely limited what could be done through the Recovery Console by default. The movement
of recovery utilities to boot CDs based on BartPE or Linux (with NTFS-3G) is finally eroding this drawback.
FAT is still the normal file system for removable media (with the exception of CDs and DVDs), with FAT12
used on floppies, and FAT16 on most other removable media (such as flash memory cards for digital
cameras and USB flash drives). Most removable media are not yet large enough to benefit from FAT32,
although some larger flash drives, like SDHC, do make use of it. FAT16 is used on these drives for reasons
of compatibility and size overhead.
The FAT32 formatting support in Windows 2000 and XP is limited to volumes of 32 GB, which effectively
forces users of modern hard drives either to use NTFS, to partition the drive into smaller volumes (below
32 GB), or to format the drive using third party tools.
[edit] exFAT
exFAT is an incompatible replacement for FAT file systems that was introduced with Windows
Embedded CE 6.0. It is intended to be used on flash drives, where FAT is used today. Windows XP file
system drivers will be offered by Microsoft shortly after the release of Windows CE 6.0[citation needed],
while Windows Vista Service Pack 1 added exFAT support to Windows Vista.[21] exFAT introduces a free
space bitmap allowing faster space allocation and faster deletes, support for files up to 264 bytes, larger
cluster sizes (up to 32 MB in the first implementation), an extensible directory structure and name
hashes for filenames for faster comparisons. It does not have short 8.3 filenames anymore. It does not
appear to have security access control lists or file system journaling like NTFS, though device
manufacturers can choose to implement simplified support for transactions (backup file allocation table
used for the write operations, primary FAT for storing last known good allocation table).
[edit] Design
Boot
sectors
(optional) File
Allocation
Table #1 File
Allocation
Table #2 Root
Directory
1. The Reserved sectors, located at the very beginning. The first reserved sector is the Boot Sector (aka
Partition Boot Record). It includes an area called the BIOS Parameter Block (with some basic file system
information, in particular its type, and pointers to the location of the other sections) and usually
contains the operating system's boot loader code. The total count of reserved sectors is indicated by a
field inside the Boot Sector. Important information from the Boot Sector is accessible through an
operating system structure called the Drive Parameter Block in DOS and OS/2. For FAT32 file systems,
the reserved sectors include a Backup Boot Sector at Sector 6.
2. The FAT Region. This typically contains two copies (may vary) of the File Allocation Table for the sake
of redundancy checking, although the extra copy is rarely used, even by disk repair utilities. These are
maps of the Data Region, indicating which clusters are used by files and directories.
3. The Root Directory Region. This is a Directory Table that stores information about the files and
directories located in the root directory. It is only used with FAT12 and FAT16 and means that the root
directory has a fixed maximum size which is pre-allocated at creation of this volume. FAT32 stores the
root directory in the Data Region along with files and other directories instead, allowing it to grow
without such a restraint.
4. The Data Region. This is where the actual file and directory data is stored and takes up most of the
partition. The size of files and subdirectories can be increased arbitrarily (as long as there are free
clusters) by simply adding more links to the file's chain in the FAT. Note however, that files are allocated
entirely in a cluster, and so if a 1 KB file resides in a 32 KB cluster, 31 KB are wasted.
FAT uses little endian format for entries in the header and the FAT(s).
0x00 3 Jump instruction. This instruction will be executed and will skip past the rest of the (non-
executable) header if the partition is booted from. See Volume Boot Record. If the jump is two-byte near
jmp it is followed by a NOP instruction.
0x03 8 OEM Name (padded with spaces). MS-DOS checks this field to determine which other
parts of the boot record can be relied on.[22][23] Common values are IBM 3.3 (with two spaces
between the "IBM" and the "3.3"), MSDOS5.0 and MSWIN4.1.
0x0b 2 Bytes per sector. A common value is 512, especially for file systems on IDE (or
compatible) disks. The BIOS Parameter Block starts here.
0x0d 1 Sectors per cluster. Allowed values are powers of two from 1 to 128. However, the value
must not be such that the number of bytes per cluster becomes greater than 32 KB.
0x0e 2 Reserved sector count. The number of sectors before the first FAT in the file system
image. Should be 1 for FAT12/FAT16. Usually 32 for FAT32.
0x11 2 Maximum number of root directory entries. Only used on FAT12 and FAT16, where the
root directory is handled specially. Should be 0 for FAT32. This value should always be such that the root
directory ends on a sector boundary (i.e. such that its size becomes a multiple of the sector size). 224 is
typical for floppy disks.
0x13 2 Total sectors (if zero, use 4 byte value at offset 0x20)
0xF0 3.5" Double Sided, 80 tracks per side, 18 or 36 sectors per track (1.44MB or 2.88MB). 5.25"
Double Sided, 15 sectors per track (1.2MB). Used also for other media types.
0xF8 Hard disk. Single sided, 80 tracks per side, 9 sectors per track[citation needed]
0xF9 3.5" Double sided, 80 tracks per side, 9 sectors per track (720K). 5.25" Double sided, 40 tracks
per side, 15 sectors per track (1.2MB)
0xFA 5.25" Single sided, 80 tracks per side, 8 sectors per track (320K)
0xFB 3.5" Double sided, 80 tracks per side, 8 sectors per track (640K)
0xFC 5.25" Single sided, 40 tracks per side, 9 sectors per track (180K)
0xFD 5.25" Double sided, 40 tracks per side, 9 sectors per track (360K). Also used for 8".
0xFE 5.25" Single sided, 40 tracks per side, 8 sectors per track (160K). Also used for 8".
0xFF 5.25" Double sided, 40 tracks per side, 8 sectors per track (320K)
Same value of media descriptor should be repeated as first byte of each copy of FAT. Certain operating
systems (MSX-DOS version 1.0) ignore boot sector parameters altogether and use media descriptor
value from the first byte of FAT to determine file system parameters.
0x20 4 Total sectors (if greater than 65535; otherwise, see offset 0x13)
Further structure used by FAT12 and FAT16, also known as Extended BIOS Parameter Block:
0x36 8 FAT file system type, padded with blanks (0x20), e.g.: "FAT12 ", "FAT16 ". This is not
meant to be used to determine drive type, however, some utilities use it in this way.
The boot sector is portrayed here as found on e.g. an OS/2 1.3 boot diskette. Earlier versions used a
shorter BIOS Parameter Block and their boot code would start earlier (for example at offset 0x2b in OS/2
1.1).
0x2a 2 Version
0x34 12 Reserved
[edit] Exceptions
The implementation of FAT used in MS-DOS for the Apricot PC had a different boot sector layout, to
accommodate that computer's non-IBM compatible BIOS. The jump instruction and OEM name were
omitted, and the MS-DOS file system parameters (offsets 0x0B - 0x17 in the standard sector) were
located at offset 0x50. Later versions of Apricot MS-DOS gained the ability to read and write disks with
the standard boot sector in addition to those with the Apricot one.
DOS Plus on the BBC Master 512 did not use conventional boot sectors at all. Data disks omitted the
boot sector and began with a single copy of the FAT (the first byte of the FAT was used to determine disk
capacity) while boot disks began with a miniature ADFS file system containing the boot loader, followed
by a single FAT. It could also access standard PC disks formatted to 180 KB or 360 KB, again using the
first byte of the FAT to determine capacity.
A partition is divided up into identically sized clusters, small blocks of contiguous space. Cluster sizes
vary depending on the type of FAT file system being used and the size of the partition, typically cluster
sizes lie somewhere between 2 KB and 32 KB. Each file may occupy one or more of these clusters
depending on its size; thus, a file is represented by a chain of these clusters (referred to as a singly linked
list). However these clusters are not necessarily stored adjacent to one another on the disk's surface but
are often instead fragmented throughout the Data Region.
The File Allocation Table (FAT) is a list of entries that map to each cluster on the partition. Each entry
records one of five things:
* a special end of clusterchain (EOC) entry that indicates the end of a chain
Each version of the FAT file system uses a different size for FAT entries. The size is indicated by the
name, for example the FAT16 file system uses 16 bits for each entry while the FAT32 file system uses 32
bits. Only 28 of these are actually used, however. This difference means that the File Allocation Table of
a FAT32 system can map a greater number of clusters than FAT16, allowing for larger partition sizes with
FAT32. This also allows for more efficient use of space than FAT16, because on the same hard drive a
FAT32 table can address smaller clusters which means less wasted space.
0x002 - 0xFEF 0x0002 - 0xFFEF 0x?0000002 - 0x?FFFFFEF Used cluster; value points to
next cluster
Note that FAT32 uses only 28 bits of the 32 possible bits. The upper 4 bits are usually zero but are
reserved and should be left untouched. In the table above these are denoted by a question mark.
The first cluster of the Data Region is cluster #2. That leaves the first two entries of the FAT unused. In
the first byte of the first entry a copy of the media descriptor is stored. The remaining 8 bits (if FAT16),
or 20 bits (if Fat32) of this entry are 1. In the second entry the end-of-cluster-chain marker is stored. The
high order two bits of the second entry are sometimes, in the case of FAT16 and FAT32, used for dirty
volume management: high order bit 1: last shutdown was clean; next highest bit 1: during the previous
mount no disk I/O errors were detected.[25]
A directory table is a special type of file that represents a directory (also known as a folder). Each file or
directory stored within it is represented by a 32-byte entry in the table. Each entry records the name,
extension, attributes (archive, directory, hidden, read-only, system and volume), the date and time of
creation, the address of the first cluster of the file/directory's data and finally the size of the
file/directory. Aside from the Root Directory Table in FAT12 and FAT16 file systems, which occupies the
special Root Directory Region location, all Directory Tables are stored in the Data Region. The actual
number of entries in a directory stored in the Data Region can grow by adding another cluster to the
chain in the FAT.
* Numbers 0–9
* Space (though trailing spaces in either the base name or the extension are considered to be padding
and not a part of the file name, also filenames with space in them could not be used on the DOS
command line because of the lack of a suitable escaping system)
*!#$%&'()-@^_`{}~
* (FAT-32 only) + , . ; = [ ]
* Values 128–255
Directory entries, both in the Root Directory Region and in subdirectories, are of the following format:
0xE5 Entry has been previously erased and is available. File undelete utilities must replace this
character with a regular character as part of the undeletion process.
2 0x04 System
4 0x10 Subdirectory
5 0x20 Archive
7 0x80 Unused
0x0c 1 Reserved; two bits are used by NT and later versions to encode case information (see
below); otherwise 0[26]
0x0d 1 Create time, fine resolution: 10ms units, values from 0 to 199.
0x0e 2 Create time. The hour, minute and second are encoded according to the following
bitmap:
Bits Description
Note that the seconds is recorded only to a 2 second resolution. Finer resolution for file creation is
found at offset 0x0d.
0x10 2 Create date. The year, month and day are encoded according to the following bitmap:
Bits Description
0x1a 2 First cluster in FAT12 and FAT16. Low 2 bytes of first cluster in FAT32. Entries with the
Volume Label flag, subdirectory ".." pointing to root, and empty files with size 0 should have first cluster
0.
0x1c 4 File size. Entries with the Volume Label or Subdirectory flag set should have a size of 0.
Clusters are numbered from a cluster offset as defined above. That is, a zero in 0x1a would mean the
first data segment is at:
Long File Names (LFN) are stored on a FAT file system using a trick—adding (possibly multiple) additional
entries into the directory before the normal file entry. The additional entries are marked with the
Volume Label, System, Hidden, and Read Only attributes (yielding 0x0F), which is a combination that is
not expected in the MS-DOS environment, and therefore ignored by MS-DOS programs and third-party
utilities. Notably, a directory containing only volume labels is considered as empty and is allowed to be
deleted; such a situation appears if files created with long names are deleted from plain DOS.
Older versions of PC-DOS mistake LFN names in the root directory for the volume label, and are likely to
display an incorrect label.
Each phony entry can contain up to 13 UTF-16 characters (26 bytes) by using fields in the record which
contain file size or time stamps (but not the starting cluster field, for compatibility with disk utilities, the
starting cluster field is set to a value of 0. See 8.3 filename for additional explanations). Up to 20 of these
13-character entries may be chained, supporting a maximum length of 255 UTF-16 characters.[26]
After the last UTF-16 character, a 0x00 0x00 is added. Other not used characters are filled with 0xFF
0xFF.
If there are multiple LFN entries, required to represent a file name, firstly comes the last LFN entry (the
last part of the filename). The sequence number here also has bit 7 (0x40) checked (this means the last
LFN entry. However it's the first entry got when reading the directory file). The last LFN entry has the
biggest sequence number which decreases in following entries. The first LFN entry has sequence
number 1. Bit 8 (0x80) of the sequence number is used to indicate that the entry is deleted.
For example if we have filename "File with very long filename.ext" it would be formatted like this:
0x43 "me.ext"
int i;
return sum;
7 0x80 F1
6 0x40 F2
5 0x20 F3
4 0x10 F4
0x0D 1 DR-DOS For a deleted file, the original first character of the filename.
0x10 4 DR-DOS 7 For a deleted file, its original file time and date; deleted files have their
normal time and date fields set to the time of deletion
0x14 2 DR-DOS and FlexOS File permissions bitmap (execute permissions are only used by
FlexOS):
Microsoft applied for, and was granted, a series of patents for key parts of the FAT file system in the
mid-1990s. Being almost universally compatible and well-understood, FAT is frequently chosen as an
interchange format for flash media used in digital cameras and PDAs.
On 2003-12-03 Microsoft announced it would be offering licenses for use of its FAT specification and
"associated intellectual property", at the cost of a US$0.25 royalty per unit sold, with a $250,000
maximum royalty per license agreement.[28]
To this end, Microsoft cited four patents on the FAT file system as the basis of its intellectual property
claims. All four pertain to long-filename extensions to FAT first seen in Windows 95:
* U.S. Patent 5,745,902 - Method and system for accessing a file using file names having different file
name formats. Filed July 6, 1992. This covered a means of generating and associating a short, 8.3
filename with long one (for example, "Microsoft.txt" with "MICROS~1.TXT") and a means of
enumerating conflicting short filenames (for example, "MICROS~2.TXT" and "MICROS~3.TXT"). It is
unclear whether this patent would cover an implementation of FAT without explicit long filename
capabilities. Hard links in Unix file systems do not appear to be prior art: deleting a FAT file via its long
name will also remove its short name. Renaming a file to a "short" name also updates the long file name
for coherency; similarly, renaming a file to a "long" name will allocate a new "short" name. In NTFS, hard
links and dual names are separate concepts and each hard link has two names. Finally, at the API level,
both names are always provided together when a directory lookup is requested from the system; they
do not appear as two separate files and do not have to be "matched" to determine unique files.
* U.S. Patent 5,579,517 - Common name space for long and short filenames. Filed for on 1995-04-24.
This covers the method of chaining together multiple consecutive 8.3 named directory entries to hold
long filenames, with some of the entries specially marked to prevent their confusing older, long
filename-unaware FAT implementations.
o The Public Patent Foundation successfully challenged this patent; the claims were rejected[29]
on 2004-09-14, due to prior disclosure[30] of the claimed techniques in patents U.S. Patent 5,307,494
and U.S. Patent 5,367,671 . This decision was later overturned by the Patent Office on 2006-01-10.
* U.S. Patent 5,758,352 - Common name space for long and short filenames. Filed on 1996-09-05.
This is very similar to 5,579,517.
o The Public Patent Foundation successfully challenged this patent (USPTO); The USPTO rejected
this patent on 2005-10-05, on the grounds that "the six assignees names were incorrect".[31][32] This
decision was also later overturned by the Patent Office on 2006-01-10.
* U.S. Patent 6,286,013 - Method and system for providing a common name space for long and short
file names in an operating system. Filed on 1997-01-28. This makes claims on the methods used when
Windows 95, Windows 98 and Windows Me expose long filenames to their MS-DOS compatibility layer.
It does not appear to affect any non-Microsoft FAT implementations.
Many technical commentators have concluded that these patents only cover FAT implementations that
include support for long filenames, and that removable solid state media and consumer devices only
using short names would be unaffected.
Additionally, in the document "Microsoft Extensible Firmware Initiative FAT 32 File System Specification,
FAT: General Overview of On-Disk Format" published by Microsoft (version 1.03, 2000-12-06), Microsoft
specifically grants a number of rights, which many readers have interpreted as permitting operating
system vendors to implement FAT.
Microsoft is not the only company to have applied for patents for parts of the FAT file system. Other
patents affecting FAT include:
* U.S. Patent 5,367,671 - System for accessing extended object attribute (EA) data through file name
or EA handle linkages in path tables. Filed on 1990-09-25 by Barry A. Feigenbaum and Felix Miro of IBM,
this makes claims on the methods used by OS/2, Windows NT, and Linux for storing extended attribute
data in the "EA DATA. SF" file.
[edit] Appeal
As there was widespread call for these patents to be re-examined, the Public Patent Foundation
(PUBPAT) submitted evidence to the US Patent and Trade Office (USPTO) disputing the validity of these
patents, including prior art references from Xerox and IBM. The USPTO acknowledged that the evidence
raised "substantial new question[s] of patentability," and opened an investigation into the validity of
Microsoft's FAT patents.[33]
On 2004-09-30 the USPTO rejected all claims of U.S. Patent 5,579,517 , based primarily on evidence
provided by PUBPAT. Dan Ravicher, the foundation's executive director, said, "The Patent Office has
simply confirmed what we already knew for some time now, Microsoft's FAT patent is bogus."
According to the PUBPAT press release, "Microsoft still has the opportunity to respond to the Patent
Office's rejection. Typically, third party requests for reexamination, like the one filed by PUBPAT, are
successful in having the subject patent either narrowed or completely revoked roughly 70% of the time."
On 2005-10-05 the Patent Office announced that, following the re-examination process, it had again
rejected all claims of patent 5,579,517, and it additionally found U.S. Patent 5,758,352 invalid on the
grounds that the patent had incorrect assignees.
Finally, on 2006-01-10 the Patent Office ruled that features of Microsoft's implementation of the FAT
system were "novel and non-obvious", reversing both earlier non-final decisions.[34]