Chapter 8 - File Management
Chapter 8 - File Management
Sixth Edition
Chapter 8
File Management
Learning Objectives
2
Table of Contents
3
The File Manager
• The File Manager controls every file in the system.
• The File Manager (File Management System) is the
software responsible for creating, deleting, modifying,
and controlling access to files, as well as for
managing the resources used by the files.
• These functions are performed in collaboration with
the Device Manager.
4
This PC – Windows 10
Nautilus - Ubuntu
5
Responsibilities of the File Manager
• Responsibilities - perform four tasks:
– File storage tracking
• Keep track of where each file is stored
– Policy implementation
• Determine where and how the files will be stored
• Efficiently use the available storage space;
• Provide efficient file access to the files.
– File allocation if user access is clear
• Allocate each file when a user has been cleared for access to
it, then record its use
– File deallocation
• Deallocate the file when the file is to be returned to storage
• Communicate the file availability to others who may be
waiting for it.
6
Responsibilities of the File Manager (cont’d)
• The File Manager keeps track of its files with directories
that contain:
– the filename;
– its physical location in secondary storage;
– Other important information about each file.
7
Responsibilities of the File Manager (cont’d)
8
Responsibilities of the File Manager (cont’d)
9
Responsibilities of the File Manager (cont’d)
• File allocation
– Activate secondary storage device, load file into
memory, and update records
• File deallocation
– Update file tables, rewrite file (if revised), and notify
waiting processes of file availability
• Any processes waiting to access the file are then
notified of its availability.
10
Important Definitions
Let’s take a minute to define some basic file elements, illustrated in Figure
8.1, that relate to our discussion of the File Manager.
11
Important Definitions (cont’d)
• Field
– A group of related bytes that can be identified by the
user with a name, type, and size.
• Record
– A group of related fields.
• File
– A group of related records that contains information to
be used by specific application programs to generate
reports.
– Flat file
• No connections to other files, no dimensionality
12
Important Definitions (cont’d)
• Database
– Groups of related files that are interconnected at
various levels to give users flexibility of access to the
stored data.
– If the user’s database requires a specific structure,
the File Manager must be able to support it.
• Program files
– Contain instructions;
– Data files contain data;
– As far as storage is concerned, the File Manager
treats them the same way.
13
Important Definitions (cont’d)
• Directories
– Special files with listings of filenames and their
attributes.
– Data collected to monitor system performance and
provide for system accounting is collected into files.
– Every program and data file accessed by the
computer system, and every piece of computer
software, is treated as a file.
– File Manager treats all files exactly same way as far
as storage is concerned
14
Interacting with the File Manager
15
Interacting with the File Manager
• Interactive Commands
– CREATE & DELETE - deal with system’s knowledge of
file.
– SAVE - first time used, a file is actually created.
– OPEN NEW - within a program indicates file must be
created.
– OPEN…FOR OUTPUT - creates file by making entry for it
in directory & finding space for it in secondary storage.
– RENAME - allows users to change name of existing file.
– COPY - allows user to make duplicate copies of existing
files.
16
Interacting with the File Manager
17
Interacting with the File Manager (cont’d)
18
Typical Volume Configuration
19
Typical Volume Configuration (cont'd.)
• Each volume in the system is given a name.
• The File Manager writes this name and other
descriptive information on an easy-to-access place
on each unit.
– The innermost part of the CD or DVD;
– The beginning of the tape;
– The first sector of the outermost track of the disk pack
20
21
Typical Volume Configuration (cont'd.)
Master file directory (MFD) - Database
• MFD stored immediately after the volume descriptor
• Lists the names and characteristics of every file
contained in that volume.
– The filenames in the MFD can refer to:
• Program files
• Data files;
• System files;
– Subdirectories
• If the File Manager supports subdirectories.
– The remainder of the volume
• Used for file storage.
22
Typical Volume Configuration (cont'd.)
Master file directory (MFD) - Database
• Early OS supported only a single directory per
volume
– Created by File Manager
– Contains names of files
– Simple to implement and maintain
• Disadvantages
– Long search time for individual file
– Directory space filled before disk storage space filled
– Users cannot create subdirectories
– Users cannot safeguard their files
– Each program needs unique name
23
Introducing Subdirectories
• File Managers create an MFD for each volume that can
contain entries for both files and subdirectories.
• Subdirectory is created when a user opens an account
to access the computer system.
• Although the user directory is treated as a file, its entry in
the MFD is flagged to indicate to the File Manager that
this file is really a subdirectory and has unique
properties:
– Its records are filenames pointing to files.
• Still cant group files in a logical order to improve accessibility
and efficiency of system
24
Introducing Subdirectories (cont'd.)
• Today’s File Managers encourage users to create
subdirectories, so related files can be grouped together.
– Folders
• Tree structures allow the system to efficiently search
individual directories because there are fewer entries in
each directory.
– However, the path to the requested file may lead through
several directories.
• When the user wants to access a specific file:
– The filename is sent to the File Manager.
– The File Manager first searches the MFD for the user’s
directory.
– Then searches the user’s directory and any subdirectories
for the requested file and its location.
25
Introducing Subdirectories (cont'd.)
26
Introducing Subdirectories (cont'd.)
File Descriptor (view Details)
• Filename – Within a single directory, filenames must be
unique; in some OS, the filenames are case sensitive.
• File type – The organization and usage that are
dependent on the system (files and directories).
• File size – The size is kept here for convenience.
• File location – Identification of the first physical block (or
all blocks) where the file is stored.
• Date and time of creation
• Owner
• Protection information - Access restrictions.
• Record size – Its fixed size or its maximum size
27
File-Naming Conventions
• Two common components to many filenames are:
– Relative filename
– An extension
Relative Filename
• Can vary in length from 1 or more characters.
• Can include letters of alphabet & digits.
• Every OS has specific rules that affect length of
relative name & types of characters allowed.
– Eg: MS-DOS allows 1-8 alphanumeric character
names without spaces.
– More modern OS allow names with dozens of
characters including spaces.
28
File-Naming Conventions (cont'd.)
Extensions
• Some OS require an extension that’s appended to the
relative filename.
• Two to four characters and its purpose is to identify the
type of file or its contents.
– basia_tune.MP3
– take out menu.DOCX
• If an extension is incorrect or unknown, it requires user
intervention (Figure 8.5)
• Some extensions (.EXE, .BAT, .COB, .FOR) are
restricted by certain OS because they serve as a signal
to the system to use a specific compiler or program to
run these files. 29
File-Naming Conventions (cont'd.)
30
File-Naming Conventions (cont'd.)
(table 8.1)
File name parameters for several operating systems.
31
File Organization: Record Format
32
Record Format
Fixed-length records - Easiest to access directly
• Most common type & ideal for data files.
• Record size critical (too small - truncation; too large -
wastes space)
(figure 8.6)
Data stored in fixed length fields (top) that extends beyond the field limit is
truncated. Data stored in variable length fields (bottom) is not truncated.
34
Physical File Organization
• Describes the way records are arranged and the
characteristics of the medium used to store it.
• On magnetic disks (hard drives), files can be
organized by Sequential, Direct and Indexed
sequential
• File organization scheme selection considerations
– Volatility of data - frequency with which additions &
deletions made.
– Activity of file – percentage of records processed during
a given run.
– Size of file
– Response time - amount of time user is willing to wait
before requested operation is completed.
35
Physical File Organization (cont'd.)
Sequential record organization
• Records stored and retrieved serially
– One after the other
• Easiest to implement
• File search: beginning until record found
• Optimization features may be built into system
– Select key field from record and sort before storage
– Complicates maintenance algorithms
– Preserve original order when records added, deleted
36
Physical File Organization (cont'd.)
Direct record organization
• Uses direct access files which can be implemented
only on direct access storage devices.
• Give users flexibility of accessing any record in
any order without having to begin search from
beginning of file.
• Records are identified by their relative addresses
– Known as logical addresses
– Computed when records stored and retrieved
• Hashing algorithms
– Transform each key into a number
37
Physical File Organization (cont'd.)
Direct record organization (cont'd.)
• Advantages
– Fast record access
– Sequential access if starting at first relative address
and incrementing to next record
– Updated more quickly than sequential files
– No preservation of records order
– Adding, deleting records is quick
• Disadvantages
– Hashing algorithm collision: records with unique keys
may generate the same logical address
38
Physical File Organization (cont'd.)
39
Physical File Organization (cont'd.)
Direct Record Organization:
• Several records with unique keys may generate
same logical address (collision).
• Program generates another logical address before
presenting it to File Manager for storage.
• Colliding records stored in overflow area via links.
• File Manager handles physical allocation of space.
• Maximum file size established when created &
eventually file is full or too many records are stored
in overflow area.
• Programmer must reorganize & rewrite file.
40
41
Physical File Organization (cont'd.)
Indexed Sequential Record Organization
• Combines the best of sequential and direct access.
• Created and maintained through an Indexed Sequential
Access Method (ISAM) application.
• Doesn’t create collisions because it doesn’t use the
results of the hashing algorithm to generate a record’s
address.
– Uses info to generate index file through which records
retrieved.
• Divides ordered sequential file into blocks of equal size.
– Size determined by File Manager to take advantage
42
Physical File Organization (cont'd.)
Indexed Sequential Record Organization (cont’d)
• Each entry in index file contains highest record key &
physical location of data block where this record, &
records with smaller keys, are stored.
• To access any record in file, system begins by
searching index file & then goes to physical location
indicated at that entry.
• Overflow areas are spread throughout file
– Existing records can expand & new records are in close
physical & logical sequence.
– Last-resort overflow area is located apart from main data
area but is used only when the other overflow areas are
completely filled.
43
Physical File Organization (cont'd.)
Indexed Sequential Record Organization (cont’d)
• When retrieval time becomes too slow, file has to be
reorganized
– Usually performed b y maintenance software.
– For most dynamic files, indexed sequential is the
organization of choice because it allows both direct
access to a few requested records and sequential access
to many
• Allows both direct access to a few requested records &
sequential access to many records for most dynamic
files.
• A variation of indexed sequential files is B-tree.
44
Summary of Week 12
45
Physical Storage Allocation
• The File Manager must work with files not just as
whole units but also as logical units.
• Records within a file must have the same format but
they can vary in length.
• Records are subdivided into fields.
• Their structure is managed by application programs
and not the OS.
– An exception is made for those systems that are
heavily oriented to database applications, where the
File Manager handles field structure.
• When we talk about file storage, we are actually
referring to record storage.
• The File Mgr and the Device Mgr have to cooperate
to ensure successfully record storage and retrieval.
46
Physical Storage Allocation (cont'd.)
byte address
Block number = integers [-----------------------] + address of the first physical record
• physical block size
byte address
offset = remainder [-----------------------]
physical block size
ADAMSbbbbbbbbbb
ADAMSb10
300000000
3#8