0% found this document useful (0 votes)

10 views36 pages

DBMS Chapter 4 Record Organization and Dile Management

The document discusses primary and secondary storage, including RAM, cache memory, hard drives, and optical discs. It describes memory hierarchies and storage devices. Key terms related to hard drive hardware are defined. RAID levels and how they provide redundancy and performance are covered. The structure and types of records, files, and different file organizations like heap, ordered, and hash files are explained.

Uploaded by

Sultan Jenbo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views36 pages

DBMS Chapter 4 Record Organization and Dile Management

Uploaded by

Sultan Jenbo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 36

Chapter 4

Record Storage & Primary File

Organizations
Storage
 The are two general types of storage media that is used with
computers. They are :
 Primary Storage: includes all storage media that can be operated
on directly by the CPU (RAM , L1 and L2 Cache Memory)
 The first-level (L1) cache is the fastest memory in the computer
and closest to the processor.
 The second-level (L2) cache is also built from SRAM but is larger,
and therefore slower, than the L1 cache. The processor first looks
for the data in the L1 cache.
 Secondary Storage: includes Hard Drives, CD’s and tape.
Memory Hierarchies & Storage Devices

• The Memory Hierarchy is based upon speed of access

• However, this speed comes with a price tag attached which
varies inversely with the access time of memory.
Primary Storage Level of Memory

• The Primary Storage Level of Memory is generally made up of 3

Levels.
– L1 Cache which is located on the CPU
– L2 Cache which is located near the CPU
– Main Memory which is RAM that is often referred to in
computer advertisements
Secondary Storage Level of Memory

• The 2ndry Storage Level of Memory may be made up of 4 Levels.

– Flash Memory or EEPROM
– Hard Drives
– CD ROM’s
– Tape
Figure 1.1
Terms Used in the HW Description of Hard Drives

• Capacity - The number of bytes it can store.

• Single-sided vs. Double-sided - States if the disk/platter is
written on one or both sides.
• Disk Pack - A collection of disks/platters that are assembled
together into a pack.
• Track - A Circle of a small width on a disk.
– A disk surface will have many tracks.
Terms Used in the HW Description of Hard Drives

• Sector - A segment or arc of a track.

• Block - is the division of a track into equal sized portions by the operating
system.
• Interblock Gaps - These are fixed sized segments that separate the blocks.
• Read/Write Head - Actual reads/writes the information to the disk.
• Cylinder - Tracks with the same diameter that are located on the disk
surface of a disk pack.
Parallelizing Disk Access Using RAID

• RAID - Stands for Redundant Arrays of Inexpensive Disks or

Redundant Arrays of Independent Disks.
• It is a way of logically putting multiple disks together into a
single array which are working together.
• RAIDs are used to provide increased reliability, increased
performance or both.
RAID Levels
• Level 0 - has no redundancy and the best write performance but its
read performance is not as good as level 1.
• Level 1 - uses mirrored disks which provide redundancy and
improved read performance.
• Level 2 - provides redundancy using Hamming Codes
• The Hamming Code is simply the use of extra parity bits to allow
the identification of an error. Write the bit positions starting from 1
in binary form (1, 10, 11, 100, etc).
RAID Levels

• Level 3 - uses a single parity disk.

• Level 4 and 5 - use block-level data striping with level 5
distributing the data across all the disks.
• Level 6 - uses the P + Q redundancy scheme making use of the
Reed-Soloman codes to protect against the failure of 2 Disks.
Records

• Record is the term used to refer to a number of related values or

items.
• Each value or item is stored in a field of a specific data type.
• Records may be of either fixed or variable lengths.
Variable Length Records in Files
• There are several reasons a record with the same record
type may be of variable length.
– Variable length fields
– Repeating fields
• For efficiency reasons different record types may be
clustered in a file.
Spanned Vs. Unspanned Records
• Unspanned Records: When many records are restricted to fit within
one block due to their small size.
• Spanned Records: When (portions of ) a single record may lie in
different blocks, due to their large size.
File Operations
• File may either be stored in contiguous blocks or by linking the
blocks together.
• There are advantages and disadvantages to both methods.
• Operations on files can be group into two type of operations.
Retrieval or update.
• Retrieval only involves a read while and update involves read,
write and modification.
File Structure

• Heap (Pile) Files

• Ordered (Sorted) Files
• Hash (Direct) Files
• B - Trees
Heap (Pile) Files
• A heap file is an unordered set of records, stored on a set of pages.
• This class provides basic support for inserting, selecting, updating,
and deleting records.
• Insertions - Very efficient
• Search - Very inefficient (Linear Search)
• Deletion - Very inefficient
• Temporary heap files are used for external sorting and in other
relational operators.
Ordered (Sorted Files) Records

• A sorted file is one in which records are stored in order of the

values of one field (e.g., ID number) – or in order of the
concatenation of several fields. (e.g., first & last names) The sort
field is sometimes called as a key of the file.
• Records are stored based on the value contained in one of their
fields called the ordering field.
• If the ordering field is also a key field than the field is better
described as an ordering key.
Advantages of Ordered Files

• Reading of the records in order of the ordering field is extremely

efficient.
• Finding the next record is fast.
• Finding records based on a query of the ordering field is efficient.
(binary search).
• Binary search may be done on the blocks as well.
Disadvantages of Ordered Files

• Searches on non-ordering fields are inefficient.

• Insertion and deletion of records are very expensive.
Hashing Techniques
• Hashing is a technique to directly search the location of desired data on the disk.

• It is used to index and retrieve items in a database as it is faster to search that

specific item using the shorter hashed key instead of using its original value.

• This is where a records placement is determined by value in the hash field.

• This value has a hash or randomizing function applied to it which yields the
address of the disk block where the record is stored.

• For most records, we need only a single-block access to retrieve that record.
Internal Hashing

• Internal Hashing is implemented as a hash table through the use of

an array of records. (In memory)
• An array index range of 0 to M-1. A function that transforms the
hash field value into an integer between 0 to M-1 is used.
Internal Hashing (con’t)

• Collisions occur when a hash field value of a record being inserted

hashes to an address that already contains a different record.

• The process of finding another position for this record is called

collision resolution.
Collision Resolution

• Open Addressing- Places the record to be inserted in the first

available position subsequent to the hash address.
• Chaining - A pointer field is added to each record location. When
an overflow occurs this pointer is set to point to overflow blocks
making a linked list.
Collision Resolution (con’t)

• Multiple hashing - If an overflow occurs a second hash

function is used to find a new location.
– If that location is also filled either another hash function is applied
or open addressing is used.
Goals of the Hash Function

• The goals of a good hash function are to uniformly distribute the

records over the address space while minimizing collisions to avoid
wasting space.
• Research has shown
– 70% to 90% fill ratio best.
– That when uses a Mod function M should be a prime number.
External Hashing for Disk Files

• External hashing makes use of buckets, each of which can hold

multiple records.
• A bucket is either a block or a cluster of contiguous blocks.
• The hash function maps a key into a relative bucket number,
rather than an absolute block address for the bucket.
Types of External Hashing
• Using a fixed address space is called static hashing.
• Dynamically changing address space:
– Extendible hashing*
– Linear hashing**

Whereas,
* With a Directory
** Without a Directory
Overflow (Bucket Splitting)

• When an overflow in a bucket occurs that bucket is split.

• This is done by dynamically allocating a new bucket and
redistributing the contents of the old bucket between the old and
new buckets based on the increased local depth d’+1 of both these
buckets.
• Where d refers to as the global depth of the directory.
Overflow (Bucket Splitting)

• Now the new bucket’s address must be added to the directory.

• If the overflow occurred in a bucket whose current local depth d’ is
less than or equal to the global depth d adjust the directory entries
accordingly. (No change in the directory size is made.)
Overflow (Bucket Splitting)

• If the overflow occurred in a bucket whose current local

depth d’ is now greater than the global depth d you must
increase the global depth accordingly.
• This results in a doubling of the directory size for each
time d is increased by 1 and needs appropriate adjustment
of the entries.
Linear Hashing
• Linear Hashing allows the hash file to expand and shrink its
number of buckets dynamically without needing a directory.
• It starts with M buckets numbered 0 to M-1 and use the mod hash
function

h(K)= K mod M as the initial hash function called hi.

Hashing Example with Open Addressing

Hash Function = K mod M, where K is the field value, and

M is the size of the address space.
This will result in the range of values of the hash function to match the
address spaces.
M=9
h(k) = K mod M

K h(K)
30 3
45 0
24 6
25 7
36 0
54 0
Index Structure for Files
 Types of Single level Ordered Index
 A dense index has an index entry for every search key value in the data file.
 A sparse (or nondense) index, on the other hand, has index entries for only
some of the search values. A sparse index has fewer entries than the number
of records in the file.
 Two basic kinds of indices:
– Ordered indices: search keys are stored in sorted order
– Hash indices: search keys are distributed uniformly across “buckets” using a
“hash function”.
Reading Assignment

Dynamic Multilevel indexes using BTrees and B+ Trees

Multiple Indexes

1. Elmasri_6e_Ch17
No ratings yet
1. Elmasri_6e_Ch17
43 pages
Chapter 17 Disk Storage, Basic File Structures, and Hashing Disk Storage Devices
No ratings yet
Chapter 17 Disk Storage, Basic File Structures, and Hashing Disk Storage Devices
10 pages
Hashing in DBMS
No ratings yet
Hashing in DBMS
11 pages
Dbms Unit III Notes
No ratings yet
Dbms Unit III Notes
27 pages
Chapter 17: Disk Storage, Basic File Structures, and Hashing
No ratings yet
Chapter 17: Disk Storage, Basic File Structures, and Hashing
54 pages
File Organization
No ratings yet
File Organization
45 pages
Unit 5
No ratings yet
Unit 5
185 pages
File Organization in DBMS
No ratings yet
File Organization in DBMS
10 pages
7_DataStorageIndexingStructures
No ratings yet
7_DataStorageIndexingStructures
83 pages
Storage and Querying in DBMS
No ratings yet
Storage and Querying in DBMS
45 pages
File Organization
No ratings yet
File Organization
47 pages
Chapter_2 - Disk Storage, Basic File Structures, and Hashing
No ratings yet
Chapter_2 - Disk Storage, Basic File Structures, and Hashing
71 pages
File Organizations and Indexes
No ratings yet
File Organizations and Indexes
51 pages
Chapter 6- - Copy
No ratings yet
Chapter 6- - Copy
62 pages
Unit 5 DBMS
No ratings yet
Unit 5 DBMS
38 pages
DBMS Unit 3
No ratings yet
DBMS Unit 3
81 pages
Unit_6
No ratings yet
Unit_6
38 pages
Unit 4 Chapter 1 Storage and Querying
No ratings yet
Unit 4 Chapter 1 Storage and Querying
37 pages
Chapter 6-
No ratings yet
Chapter 6-
62 pages
Data Management: INFO125
No ratings yet
Data Management: INFO125
111 pages
Unit Iv Implementation Techniques
No ratings yet
Unit Iv Implementation Techniques
91 pages
Lecture 17
No ratings yet
Lecture 17
24 pages
L2.2-File Organization Techniques
No ratings yet
L2.2-File Organization Techniques
42 pages
Elmasri_6e_Ch17_ppt_Compatibility_Mode_Repaired
No ratings yet
Elmasri_6e_Ch17_ppt_Compatibility_Mode_Repaired
32 pages
2MCA2 DBMS Nit 2 Secondary Storage. 16960710426030.Pptx
No ratings yet
2MCA2 DBMS Nit 2 Secondary Storage. 16960710426030.Pptx
32 pages
DS_TM_Study_Material_Presentations_Unit-4_1TM
No ratings yet
DS_TM_Study_Material_Presentations_Unit-4_1TM
22 pages
File Structures Indexing Kopyası
No ratings yet
File Structures Indexing Kopyası
76 pages
File Structures Indexing
No ratings yet
File Structures Indexing
58 pages
Database 2 Notes
No ratings yet
Database 2 Notes
42 pages
Unit-3 Part 2 Indexing and Hashing
No ratings yet
Unit-3 Part 2 Indexing and Hashing
36 pages
Topic Beyond Syllabus
No ratings yet
Topic Beyond Syllabus
22 pages
Elmasri Storage Hashing
No ratings yet
Elmasri Storage Hashing
27 pages
1 - Disk Storage - Ch13
No ratings yet
1 - Disk Storage - Ch13
31 pages
Presentation 7 (7)
No ratings yet
Presentation 7 (7)
21 pages
8.Physical Database Design
No ratings yet
8.Physical Database Design
20 pages
File Organization CH16 Updated
No ratings yet
File Organization CH16 Updated
30 pages
22-File Organization-06-09-2024
No ratings yet
22-File Organization-06-09-2024
23 pages
5 Data Storage and Indexing
No ratings yet
5 Data Storage and Indexing
60 pages
1 File Structure & Organization
No ratings yet
1 File Structure & Organization
23 pages
Unit 6.2 Indexing and Hashing
No ratings yet
Unit 6.2 Indexing and Hashing
37 pages
m5 Index PDF
No ratings yet
m5 Index PDF
60 pages
File Organization Notes
No ratings yet
File Organization Notes
21 pages
Dbms Chapter 5
No ratings yet
Dbms Chapter 5
28 pages
Database Management: Department of Computer Science, School of Computing Sciences
No ratings yet
Database Management: Department of Computer Science, School of Computing Sciences
24 pages
File Organization
No ratings yet
File Organization
11 pages
File Organization in DBMS
No ratings yet
File Organization in DBMS
13 pages
It Is A Very Efficient Method To Search The Exact Data Items Based On Hash Table
No ratings yet
It Is A Very Efficient Method To Search The Exact Data Items Based On Hash Table
49 pages
Unit Iv
No ratings yet
Unit Iv
6 pages
CH 13
No ratings yet
CH 13
6 pages
Chapter 12: Indexing and Hashing
No ratings yet
Chapter 12: Indexing and Hashing
31 pages
UNIT-IV - File Organization
No ratings yet
UNIT-IV - File Organization
10 pages
Unit 3 - DBMS (Indexing, Hashing, B+-Tree)
No ratings yet
Unit 3 - DBMS (Indexing, Hashing, B+-Tree)
7 pages
5 Data Storage and Indexing
No ratings yet
5 Data Storage and Indexing
58 pages
ENACh 13 Final
No ratings yet
ENACh 13 Final
34 pages
File Organization and Indexing: Structure of Disks
No ratings yet
File Organization and Indexing: Structure of Disks
28 pages
[Ebooks PDF] download Learning SQL Master SQL Fundamentals Alan Beaulieu full chapters
100% (2)
[Ebooks PDF] download Learning SQL Master SQL Fundamentals Alan Beaulieu full chapters
55 pages
Disk Storage, Basic File Structures, and Hashing
No ratings yet
Disk Storage, Basic File Structures, and Hashing
34 pages
Advanced SQL Concepts
No ratings yet
Advanced SQL Concepts
55 pages
Employee Management System
No ratings yet
Employee Management System
63 pages
srinu interview
No ratings yet
srinu interview
16 pages
MQL 105
No ratings yet
MQL 105
884 pages
SAP HANA Modeling Guide For SAP HANA Studio en
100% (1)
SAP HANA Modeling Guide For SAP HANA Studio en
266 pages
V12-Dbe 3 Manual
0% (1)
V12-Dbe 3 Manual
194 pages
M02 Key+Capabilities++of+Exadata+Database+Machine Ed2+
No ratings yet
M02 Key+Capabilities++of+Exadata+Database+Machine Ed2+
47 pages
ElasticSearch Interview Questions and Answers for Freshers
No ratings yet
ElasticSearch Interview Questions and Answers for Freshers
14 pages
DBMS File
No ratings yet
DBMS File
36 pages
BD_Unit4_Summary_efde2208-1937-44c2-9c1d-e0d171eb6120
No ratings yet
BD_Unit4_Summary_efde2208-1937-44c2-9c1d-e0d171eb6120
6 pages
DBMS 1st 10 Q & A.
No ratings yet
DBMS 1st 10 Q & A.
25 pages
RDMS
No ratings yet
RDMS
42 pages
Farm Produce Report
100% (1)
Farm Produce Report
27 pages
Komatsu Hydraulic Excavator Pc600 8 Shop Manual
100% (57)
Komatsu Hydraulic Excavator Pc600 8 Shop Manual
20 pages
MTA: Database Fundamentals: Lab Exercises & Notes
No ratings yet
MTA: Database Fundamentals: Lab Exercises & Notes
24 pages
PL SQL Collections
No ratings yet
PL SQL Collections
46 pages
QCM
No ratings yet
QCM
11 pages
Bteq Fexp Fload Mload
No ratings yet
Bteq Fexp Fload Mload
59 pages
Splunk 6.3.1 Forwarding
No ratings yet
Splunk 6.3.1 Forwarding
159 pages
SQL by Rohan
No ratings yet
SQL by Rohan
13 pages
SQL Locks
No ratings yet
SQL Locks
39 pages
Tp6667 Generator Sets
No ratings yet
Tp6667 Generator Sets
108 pages
Unit 1: Db2 Program Preparation
No ratings yet
Unit 1: Db2 Program Preparation
167 pages
MSD Bos Cse
No ratings yet
MSD Bos Cse
40 pages
SQL Query Interview Questions With Answers
100% (1)
SQL Query Interview Questions With Answers
16 pages
Oracle 12c SQL JSON
No ratings yet
Oracle 12c SQL JSON
6 pages
MCQ Question Sub: Database Fundamentals Chapter (1-7)
No ratings yet
MCQ Question Sub: Database Fundamentals Chapter (1-7)
9 pages
Pandas For Machine Learning: Acadview
No ratings yet
Pandas For Machine Learning: Acadview
18 pages
Best Practices in Elasticsearch
No ratings yet
Best Practices in Elasticsearch
5 pages
Bash Shell from Zero to Hero: An SRE's Practical Guide to Terminal Skills, Scripting, and Automation
From Everand
Bash Shell from Zero to Hero: An SRE's Practical Guide to Terminal Skills, Scripting, and Automation
Nolan Reeves
No ratings yet
Introduction to Microsoft SQL Server
From Everand
Introduction to Microsoft SQL Server
Eric Frick
No ratings yet
Oracle Database 12c Quickstart
From Everand
Oracle Database 12c Quickstart
Michael Elliott
5/5 (5)
Search Tree: Fundamentals and Applications
From Everand
Search Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet

DBMS Chapter 4 Record Organization and Dile Management

Uploaded by

DBMS Chapter 4 Record Organization and Dile Management

Uploaded by

Chapter 4

Record Storage & Primary File

• The Memory Hierarchy is based upon speed of access

• The Primary Storage Level of Memory is generally made up of 3

• The 2ndry Storage Level of Memory may be made up of 4 Levels.

• Capacity - The number of bytes it can store.

• Sector - A segment or arc of a track.

• RAID - Stands for Redundant Arrays of Inexpensive Disks or

• Level 3 - uses a single parity disk.

• Record is the term used to refer to a number of related values or

• Heap (Pile) Files

• A sorted file is one in which records are stored in order of the

• Reading of the records in order of the ordering field is extremely

• Searches on non-ordering fields are inefficient.

• It is used to index and retrieve items in a database as it is faster to search that

• This is where a records placement is determined by value in the hash field.

• Internal Hashing is implemented as a hash table through the use of

• Collisions occur when a hash field value of a record being inserted

• The process of finding another position for this record is called

• Open Addressing- Places the record to be inserted in the first

• Multiple hashing - If an overflow occurs a second hash

• The goals of a good hash function are to uniformly distribute the

• External hashing makes use of buckets, each of which can hold

• When an overflow in a bucket occurs that bucket is split.

• Now the new bucket’s address must be added to the directory.

• If the overflow occurred in a bucket whose current local

h(K)= K mod M as the initial hash function called hi.

Hash Function = K mod M, where K is the field value, and

Dynamic Multilevel indexes using BTrees and B+ Trees

You might also like