0% found this document useful (0 votes)

7 views

DS_TM_Study_Material_Presentations_Unit-4_1TM

The document discusses file structures in data management, detailing what a file is, types of file organizations, and their respective advantages and disadvantages. It covers sequential files, hashing, and indexing methods for efficient record retrieval, including primary, clustering, and secondary indexes. The document emphasizes the importance of choosing the right file organization based on the operations required for data access and manipulation.

Uploaded by

deviankamreddi811

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views

DS_TM_Study_Material_Presentations_Unit-4_1TM

Uploaded by

deviankamreddi811

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Data Structures (DS)

Unit-4
Hashing & File Structure
(File Structure)
What is File?
 A file is a collection of records where a record consists of one or more fields. Each contains the
same sequence of fields.
 Each field is normally of fixed length.
 A sample file with four records is shown below:
Name Roll No. Year Marks • There are four records
AMIT 1000 1 82 • There are four fields (Name, Roll No., Year,
KALPESH 1005 2 54 Marks)
JITENDRA 1009 1 75 • Records can be uniquely identified on the field
RAVI 1010 1 79 'Roll No.' Therefore, Roll No. is the key field.
• A database is a collection of files.

2
File Organizations
 File Organizations  Primitive Operations on a File
1. Sequential files 1. Creation
2. Relative files 2. Reading
3. Direct files 3. Insertion
4. Indexed Sequential files 4. Deletion
5. Index files 5. Updation
6. Searching

3
Sequential Files
 It is the most common type of file. Block 1
Name Roll No. Year Marks
 A fixed format is used for record.
AMIT 1000 1 82
 All records are of the same length. KALPESH 1005 1 54
JITENDRA 1009 1 75
 Position of each field in record and length of field is fixed.
RAVI 1010 1 79
 Records are physically ordered on the value of one of the fields
- called the ordering field. Block 2
Name Roll No. Year Marks
RAMESH 1015 1 75
ROHIT 1025 1 65
JANAK 1026 1 75
AMAR 1029 1 79

4
Advantages of Sequential Files
 Reading of records in order of the ordering key is extremely efficient.
 Finding the next record in order of the ordering key usually, does not require additional block
access. Next record may be found in the same block.
 Searching operation on ordering key is must faster. Binary search can be utilized. A binary
search will require log2b block accesses where b is the total number of blocks in the file.

5
Disadvantages of Sequential Files
 Sequential file does not give any advantage when the search operation is to be carried out on
non- ordering field.
 Inserting a record is an expensive operation. Insertion of a new record requires finding of place
of insertion and then all records ahead of it must be moved to create space for the record to be
inserted. This could be very expensive for large files.
 Deleting a record is an expensive operation. Deletion too requires movement of records.
 Modification of field value of ordering key could be time consuming. Modifying the ordering field
means the record can change its position. This requires deletion of the old record followed by
insertion of the modified record.

6
Hashing (Direct file organization)
Bucket 0
0 230 480
460 790
1
2
Bucket 1
… 321 Hashing with buckets
… 531 of chained blocks
…
…
… Bucket 2 930
… 232 270 420
242 470

B-1
Bucket Directory

7
Hashing (Direct file organization)
 It is a common technique used for fast accessing of records on secondary storage.
 Records of a file are divided among buckets.
 A bucket is either one disk block or cluster of contiguous blocks.
 A hashing function maps a key into a bucket number. The buckets are numbered 0, 1,2...b-1.
 A hash function f maps each key value into one of the integers 0 through b - 1.
 If x is a key, f(x) is the number of bucket that contains the record with key x.
 The blocks making up each bucket could either be contiguous blocks or they can be chained
together in a linked list.

8
Hashing (Direct file organization)
 Translation of bucket number to disk block address is done with the help of bucket directory. It
gives the address of the first block of the chained blocks in a linked list.
 Hashing is quite efficient in retrieving a record on hashed key. The average number of block
accesses for retrieving a record.
𝑵𝒐 𝒐𝒇 𝒓𝒆𝒄𝒐𝒓𝒅𝒔
= 1 (bucket directory) + 𝑵𝒐 𝒐𝒇 𝒃𝒖𝒄𝒌𝒆𝒕𝒔 𝒙 𝑵𝒐 𝒐𝒇 𝒓𝒆𝒄𝒐𝒓𝒅𝒔 𝒑𝒆𝒓 𝒃𝒍𝒐𝒄𝒌

 Thus the operation is b times faster (b = number of buckets) than unordered file.
 To insert a record with key value x, the new record can added to the last block in the chain for
bucket f(x). If the record does not fit into the existing block, record is stored in a new block and
this new block is added at the end of the chain for bucket f(x).
 A well designed hashed structure requires two block accesses for most operations

9
Indexing
 Indexing is used to speed up retrieval of records.
 It is done with the help of a separate sequential file.
 Each record of in the index file consists of two fields, a key field and a pointer into the main file.
 To find a specific record for the given key value, index is searched for the given key value.
 Binary search can used to search in index file. After getting the address of record from index file,
the record in main file can easily be retrieved.

10
Indexing
Index File Main File

Keyc Name Roll No. Year Marks

1000 AMIT 1010 1 82
1009 KALPESH 1016 1 54
1010 JITENDRA 1000 1 75
1012 RAVI 1012 1 79
1016 NILESH 1089 1 85
1089 NITIN 1100 1 98
1100 JAYESH 1200 1 99
1200 UMESH 1009 1 74

Index file is ordered on the ordering key Roll No. each record of index file points to
the corresponding record. Main file is not sorted.
11
Advantages of Indexing
 Sequential file can be searched effectively on ordering key. When it is necessary to search for a
record on the basis of some other attribute than the ordering key field, sequential file
representation is inadequate.
 Multiple indexes can be maintained for each type of field used for searching. Thus, indexing
provides much better flexibility.
 An index file usually requires less storage space than the main file.
 A binary search on sequential file will require accessing of more blocks.
 This can be explained with the help of the following example.
 Consider the example of a sequential file with r = 1024 records of fixed length with record size R
= 128 bytes stored on disk with block size B = 2048 bytes.

12
Advantages of Indexing
 Size of Sequential File
 Number of blocks required to store the file
 (1024 x 128) / 2048 = 64
 Number of block accesses for searching a record
 log264= 6

 Size of Index File

 Suppose, we want to construct an index on a key field that is V = 4 bytes long and the block pointer is P = 4 bytes
long.
 A record of an index file needs 8 bytes per entry.
 Total Number of index entries = 1024
 Number of blocks required to store the index file
 (1024x8) / 2048 = 4
 Number of block accesses for searching a record = log24 = 2

13
Types of Indexes
 With indexing, new records can be added at the end of the main file. It will not require movement
of records as in the case of sequential file.
 Updation of index file requires fewer block accesses compare to sequential file
 Types of Indexes:
1. Primary indexes
2. Clustering indexes
3. Secondary indexes

14
Primary Indexes (Indexed Sequential File)
101

101 200

201 201
351
350
Data File
…
… Sequential File
351
805
905 400
… …
…
Index File …
805
Primary Index on ordering key field
Roll Number 904

15
Primary Indexes (Indexed Sequential File)
 An indexed sequential file is characterized by
 Sequential organization (ordered on primary key)
 Indexed on primary key

 An indexed sequential file is both ordered and indexed.

 Records are organized in sequence based on a key field, known as primary key.
 An index to the file is added to support random access. Each record in the index file consists of
two fields: a key field, which is the same as the key field in the main file.
 Number of records in the index file is equal to the number of blocks in the main file (data file)
and not equal to the number of records in the main file (data file).

16
Clustering Indexes
100 Math
100 Science
100 105 Physics
105 105
106 105
108 106
… 106
…
…
…
…

… 108
108
Field Clustering 109
Index File
Data File 109

17
Clustering Indexes
 If records of a file are ordered on a non-key field, we can create a different type of index known
as clustering index.
 A non-key field does not have distinct value for each record.
 A Clustering index is also an ordered file with two fields.

18
Secondary Indexes (Simple Index File)
1 2

2 5

3 3

4 17

5 6
6 10
7 14
8 7
10 13
12 4
13 15
14 18
15
12
17
1
18
19
19
8

A secondary index on a non-ordering key field

19
Secondary Indexes (Simple Index File)
 While the hashed, sequential and indexed sequential files are suitable for operations based on
ordering key or the hashed key. Above file organizations are not suitable for operations involving
a search on a field other than ordering or hashed key.
 If searching is required on various keys, secondary indexes on these fields must be maintained.
 A secondary index is an ordered file with two fields.
 Some non-ordering field of the data file.
 A block pointer

 There could be several secondary indexes for the same file.

 One could use binary search on index file as entries of the index file are ordered on secondary key
field.
 Records of the data files are not ordered on secondary key field.

20
Secondary Indexes (Simple Index File)
 A secondary index requires more storage space and longer search time than does a primary
index.
 A secondary index file has an entry for every record whereas primary index file has an entry for
every block in data file.
 There is a single primary index file but the number of secondary indexes could be quite a few.

21
Data Structures (DS)

Thank
You

Appian Interview Question and Answers
100% (2)
Appian Interview Question and Answers
12 pages
50 Business Analyst Interview Questions and Answers
100% (1)
50 Business Analyst Interview Questions and Answers
11 pages
Chapter 17 Disk Storage, Basic File Structures, and Hashing Disk Storage Devices
No ratings yet
Chapter 17 Disk Storage, Basic File Structures, and Hashing Disk Storage Devices
10 pages
Dork
No ratings yet
Dork
75 pages
09_FIle.pptx
No ratings yet
09_FIle.pptx
22 pages
22-File Organization-06-09-2024
No ratings yet
22-File Organization-06-09-2024
23 pages
L2.2-File Organization Techniques
No ratings yet
L2.2-File Organization Techniques
42 pages
UNIT-IV - File Organization
No ratings yet
UNIT-IV - File Organization
10 pages
Presentation ON File Organisation: Submitted To: Mrs. Sonal Beniwal
No ratings yet
Presentation ON File Organisation: Submitted To: Mrs. Sonal Beniwal
23 pages
File Organization
No ratings yet
File Organization
11 pages
Chapter 12: Indexing and Hashing
No ratings yet
Chapter 12: Indexing and Hashing
31 pages
File Organization Notes
No ratings yet
File Organization Notes
21 pages
Unit 1 Introduction To Dbms
No ratings yet
Unit 1 Introduction To Dbms
27 pages
Dbms Unit III Notes
No ratings yet
Dbms Unit III Notes
27 pages
Chapter 5. Record Storage and Primary File Organization
No ratings yet
Chapter 5. Record Storage and Primary File Organization
18 pages
File Organizations and Indexes
No ratings yet
File Organizations and Indexes
51 pages
Unit 6 (22516)
No ratings yet
Unit 6 (22516)
40 pages
Indexing
No ratings yet
Indexing
62 pages
5 Data Storage and Indexing
No ratings yet
5 Data Storage and Indexing
60 pages
m5 Index PDF
No ratings yet
m5 Index PDF
60 pages
Chapter - 8 1 97
No ratings yet
Chapter - 8 1 97
97 pages
DSA Unit6 Theory
No ratings yet
DSA Unit6 Theory
23 pages
5 Data Storage and Indexing
No ratings yet
5 Data Storage and Indexing
58 pages
Unit 6 File Management
No ratings yet
Unit 6 File Management
70 pages
Mod4 Chap10 - 11 Indexing
No ratings yet
Mod4 Chap10 - 11 Indexing
77 pages
Disk Storage, Basic File Structures, and Hashing
No ratings yet
Disk Storage, Basic File Structures, and Hashing
34 pages
File Organization
No ratings yet
File Organization
41 pages
File Organization and Indexing: Structure of Disks
No ratings yet
File Organization and Indexing: Structure of Disks
28 pages
Lecture 3.3.2 Index Sequential
No ratings yet
Lecture 3.3.2 Index Sequential
14 pages
Unit-4: Hashing & File Structure (File Structure)
No ratings yet
Unit-4: Hashing & File Structure (File Structure)
22 pages
Unit-1-Lecture-9
No ratings yet
Unit-1-Lecture-9
22 pages
2MCA2 DBMS Nit 2 Secondary Storage. 16960710426030.Pptx
No ratings yet
2MCA2 DBMS Nit 2 Secondary Storage. 16960710426030.Pptx
32 pages
1 - Disk Storage - Ch13
No ratings yet
1 - Disk Storage - Ch13
31 pages
2022 - CMP 262 - File Organisation - Slides
No ratings yet
2022 - CMP 262 - File Organisation - Slides
19 pages
08 File Handling
No ratings yet
08 File Handling
18 pages
file organization
No ratings yet
file organization
9 pages
Chapter 6-
No ratings yet
Chapter 6-
62 pages
UNIT 5 dbms
No ratings yet
UNIT 5 dbms
25 pages
File Organization Methods
No ratings yet
File Organization Methods
22 pages
Storage System Hierarchy in DBMS
No ratings yet
Storage System Hierarchy in DBMS
20 pages
File Organization and Indexing: Prof P Sreenivasa Kumar Department of CS&E, IITM 1
No ratings yet
File Organization and Indexing: Prof P Sreenivasa Kumar Department of CS&E, IITM 1
23 pages
Unit5 File Organization
No ratings yet
Unit5 File Organization
112 pages
Unit 5
No ratings yet
Unit 5
185 pages
7_DataStorageIndexingStructures
No ratings yet
7_DataStorageIndexingStructures
83 pages
Explain File Management in An Operating System
No ratings yet
Explain File Management in An Operating System
57 pages
File Organization CH16 Updated
No ratings yet
File Organization CH16 Updated
30 pages
LM2 File Organisation
No ratings yet
LM2 File Organisation
31 pages
Chap. 2 File Organization and Indexing: Abel J.P. Gomes
No ratings yet
Chap. 2 File Organization and Indexing: Abel J.P. Gomes
20 pages
Inls 623 - Database Systems Ii - File Structures, Indexing, and Hashing
No ratings yet
Inls 623 - Database Systems Ii - File Structures, Indexing, and Hashing
41 pages
Elmasri Storage Hashing
No ratings yet
Elmasri Storage Hashing
27 pages
Unit 4 Chapter 1 Storage and Querying
No ratings yet
Unit 4 Chapter 1 Storage and Querying
37 pages
Unit 6 File Indexing and Transaction Processing
No ratings yet
Unit 6 File Indexing and Transaction Processing
21 pages
DBMS-Unit5-PPT (1)
No ratings yet
DBMS-Unit5-PPT (1)
40 pages
W5 Storage Files Indexing pt1
No ratings yet
W5 Storage Files Indexing pt1
61 pages
Indexing
No ratings yet
Indexing
53 pages
Storage and File Management
100% (1)
Storage and File Management
16 pages
CH 3 Index
No ratings yet
CH 3 Index
40 pages
DBMS Unit 3
No ratings yet
DBMS Unit 3
81 pages
Ch17Notes Indexing Structures For Files
No ratings yet
Ch17Notes Indexing Structures For Files
39 pages
FP-Lecture-6 01
No ratings yet
FP-Lecture-6 01
33 pages
TOPIC THREE-File system
No ratings yet
TOPIC THREE-File system
15 pages
Linux Shell Scripting: From Basics to Expert Proficiency
From Everand
Linux Shell Scripting: From Basics to Expert Proficiency
William Smith
No ratings yet
Gluster Filesystem - Practical Method
From Everand
Gluster Filesystem - Practical Method
Fabian Mestre
No ratings yet
Spring Boot Token Based Authentication With Spring Security and JWT
No ratings yet
Spring Boot Token Based Authentication With Spring Security and JWT
30 pages
Advanced Database Concepts - Coursework
No ratings yet
Advanced Database Concepts - Coursework
11 pages
PHP Interview Questions by Ajay1kumar1
100% (13)
PHP Interview Questions by Ajay1kumar1
36 pages
Create Database: SQL Statement
No ratings yet
Create Database: SQL Statement
13 pages
Oracle FAQs
No ratings yet
Oracle FAQs
14 pages
Hashing Concepts in DBMS PDF
No ratings yet
Hashing Concepts in DBMS PDF
7 pages
PL SQL Interview - Career Ride - Imp
No ratings yet
PL SQL Interview - Career Ride - Imp
392 pages
Integration in The SAP System Functions: Document Info Record
No ratings yet
Integration in The SAP System Functions: Document Info Record
4 pages
Python Programming Pandas Across Examples
No ratings yet
Python Programming Pandas Across Examples
350 pages
DBMS External Internal Question Bank
No ratings yet
DBMS External Internal Question Bank
10 pages
Oracle Interview Questions
No ratings yet
Oracle Interview Questions
35 pages
DBMS Notes
No ratings yet
DBMS Notes
11 pages
Structured Query Language (SQL) : - Shilpa Pillai
No ratings yet
Structured Query Language (SQL) : - Shilpa Pillai
64 pages
Patches AME
No ratings yet
Patches AME
11 pages
ADBMS Notes
67% (3)
ADBMS Notes
48 pages
JSON Deep Down
No ratings yet
JSON Deep Down
50 pages
Database Optimization Interview: Answer
No ratings yet
Database Optimization Interview: Answer
13 pages
Cs403 Solved Mcqs Final Term by Junaid
100% (4)
Cs403 Solved Mcqs Final Term by Junaid
52 pages
S4D400 - Day 1
No ratings yet
S4D400 - Day 1
51 pages
[FREE PDF sample] (Ebook) High Performance PostgreSQL for Rails (Beta): Reliable, Scalable, Maintainable Database Applications by Andrew Atkinson ISBN 9798888650387, 8888650385 ebooks
100% (1)
[FREE PDF sample] (Ebook) High Performance PostgreSQL for Rails (Beta): Reliable, Scalable, Maintainable Database Applications by Andrew Atkinson ISBN 9798888650387, 8888650385 ebooks
76 pages
SQ L Server Final Soft
No ratings yet
SQ L Server Final Soft
183 pages
Multilevel Indexing and B+ Trees
No ratings yet
Multilevel Indexing and B+ Trees
33 pages
Module 4 - Oracle Tablespaces and Datafiles
No ratings yet
Module 4 - Oracle Tablespaces and Datafiles
26 pages
Third Semester MCA
No ratings yet
Third Semester MCA
22 pages
Teradata Performance and Capacity Services
No ratings yet
Teradata Performance and Capacity Services
3 pages
Database Index PDF
No ratings yet
Database Index PDF
6 pages
Marble-Tiles Report
No ratings yet
Marble-Tiles Report
91 pages

DS_TM_Study_Material_Presentations_Unit-4_1TM

Uploaded by

DS_TM_Study_Material_Presentations_Unit-4_1TM

Uploaded by

Data Structures (DS)

Keyc Name Roll No. Year Marks

 Size of Index File

 An indexed sequential file is both ordered and indexed.

A secondary index on a non-ordering key field

 There could be several secondary indexes for the same file.

You might also like