0% found this document useful (0 votes)

2 views

Module-5 Dbms Cs208 Notes

Uploaded by

aadilak2004

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Module-5 Dbms Cs208 Notes

Uploaded by

aadilak2004

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

CS208: Principles of Database Design

MODULE 5
Physical Data Organization
Single level ordered indexes
Indexes are used to speed up the retrieval of records in response to certain search conditions. The index
structures are additional files on disk that provide secondary access paths, which
provide alternative ways to access the records without affecting the physical placement of records in the
primary data file on disk. They enable efficient access to records based on the indexing fields that are
used to construct the index.

For a file with a given record structure consisting of several fields (or attributes), an index access
structure is usually defined on a single field of a file, called an indexing field (or indexing attribute). The
index typically stores each value of the index field along with a list of pointers to all disk blocks that
contain records with that field value. The values in the index are ordered so that we can do a binary
search on the index.

Index structures, primary, secondary and clustering indices

KTU STUDENTS
There are several types of ordered indexes. A primary index is specified on the ordering key field of an
ordered file of records.An ordering key field is used to physically order the file records on disk, and every
record has a unique value for that field. If the ordering field is not a key field—that is, if numerous records
in the file can have the same value for the ordering field—another type of index, called a clustering
index, can be used. The data file is called a clustered file in this latter case. A third type of index, called a
secondary index, can be specified on any nonordering field of a file. A data file can have several
secondary indexes in addition to its primary access method.
Primary indexes
A primary index is an ordered file whose records are of fixed length with two fields, and it acts like an
access structure to efficiently search for and access the data records in a data file. The first field is of the
same data type as the ordering key field—called the primary key—of the data file, and the second field is
a pointer to a disk block (a block address). There is one index entry (or index record) in the index file for
each block in the data file. Each index entry has the value of the primary key field for the first record in a
block and a pointer to that block as its two
field values. We will refer to the two field values of index entry i as <K(i), P(i)>.

The total number of entries in the index is the same as the number of disk blocks in the ordered data file.
The first record in each block of the data file is called the anchor record of the block, or simply the block
anchor.

Indexes can also be characterized as dense or sparse. A dense index has an index entry for every search
key value (and hence every record) in the data file. A sparse (or nondense) index, on the other hand, has
index entries for only some of the search values. A sparse index has fewer entries than the number of
records in the file. Thus, a primary index is a nondense (sparse) index, since it includes an entry for each
disk block of the data file and the keys of its anchor record rather than for every search value (or every
record).

For more study materials>www.ktustudents.in

CS208: Principles of Database Design
The index file for a primary index occupies a much smaller space than does the data file, for two reasons.
First, there are fewer index entries than there are records in the data file. Second, each index entry is
typically smaller in size than a data record because it has only two fields; consequently, more index
entries than data records can fit in one block. Therefore, a binary search on the index file requires fewer
block accesses than a binary search on the data file. The
binary search for an ordered data file required log 2b block accesses.

KTU STUDENTS
A major problem with a primary index—as with any ordered file—is insertion and deletion of records.
With a primary index, the problem is compounded because if we attempt to insert a record in its correct
position in the data file, we must not only move records to make space for the new record but also
change some index entries, since moving records will change the anchor records of some blocks.

Clustering Indexes
If file records are physically ordered on a nonkey field—which does not have a distinct
value for each record—that field is called the clustering field and the data file is called a clustered file.
We can create a different type of index, called a clustering index, to speed up retrieval of all the records
that have the same value for the clustering field. This differs from a primary index, which requires that
the ordering field of the data file have a distinct value for each record.

A clustering index is also an ordered file with two fields; the first field is of the same type as the clustering
field of the data file, and the second field is a disk block pointer. There is one entry in the clustering index

For more study materials>www.ktustudents.in

CS208: Principles of Database Design
for each distinct value of the clustering field, and it contains the value and a pointer to the first block in
the data file that has a record with that value for its clustering field.

Record insertion and deletion still cause problems because the data records are physically ordered. To
alleviate the problem of insertion, it is common to reserve a whole block (or a cluster of contiguous
blocks) for each value of the clustering field; all records with that value are placed in the block (or block
cluster). This makes insertion and deletion relatively straightforward.
A clustering index is another example of a nondense index because it has an entry for every distinct value
of the indexing field, which is a nonkey by definition and hence has duplicate values rather than a unique
value for every record in the file.

KTU STUDENTS

For more study materials>www.ktustudents.in

CS208: Principles of Database Design

KTU STUDENTS
Secondary Indexes
A secondary index provides a secondary means of accessing a data file for which some primary access
already exists. The data file records could be ordered, unordered, or hashed. The secondary index may be
created on a field that is a candidate key and has a unique value in every record, or on a nonkey field with
duplicate values. The index is again an ordered file with two fields. The first field is of the same data type
as some nonordering field of the data file that is an indexing field. The second field is either a block
pointer or a record pointer.

First we consider a secondary index access structure on a key (unique) field that has a distinct value for
every record. Such a field is sometimes called a secondary key; in the relational model, this would
correspond to any UNIQUE key attribute or to the primary key attribute of a table. In this case there is one
index entry for each record in the data file, which contains the value of the field for the record and a
pointer either to the block in which the record is stored or to the record itself. Hence, such an index is
dense.

Because the records of the data file are not physically ordered by values of the secondary key field, we
cannot use block anchors.
The pointers P(i) in the index entries are block pointers, not record pointers. Once the appropriate disk
block is transferred to a main memory buffer, a search for the desired record within the block can be
carried out.

For more study materials>www.ktustudents.in

CS208: Principles of Database Design
A secondary index usually needs more storage space and longer search time than does a primary index,
because of its larger number of entries.

KTU STUDENTS
We can also create a secondary index on a nonkey, nonordering field of a file. In this case, numerous
records in the data file can have the same value for the indexing field. There are several options for
implementing such an index:
■ Option 1 is to include duplicate index entries with the same K(i) value—one for each record. This would
be a dense index.
■ Option 2 is to have variable-length records for the index entries, with a repeating field for the pointer.
■ Option 3, which is more commonly used, is to keep the index entries themselves at a fixed length and
have a single entry for each index field value, but to create an extra level of indirection to handle the
multiple pointers as shown in figure below.

For more study materials>www.ktustudents.in

CS208: Principles of Database Design

KTU STUDENTS
Retrieval via the index requires one or more additional block accesses because of the extra level, but the
algorithms for searching the index and (more importantly) for inserting of new records in the data file are
straightforward.

A secondary index provides a logical ordering on the records by the indexing field. If we access the
records in order of the entries in the secondary index, we get them in order of the indexing field. The
primary and clustering indexes assume that the field used for physical ordering of records in the file is
the same as the indexing field.

For more study materials>www.ktustudents.in

CS208: Principles of Database Design

Multi-level indexes
If an index is small enough to be kept entirely in main memory, the search time to find an entry
is low. However, if the index is so large that not all of it can be kept in memory, index blocks
must be fetched from disk when required. (Even if an index is smaller than the main memory of
a computer, main memory
is also required for a number of other tasks, so it may not be possible to keep the entire index in
memory.) The search for an entry in the index then requires several disk-block reads.
In such a case, we can create yet another level of index. Indeed, we can repeat this process as

KTU STUDENTS
many
times as necessary. Indices with two or more levels are called multilevel indices. Searching for
records with a multilevel index requires significantly fewer I/O operations than does searching
for records by binary search. Multilevel indices are closely related to tree structures, such as the
binary trees used for in-memory indexing.
The multilevel scheme can be used on any type of index—whether it is primary, clustering, or
secondary—as long as the first-level index has distinct values for K(i) and fixed-length entries.

For more study materials>www.ktustudents.in

CS208: Principles of Database Design

KTU STUDENTS
A multilevel index reduces the number of blocks accessed when searching for a record, given its indexing
field value.We are still faced with the problems of dealing with index insertions and deletions, because all
index levels are physically ordered files. To retain the benefits of using multilevel indexing while reducing
index insertion and deletion problems, designers adopted a multilevel index called a dynamic multilevel
index that leaves some space in each of its blocks
for inserting new entries and uses appropriate insertion/deletion algorithms for creating and deleting
new index blocks when the data file grows and shrinks. It is often implemented by using data structures
called B-trees and B+-trees,

B+-Trees
A tree is formed of nodes. Each node in the tree, except for a special node called the root, has one
parent node and zero or more child nodes. The root node has no parent. A node that does not have any
child nodes is called a leaf node; a nonleaf node is called an internal node. The level of a node is always
one more than the level of its parent, with the level of the root node being zero. A subtree of a node

For more study materials>www.ktustudents.in

CS208: Principles of Database Design
consists of that node and all its descendant nodes—its child nodes, the child nodes of its child nodes, and
so on.

In a B+-tree, data pointers are stored only at the leaf nodes of the tree; hence, the structure of leaf nodes
differs from the structure of internal nodes. The leaf nodes have an entry for every value of the search
field, along with a data pointer to the record (or to the block that contains
this record) if the search field is a key field. For a nonkey search field, the pointer points to a block
containing pointers to the data file records, creating an extra level of indirection.

The leaf nodes of the B+-tree are usually linked to provide ordered access on the search field to the
records. These leaf nodes are similar to the first (base) level of an index. Internal nodes of the B +-tree
correspond to the other levels of a multilevel index. Some search field values from the leaf nodes are
repeated in the internal nodes of the B+-tree to guide the search.

KTU STUDENTS
The pointers in internal nodes are tree pointers to blocks that are tree nodes, whereas the pointers in leaf
nodes are data pointers to the data file records or blocks. Because entries in the internal nodes of a B+-
tree include search values and tree pointers without any data pointers, more entries can be packed into
an internal node of a B+-tree than for a similar B-tree. Thus, for the same block (node) size, the order p
will be larger for the B+-tree than for the B-tree. This can lead to fewer B+-tree levels, improving search
time. Because the structures for internal and for leaf nodes of a B+-tree are different, the order p can be
different.

Query processing
The scanner identifies the query tokens—such as SQL keywords, attribute names, and relation names—
that appear in the text of the query, whereas the parser checks the query syntax to determine whether it
is formulated according to the syntax rules (rules of grammar) of the
query language. An internal representation of the query is then created, usually as a tree data structure
called a query tree or query graph. The DBMS must then devise an execution strategy or query plan for
retrieving the results of the query from the database files. A query typically has many possible execution
strategies, and the process of choosing a suitable one for processing a query is known as query
optimization.

For more study materials>www.ktustudents.in

CS208: Principles of Database Design

Heuristics-based query optimization

Is optimization techniques that apply heuristic rules to modify the internal representation of a query—
which is usually in the form of a query tree or a query graph data structure—to improve its expected
performance.

The scanner and parser of an SQL query first generate a data structure that corresponds
to an initial query representation, which is then optimized according to heuristic rules. This leads to an
optimized query representation, which corresponds to the query execution strategy. Following that, a
query execution plan is generated to execute groups of operations based on the access paths available on
the files involved in the query.

One of the main heuristic rules is to apply SELECT and PROJECT operations before applying the JOIN or other
binary operations, because the size of the file resulting from a binary operation—such as JOIN—is usually
a multiplicative function of the sizes of the input files. The SELECT and PROJECT operations reduce the size
of a file and hence should be applied before a join or other binary operation.

A query tree is a tree data structure that corresponds to a relational algebra expression. It represents the
input relations of the query as leaf nodes of the tree, and represents the relational algebra operations as
internal nodes. An execution of the query tree consists of executing an internal node operation whenever
its operands are available and then replacing that internal node by the relation that results from
executing the operation.

The order of execution of operations starts at the leaf nodes, which represents the input database

KTU STUDENTS
relations for the query, and ends at the root node, which represents the final operation of the query. The
execution terminates when the root node operation is executed and produces the result relation for the
query.

For more study materials>www.ktustudents.in

CS208: Principles of Database Design

KTU STUDENTS

For more study materials>www.ktustudents.in

Clinical Pearls in Pulmonology (2018) PDF
No ratings yet
Clinical Pearls in Pulmonology (2018) PDF
183 pages
CO3 Notes Indexing
No ratings yet
CO3 Notes Indexing
11 pages
Indexing Lecture Nov 2023 Detailed
No ratings yet
Indexing Lecture Nov 2023 Detailed
37 pages
Single-Level Ordered Indexes
No ratings yet
Single-Level Ordered Indexes
12 pages
Single Level Indexing
No ratings yet
Single Level Indexing
9 pages
Chapter 3
No ratings yet
Chapter 3
50 pages
SingleLevelIndexing Examples
No ratings yet
SingleLevelIndexing Examples
24 pages
CNG351 Lecture 12 A
No ratings yet
CNG351 Lecture 12 A
21 pages
FALLSEM2019-20 ITE1003 ETH VL2019201002592 Reference Material I 06-Nov-2019 Indexing
No ratings yet
FALLSEM2019-20 ITE1003 ETH VL2019201002592 Reference Material I 06-Nov-2019 Indexing
32 pages
CNG351-lecture-12-a
No ratings yet
CNG351-lecture-12-a
21 pages
Assignment 3
No ratings yet
Assignment 3
4 pages
Indexing
No ratings yet
Indexing
27 pages
Unit 5
No ratings yet
Unit 5
54 pages
Co3 Session 21
No ratings yet
Co3 Session 21
53 pages
Indexing Structures For Files
No ratings yet
Indexing Structures For Files
23 pages
9 Files, Indices and Database Tuning
No ratings yet
9 Files, Indices and Database Tuning
17 pages
File Organization and Indexing
No ratings yet
File Organization and Indexing
13 pages
Unit 6 notes DBMS final
No ratings yet
Unit 6 notes DBMS final
14 pages
Ch17Notes Indexing Structures For Files
No ratings yet
Ch17Notes Indexing Structures For Files
39 pages
Indexing
No ratings yet
Indexing
6 pages
Indexing in DBMS
No ratings yet
Indexing in DBMS
12 pages
File Org
No ratings yet
File Org
10 pages
Primary Indexing
No ratings yet
Primary Indexing
7 pages
DBMS Unit-5
No ratings yet
DBMS Unit-5
33 pages
Indexing Dbms
No ratings yet
Indexing Dbms
22 pages
File Organizations and Indexes
No ratings yet
File Organizations and Indexes
51 pages
Chapter_3_File_Organization_Indexed_methods
No ratings yet
Chapter_3_File_Organization_Indexed_methods
31 pages
SS3 TERM 1
No ratings yet
SS3 TERM 1
18 pages
Dbms Mod3
No ratings yet
Dbms Mod3
54 pages
sqlIndexes2
No ratings yet
sqlIndexes2
10 pages
Dbms r18 Unit 5 Notes
No ratings yet
Dbms r18 Unit 5 Notes
24 pages
Dbms r18 Unit 5 Notes
No ratings yet
Dbms r18 Unit 5 Notes
24 pages
Index Structures
No ratings yet
Index Structures
34 pages
R22 Unit 5
No ratings yet
R22 Unit 5
23 pages
Types of Indexes
No ratings yet
Types of Indexes
9 pages
DBMS UNIT-5
No ratings yet
DBMS UNIT-5
23 pages
Indexing Lecture Nov 2023 Summary
No ratings yet
Indexing Lecture Nov 2023 Summary
41 pages
unit 4 - Indexing
No ratings yet
unit 4 - Indexing
8 pages
Indexing Structures For Files: Database Design Database Design
No ratings yet
Indexing Structures For Files: Database Design Database Design
9 pages
Indexing in DBMS
No ratings yet
Indexing in DBMS
5 pages
FALLSEM2024-25 BCSE302L TH VL2024250101553 2024-09-02 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE302L TH VL2024250101553 2024-09-02 Reference-Material-I
48 pages
Index and Hashing 2017 Combined
No ratings yet
Index and Hashing 2017 Combined
60 pages
Dbms r18 Unit 5 Notes
No ratings yet
Dbms r18 Unit 5 Notes
24 pages
DBMS - R2017 - Anna University
No ratings yet
DBMS - R2017 - Anna University
20 pages
CIT 401 Lecture Note
No ratings yet
CIT 401 Lecture Note
46 pages
Indexing_complete note
No ratings yet
Indexing_complete note
49 pages
Module 4 Indexing
No ratings yet
Module 4 Indexing
20 pages
Screenshot 2025-03-12 at 9.41.04 AM
No ratings yet
Screenshot 2025-03-12 at 9.41.04 AM
41 pages
UNIT-IV - File Organization
No ratings yet
UNIT-IV - File Organization
10 pages
2. Elmasri_6e_Ch18 (1)
No ratings yet
2. Elmasri_6e_Ch18 (1)
53 pages
Indexing
No ratings yet
Indexing
8 pages
CMP 312
No ratings yet
CMP 312
2 pages
Indexing Structures For Files
No ratings yet
Indexing Structures For Files
25 pages
Indexing
No ratings yet
Indexing
62 pages
CH 14
No ratings yet
CH 14
6 pages
S - UNIT VII Indexing in Database
No ratings yet
S - UNIT VII Indexing in Database
9 pages
Indexing
No ratings yet
Indexing
2 pages
M12 Indexing in DBMS
No ratings yet
M12 Indexing in DBMS
18 pages
Unit5 File Organization
No ratings yet
Unit5 File Organization
112 pages
2 RDBMS
No ratings yet
2 RDBMS
8 pages
Data Structures I Essentials
From Everand
Data Structures I Essentials
Dennis Smolarski
No ratings yet
CCS 352 Multimedia and Animation Question Bank Unitwise
No ratings yet
CCS 352 Multimedia and Animation Question Bank Unitwise
27 pages
Jaswanth's Resume
No ratings yet
Jaswanth's Resume
1 page
Vims para Determinar Codigos de Diagnostico
No ratings yet
Vims para Determinar Codigos de Diagnostico
12 pages
Curriculum Map - Practical Research 1
No ratings yet
Curriculum Map - Practical Research 1
3 pages
Nice Gate Door Catalogue en
No ratings yet
Nice Gate Door Catalogue en
288 pages
Avionics Thesis Topics
100% (3)
Avionics Thesis Topics
6 pages
Project Report
No ratings yet
Project Report
119 pages
Mapping Indonesian Paddy Fields Using Multiple-Temporal Satellite Imagery
No ratings yet
Mapping Indonesian Paddy Fields Using Multiple-Temporal Satellite Imagery
7 pages
SS-7 May Refer To:: RFC 2719 RFC 2719
No ratings yet
SS-7 May Refer To:: RFC 2719 RFC 2719
15 pages
Fs Avaya Vantage Uc7928en
No ratings yet
Fs Avaya Vantage Uc7928en
6 pages
(eBook PDF) The Art of Public Speaking 13th Edition by Stephen Lucas pdf download
100% (8)
(eBook PDF) The Art of Public Speaking 13th Edition by Stephen Lucas pdf download
54 pages
Reading Review 2
No ratings yet
Reading Review 2
4 pages
Stu w02b Beginners Guide To Reverse Engineering Android Apps PDF
No ratings yet
Stu w02b Beginners Guide To Reverse Engineering Android Apps PDF
22 pages
Carla Olu Icher: Kemi Schle
No ratings yet
Carla Olu Icher: Kemi Schle
1 page
Instant ebooks textbook Formulas and Calculations for Drilling Production and Workover 5th Edition William C. Lyons download all chapters
100% (1)
Instant ebooks textbook Formulas and Calculations for Drilling Production and Workover 5th Edition William C. Lyons download all chapters
55 pages
Alvarado 2017
No ratings yet
Alvarado 2017
12 pages
MODULE 2 (Chapter 2.1)
No ratings yet
MODULE 2 (Chapter 2.1)
12 pages
Cigre, 13 Years Test Experience With Short Circuit Withstand Capability of Large Power Transformers
No ratings yet
Cigre, 13 Years Test Experience With Short Circuit Withstand Capability of Large Power Transformers
7 pages
Cyber Law and Policy: Lesson 6 Information Security Policies
No ratings yet
Cyber Law and Policy: Lesson 6 Information Security Policies
51 pages
Health Consciousness and Fast Foods - A Study Among College Students in Lunglei Town, Mizoram
No ratings yet
Health Consciousness and Fast Foods - A Study Among College Students in Lunglei Town, Mizoram
11 pages
Intro To Programming
No ratings yet
Intro To Programming
113 pages
605 00 159 FLOAT DPX2 Tuning Guide White Reva PDF
No ratings yet
605 00 159 FLOAT DPX2 Tuning Guide White Reva PDF
8 pages
Rabbit Laser USA Middletown, OH: Common Settings For Engraving and Cutting Materials
No ratings yet
Rabbit Laser USA Middletown, OH: Common Settings For Engraving and Cutting Materials
3 pages
IT Applications in Operations by SR Prof TM
100% (1)
IT Applications in Operations by SR Prof TM
47 pages
STP 15 51 Permeable Concrete
No ratings yet
STP 15 51 Permeable Concrete
116 pages
Thermowell With Flange (Fabricated) Version Per DIN 43772 Form 2F, 3F Models TW40-8, TW40-9
No ratings yet
Thermowell With Flange (Fabricated) Version Per DIN 43772 Form 2F, 3F Models TW40-8, TW40-9
3 pages
Element
No ratings yet
Element
22 pages
A Teacher's Guide For Typewriter Maintenance Emphasizing The - Clopper, Henry Eckert
No ratings yet
A Teacher's Guide For Typewriter Maintenance Emphasizing The - Clopper, Henry Eckert
95 pages
TCPIP
No ratings yet
TCPIP
2 pages

Module-5 Dbms Cs208 Notes

Uploaded by

Module-5 Dbms Cs208 Notes

Uploaded by

CS208: Principles of Database Design

Index structures, primary, secondary and clustering indices

For more study materials>www.ktustudents.in

For more study materials>www.ktustudents.in

For more study materials>www.ktustudents.in

For more study materials>www.ktustudents.in

For more study materials>www.ktustudents.in

For more study materials>www.ktustudents.in

For more study materials>www.ktustudents.in

For more study materials>www.ktustudents.in

For more study materials>www.ktustudents.in

Heuristics-based query optimization

For more study materials>www.ktustudents.in

For more study materials>www.ktustudents.in

You might also like