0% found this document useful (0 votes)
3 views

Unit_6

Uploaded by

Divyesh Ahir
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Unit_6

Uploaded by

Divyesh Ahir
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

Prof.

Amit Vyas
Department of Computer Engineering
V.V.P. Engineering College

1
UNIT:6
Storage Strategies
Indexing in DBMS
Indexing is used to optimize the performance
of a database.
 database search engine can use to speed up
data retrieval.
Indexes are used for faster access to user data
from set of data.
3
Index
Indexing is a way to optimize the performance
of a database by minimizing the number of
disk accesses required when a query is
processed.

It is a data structure technique which is used


to quickly locate and access the data in a
database.
4
Index structure:
 Indexes can be created using some database columns.
search-key Pointer

The first column of the database is the search key that


contains a copy of the primary key of the table. The values of
the primary key are stored in sorted order so that the
corresponding data can be accessed easily.
The second column of the database is the data reference. It
contains a set of pointers holding the address of the disk
block where the value of the particular key can be found. 5
Indexing Methods

6
Primary key
If the index is created on the primary key of
the table, then it is known as primary index.
These primary keys are unique to each record.
As primary keys are stored in sorted order, the
performance of the searching operation is
quite efficient.

7
Primary key
� Student(RollNo,
Name, Address, City, Mo.No)
CREATE INDEX idx_StudentRno
ON Student (RollNo);

8
Dense index
The dense index contains an index record
for every search key value in the data file. It
makes searching faster.
In this, the number of records in the index table
is same as the number of records in the main
table. Main
Table

9
Dense index
It needs more space to store index record
itself. The index records have the search key and
a pointer to the actual record on the disk.

10
Sparse index
 In the data file, index record appears only for a few
items. Each item points to a block.
 In this, instead of pointing to each record in the main
table, the index points to the records in the main
table in a gap.

11
Clustering Index
A clustered index can be defined as an ordered data
file. Sometimes the index is created on non-primary
key columns which may not be unique for each
record.

In this case, to identify the record faster, we will


group two or more columns to get the unique
value and create index out of them. This method is
called a clustering index. 12
Clustering Index
The records which have similar characteristics
are grouped, and indexes are created for these
group.

13
Cluster
Clustering Block
Field Pointer Branch No Roll No Name DOB
values 1
1 1
2 2
3 2
3
4
3
3
4
4

14
Branch No Roll No Name DOB
1
1
Clustering Block
Field Pointer 2
values
2
1
2
3 3
3
4
3

4
4

15
Secondary Index (Non-clustering Index)
(Multilevel Index)
 In secondary indexing, to reduce the size of
mapping, another level of indexing is introduced.
Then each range is further divided into smaller
ranges.
The mapping of the first level is stored in the
primary memory, so that address fetch is faster. The
mapping of the second level and actual data are
stored in the secondary memory (hard disk).
16
Main Table

17
B-tree
B-tree is a data structure that store data in
its node in sorted order. We can represent
sample B-tree as follows..

B-tree stores data in such a way that each


node contains keys in ascending order.
18
B(B+) tree
 The left side child node keys are less than the
current keys and the right side child node keys
are greater than the current keys.

19
B+
 B+ Tree is a balanced tree in which every path from the root
a leaf is of the same length. Each non-leaf node in the tree
must have between [n/2] and n children, where n is fixed
for particular tree. Suppose we Find 55 in the
intermediary node, we will find a branch between 50 and 75
nodes

20
leaf node of this tree is already full, and insert
a record 60
 The 3rd leaf node has the values (50, 55, 60, 65, 70)
and its current root node is 50

21
We will split the leaf node of the tree in the
middle so that its balance is not altered. So we
can group (50, 55) and (60, 65, 70) into 2 leaf
nodes.
If these two has to be leaf nodes, the
intermediate node cannot branch from 50. It
should have 60 added to it, and then we can
have pointers to a new leaf node.
22
Hashing
Hashing technique is used to calculate the direct
location of a data record on the disk without using
index structure.
In this technique, data is stored at the data blocks
whose address is generated by using the hashing
function. The memory location where these records
are stored is known as data bucket or data blocks.
23
Hashing
Data bucket: Data buckets are the memory
locations where the records are stored.
Hash Function: Hash function is a mapping
function that maps all the set of search keys to
actual record address. Generally, hash function
uses primary key to generate the hash index –
address of the data block.
24
Hashing
The above diagram shows data block addresses same as
primary key value.
This hash function can also be a simple mathematical
function like exponential, mod, cos, sin, etc.
Suppose we have mod (5) hash function to determine the
address of the data block.
In this case, it applies mod (5) hash function on the primary
keys and generates 3, 3, 1, 4 and 2 respectively, and records
are stored in those data block addresses. 25
26
Hashing

27
Types of Hashing

28
Static
� In the static hashing, the resultant data
bucket address will always remain the same.
� Therefore, if you generate an address for say
Student_ID = 10 using hashing function
mod(3), the resultant bucket address will always
be 1. So, you will not see any change in the
bucket address.
29
static
1
1 10 Dip Rajkot
2
2 8 Jay Baroda
3
0 9 Bhavik surat
4
5
6
7
8
9
10

30
Dynamic hashing
In dynamic hashing, data buckets grows or
shrinks (added or removed dynamically) as the
records increases or decreases.
Dynamic hashing is also known as extended
hashing.

31
Dynamic hashing
First, calculate the hash address of the key.
Check how many bits are used in the directory, and these
bits are called as i.
Take the least significant i bits of the hash address. This
gives an index of the directory. Now using the index, go to
the directory and find bucket address where the record
might be.

32
Dynamic
The last two bits of 2 and 4 are
00. So it will go into bucket B0.
The last two bits of 5 and 6 are
01, so it will go into bucket B1.
The last two bits of 1 and 3 are
10, so it will go into bucket B2.
The last two bits of 7 are 11, so it
will go into B3.
33
B0

B1

B2

B3

34
Insert key 9 with hash address 10001 into the
above structure:
Since key 9 has hash address 10001, it must go into
B1 bucket but bucket B1 is full, so it will get split.
The splitting will separate 5, 9 from 6 since last
three bits of 5, 9 are 001, so it will go into bucket B1,
and the last three bits of 6 are 101, so it will go into
bucket B5.

35
36
Keys 2 and 4 are still in B0. The record in B0 pointed
by the 000 and 100 entry because last two bits of both
the entry are 00.
Keys 1 and 3 are still in B2. The record in B2 pointed
by the 010 and 110 entry because last two bits of both
the entry are 10.
Key 7 are still in B3. The record in B3 pointed by the
111 and 011 entry because last two bits of both the
entry are 11
37
Thank You

38

You might also like