Types of Indexes
Types of Indexes
An index -
In a dense index, a record is created for every search key valued in the
database. This helps you to search faster but needs more space to store
index records. In this Indexing method, records contain search key value and
points to the real record on the disk.
Sparse Index
It is an index record that appears for only some of the values in the file.
Sparse Index helps you to resolve the issues of dense Indexing. In this
method of indexing technique, a range of index columns stores the same
data block address, and when data needs to be retrieved, the block address
will be fetched.
However, sparse Index stores index records for only some search-key values.
It needs less space, less maintenance overhead for insertion, and deletions
but it is slower compared to the dense Index for locating records.
Secondary Index
The secondary index can be generated by a field which has a unique value
for each record, and it should be a candidate key. It is also known as a non-
clustering index.
Here, you can have a secondary index for every search-key. Index record is a
record point to a bucket that contains pointers to all the records with their
specific search-key value.
Clustering Index
In a clustered index, records themselves are stored in the Index and not
pointers. Sometimes the Index is created on non-primary key columns which
might not be unique for each record. In such a situation, you can group two
or more columns to get the unique values and create an index which is
called clustered Index. This also helps you to identify the record faster.
Multilevel Indexing
With the growth of the size of the database, indices also grow. As the index
is stored in the main memory, a single-level index might become too large a
size to store with multiple disk accesses. The multilevel indexing segregates
the main block into various smaller blocks so that the same can stored in a
single block. The outer blocks are divided into inner blocks which in turn are
pointed to the data blocks. This can be easily stored in the main memory
with fewer overheads.
Index and Tree Structure
Here we will use the concept of data structure for creating indexes.
A BST (Binary Search Tree) is a data structure that has a property that all
the keys that are to the left of a node are smaller than the key value of the
node and all the keys to the right are larger than the key value of the node.
But B-Tree is more advantageous than the BST
B Tree
B Tree is a specialized m-way tree that can be widely used for disk access. A
B-Tree of order m can have at most m-1 keys and m children. One of the
main reason of using B tree is its capability to store large number of keys in
a single node and large key values by keeping the height of the tree
relatively small.
It is not necessary that, all the nodes contain the same number of children
but, each node must have m/2 number of nodes.
Operations: Searching
1. Compare item 49 with root node 78. Since 49 < 78 hence, move to its
left sub-tree.
2. Since, 40<49<56, traverse right sub-tree of 40.
3. 49>45, move to right. Compare 49.
4. Match found, return.
Searching in a B tree depends upon the height of the tree. The search
algorithm takes O (log n) time to search any element in a B tree.
The ability to search on many keys is enabled by building multiple index files
(multikey file organisation) “on top of” the data file. The physical database
then consists of one or more data files and many index files, and each data
file contains either one or several record types. Each index file supports
access by a particular field or group of fields.
For example:
All of these users access the bank data however in a different way.
Let us assume a sample data format for the Account relation in a bank and
applications are trying to refer to the same data but using different key
values. Thus, all the applications as above require the database file to be
accessed in different format and order. The two approaches for this:
By Replicating Data
By providing indexes.
Replicating Data
One approach that may support efficient access to different applications may
be to provide access to each of the applications from different replicated files
of the data. Each of the file may be organized in a different way to serve the
requirements of a different application.
For example, for the problem above, we may provide an indexed sequential
account file having account number as the key to bank teller and the
account holders. A sequential file in the order of permissible loan limit to the
Loan officers and a sorted sequential file in the order of balance to branch
manager.
All of these files thus differ in the organization and would require different
replica for different applications. However, the Data replication brings in the
problems of inconsistency under updating environments. Therefore, a better
approach for data access for multiple keys has been proposed.
Multiple indexes can be used to access a data file through multiple access
paths. In such a scheme only one copy of the data is kept, only the number
of paths is added with the help of indexes.
A multi-list file maintains an index for each secondary key. The index for
secondary key contains, instead of a list of primary keys related to that
secondary key, only one primary key value related to that secondary key.
That record will be linked to other records containing the same secondary
key in the data file.
This idea is very easily generalized to allow for easy secondary key retrieval.
We just set up indexes for each key and allow records to be in more than one
list. This leads to the multi-list structure for file representation.
Inverted file organization is one file organization where the index structure is
most important. In this organization the basic structure of file records does
not matter much.
This file organization is somewhat similar to that of multi-list file organization
with the key difference that in multi-list file organization index points to a
list, whereas in inverted file organization the index itself contains the list.
The index entries are of variable lengths as the number of records with the
same key value is changing, thus, maintenance of index is more complex
than that of multi-list file organization.