Indexing Improves
Indexing Improves
Sparse Index
The index record appears only for a few items in the data file. Each
item points to a block as shown.
To locate a record, we find the index record with the largest search
key value less than or equal to the search key value we are looking
for.
We start at that record pointed to by the index record, and proceed
along with the pointers in the file (that is, sequentially) until we find
the desired record.
Number of Accesses required=log₂(n)+1, (here n=number of blocks
acquired by index file)
When more than two records are stored in the same file this type of storing is
known as cluster indexing. By using cluster indexing we can reduce the cost of
searching reason being multiple records related to the same thing are stored in
one place and it also gives the frequent joining of more than two tables
(records).
The clustering index is defined on an ordered data file. The data file is ordered
on a non-key field. In some cases, the index is created on non-primary key
columns which may not be unique for each record. In such cases, in order to
identify the records faster, we will group two or more columns together to get
the unique values and create an index out of them. This method is known as
the clustering index. Essentially, records with similar properties are grouped
together, and indexes for these groupings are formed.
Students studying each semester, for example, are grouped together. First-
semester students, second-semester students, third-semester students, and so
on are categorized.
Non-clustered or Secondary Indexing
A non-clustered index just tells us where the data lies, i.e. it gives us a list of
virtual pointers or references to the location where the data is actually
stored. Data is not physically stored in the order of the index. Instead, data is
present in leaf nodes. For eg. the contents page of a book. Each entry gives
us the page number or location of the information stored. The actual data
here(information on each page of the book) is not organized but we have an
ordered reference(contents page) to where the data points actually lie. We
can have only dense ordering in the non-clustered index as sparse ordering
is not possible because data is not physically organized accordingly.
It requires more time as compared to the clustered index because some
amount of extra work is done in order to extract the data by further following
the pointer. In the case of a clustered index, data is directly present in front
of the index.