Indexing in DBMS - Ordered Indices - Primary Index - Dense Index - Sparse Index - Secondary Index - Multilevel Indices - Clustering Index in Database
Indexing in DBMS - Ordered Indices - Primary Index - Dense Index - Sparse Index - Secondary Index - Multilevel Indices - Clustering Index in Database
IntroductionofIndexing
Themaingoalofdesigningthedatabaseisfasteraccesstoanydatainthedatabaseandquicker
insert/delete/updatetoanydata.Thisisbecausenoonelikeswaiting.Whenadatabaseisveryhuge,evena
smallesttransactionwilltaketimetoperformtheaction.Inordertoreducethetimespentintransactions,Indexes
areused.Indexesaresimilartobookcataloguesinlibraryorevenlikeanindexinabook.Whatitdoes?Itmakes
oursearchsimplerandquicker.SameconceptisappliedhereinDBMStoaccessthefilesfromthememory.
WhenrecordsarestoredintheprimarymemorylikeRAM,accessingthemisveryeasyandquick.Butrecordsare
notlimitedinnumberstostoreinRAM.Theyareveryhugeandwehavetostoreitinthesecondarymemorieslike
harddisk.Aswehaveseenalready,inmemorywecannotstorerecordslikeweseetables.Theyarestoredin
theformoffilesindifferentdatablocks.Eachblockiscapableofstoringoneormorerecordsdependingonits
size.
Whenwehavetoretrieveanyrequireddataorperformsometransactiononthosedata,wehavetopullthem
frommemory,performthetransactionandsavethembacktothememory.Inordertodoalltheseactivities,we
needtohavealinkbetweentherecordsandthedatablockssothatwecanknowwheretheserecordsarestored.
Thislinkbetweentherecordsandthedatablockiscalledindex.Itactslikeabridgebetweentherecordsandthe
datablock.
Howdowecreatetheseindexes?Howtheseindexeshelptoaccessthedata?Islinkingbetweenrecordsand
datablockaddressenoughtogivebetterperformance?Answersallthesequestionsarelearntinthisarticle.
Howdoweindexinabook?Welistmaintopicsfirstandunderthatwegroupdifferentsubtopicright?Wedothe
samethinginthedatabasetoo.Eachtablewillhaveuniquecolumnorprimarykeycolumnwhichuniquely
determineseachrecordinthetable.Mostofthetime,weusethisprimarykeytocreateindex.Sometimes,wewill
havetofetchtherecordsbasedonothercolumnsinthetablewhicharenotprimarykey.Insuchcaseswecreate
indexonthosecolumns.Butwhatisthisindex?Indexindatabasesisthepointertotheblockaddressinthe
memory.Butthesepointersarestoredas(column,block_address)format.
OrderedIndices
Imaginewehaveastudenttablewiththousandsofrecords,eachofwhichis10byteslong.ImaginetheirIDsstart
from12,3andgoeson.AndwehavetosearchstudentwithID678.Inanormaldatabasewithnoindex,it
searchesthediskblockfromthebeginningtillitreaches678.SotheDBMSwillreachthisrecordafterreading
677*10=6770bytes.ButifwehaveindexonIDcolumn,thentheaddressofthelocationwillbestoredaseach
recordas(1,200),(2,201)(678,879)andsoon.Onecanimagineitasasmallertablewithindexcolumnand
addresscolumn.NowifwewanttosearchrecordwithID678,thenitwillsearchusingindexes.i.e.hereitwill
traverseonly677*2=1354byteswhichverylesscomparedtoearlierone.Henceretrievingtherecordfromthe
diskbecomesfaster.Mostofthecasestheseindexesaresortedandkepttomakesearchingfaster.Iftheindexes
aresorted,thenitiscalledasorderedindices.
PrimaryIndex
https://www.tutorialcup.com/dbms/indexing.htm 1/7
6/5/2017 IndexinginDBMSOrderedIndicesPrimaryIndexDenseIndexSparseIndexSecondaryIndexMultilevelIndicesClusteringIndexinDatabase
IftheindexiscreatedontheprimarykeyofthetablethenitiscalledasPrimaryIndexing.Sincetheseprimary
keysareuniquetoeachrecordandithas1:1relationbetweentherecords,itismucheasiertofetchtherecord
usingit.Also,theseprimarykeyarekeptinsortedformwhichhelpsinperformanceofthetransactions.The
primaryindexingisoftwotypesDenseIndexandSparseIndex.
1.DenseIndex
Inthiscase,indexingiscreatedforprimarykeyaswellasonthecolumnsonwhichweperformtransactions.That
means,usercanfirequerynotonlybasedonprimarykeycolumn.Hecanquerybasedonanycolumnsinthe
tableaccordingtohisrequirement.Butcreatingindexonlyonprimarykeywillnothelpinthiscase.Henceindex
onallthesearchkeycolumnsarestored.Thismethodiscalleddenseindex.
Forexample,StudentcanbesearchedbasedonhisIDwhichisaprimarykey.Inaddition,wesearchforstudent
byhisfirstname,lastname,particularagegroup,residinginsomeplace,optedforsomecourseetc.Thatmeans
mostofthecolumnsinthetablecanbeusedforsearchingthestudentbasedondifferentcriteria.Butifwehave
indexonhisID,othersearcheswillnotbeefficient.Henceindexonothersearchcolumnsarealsostoredtomake
thefetchfaster.
Thoughitaddressesquicksearchonanysearchkey,thespaceusedforindexandaddressbecomesoverheadin
thememory.Herethe(index,address)becomesalmostsameas(tablerecords,address).Hencemorespaceis
consumedtostoretheindexesastherecordsizeincreases.
https://www.tutorialcup.com/dbms/indexing.htm 2/7
6/5/2017 IndexinginDBMSOrderedIndicesPrimaryIndexDenseIndexSparseIndexSecondaryIndexMultilevelIndicesClusteringIndexinDatabase
2.SparseIndex
Inordertoaddresstheissuesofdenseindexing,sparseindexingisintroduced.Inthismethodofindexing,range
ofindexcolumnsstorethesamedatablockaddress.Andwhendataistoberetrieved,theblockaddresswillbe
fetchedlinearlytillwegettherequesteddata.
Letusseehowaboveexampleofdenseindexisconvertedintosparseindex.
Inabovediagramwecansee,wehavenotstoredtheindexesforalltherecords,insteadonlyfor3records
indexesarestored.NowifwehavetosearchastudentwithID102,thentheaddressfortheIDlessthanorequal
to102issearchedwhichreturnstheaddressofID100.Thisaddresslocationisthenfetchedlinearlytillweget
therecordsfor102.Henceitmakesthesearchingfasterandalsoreducesthestoragespaceforindexes.
Therangeofcolumnvaluestostoretheindexaddressescanbeincreasedordecreaseddependingonthe
numberofrecordinthetable.Themaingoalofthismethodshouldbemoreefficientsearchwithlessmemory
space.
Butifwehaveveryhugetable,thenifweprovideverylargerangebetweenthecolumnswillnotwork.Wewill
havetodividethecolumnrangesconsiderablyshorter.Inthissituation,(index,address)mappingfilesizegrows
likewehaveseeninthedenseindexing.
SecondaryIndex
Inthesparseindexing,asthetablesizegrows,the(index,address)mappingfilesizealsogrows.Inthememory,
usuallythesemappingsarekeptintheprimarymemorysothataddressfetchshouldbefaster.Andlatterthe
actualdataissearchedfromthesecondarymemorybasedontheaddressgotfrommapping.Ifthesizeofthis
mappinggrows,fetchingtheaddressitselfbecomesslower.Hencesparseindexwillnotbeefficient.Inorderto
overcomethisproblemnextversionofsparseindexingisintroducedi.e.SecondaryIndexing.
Inthismethod,anotherlevelofindexingisintroducedtoreducethe(index,address)mappingsize.Thatmeans
initiallyhugerangeforthecolumnsareselectedsothatfirstlevelofmappingsizeissmall.Theneachrangeis
https://www.tutorialcup.com/dbms/indexing.htm 3/7
6/5/2017 IndexinginDBMSOrderedIndicesPrimaryIndexDenseIndexSparseIndexSecondaryIndexMultilevelIndicesClusteringIndexinDatabase
furtherdividedintosmallerranges.Firstlevelofmappingisstoredintheprimarymemorysothataddressfetchis
faster.Secondarylevelofmappingandtheactualdataarestoredinthesecondarymemoryharddisk.
Intheabovediagram,wecanseethatcolumnsaredividedintogroupsof100sfirst.Thesegroupsarestoredin
theprimarymemory.Inthesecondarymemory,thesegroupsarefurtherdividedintosubgroups.Actualdata
recordsarethenstoredinthephysicalmemory.Wecannoticethat,addressindexinthefirstlevelispointingto
thefirstaddressinthesecondarylevelandeachsecondaryindexaddressesarepointingtothefirstaddressin
thedatablock.Ifwehavetosearchanydatainbetweenthesevalues,thenitwillsearchthecorresponding
addressfromfirstandsecondlevelrespectively.Thenitwillgototheaddressinthedatablocksandperform
linearsearchtogetthedata.
Forexample,ifithastosearch111intheabovediagramexample,itwillsearchthemax(111)<=111inthefirst
levelindex.Itwillget100atthislevel.Theninthesecondaryindexlevel,againitdoesmax(111)<=111,andgets
110.Nowitgoestodatablockwithaddress110andstartssearchingeachrecordtillitgets111.Thisishowa
searchisdoneinthismethod.Inserting/deleting/updatingisalsodoneinsamemanner.
MultilevelIndexing
Inthismethod,wecanseethatindexmappinggrowthisreducedtoconsiderableamount.Butthismethodcan
alsohavesameproblemasthetablesizeincreases.Inordertoovercomethis,wecanintroducemultiplelevels
betweenprimarymemoryandsecondarymemory.Thismethodisalsoknownasmultilevelindexing.Inthis
methodnumberofsecondarylevelindexistwoormore.
ClusteringIndex
Insomecases,theindexiscreatedonnonprimarykeycolumnswhichmaynotbeuniqueforeachrecord.In
suchcases,inordertoidentifytherecordsfaster,wewillgrouptwoormorecolumnstogethertogettheunique
https://www.tutorialcup.com/dbms/indexing.htm 4/7
6/5/2017 IndexinginDBMSOrderedIndicesPrimaryIndexDenseIndexSparseIndexSecondaryIndexMultilevelIndicesClusteringIndexinDatabase
valuesandcreateindexoutofthem.Thismethodisknownasclusteringindex.Basically,recordswithsimilar
characteristicsaregroupedtogetherandindexesarecreatedforthesegroups.
Forexample,studentsstudyingineachsemesteraregroupedtogether.i.e.1stSemesterstudents,2ndsemester
students,3rdsemesterstudentsetcaregrouped.
Inabovediagramwecanseethat,indexesarecreatedforeachsemesterintheindexfile.Inthedatablock,the
studentsofeachsemesteraregroupedtogethertoformthecluster.Theaddressintheindexfilepointstothe
beginningofeachcluster.Inthedatablocks,requestedstudentIDisthensearchinsequentially.
Newrecordsareinsertedintotheclustersbasedontheirgroup.Inabovecase,ifanewstudentjoins
3rdsemester,thenhisrecordisinsertedintothesemester3clusterinthesecondarymemory.Sameisdonewith
updateanddelete.
Ifthereisshortofmemoryinanycluster,newdatablocksareaddedtothatcluster.
https://www.tutorialcup.com/dbms/indexing.htm 5/7
6/5/2017 IndexinginDBMSOrderedIndicesPrimaryIndexDenseIndexSparseIndexSecondaryIndexMultilevelIndicesClusteringIndexinDatabase
Thismethodoffileorganizationisbettercomparedtoothermethodsasitprovidescleandistributionofrecords,
andhencemakingsearcheasierandfaster.Butineachcluster,therewouldbeunusedspaceleft.Henceitwill
takemorememorycomparedtoothermethods.
Summary
https://www.tutorialcup.com/dbms/indexing.htm 6/7
6/5/2017 IndexinginDBMSOrderedIndicesPrimaryIndexDenseIndexSparseIndexSecondaryIndexMultilevelIndicesClusteringIndexinDatabase
https://www.tutorialcup.com/dbms/indexing.htm 7/7