Lecture 05 and 06 Conventional Indexes
Lecture 05 and 06 Conventional Indexes
Conventional Indexes
Long Cheng
Assistant Professor
[email protected]
A block 1 1500 …
2 1200 …
E# Salary ...
1. Each block holds 2 records 1 1500 ...
3 2100 …
2. Blocks are physically 3 2100 ...
4 1800 …
4 1800 ...
contiguous on the disk 2 1200 ...
6 2300 …
3. The records are stored in an 6 2300 ...
8 1900 …
ascending order of E# 9 1200 ...
8 1900 ...
(within blocks and across … … ...
9 1200 …
… … …
blocks)
…
Table R
Data (on disk)
3 2100 …
SELECT Salary FROM R 4 1800 …
WHERE E# = 9
6 2300 …
8 1900 …
9 1200 …
… … …
4 I/Os
…
Data (on disk)
3 2100 …
SELECT Salary FROM R 4 1800 …
WHERE E# = N
6 2300 …
8 1900 …
9 1200 …
Many I/Os
… … …
(All blocks will be accessed in
the worst case)
…
Data (on disk)
3 3 2100 …
SELECT Salary FROM R 4 4 1800 …
WHERE E# = 9 6 6 2300 …
8 8 1900 …
9
9 1200 …
We build an index … … …
…
(over the E#)
…
Index
Data (on disk)
…
3. Entries are sorted by the
…
Index
Data (on disk)
key
DATABASE SYSTEM PRINCIPLES: Lecture 05: Conventional Indexes 11
Introduction – Index at the Record Level
A block
1 1500 …
1
1. Each entry corresponds to 2 1200 …
2
one record 3 3 2100 …
2. Much more entries than 4 4 1800 …
records can be fit in one
6 6 2300 …
block 8 8 1900 …
• 4 entries / block (index) 9
• 2 records / block (data) 9 1200 …
… … …
3. Index occupies much fewer
…
blocks
…
Index
Data (on disk)
3 3 2100 …
SELECT Salary FROM R 4 4 1800 …
WHERE E# = 9 6 6 2300 …
8 8 1900 …
9
9 1200 …
3 I/Os
… … …
(less than that when no index is
…
used)
…
Index
Data (on disk)
level 9
9 1200 …
… … …
…
Index
Data (on disk)
…
based on (e.g., E#)
…
Index
3. Entries are sorted by the Data (on disk)
key DATABASE SYSTEM PRINCIPLES: Lecture 05: Conventional Indexes 15
Introduction – Index at the Block Level
1 1500 …
1
2 1200 …
3
6
3 2100 …
1. Each entry corresponds to 9
4 1800 …
one (data) block
6 2300 …
2. Index occupies fewer blocks 8 1900 …
(than an index at the record
level) 9 1200 …
…
Index
Data (on disk)
6
3 2100 …
9
SELECT Salary FROM R 4 1800 …
WHERE E# = 9
6 2300 …
8 1900 …
9 1200 …
2 I/Os
(less than that when no index is
…
used)
…
Index
Data (on disk)
6
3 2100 …
9
SELECT Salary FROM R 4 1800 …
WHERE E# = 4
6 2300 …
8 1900 …
9 1200 …
2 I/Os
…
Index
Data (on disk)
3 2100 … 4 1800 …
4 1800 … 2 1200 …
• Sequential file: the records
are stored in a sorted order 6 2300 … 6 2300 …
wrt a key value 8 1900 … 9 1200 …
9 1200 … 8 1900 …
• Non-sequential file: the
other case
…
Data (on disk) Data (on disk)
Sequential Non-sequential
(wrt E#)
DATABASE SYSTEM PRINCIPLES: Lecture 05: Conventional Indexes 20
Index at the Block Level
1 1500 …
1
2 1200 …
3
6
3 2100 …
9
4 1800 …
6 2300 …
It works for sequential file 8 1900 …
9 1200 …
…
Index
Data (on disk)
Sequential
DATABASE SYSTEM PRINCIPLES: Lecture 05: Conventional Indexes 21
Index at the Block Level
What’s the salary of the 1
1 1500 …
employee with the E# of 9? 4
3 2100 …
6
4 1800 …
8
SELECT Salary FROM R 2 1200 …
WHERE E# = 9
6 2300 …
9 1200 …
Target is missed!
8 1900 …
Do not work for non-sequential
…
file!
…
Index
Data (on disk)
Non-sequential
DATABASE SYSTEM PRINCIPLES: Lecture 05: Conventional Indexes 22
Index at the Record Level
…
Index
Data (on disk)
Sequential
DATABASE SYSTEM PRINCIPLES: Lecture 05: Conventional Indexes 24
Index at the Record Level
1 1500 …
1
3 2100 …
2
3 4 1800 …
4 2 1200 …
…
Index
Data (on disk)
Non-sequential
DATABASE SYSTEM PRINCIPLES: Lecture 05: Conventional Indexes 25
Index at the Record Level
What’s the salary of the 1
1 1500 …
employee with the E# of 9? 2
3 2100 …
3 4 1800 …
SELECT Salary FROM R 4 2 1200 …
WHERE E# = 9 6 6 2300 …
8 9 1200 …
9
8 1900 …
…
Index
Data (on disk)
Non-sequential
DATABASE SYSTEM PRINCIPLES: Lecture 05: Conventional Indexes 26
Index at the Record Level
Yes
Can we ALWAYS build an index
(for both sequential and non-
at the record level?
sequential data files)
1. Block level
2. Smaller index (fewer
Dense blocks)
index 3. For sequential file only
6 6 2300 …
8 8 1900 …
9
9 1200 …
… … …
Find the target record
…
Index
Data (on disk)
6 6 2300 …
8 8 1900 …
9
9 1200 …
… … …
Remove the target record
…
Index
Data (on disk)
6 6 2300 …
8 8 1900 …
9
9 1200 …
Shift the following records … … …
…
forward
…
Index
Data (on disk)
6 6 2300 …
8 8 1900 …
9
9 1200 …
Shift the following records … … …
…
forward
…
Index
Data (on disk)
6 6 2300 …
8 8 1900 …
9
9 1200 …
… … …
Update the index
…
Index
Data (on disk)
6 6 2300 …
8 8 1900 …
9
9 1200 …
… … …
Update the index
…
Index
Data (on disk)
6 6 2300 …
8 8 1900 …
9
9 1200 …
… … …
Done!
…
Index
Data (on disk)
6 2300 …
8 1900 …
9 1200 …
…
Index
Data (on disk)
6 2300 …
8 1900 …
9 1200 …
…
Index
Data (on disk)
6 2300 …
8 1900 …
9 1200 …
…
Index
Data (on disk)
6 2300 …
8 1900 …
9 1200 …
Shift the following records
…
forward
…
Index
Data (on disk)
6 2300 …
8 1900 …
9 1200 …
Shift the following records
…
forward
…
Index
Data (on disk)
6 2300 …
8 1900 …
9 1200 …
…
Index
Data (on disk)
6 2300 …
8 1900 …
9 1200 …
…
Index
Data (on disk)
6 2300 …
8 1900 …
9 1200 …
Done!
…
Index
Data (on disk)
pointer 9
9 1200 …
Find the place to insert the … … …
…
target record
…
Index
Data (on disk)
6 6 2300 …
8 8 1900 …
9
9 1200 …
…
Index
Data (on disk)
6
6 2300 …
8
9
8 1900 …
9 1200 …
Shift some records backward
…
Index
Data (on disk)
6
6 2300 …
8
9
8 1900 …
9 1200 …
Insert the target record
…
Index
Data (on disk)
6 5 … …
8 6 2300 …
9
8 1900 …
9 1200 …
Insert the target record
…
Index
Data (on disk)
6 5 … …
8 6 2300 …
9
8 1900 …
9 1200 …
Update the index
…
Index
Data (on disk)
5 5 … …
6 6 2300 …
8
9 8 1900 …
9 1200 …
Update the index
…
Index
Data (on disk)
5 5 … …
6 6 2300 …
8
9 8 1900 …
9 1200 …
Done!
…
Index
Data (on disk)
…
Similar to the case of dense
…
index Index
Data (on disk)
…
Similar to the case of dense
…
index Index
Data (on disk)
…
Similar keys are stored nearby Index
Data (on disk)
(i.e., clustered) Sequential (wrt E#)
DATABASE SYSTEM PRINCIPLES: Lecture 05: Conventional Indexes 58
Clustered Index
Clustered index
1 1500 …
1
2 1200 …
3
6
3 2100 …
9
4 1800 …
• Clustered index:
6 2300 …
• Sequential file (wrt a key)
8 1900 …
• Index is on the same key
9 1200 …
…
Index
Data (on disk)
Sequential (wrt E#)
DATABASE SYSTEM PRINCIPLES: Lecture 05: Conventional Indexes 59
Create a Clustered Index in SQL Server
Key 2) 9
8 1900 …
…
Index
Data (on disk)
Similar keys may be scattered
Non-Sequential
(i.e., non-clustered) DATABASE SYSTEM PRINCIPLES: Lecture 05: Conventional Indexes 61
Non-clustered Index
• A non-clustered index is an Non-clustered index
1 1200 …
index that is not a clustered 1200
2 1800 …
one. 1500
• Case 1: Non-sequential 1800 3 1900 …
1900 4 1500 …
file
• Case 2: Sequential file 2100 6 2100 …
(wrt Key 1) but index (on 2300 8 2500 …
Key 2) 2500
9 2300 …
…
Index
Data (on disk)
Similar keys may be scattered
Sequential (E#)
(i.e., non-clustered) DATABASE SYSTEM PRINCIPLES: Lecture 05: Conventional Indexes 62
Create a Non-clustered Index in SQL Server
WHERE E# 3 and E# 8 9
9 1200 …
… … …
…
Index
Data (on disk)
WHERE E# 3 and E# 8 9
8 1900 …
…
Index
Data (on disk)
6 6 2300 …
8 8 1900 …
9
9 1200 …
A few steps … … …
…
(locating, updating, shifting …)
…
Index
Data (on disk)
5 5 … …
6 6 2300 …
8
9 8 1900 …
9 1200 …
At the end
…
Index
Data (on disk)
…
target record
…
Index
Data (on disk)
6 6 2300 …
8 9 1200 …
9
8 1900 …
…
Index
Data (on disk)
6 6 2300 …
8 9 1200 …
9
8 1900 …
5 … …
Insert the target record
…
Index
Data (on disk)
6 6 2300 …
8 9 1200 …
9
8 1900 …
5 … …
Update the index
…
Index
Data (on disk)
5 6 2300 …
6 9 1200 …
8
9 8 1900 …
5 … …
Update the index
…
Index
Data (on disk)
5 6 2300 …
6 9 1200 …
8
9 8 1900 …
5 … …
Done!
…
Index
Data (on disk)
…
Build index of indexes!!
…
Index
Data (on disk)
1 1500 …
1 1
2 1200 …
Search for the 6 2
3 3 2100 …
record with key = 6
4 4 1800 …
…
6 6 2300 …
8 8 1900 …
9
3 I/Os 9 1200 …
(even there are … … …
…
many blocks at the
…
1st level) Index Index
Data (on disk)
Next lecture:
Lecture 07 and 08: B Tree Index