5b Tree Indexes
5b Tree Indexes
Alvin Cheung
Fall 2022
Reading: R & G Chapter 10
Input 3, 4, 5 1, 2, 7 8, 6, 9 10, _, _
Simple Idea? Heap
File
• Step 1: Sort heap file & leave some space
• Pages physically stored in logical order (sequential access)
• Maintenance as new records are added/deleted is a pain, can lead
to B updates in the worst case (move everything down or up)
1, 2, _ 3, 4, _ 5, 6, _ 7, 8, _ 9, 10, _
• Step 2: Use binary search on this sorted heap file: log2(B) pages read
• Fan-out of 2 à deep tree à lots of I/Os
• Examine entire records just to read key during search: would
prefer log2(K) where K is number of pages to store keys << B
Let’s fix these assumptions
• Idea: Keep separate (compact) key lookup pages, laid out sequentially
• Maintain key è recordID mapping [We’ll revisit this later]
• No need to sort heap file anymore! Just sort key lookup pages
• Use binary search on lookup pages as opposed to on all of the data pages
• Still have a deep tree due to fan-out of 2 à lots of I/Os
• Also, maintenance of the key lookup pages is a pain! Worst case K updates
2* 3* 5* 7* 8* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*
ot d,
Sl ge I
Id
Pa
Page 1 Page 2 Page 3 Page 4
(20, Tim) (7, Dan) (5, Kay) (3, Jim) (27, Joe) (34, Kit) (1, Kim) (42, Hal)
Let’s fix these assumptions, take 2
• Idea: repeat the process!
• Lookup pages for the lookup
pages 17
Page 1
• And then lookup pages for
the lookup pages for the Page 4 5 13 24 30 Page 6
lookup pages, ….
• Let’s set fanout to be >> 2
• That is essentially the idea
2* 3* 5* 7* 8* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*
behind B+ Trees …
• We’ll find out why the
pointers are helpful later
ot d,
Sl ge I
Id
Pa
Page 1 Page 2 Page 3 Page 4
(20, Tim) (7, Dan) (5, Kay) (3, Jim) (27, Joe) (34, Kit) (1, Kim) (42, Hal)
Enter the B+ Tree, More Formally
Values &
• Node[…, (KL, PL), (KR, PR)…] Values v:
Pointers
5<=v<13
means that 0 5 13
2* 3* 5* 7* 8* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*
2* 3* 5* 7* 8* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*
Root Node
Key à Pointer to record
Level 2
Level 3
• At these capacities
• Height 1: 2000 (pointers from root) x 2000 (entries per leaf) = 20002 = 4,000,000
• Height 2: 2000 (pointers from root) x 2000 (pointers from level 2) x 2000 (entries
per leaf) = 20003 = 8,000,000,000 records!!
• Core takeaway: Even depths of 3 allow us to index a massive # of records!
Searching the B+ Tree
Root Node 17
Page 1
Page 4 5 13 24 30 Page 6
2* 3* 5* 7* 8* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*
• Procedure:
• Find split on each node (Binary Search)
• Follow pointer to next node
Searching the B+ Tree: Find 27
Root Node 17
Page 1
Page 4 5 13 24 30 Page 6
2* 3* 5* 7* 8* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*
• Find key = 27
• Find split on each node (Binary Search)
• Follow pointer to next node
Searching the B+ Tree: Fetch Data
Root Node 17
Page 1
Page 4 5 13 24 30 Page 6
2* 3* 5* 7* 8* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*
ot d,
Sl ge I
Id
Pa
(20, Tim) (7, Dan) (5, Kay) (3, Jim) (27, Joe) (34, Kit) (1, Kim) (42, Hal)
Searching the B+ Tree: Find 27 and up
Root Node 17
Page 1
Page 4 5 13 24 30 Page 6
2* 3* 5* 7* 8* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*
Root Node 13 17 24 30
2* 3* 5* 7* 14* 16* 19* 20* 24* 25* 29* 33* 34* 38* 39*
Root Node 13 17 24 30
2* 3* 5* 7* 14* 16* 19* 20* 24* 25* 29* 26* 33* 34* 38* 39*
Root Node 13 17 24 30
2* 3* 5* 7* 14* 16* 19* 20* 24* 25* 26* 29* 33* 34* 38* 39*
Root Node 13 17 24 30
2* 3* 5* 7* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39*
Root Node 13 17 24 30
8*
2* 3* 5* 7* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39*
Root Node 13 17 24 30
8*
2* 3* 5* 7* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39*
Root Node 13 17 24 30
2* 3* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39*
5* 7* 8*
Root Node 13 17 24 30
2* 3* 5* 7* 8* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39*
I am an
Root Node 13 17 24 30
orphan!
2* 3* 5* 7* 8* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39*
5 Root Node 13 17 24 30
2* 3* 5* 7* 8* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39*
• Copy up from leaf the middle key and pointer to the orphan leaf
• This is what we need to access it
Inserting 8* into a B+ Tree: Split Parent, Part 1
5 13 17 24 30
2* 3* 5* 7* 8* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39*
• Copy up from leaf the middle key and pointer to the orphan leaf
• No room in parent? (Parent now has 2d+1 instead of 2d)
• Recursively split index nodes
• Redistribute the rightmost d+1 keys
Inserting 8* into a B+ Tree: Split Parent, Part 2
5 13 17 24 30
2* 3* 5* 7* 8* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39*
• Copy up from leaf the middle key and pointer to the orphan leaf
• No room in parent? Recursively split index nodes
• Redistribute the rightmost d+1 keys
• Not enough: we now have two roots!
Inserting 8* into a B+ Tree: Root Grows Up
Root Node
5 13 17 24 30
2* 3* 5* 7* 8* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39*
5 13 24 30
2* 3* 5* 7* 8* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39*
• Net effect
• d keys on the left and right => invariant satisfied!
• middle key pushed up
• Consolidate 5* into left node
Inserting 8* into a B+ Tree: Root Grows Up, Pt 3
17 Root Node
5 13 24 30
2* 3* 5* 7* 8* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39*
• Net effect
• d keys on the left and right
• middle key pushed up
• Here, we ended up creating a new root and increasing depth => rare
Copy up vs Push up!
17 Root Node
5 13 24 30
2* 3* 5* 7* 8* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39*
5 13 24 30
2* 3* 5* 7* 8* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39*
We will skip deletion
• In practice, occupancy invariant often not enforced during deletion
• Just delete leaf entries and leave space
• If new inserts come, great
• This is common
4 7 10 13
10
4 7 13 16
4 7 13 16
10
4 7 13 16 19 22
1* 2* 3* 4* 5* 6* 7* 8* 9* 10* 11* 12* 13* 14* 15* 16* 17* 18* 19* 20* 21* 22* 23* 24*
• Benefits: Better
• Cache utilization than insertion into random locations
• Utilization of leaf nodes (and therefore shallower tree)
• Layout of leaf pages (more sequential)
Summary
• B+ Tree is a powerful dynamic indexing structure
• Inserts/deletes leave tree height-balanced; logFN cost
• High fanout (F) means height rarely more than 3 or 4.
• Higher levels stay in cache, avoiding expensive disk I/O
• Almost always better than maintaining a sorted file.
• Widely used in DBMSs!
• Bulk loading can be much faster than repeated inserts for creating a
B+ tree on a large data set.