B Plus Tree.pptx
B Plus Tree.pptx
Introduction of B+ Tree
• B + Tree is a variation of the B-tree data structure.
• In a B + tree, data pointers are stored only at the leaf nodes of the tree.
• In a B+ tree structure of a leaf node differs from the structure of internal
nodes.
• The leaf nodes have an entry for every value of the search field, along with
a data pointer to the record (or to the block that contains this record).
• The leaf nodes of the B+ tree are linked together to provide ordered access
to the search field to the records.
• Internal nodes of a B+ tree are used to guide the search. Some search field
values from the leaf nodes are repeated in the internal nodes of the B+
tree.
Features of B+ Trees
• Balanced: B+ Trees are self-balancing, which means that as data is added or removed from
the tree, it automatically adjusts itself to maintain a balanced structure. This ensures that the
search time remains relatively constant, regardless of the size of the tree.
• Multi-level: B+ Trees are multi-level data structures, with a root node at the top and one or
more levels of internal nodes below it. The leaf nodes at the bottom level contain the actual
data.
• Ordered: B+ Trees maintain the order of the keys in the tree, which makes it easy to perform
range queries and other operations that require sorted data.
• Fan-out: B+ Trees have a high fan-out, which means that each node can have many child
nodes. This reduces the height of the tree and increases the efficiency of searching and
indexing operations.
• Cache-friendly: B+ Trees are designed to be cache-friendly, which means that they can take
advantage of the caching mechanisms in modern computer architectures to improve
performance.
• Disk-oriented: B+ Trees are often used for disk-based storage systems because they are
efficient at storing and retrieving data from disk.
Why Use B+ Tree?
• B+ Trees are the best choice for storage systems
with sluggish data access because they minimize
I/O operations while facilitating efficient disc
access.
• B+ Trees are a good choice for database systems
and applications needing quick data retrieval
because of their balanced structure, which
guarantees predictable performance for a variety
of activities and facilitates effective range-based
queries.
Difference between B+ Tree and B
Tree
Parameters B+ Tree B Tree
Structure Separate leaf nodes for Nodes store both keys
data storage and and data values
internal nodes for
indexing
Leaf Nodes Leaf nodes form a linked Leaf nodes do not form
list for efficient a linked list
range-based queries
Order Higher order (more Lower order (fewer
keys) keys)
Key Duplication Typically allows key Usually does not allow
duplication in leaf nodes key duplication
Disk Access Better disk access due to More disk I/O due to
sequential reads in a non-sequential reads in
linked list structure internal nodes
Difference between B+ Tree and B
Tree
Parameters B+ Tree B Tree
Applications Database systems, file In-memory data structures,
systems, where range databases,
queries are common general-purpose use
Performance Better performance for Balanced performance for
range queries and bulk search, insert, and delete
data retrieval operations
Memory Usage Requires more memory for Requires less memory as
internal nodes keys and values are stored
in the same node
Implementation of B+ Tree
• In order, to implement dynamic multilevel indexing, B-tree and B+ tree are generally
employed. The drawback of the B-tree is that it stores the data pointer, corresponding to a
particular key value, along with that key value in the node of a B-tree, thereby contributing to
the increase in the number of levels in the B-tree, hence increasing the search time of a
record.
• B+ tree eliminates the above drawback by storing data pointers only at the leaf nodes of the
tree.
• As data pointers are present only at the leaf nodes, the leaf nodes must necessarily store all
the key values along with their corresponding data pointers to the disk file block, in order to
access them.
• Moreover, the leaf nodes are linked to providing ordered access to the records. The leaf
nodes, therefore form the first level of the index, with the internal nodes forming the other
levels of a multilevel index. Some of the key values of the leaf nodes also appear in the
internal nodes, to simply act as a medium to control the searching of a record.
• It is apparent that a B+ tree, unlike a B-tree, has two orders, ‘a’ and ‘b’, one for the internal
nodes and the other for the external (or leaf) nodes.
Structure of B+ Trees
B+ Trees contain two types of nodes:
• Internal Nodes: Internal Nodes are the nodes that are present
in at least n/2 record pointers, but not in the root node,
• Leaf Nodes: Leaf Nodes are the nodes that have n pointers.
The Structure of the Internal Nodes of
a B+ Tree of Order ‘a’
• Each internal node is of the form: <P1, K1, P2, K2, ….., Pc-1, Kc-1, Pc>
where c <= a and each Pi is a tree pointer (i.e points to another
node of the tree) and, each Ki is a key-value
• Every internal node has : K1 < K2 < …. < Kc-1
• For each search field value ‘X’ in the sub-tree pointed at by Pi, the
following condition holds: Ki-1 < X <= Ki, for 1 < I < c and, Ki-1 < X, for
i=c
• Each internal node has at most ‘aa tree pointers.
• The root node has, at least two tree pointers, while the other
internal nodes have at least \ceil(a/2) tree pointers each.
• If an internal node has ‘c’ pointers, c <= a, then it has ‘c – 1’ key
values.
The Structure of the Internal Nodes of
a B+ Tree of Order ‘a’
The Structure of the Leaf Nodes of a
B+ Tree of Order ‘b’
• Each leaf node is of the form: <<K1, D1>, <K2, D2>,
….., <Kc-1, Dc-1>, Pnext> where c <= b and each Di is
a data pointer (i.e points to actual record in the
disk whose key value is Ki or to a disk file block
containing that record) and, each Ki is a key
value and, Pnext points to next leaf node in the B+
tree
• Every leaf node has : K1 < K2 < …. < Kc-1, c <= b
• Each leaf node has at least \ceil(b/2) values.
• All leaf nodes are at the same level.
The Structure of the Leaf Nodes of a
B+ Tree of Order ‘b’
Using the Pnext pointer it is viable to traverse all the leaf
nodes, just like a linked list, thereby achieving ordered
access to the records stored in the disk.
Searching a Record in B+ Trees
• Let us find 58 in the B+ Tree. Start by fetching from the root
node then move to the leaf node, which might contain a
record of 58.
• In the image given, 58 between 50 and 70. Therefore, we will
get a leaf node in the third leaf node and get 58 there.
Insertion in a B+ tree
During insertion following properties of B+ Tree must be followed:
• Each node except root can have a maximum of M children and at
least ceil(M/2) children.
• Each node can contain a maximum of M – 1 keys and a minimum
of ceil(M/2) – 1 keys.
• The root has at least two children and atleast one search key.
• While insertion overflow of the node occurs when it contains more
than M – 1 search key values.
• Here M is the order of B+ tree.
Steps for insertion in B+ Tree
1. Every element is inserted into a leaf node. So,
go to the appropriate leaf node.
2. Insert the key into the leaf node in increasing
order only if there is no overflow. If there is
an overflow go ahead with the following
steps mentioned below to deal with overflow
while maintaining the B+ Tree properties.
Properties for insertion B+ Tree
Case 1: Overflow in leaf node
• Split the leaf node into two nodes.
• First node contains ceil((m-1)/2) values.
• Second node contains the remaining values.
• Copy the smallest search key value from
second node to the parent node.(Right biased)
Properties for insertion B+ Tree
inserting 8 into B+ Tree of order of 5
Properties for insertion B+ Tree
Case 2: Overflow in non-leaf node
• Split the non leaf node into two nodes.
• First node contains ceil(m/2)-1 values.
• Move the smallest among remaining to the
parent.
• Second node contains the remaining keys.
Properties for insertion B+ Tree
inserting 15 into B+ Tree of order of 5
Example to illustrate insertion on a B+
tree
Problem: Insert the following key values
6, 16, 26, 36, 46 on a B+ tree with order = 3.
Solution:
Step 1: The order is 3 so at maximum in a node
so there can be only 2 search key values. As
insertion happens on a leaf node only in a B+
tree so insert search key value 6 and 16 in
increasing order in the node.
Example to illustrate insertion on a B+
tree
Example to illustrate insertion on a B+
tree
Step 2: We cannot insert 26 in the same node as
it causes an overflow in the leaf node, We have
to split the leaf node according to the rules. First
part contains ceil((3-1)/2) values i.e., only 6. The
second node contains the remaining values
i.e., 16 and 26. Then also copy the smallest
search key value from the second node to the
parent node i.e., 16 to the parent node.
Example to illustrate insertion on a B+
tree
Example to illustrate insertion on a B+
tree
Step 3: Now the next value is 36 that is to be
inserted after 26 but in that node, it causes an
overflow again in that leaf node. Again follow the
above steps to split the node. First part
contains ceil((3-1)/2) values i.e., only 16. The second
node contains the remaining values i.e., 26 and 36.
Then also copy the smallest search key value from
the second node to the parent node i.e., 26 to the
parent node.
Example to illustrate insertion on a B+
tree
Example to illustrate insertion on a B+
tree
Step 3: Now the next value is 36 that is to be
inserted after 26 but in that node, it causes an
overflow again in that leaf node. Again follow the
above steps to split the node. First part
contains ceil((3-1)/2) values i.e., only 16. The second
node contains the remaining values i.e., 26 and 36.
Then also copy the smallest search key value from
the second node to the parent node i.e., 26 to the
parent node.
Example to illustrate insertion on a B+
tree
Step 4: Now we have to insert 46 which is to be inserted after 36 but
it causes an overflow in the leaf node. So we split the node according
to the rules. The first part contains 26 and the second part
contains 36 and 46 but now we also have to copy 36 to the parent
node but it causes overflow as only two search key values can be
accommodated in a node. Now follow the steps to deal with overflow
in the non-leaf node.
First node contains ceil(3/2)-1 values i.e. ’16’.
Move the smallest among remaining to the parent i.e ’26’ will be the
new parent node.
The second node contains the remaining keys i.e ’36’ and the rest of
the leaf nodes remain the same.
Example to illustrate insertion on a B+
tree
Deletion in B+ Trees
• Deletion in B+ Trees is just not deletion but it is a combined
process of Searching, Deletion, and Balancing.
• In the last step of the Deletion Process, it is mandatory to
balance the B+ Trees, otherwise, it fails in the property of B+
Trees.
• A key-value pair is deleted from a B+ tree using this method.
In order to keep the attributes of the tree’s internal nodes
after performing the deletion operation, the appropriate leaf
node and its key-value pair must be removed.
The deletion strategy for the B+ tree
• Look to locate the deleted key in the leaf nodes.
• Delete the key and its associated value if the key is discovered in a leaf
node.
• One of the following steps should be taken if the node underflows
(number of keys is less than half the maximum allowed):
– Get a key by borrowing it from a sibling node if it contains more keys
than the required minimum.
– If the minimal number of keys is met by all of the sibling nodes, merge
the underflow node with one of its siblings and modify the parent
node as necessary.
• Remove all references to the deleted leaf node from the internal nodes of
the tree.
• Remove the old root node and update the new one if the root node is
empty.
Illustration
Illustration
• Look in the leaf nodes for the key number 5. The node [1 3 5] contains the
key.
• Remove the value associated with the key 5, creating the node [1 3].
• The node [1 3] underflows because it contains fewer keys than the
maximum number permitted. From its sibling node, we can obtain a key [3
4]. We borrow key 4 in this instance, resulting in the nodes [1 3 4] and [5
7].
• Remove all references to the deleted leaf node from the internal nodes of
the tree. We must delete the reference to the node [1 3 5] from its parent
node [1 3 4 11] because it was merged with the node [5 7]. The node [1 3
4 11] is the consequence of this.
• The deletion is finished since the root node [11 14 20] is not empty.
Advantages of B+ Trees
• A B+ tree with ‘l’ levels can store more entries in its internal nodes
compared to a B-tree having the same ‘l’ levels. This accentuates
the significant improvement made to the search time for any given
key. Having lesser levels and the presence of Pnext pointers imply
that the B+ trees is very quick and efficient in accessing records
from disks.
• Data stored in a B+ tree can be accessed both sequentially and
directly.
• It takes an equal number of disk accesses to fetch records.
• B+trees have redundant search keys, and storing search keys
repeatedly is not possible.
Disadvantages of B+ Trees
• The major drawback of B-tree is the difficulty
of traversing the keys sequentially. The B+ tree
retains the rapid random access property of
the B-tree while also allowing rapid sequential
access.
Application of B+ Trees
• Multilevel Indexing
• Faster operations on the tree (insertion,
deletion, search)
• Database indexing