0% found this document useful (0 votes)
27 views

unit5_trie

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

unit5_trie

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 23

Unit 5

 Balanced Tree
 AVL Tree
 Red Black Tree
 Multi way search Tree
 B- Tree
 Binary Trie
 Multi-way Trie
 Suffix Tree
Trie data structure
What is a Trie data structure?

 The word "Trie" is an excerpt from the word "retrieval".


 Trie is a sorted tree-based data-structure that stores the set of strings.
 It has the number of pointers equal to the number of characters of the
alphabet in each node.
 It can search a word in the dictionary with the help of the word's prefix.
 For example, if we assume that all strings are formed from the letters 'a' to
'z' in the English alphabet, each trie node can have a maximum
of 26 points.

 Trie is also known as the digital tree or prefix tree. The position of a node in
the Trie determines the key with which that node is connected.
Properties of the Trie for a set of the
string:
 The root node of the trie always represents the null node.
 Each child of nodes is sorted alphabetically.
 Each node can have a maximum of 26 children (A to Z).
 Each node (except the root) can store one letter of the alphabet.
Basic operations of Trie
 There are three operations in the Trie:
 Insertion of a node
 Searching a node
 Deletion of a node

Insert of a node in the Trie


 The first operation is to insert a new node into the trie.
 Every letter of the input key (word) is inserted as an individual in the Trie_node.
 Note that children point to the next level of Trie nodes.

 The key character array acts as an index of children.


 If the present node already has a reference to the present letter, set the present
node to that referenced node. Otherwise, create a new node, set the letter to be
equal to the present letter, and even start the present node with this new node.
 The character length determines the depth of the trie.
Basic operations of Trie

 Searching a node in Trie


 The second operation is to search for a node in a Trie. The searching operation is
similar to the insertion operation. The search operation is used to search a key in
the trie.

 Deletion of a node in the Trie


 The Third operation is the deletion of a node in the Trie. Before we begin the
implementation, it is important to understand some points:
 If the key is not found in the trie, the delete operation will stop and exit it.
 If the key is found in the trie, delete it from the trie.
Applications of Trie
1. Spell Checker
 Spell checking is a three-step process. First, look for that word in a dictionary, generate
possible suggestions, and then sort the suggestion words with the desired word at the
top.
 Trie is used to store the word in dictionaries. The spell checker can easily be applied in
the most efficient way by searching for words on a data structure. Using trie not only
makes it easy to see the word in the dictionary, but it is also simple to build an algorithm
to include a collection of relevant words or suggestions.
2. Auto-complete
 Auto-complete functionality is widely used on text editors, mobile applications, and the
Internet. It provides a simple way to find an alternative word to complete the word for the
following reasons.
 It provides an alphabetical filter of entries by the key of the node.
 We trace pointers only to get the node that represents the string entered by the user.
 As soon as you start typing, it tries to complete your input.

3. Browser history
 It is also used to complete the URL in the browser. The browser keeps a history of the
URLs of the websites you've visited.
Trie

Advantages of Trie
 It can be insert faster and search the string than hash tables and binary
search trees.
 It provides an alphabetical filter of entries by the key of the node.

Disadvantages of Trie
 It requires more memory to store the strings.
 It is slower than the hash table.
Multiway tries
 A binary trie uses radix search with radix 2; a multiway trie uses radix search with
radix R > 2
 multiway tries are sometimes called R-ary tries

 If each digit in a key has r bits, the radix is R = 2 r , and if keys have at most B bits,
the worst-case number of comparisons would be only B/r
 However, to implement this idea, a node in the trie must be able to have as many
as R children
 Examples:
 Keys are words made up of lower-case letters in English. There are 26 different lower-
case letters in English, so a R-ary trie with R=26 could hold these keys. (This specific
variant is sometimes called an “alphabet trie”)
 Keys are decimal integers made up of decimal digits. There are 10 different decimal
digits, so a R-ary trie with R=10 could hold these keys
 Keys are 128-bit IEEE high precision floating point numbers. Consider each as made up of
32 4-bit nybbles. There are 2 4 = 16 different nybble values, so a R-ary trie with R=16
could hold these keys (note that lexicographic ordering of such keys is not the same as
their numeric ordering)
Suffix tree
 In algorithms for string processing and pattern matching, a suffix tree is a type of
data structure. It allows for quick pattern searching and other string-related
activities by compactly representing all the suffixes of a given string
 . It was first introduced by Ukkonen in 1995 and is now a key idea in bioinformatics
and computer science.
 Trie is simply an expanded version of the suffix tree.
 It is a trie that has all of a string's suffixes compressed into it.
 Suffix trees can be used to address several string-related issues.
 Pattern matching, spotting distinctive substrings within a string, and figuring out
the longest palindrome are a few of these issues.
 A suffix is a substring that consists of all the characters in the string from a
particular location to the very end.
 For instance, the suffixes for the string "banana" are "banana," "nana," "nana," "ana,"
"na," and "a." These suffixes are all stored in a tree-like data structure called a suffix tree.
An ordered tree data structure called a trie is effective at storing a dynamic set of strings.
Each edge of a suffix tree corresponds to a single character, and the pathways from the
root to the leaves make up the suffixes of the starting string.
And a compressed trie for the given set of strings
will look like:
What is a B Tree?

 The B Tree is a special type of multiway search


tree, commonly known as the M-way tree, which
balances itself. B Tree with order 3
 Because of their balanced structure, these trees are
commonly utilized to operate and manage immense
databases and simplify searches.
 In a B Tree, each node can have at most n child
nodes.
 B Tree is an example of Multilevel Indexing in a
Database Management System (DBMS). Leaf and
Internal nodes will both have record references.
 B Tree is known as Balanced Stored Tree because all
the leaf nodes are at the same level.
Rules of the B Tree
1. All the leaf nodes are at the same level.
2. The B Tree data structure is defined by the term minimum
degree 'd'. The value of 'd' depends on the size of the disk
block.
3. Every node, excluding the root, must consist of at least d-
1 keys. The root node may consist of a minimum of 1 key.
4. All nodes (including the root node) may consist of at
most (2d-1) keys.
5. The number of children of a node is equal to the addition
of the number of keys present in it and .
6. All keys of a node are sorted in ascending order. The child
between two keys, k1 and k2, consists of all the keys
ranging between k1 and k2, respectively.
7. Unlike the Binary Search Tree, the B Tree data structure
grows and shrinks from the root. Whereas the Binary
Search Tree grows downwards and shrinks downward.
B Tree of order 5
8. Similar to other Self-Balanced Binary Search Trees, the
Time complexity of the B Tree data structure for the
operations like searching, insertion, and deletion
is O(log?n).
9. The Insertion of a Node in the B Tree happens only at the
Leaf Node.
Rules of the B Tree
Every B Tree depends upon a positive constant integer known
as MINIMUM, which is utilized in order to determine the number
of data elements that can be held in a single node.
Rule 1: The root can have as few as only one data element (or
even no data elements if it is also no children); every other node
has at least MINIMUM data elements.
Rule 2: The maximum number of data elements stored in a
node is twice the value of MINIMUM.
Rule 3: The data elements of each node of the B Tree are stored
in a partially filled array, sorted from the smallest data element
(at index 0) to the largest data element (at the final utilized
position of the array).
Rule 4: The total number of subtrees below a non-leaf node is
always one more than the number of data elements in that node.
subtree 0,subtree 1,...
Rule 5: With respect to any non-leaf node:
A data element at index is greater than all the data B Tree of order 5
elements in subtree number i of the node, and
 A data element at index is less than all the data elements in
subtree number i+1 of the node.
Rule 6: Every leaf in a B Tree has the same depth. Thus, it
ensures that a B Tree prevents the problem of an unbalanced
tree.
Operations on a B Tree data structure

 In order to ensure that none of the properties of a B Tree data structure are
violated during the operations, the B Tree may be split or joined. The
following are some operations that we can perform on a B Tree:
 Searching a data element in B Tree
 Insertion of a data element in B Tree
 Deletion of a data element in B Tree
Searching Operation on a B
Tree
 Step 1: The search begins from the root node. Compare the search
element, k, with the root.
 Step 1.1: If the root node consists of the element k, the search will be complete.
 Step 1.2: If the element k is less than the first value in the root, we will move to
the leftmost child and search the child recursively.
 Step 1.3.1: If the root has only two children, we will move to the rightmost child
and recursively search the child nodes.
 Step 1.3.2: If the root has more than two keys, we will search the rest.

 Step 2: If the element k is not found after traversing the whole tree, then
the search element is not present in the B Tree.
Let us visualize the above steps with the help of an example.
Suppose that we wanted to search for a key k=34 in the following B Tree:
Let us visualize the above steps with the help of an example.
Suppose that we wanted to search for a key k=34 in the following B Tree:
Let us visualize the above steps with the help of an example.
Suppose that we wanted to search for a key k=34 in the following B Tree:

We compared the key with four different values in the above example until we found
it. Thus, the time complexity required for the search operation in a B Tree is O(log?n).

You might also like