CS202 Unit5 Slides
CS202 Unit5 Slides
V R BADRI PRASAD
Department of Computer Science & Engineering
DATA STRUCTURES AND ITS APPLICATIONS
V R BADRI PRASAD
Department of Computer Science & Engineering
Data Structures and its Applications
TRIE Trees – An Introduction
• TRIE tree is a digital search tree, need not be implemented as a binary tree.
• Each node in the tree can contain ‘m’ pointers – corresponding to ’m‘ possible
symbols in each position of the key.
• Generally used to store strings.
Examples:
Data Structures and its Applications
TRIE Trees – An Introduction
• A trie, pronounced “try”, is a tree that exploits some structure in the keys
- e.g. if the keys are strings, a binary search tree would compare the entire strings
but a trie would look at their individual characters
-A trie is a tree where each node stores a bit indicating whether the string
spelled out to this point is in the set
-Examples:
Data Structures and its Applications
TRIE Trees – Numeric Keys : Example2
8 9
0 5 6 195
111
111
180 185 1867
867
867
Data Structures and its Applications
TRIE Trees – Numeric Keys : Example1
A B C … L … Y Z
… …
A B C … L … Y Z
… …
eok
Data Structures and its Applications
TRIE Trees – An Introduction
• An extra pointer corresponding to eok (end of key) or a flag with each pointer
indicating that it point to a record rather than to a tree node. ( normally $ symbol is used).
• A pointer in the node is associated with a particular symbol value based on its
position in the node.
• First pointer corresponds to the lowest value.
•Second pointer to the second lowest and so forth.
• This way of implementation of a digital search tree is called a TRIE tree.
• The word TRIE is extracted from retrieval word.
Data Structures and its Applications
TRIE Trees – Structure
• Tries are extremely special and useful data-structure that are based on the prefix of a string.
• Strings are stored in a top to bottom manner on the basis of their prefix in a TRIE.
• All prefixes of length 1 are stored at until level 1, all prefixes of length 2 are sorted at until
level 2 and so on.
Suffix Trie:
• Suffix Trie is a space-efficient data structure to store a string that allows many
kinds of queries to be answered quickly.
• Example:
Text is “banana\$” where ‘\$’ is the string terminating character.
Data Structures and its Applications
Suffix Trie – Building
• Example1:
Text is “banana\$” where ‘\$’ is the
string terminating character.
e
ze
ize
mize S - set of strings to
imize include in the suffix trie.
nimize
inimize
minimize
Data Structures and its Applications
Suffix Trie – Building - for the word minimize
e z e
n e ze
i
i ize
z m mize S - set of strings to
n m
m imize include in the
e i
i i nimize suffix trie
i
m z z inimize
z n minimize
i e e
e i
z
m
e
i
z
e
Data Structures and its Applications
Suffix Trie – Compressed Trie
e ze
nimize
i
ze
mi
mize
ze
nimize
nimize
Data Structures and its Applications
Suffix Trie – Compressed Trie – using numbers
7,7 6,7
5,5 2,7
4,5
Example – Banana$
Data Structures and its Applications
Suffix Trees – Introduction
Suffix
TREE
Data Structures and its Applications
Search for a substring in a Suffix Tree
Applications:
• English dictionary
• Predictive text
• Auto-complete dictionary found on Mobile phones and other gadgets.
Advantages:
• Faster than BST
• Printing of all the strings in the alphabetical order easily.
• Prefix search can be done (Auto complete).
Disadvantages:
• Need for a lot of memory to store the strings,
• Storing of too many node pointers.
THANK YOU
V R BADRI PRASAD
Department of Computer Science & Engineering
[email protected]
Data Structures and its Applications
V R BADRI PRASAD
Department of Computer Science & Engineering
DATA STRUCTURES AND ITS APPLICATIONS
V R BADRI PRASAD
Department of Computer Science & Engineering
Data Structures and its Applications
TRIE Trees – Implementation
struct trienode
{
struct trienode* child[26];
int endofword;
};
A B C D E F ...... W X Y Z Address of the next node (reference for us)
F1 F2 F3 F4 F5 F6 ...... F23 F24 F25 F26 Field number
End of Word / (eok - $) End of word / key field
Data Structures and its Applications
TRIE Trees – Implementation
ref A B C ...... W X Y Z
struct trienode *getnode()
{ root NULL NULL NULL ...... NULL NULL NULL NULL
int i; 0
curr=root;
for(i=0;key[i]!='\0';i++)
{
index=key[i]-'a';
if(curr->child[index]==NULL)
curr->child[index]=getnode();
curr=curr->child[index];
}
curr->endofword=1;
}
Data Structures and its Applications
TRIE Trees – Implementation
A B ....... H ...... X Y Z
NULL NULL ....... ...... NULL NULL NULL
0
A B C D E ...... X Y Z
NULL NULL NULL NULL ...... NULL NULL NULL
0
Data Structures and its Applications
TRIE Trees – Implementation
root
A B ....... H ...... X Y Z
NULL NULL ....... ...... NULL NULL NULL
0
A B C D E ...... X Y Z
NULL NULL NULL NULL ...... NULL NULL NULL
0
A B C D .... L ...... Y Z
NULL NULL NULL NULL ..... ...... NULL NULL
0
THANK YOU
V R BADRI PRASAD
Department of Computer Science & Engineering
[email protected]
Data Structures and its Applications
V R BADRI PRASAD
Department of Computer Science & Engineering
DATA STRUCTURES AND ITS APPLICATIONS
for(i=0;key[i]!='\0';i++)
{
index=key[i];
if(curr->child[index]==NULL)
{
printf("The word not found..\n");
return;
}
push(curr,index);
curr=curr->child[index];
}
curr->endofword=0;
push(curr,-1);
Data Structures and its Applications
TRIE Trees – Deletion Operation
while(1)
{
x=pop();
if(x.index!=-1)
x.m->child[x.index]=NULL;
if(x.m==root)//if root
break;
k=check(x.m);
if((k>=1)||(x.m->endofword==1))
break;
else
free(x.m);
}
return;
}
Data Structures and its Applications
TRIE Trees – Search Operation
int search(struct trienode * root,char *key)
{
int i,index;
struct trienode *curr;
curr=root;
for(i=0;key[i]!='\0';i++)
{
index=key[i];
if(curr->child[index]==NULL)
return 0;
curr=curr->child[index];
}
if(curr->endofword==1)
return 1;
return 0;
}
THANK YOU
V R BADRI PRASAD
Department of Computer Science & Engineering
[email protected]
Data Structures and its Applications
V R BADRI PRASAD
Department of Computer Science & Engineering
DATA STRUCTURES AND ITS APPLICATIONS
Introduction to Hashing :
- Hash Function
- Hash Table
- Creation of Hash Table
V R BADRI PRASAD
Department of Computer Science & Engineering
Data Structures and its Applications
Introduction to Hashing
• Implementing Dictionaries
• Takes equal time for operation
• Efficient techniques for retrieval of data would be one that takes less number of comparisions.
• A hash table, or a hash map, is a data structure that associates keys (names) with values (attributes).
• Use hash function to map keys to hash tables.
• Key is stored at a memory location , the address of the location is computed using hash function.
Example:
• Consider a key 496000. Suppose the hash table has 10 memory locations, then the key is stored at
location which has an address computed using hash function key mod 10.
Address(index) is : 496005 mod 10 = 5.
The data 496000 is stored at location with index five.
Data Structures and its Applications
Hashing – Hash Function and Hash Table
•I A good hash function is one that distributes keys evenly among all slots / index (locations).
• Design of a hash function is an art more than science.
HASH
KEY FUNCTION INDEX
Hash Table • Consider key elements as 34, 46, 72, 15, 18, 26, 93
It is true for the next data item 93 as location with index 3 is also occupied.
Which results in clash.
The problem can be resolved by
• Increasing the Memory Capacity.
• Overcoming Collision using
• Open Addressing / Separate Chaining
• Closed Addressing :
• Linear Probing
• Quadratic Probing
• Double Hashing
Data Structures and its Applications
Hashing – Open Addressing / Separate Chaining
Initially Hash Table contains all ‘NULL’ values in the address field of the hash table.
• Consider key elements as 34, 46, 72, 15, 18
• Hash function is key mod 5. • 34 mod 5 = 4, 34 is stored at index 4.
• 46 mod 5 = 1, 46 is stored at index 1.
Hash Table
• 72 mod 5 = 2 , 72 is stored at index 2.
Index address • 15 mod 5 = 0, 15 is stored at index 0.
0 NULL • 18 mod 5 = 3, 18 is stored at index 3.
1 NULL
2 NULL
3 NULL
4 NULL
Data Structures and its Applications
Hashing – Open Addressing / Separate Chaining
Initially Hash Table contains all ‘NULL’ values in the address field of the hash table.
• Consider key elements as 34, 46, 72, 15, 18
• Hash function is key mod 5. • 34 mod 5 = 4, 34 is stored at index 4.
• 46 mod 5 = 1, 46 is stored at index 1.
Hash Table
• 72 mod 5 = 2 , 72 is stored at index 2.
Index address • 15 mod 5 = 0, 15 is stored at index 0.
0 NULL • 18 mod 5 = 3, 18 is stored at index 3.
1 NULL
2 NULL
3 NULL
4 34
Data Structures and its Applications
Hashing – Open Addressing / Separate Chaining
Initially Hash Table contains all ‘NULL’ values in the address field of the hash table.
• Consider key elements as 34, 46, 72, 15, 18
• Hash function is key mod 5. • 34 mod 5 = 4, 34 is stored at index 4.
• 46 mod 5 = 1, 46 is stored at index 1.
Hash Table
• 72 mod 5 = 2 , 72 is stored at index 2.
Index address • 15 mod 5 = 0, 15 is stored at index 0.
0 NULL • 18 mod 5 = 3, 18 is stored at index 3.
1 46
2 NULL
3 NULL
4 34
Data Structures and its Applications
Hashing – Open Addressing / Separate Chaining
Initially Hash Table contains all ‘NULL’ values in the address field of the hash table.
• Consider key elements as 34, 46, 72, 15, 18
• Hash function is key mod 5. • 34 mod 5 = 4, 34 is stored at index 4.
• 46 mod 5 = 1, 46 is stored at index 1.
Hash Table
• 72 mod 5 = 2 , 72 is stored at index 2.
Index address • 15 mod 5 = 0, 15 is stored at index 0.
0 NULL • 18 mod 5 = 3, 18 is stored at index 3.
1 46
2 72
3 NULL
4 34
Data Structures and its Applications
Hashing – Open Addressing / Separate Chaining
Initially Hash Table contains all ‘NULL’ values in the address field of the hash table.
• Consider key elements as 34, 46, 72, 15, 18
• Hash function is key mod 5. • 34 mod 5 = 4, 34 is stored at index 4.
• 46 mod 5 = 1, 46 is stored at index 1.
Hash Table
• 72 mod 5 = 2 , 72 is stored at index 2.
Index address • 15 mod 5 = 0, 15 is stored at index 0.
0 15 • 18 mod 5 = 3, 18 is stored at index 3.
1 46
2 72
3 NULL
4 34
Data Structures and its Applications
Hashing – Open Addressing / Separate Chaining
Initially Hash Table contains all ‘NULL’ values in the address field of the hash table.
• Consider key elements as 34, 46, 72, 15, 18
• Hash function is key mod 5. • 34 mod 5 = 4, 34 is stored at index 4.
• 46 mod 5 = 1, 46 is stored at index 1.
Hash Table
• 72 mod 5 = 2 , 72 is stored at index 2.
Index Address • 15 mod 5 = 0, 15 is stored at index 0.
0 15 • 18 mod 5 = 3, 18 is stored at index 3.
1 46
2 72
3 18
4 34
THANK YOU
V R BADRI PRASAD
Department of Computer Science & Engineering
[email protected]
Data Structures and its Applications
V R BADRI PRASAD
Department of Computer Science & Engineering
DATA STRUCTURES AND ITS APPLICATIONS
Hashing :
- Insert Operation
- Display Operation
V R BADRI PRASAD
Department of Computer Science & Engineering
Data Structures and its Applications
Hashing – Open Addressing / Separate Chaining
struct node
{
int key; 15 Name
char name[100];
struct node *next;
};
struct hash
{
struct node *head;
int count; Count Address
};
Data Structures and its Applications
Hashing: Insert Operation
void insert_to_hash(struct hash *ht, int size, int key, char* name)
{
int index;
struct node *temp;
index=key%size;
temp->next=ht[index].head;
ht[index].head=temp;
• 34 mod 5 = 4, 34 is stored at index 4.
ht[index].count++;
• 44 mod 5 = 4, 44 is stored at index 4.
}
• 54 mod 5 = 4 , 54 is stored at index 4.
Hash Table
Count address
0 NULL
0 NULL
0 NULL
0 NULL
1 34
Data Structures and its Applications
Hashing – Open Addressing / Separate Chaining
// Insert node at the beginning of Singly linked list as shown in the figure.
index=key%size;
temp->next=ht[index].head;
ht[index].head=temp;
• 34 mod 5 = 4, 34 is stored at index 4.
ht[index].count++;
• 44 mod 5 = 4, 44 is stored at index 4.
}
• 54 mod 5 = 4 , 54 is stored at index 4.
Hash Table
Count address
0 NULL
0 NULL
0 NULL
0 NULL
2 44 34
Data Structures and its Applications
Hashing – Open Addressing / Separate Chaining
// Insert node at the beginning of Singly linked list as shown in the figure.
index=key%size;
temp->next=ht[index].head;
ht[index].head=temp;
• 34 mod 5 = 4, 34 is stored at index 4.
ht[index].count++;
• 44 mod 5 = 4, 44 is stored at index 4.
}
• 54 mod 5 = 4 , 54 is stored at index 4.
Hash Table
Count address
0 NULL
0 NULL
0 NULL
0 NULL
3 54 44 34
Data Structures and its Applications
Hashing – Open Addressing / Separate Chaining – Display Operation
void display(struct hash* ht, int size) Count address
{
0 NULL
int i;
struct node *temp; 0 NULL
printf("\n"); 0 NULL
for(i=0;i<size;i++)
0 NULL
{
printf("%d : ",i) 3 54 44 34
if(ht[i].head != NULL)
{ Display Output :
temp=ht[i].head; 0 :
while(temp!=NULL) 1 :
{ 2 :
printf("%d",temp->key); 3 :
printf("%s->",temp->name); 4 : 54 -> 44-> 34
temp=temp->next;
}
}
printf("\n");
}
}
THANK YOU
V R BADRI PRASAD
Department of Computer Science & Engineering
[email protected]
Data Structures and its Applications
V R BADRI PRASAD
Department of Computer Science & Engineering
DATA STRUCTURES AND ITS APPLICATIONS
V R BADRI PRASAD
Department of Computer Science & Engineering
Data Structures and its Applications
Closed Hashing – Linear Probing
Allocation of memory
Index is 71 % 5 = 1. 2 71 ABC 0
3 -- -- 0
Since , index ‘1’ is non empty location, search for first
4 34 MNP 1
empty location in the sequence.
I.e., location with index value 1+1 =2.
Increment count by 1
(*count)++;
return;
}
// if Mark is ‘0’, indicates the element is not present in the hash table
// Otherwise:
Data Structures and its Applications
Hashing – Linear Probing – Deletion of an element
// Search for the element to be deleted
Key Name Mark
index = key % size; 15 ABC 1
i=0;
46 DEF 1
while(i<*count)
{ 71 GHI 1
if (ht[index].mark==1) 18 JKL 1
{ // indicates element is present 34 MNP 1
if(ht[index].key==key) // if found
{ ht[index].mark=0; // Delete
(*count)--;
return;
}
i++; }
index=(index+1)%size; // search for the element in the
} //consecutive memory location
V R BADRI PRASAD
Department of Computer Science & Engineering
[email protected]
Data Structures and its Applications
V R BADRI PRASAD
Department of Computer Science & Engineering
DATA STRUCTURES AND ITS APPLICATIONS
V R BADRI PRASAD
Department of Computer Science & Engineering
Data Structures and its Applications
Closed Hashing – Quadratic Probing
if(*count==0)
{
printf("table empty..\n");
return;
}
// if Mark is ‘0’, indicates the element is not present in the hash table
// Otherwise:
Data Structures and its Applications
Hashing – Quadratic Probing – Deletion of an element
// Search for the element to be deleted
Key Name Mark
index = key % size; 15 ABC 1
i=index; 46 DEF 0
h=1;
71 GHI 1
for(h=1;h<=size;h++) 18 JKL 1
{
if (ht[index].mark==1) 34 MNP 1
{ // indicates element is present
If key is 46, then key is found.
if(ht[i].key==key) // if found
Then, set mark field to 0.
printf(“key Found and deleted..”); // Delete
This indicates that the element
ht[i].mark=0;
is deleted.
return;
}
index=(index+1)%size; // search for the element in the
} //consecutive memory location
V R BADRI PRASAD
Department of Computer Science & Engineering
[email protected]
Data Structures and its Applications
V R BADRI PRASAD
Department of Computer Science & Engineering
DATA STRUCTURES AND ITS APPLICATIONS
Double Hashing
V R BADRI PRASAD
Department of Computer Science & Engineering
Data Structures and its Applications
Open Hashing – Double Hashing
Index 0 1 2 3 4 5 6 7 8 9 10 11 12
Key 41 18 22
• Key – 18 , using h1(key) gives 5 as index / hash. Go to location 5. It is Free. Assign 18 to location 5.
• Key – 41 , using h1(key) gives 2 as index / hash. Go to location 2. It is Free. Assign 41 to location 2.
• Key – 22 , using h1(key) gives 2 as index / hash. Go to location 2. It is Free. Assign 22 to location 9.
• Key – 44 , using h1(key) gives 5 as index / hash. Go to location 5. It is not Free.
Index 0 1 2 3 4 5 6 7 8 9 10 11 12
Key 41 18 22
Use double hashing function. Index/hash = hash1(key) + j * hash2(key) , j=1 as it has the first collision.
. Index/hash =( 5 + 1 * (5) ) % 13 = 10
Data Structures and its Applications
Open Hashing – Double Hashing
Key h1(key) h2(key) Double hash(key)
Index 0 1 2 3 4 5 6 7 8 9 10 11 12
Key 41 18 22 44
Use double hashing function. Index/hash = hash1(key) + j * hash2(key) , j=1 as it has the first collision.
V R BADRI PRASAD
Department of Computer Science & Engineering
[email protected]