0% found this document useful (0 votes)
73 views

Data Structures Using C++ 2E: Searching and Hashing Algorithms

The document discusses various algorithms for sorting data structures, including selection sort, insertion sort, Shellsort, quicksort, mergesort, and heapsort. It provides examples of implementing sorting functions in C++ classes and compares the performance of different sorting algorithms in terms of time complexity, such as how many key comparisons are required in the worst, best, and average cases. Priority queues and their implementation are also covered.

Uploaded by

sania ejaz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views

Data Structures Using C++ 2E: Searching and Hashing Algorithms

The document discusses various algorithms for sorting data structures, including selection sort, insertion sort, Shellsort, quicksort, mergesort, and heapsort. It provides examples of implementing sorting functions in C++ classes and compares the performance of different sorting algorithms in terms of time complexity, such as how many key comparisons are required in the worst, best, and average cases. Priority queues and their implementation are also covered.

Uploaded by

sania ejaz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

Data Structures Using C++ 2E

Chapter 9
Searching and Hashing Algorithms
Hashing (cont’d.)
• Overflow and collision occur at same time
– If r = 1 (bucket size = one)
• Choosing a hash function
– Main objectives
• Choose an easy to compute hash function
• Minimize number of collisions
• If HTSize denotes the size of hash table (array size
holding the hash table)
– Assume bucket size = one
• Each bucket can hold one item
• Overflow and collision occur simultaneously

Data Structures Using C++ 2E 2


Hash Functions: Some Examples
• Mid-square
• Folding
• Division (modular arithmetic)
– In C++
• h(X) = iX % HTSize;
– C++ function

Data Structures Using C++ 2E 3


Collision Resolution

• Desirable to minimize number of collisions


– Collisions unavoidable in reality
• Hash function always maps a larger domain onto a
smaller range
• Collision resolution technique categories
– Open addressing (closed hashing)
• Data stored within the hash table
– Chaining (open hashing)
• Data organized in linked lists
• Hash table: array of pointers to the linked lists

Data Structures Using C++ 2E 4


Collision Resolution: Open Addressing

• Data stored within the hash table


– For each key X, h(X) gives index in the array
• Where item with key X likely to be stored

Data Structures Using C++ 2E 5


Linear Probing
• Starting at location t
– Search array sequentially to find next available slot
• Assume circular array
– If lower portion of array full
• Can continue search in top portion of array using mod
operator
– Starting at t, check array locations using probe
sequence
• t, (t + 1) % HTSize, (t + 2) % HTSize, . . ., (t + j) %
HTSize

Data Structures Using C++ 2E 6


Linear Probing (cont’d.)
• The next array slot is given by
– (h(X) + j) % HTSize where j is the jth probe
• See Example 9-4
• C++ code implementing linear programming

Data Structures Using C++ 2E 7


Linear Probing (cont’d.)
• Causes clustering
– More and more new keys would likely be hashed to
the array slots already occupied

FIGURE 9-5 Hash table of size 20

FIGURE 9-6 Hash table of size 20 with certain positions occupied

FIGURE 9-7 Hash table of size 20 with certain positions occupied

Data Structures Using C++ 2E 8


Linear Probing (cont’d.)
• Improving linear probing
– Skip array positions by fixed constant (c) instead of
one
– New hash address:
• If c = 2 and h(X) = 2k (h(X) even)
– Only even-numbered array positions visited
• If c = 2 and h(X) = 2k + 1, ( h(X) odd)
– Only odd-numbered array positions visited
• To visit all the array positions
– Constant c must be relatively prime to HTSize

Data Structures Using C++ 2E 9


Random Probing

• Uses random number generator to find next


available slot
– ith slot in probe sequence: (h(X) + ri) % HTSize
• Where ri is the ith value in a random permutation of the
numbers 1 to HTSize – 1
– All insertions, searches use same random numbers
sequence
• See Example 9-5

Data Structures Using C++ 2E 10


Rehashing

• If collision occurs with hash function h


– Use a series of hash functions: h1, h2, . . ., hs
– If collision occurs at h(X)
• Array slots hi(X), 1 <= hi(X) <= s examined

Data Structures Using C++ 2E 11


Deletion: Open Addressing
• Designing a class as an ADT
– Implement hashing using quadratic probing
• Use two arrays
– One stores the data
– One uses indexStatusList as described in the
previous section
• Indicates whether a position in hash table free,
occupied, used previously
• See code on pages 521 and 522
– Class template implementing hashing as an ADT
– Definition of function insert

Data Structures Using C++ 2E 12


Collision Resolution: Chaining (Open
Hashing)
• Hash table HT: array of pointers
– For each j, where 0 <= j <= HTsize -1
• HT[j] is a pointer to a linked list
• Hash table size (HTSize): less than or equal to the
number of items
FIGURE 9-10 Linked hash table

Data Structures Using C++ 2E 13


Collision Resolution: Chaining (cont’d.)

• Item insertion and collision


– For each key X (in the item)
• First find h(X) – t, where 0 <= t <= HTSize – 1
• Item with this key inserted in linked list pointed to by
HT[t]
– For nonidentical keys X1 and X2
• If h(X1) = h(X2)
– Items with keys X1 and X2 inserted in same linked list
• Collision handled quickly, effectively

Data Structures Using C++ 2E 14


Collision Resolution: Chaining (cont’d.)
• Search
– Determine whether item R with key X is in the hash
table
• First calculate h(X)
– Example: h(X) = T
• Linked list pointed to by HT[t] searched sequentially
• Deletion
– Delete item R from the hash table
• Search hash table to find where in a linked list R exists
• Adjust pointers at appropriate locations
• Deallocate memory occupied by R

Data Structures Using C++ 2E 15


Collision Resolution: Chaining (cont’d.)

• Overflow
– No longer a concern
• Data stored in linked lists
• Memory space to store data allocated dynamically
– Hash table size
• No longer needs to be greater than number of items
– Hash table less than the number of items
• Some linked lists contain more than one item
• Good hash function has average linked list length still
small (search is efficient)

Data Structures Using C++ 2E 16


Collision Resolution: Chaining (cont’d.)

• Advantages of chaining
– Item insertion and deletion: straightforward
– Efficient hash function
• Few keys hashed to same home position
• Short linked list (on average)
– Shorter search length
• If item size is large
– Saves a considerable amount of space

Data Structures Using C++ 2E 17


Collision Resolution: Chaining (cont’d.)
• Disadvantage of chaining
– Small item size wastes space
• Example: 1000 items each requires one word of
storage
– Chaining
• Requires 3000 words of storage
– Quadratic probing
• If hash table size twice number of items: 2000 words
• If table size three times number of items
– Keys reasonably spread out
– Results in fewer collisions

Data Structures Using C++ 2E 18


Hashing Analysis

• Load factor
– Parameter α

TABLE 9-5 Number of comparisons in hashing

Data Structures Using C++ 2E 19


Summary
• Sequential search
– Order n
• Ordered lists
– Elements ordered according to some criteria
• Binary search
– Order log2n
• Hashing
– Data organized using a hash table
– Apply hash function to determine if item with a key is
in the table
– Two ways to organize data
Data Structures Using C++ 2E 20
Summary (cont’d.)

• Hash functions
– Mid-square
– Folding
– Division (modular arithmetic)
• Collision resolution technique categories
– Open addressing (closed hashing)
– Chaining (open hashing)
• Search analysis
– Review number of key comparisons
– Worst case, best case, average case
Data Structures Using C++ 2E 21
Data Structures Using C++ 2E

Chapter 10
Sorting Algorithms
Objectives

• Learn the various sorting algorithms


• Explore how to implement selection sort, insertion
sort, Shellsort, quicksort, mergesort, and heapsort
• Discover how the sorting algorithms discussed in
this chapter perform
• Learn how priority queues are implemented

Data Structures Using C++ 2E 23


Sorting Algorithms

• Several types in the literature


– Discussion includes most common algorithms
• Analysis
– Provides a comparison of algorithm performance
• Functions implementing sorting algorithms
– Included as public members of related class

Data Structures Using C++ 2E 24


Selection Sort: Array-Based Lists

• List sorted by selecting elements in the list


– Select elements one at a time
– Move elements to their proper positions
• Selection sort operation
– Find location of the smallest element in unsorted list
portion
• Move it to top of unsorted portion of the list
– First time: locate smallest item in the entire list
– Second time: locate smallest item in the list starting
from the second element in the list, and so on

Data Structures Using C++ 2E 25


FIGURE 10-1 List of 8 elements

FIGURE 10-2 Elements of list during the first iteration

FIGURE 10-3 Elements of list during the second iteration


Data Structures Using C++ 2E 26
Selection Sort: Array-Based Lists
(cont’d.)
• Selection sort steps
– In the unsorted portion of the list
• Find location of smallest element
• Move smallest element to beginning of the unsorted list
• Keep track of unsorted list portion with a for loop

Data Structures Using C++ 2E 27


Selection Sort: Array-Based Lists
(cont’d.)
• Given: starting index, first, ending index, last
– C++ function returns index of the smallest element in
list[first]...list[last]

Data Structures Using C++ 2E 28


Selection Sort: Array-Based Lists
(cont’d.)
• Function swap
• Definition of function selectionSort

Data Structures Using C++ 2E 29


Selection Sort: Array-Based Lists
(cont’d.)
• Add functions to implement selection sort in the
definition of class arrayListType

Data Structures Using C++ 2E 30


Analysis: Selection Sort

• Search algorithms
– Concerned with number of key (item) comparisons
• Sorting algorithms
– Concerned with number of key comparisons and
number of data movements
• Analysis of selection sort
– Function swap
• Number of item assignments: 3(n-1)
– Function minLocation
• Number of key comparisons of O(n2)

Data Structures Using C++ 2E 31


Insertion Sort: Array-Based Lists

• Attempts to improve high selection sort key


comparisons
• Sorts list by moving each element to its proper place
• Given list of length eight

FIGURE 10-4 list

Data Structures Using C++ 2E 32


Insertion Sort: Array-Based Lists
(cont’d.)
• Elements list[0], list[1], list[2], list[3]
in order
• Consider element list[4]
– First element of unsorted list

FIGURE 10-5 list elements while moving list[4] to its proper place

Data Structures Using C++ 2E 33


Insertion Sort: Array-Based Lists
(cont’d.)
• Array containing list divided into two sublists
– Upper and lower
• Index firstOutOfOrder
– Points to first element in the lower sublist

Data Structures Using C++ 2E 34


Insertion Sort: Array-Based Lists
(cont’d.)
• length = 8
• Initialize firstOutOfOrder to one

FIGURE 10-6 firstOutOfOrder = 1

Data Structures Using C++ 2E 35


Insertion Sort: Array-Based Lists
(cont’d.)
• list[firstOutOfOrder] = 7
• list[firstOutOfOrder - 1] = 13
– 7 < 13
• Expression in if statement evaluates to true
– Execute body of if statement
• temp = list[firstOutOfOrder] = 7
• location = firstOutOfOrder = 1
– Execute the do...while loop
• list[1] = list[0] = 13
• location = 0

Data Structures Using C++ 2E 36


Insertion Sort: Array-Based Lists
(cont’d.)
• do...while loop terminates
– Because location = 0
• Copy temp into list[location] (list[0])

FIGURE 10-7 list after the first iteration of insertion sort

Data Structures Using C++ 2E 37


Insertion Sort: Array-Based Lists
(cont’d.)
• Suppose list given in Figure 10-8(a)
– Walk through code

FIGURE 10-8 list elements while moving list[4] to its proper place

Data Structures Using C++ 2E 38


Insertion Sort: Array-Based Lists
(cont’d.)
• Suppose list given in Figure 10-9
– Walk through code

FIGURE 10-9 First out-of-order element is at position 5

Data Structures Using C++ 2E 39


Insertion Sort: Array-Based Lists
(cont’d.)
• C++ function implementing previous algorithm

Data Structures Using C++ 2E 40


Insertion Sort: Linked List-Based Lists
• If list stored in an array
– Traverse list in either direction using index variable
• If list stored in a linked list
– Traverse list in only one direction
• Starting at first node: links only in one direction

FIGURE 10-10 Linked list

Data Structures Using C++ 2E 41


Insertion Sort: Linked List-Based Lists
(cont’d.)
• firstOutOfOrder
– Pointer to node to be moved to its proper location
• lastInOrder
– Pointer to last node of the sorted portion of the list

FIGURE 10-11 Linked list and pointers lastInOrder


and firstOutOfOrder

Data Structures Using C++ 2E 42


Insertion Sort: Linked List-Based Lists
(cont’d.)
• Compare firstOutOfOrder info with first node
info
– If firstOutOfOrder info smaller than first node
info
• firstOutOfOrder moved before first node
– Otherwise, search list starting at second node to find
location where to move firstOutOfOrder
• Search list using two pointers
– current
– trailCurrent: points to node just before current
• Handle any special cases

Data Structures Using C++ 2E 43


Insertion Sort: Linked List-Based Lists
(cont’d.)

Data Structures Using C++ 2E 44


Insertion Sort: Linked List-Based Lists
(cont’d.)
• Case 1
– firstOutOfOrder->info less than first->info
• Node firstOutOfOrder moved before first
• Adjust necessary links

FIGURE 10-13 Linked list after moving the node with info 8 to the beginning

Data Structures Using C++ 2E 45


Insertion Sort: Linked List-Based Lists
(cont’d.)
• Review Case 2 on page 546
• Review Case 3 on page 546
• Review function linkedInsertionSort on page
547
– Implements previous algorithm

Data Structures Using C++ 2E 46


Analysis: Insertion Sort

TABLE 10-1 Average-case behavior of the selection sort and


insertion sort for a list of length n

Data Structures Using C++ 2E 47

You might also like