0% found this document useful (0 votes)
15 views

DS 5

Uploaded by

bijo6074
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

DS 5

Uploaded by

bijo6074
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 16

Module 5

Searching
Searching is the fundamental process of locating a specific element or item within a
collection of data. This collection of data can take various forms, such as arrays, lists,
trees, or other structured representations. Searching plays an important role in various
computational tasks and real-world applications, including information retrieval, data
analysis, decision-making processes, and more. Two popular search methods are Linear
Search and Binary Search.

Sequential Searching
 Linear search is also called as sequential search algorithm. It is the simplest
searching algorithm.
 In Linear search, we simply traverse the list completely and match each element of
the list with the item whose location is to be found. If the match is found, then the
location of the item is returned; otherwise, the algorithm returns NULL.
 It is widely used to search an element from the unordered list.
 The worst-case time complexity of linear search is O(n).

Algorithm
Linear_Search(a, n, val) // 'a' is the given array, 'n' is the size of given array, 'val' is the
value to search
Step 1: set pos = -1
Step 2: set i = 1
Step 3: repeat step 4 while i <= n
Step 4: if a[i] == val
set pos = i
print pos
go to step 6
[end of if]
set i = i + 1
[end of loop]
Step 5: if pos = -1
print "value is not present in the array "
[end of if]
Step 6: exit

Binary Search
 Binary search is a search algorithm used to find the position of a target value
within a sorted array.
 It follows the divide and conquer approach in which the list is divided into two
halves, and the item is compared with the middle element of the list. If the match is
found then, the location of the middle element is returned. Otherwise, we search into
either of the halves depending upon the result produced through the match.
 NOTE: Binary search can be implemented on sorted array elements. If the list
elements are not arranged in a sorted manner, we have first to sort them.
 The time complexity of binary search is O(log n), where n is the number of elements
in the array.

Steps:
 Divide the search space into two halves by finding the middle index “mid” .

Finding the middle index “mid” in Binary Search Algorithm

 Compare the middle element of the search space with the key.
 If the key is found at middle element, the process is terminated.
 If the key is not found at middle element, choose which half will be used as the
next search space.
 If the key is smaller than the middle element, then the left side is used
for next search.
 If the key is larger than the middle element, then the right side is used
for next search.
 This process is continued until the key is found or the total search space is
exhausted.
Algorithm
Binary_Search(a, lower_bound, upper_bound, val) // 'a' is the given array, 'lower_bound' is the
index of the first array element, 'upper_bound' is the index of the last array element, 'val' is the
value to search
Step 1: set beg = lower_bound, end = upper_bound, pos = - 1
Step 2: repeat steps 3 and 4 while beg <=end
Step 3: set mid = (beg + end)/2
Step 4: if a[mid] = val
set pos = mid
print pos
go to step 6
else if a[mid] > val
set end = mid - 1
else
set beg = mid + 1
[end of if]
[end of loop]
Step 5: if pos = -1
print "value is not present in the array"
[end of if]
Step 6: exit

Hashing
Hashing is a technique used in data structures to store and retrieve data efficiently. It
involves using a hash function to map data items to a fixed-size array which is called
a hash table.
Hash Table
A hash table is also known as a hash map. It is a data structure that stores key-value pairs.
It uses a hash function to map keys to a fixed-size array, called a hash table. This allows
in faster search, insertion, and deletion operations.

What is a hash Key?


a hash key (also known as a hash value or hash code) is a fixed-size numerical or
alphanumeric representation generated by a hashing algorithm. It is derived from the input
data, such as a text string or a file, through a process known as hashing.
Hash Function
The hash function is a function that takes a key and returns an index into the hash table.
The goal of a hash function is to distribute keys evenly across the hash table, minimizing
collisions (when two keys map to the same index).
Types of Hash functions
There are many hash functions that use numeric or alphanumeric keys.
1. Division Method.
2. Mid Square Method.
3. Folding Method.
4. Multiplication Method.
1. Division Method:
This is the most simple and easiest method to generate a hash value. The hash function
divides the value k by M and then uses the remainder obtained.
Formula:
h(K) = k mod M
Here,
k is the key value, and
M is the size of the hash table.
It is best suited that M is a prime number as that can make sure the keys are more
uniformly distributed. The hash function is dependent upon the remainder of a division.
Example:
k = 12345
M = 95
h(12345) = 12345 mod 95
= 90
k = 1276
M = 11
h(1276) = 1276 mod 11
=0
2. Mid Square Method:
The mid-square method is a very good hashing method. It involves two steps to compute
the hash value-
1. Square the value of the key k i.e. k2
2. Extract the middle r digits as the hash value.
Formula:
h(K) = h(k x k)
Here,
k is the key value.
The value of r can be decided based on the size of the table.
Example:
Suppose the hash table has 100 memory locations. So r = 2 because two digits are
required to map the key to the memory location.
k = 60
k x k = 60 x 60
= 3600
h(60) = 60
The hash value obtained is 60
3. Digit Folding Method:
This method involves two steps:
1. Divide the key-value k into a number of parts i.e. k1, k2, k3,….,kn, where each
part has the same number of digits except for the last part that can have lesser
digits than the other parts.
2. Add the individual parts. The hash value is obtained by ignoring the last carry if
any.
Formula:
k = k1, k2, k3, k4, ….., kn
s = k1+ k2 + k3 + k4 +….+ kn
h(K)= s
Here,
s is obtained by adding the parts of the key k
Example:
k = 12345
k1 = 12, k2 = 34, k3 = 5
s = k1 + k2 + k3
= 12 + 34 + 5
= 51
h(K) = 51
Note:
The number of digits in each part varies depending upon the size of the hash table.
Suppose for example the size of the hash table is 100, then each part must have two digits
except for the last part which can have a lesser number of digits.
4. Multiplication Method
This method involves the following steps:
1. Choose a constant value A such that 0 < A < 1.
2. Multiply the key value with A.
3. Extract the fractional part of kA.
4. Multiply the result of the above step by the size of the hash table i.e. M.
5. The resulting hash value is obtained by taking the floor of the result obtained in
step 4.
Formula:
h(K) = floor (M (kA mod 1))
Here,
M is the size of the hash table.
k is the key value.
A is a constant value.
Example:
k = 12345
A = 0.357840
M = 100
h(12345) = floor[ 100 (12345*0.357840 mod 1)]
= floor[ 100 (4417.5348 mod 1) ]
= floor[ 100 (0.5348) ]
= floor[ 53.48 ]
= 53
What is a Hash Collision?
A hash collision occurs when two different keys map to the same index in a hash table.
This can happen even with a good hash function, especially if the hash table is full or the
keys are similar.

Causes of Hash Collisions:


 Poor Hash Function: A hash function that does not distribute keys evenly across
the hash table can lead to more collisions.
 Similar Keys: Keys that are similar in value or structure are more likely to
collide.

Collision Resolution Techniques


o Chaining: In this technique, each hash table slot contains a linked list of all the values
that have the same hash value. This technique is simple and easy to implement, but
it can lead to poor performance when the linked lists become too long.
o Open addressing: In this technique, when a collision occurs, the algorithm searches
for an empty slot in the hash table by probing successive slots until an empty slot is
found.
o Double hashing: This is a variation of open addressing that uses a second hash
function to determine the next slot to probe when a collision occurs.

Applications of Hashing:

 Databases: Storing and retrieving data based on unique keys


 Caching: Storing frequently accessed data for faster retrieval
 Symbol Tables: Mapping identifiers to their values in programming languages
 Network Routing: Determining the best path for data packets
Sorting
Sorting refers to arranging data in a particular format. typically ascending or descending
according to some criteria, such as numerical or lexicographical order. Sorting algorithm
specifies the way to arrange data in a particular order.
Sorting Terminology:
 In-place Sorting: An in-place sorting algorithm uses constant space for producing the
output. It sorts the list only by modifying the order of the elements within the
list. Examples: Selection Sort, Bubble Sort Insertion Sort and Heap Sort.
 Internal Sorting: Internal Sorting is when all the data is placed in the main
memory or internal memory. In internal sorting, the problem cannot take input beyond
its size. Example: heap sort, bubble sort, selection sort, quick sort, shell sort, insertion
sort.
 External Sorting : External Sorting is when all the data that needs to be sorted cannot
be placed in memory at a time, the sorting is called external sorting. External Sorting is
used for the massive amount of data. Examples: Merge sort, Tag sort, Polyphase sort,
Four tape sort, External radix sort, etc.
 Stable sorting: When two same data appear in the same order in sorted data without
changing their position is called stable sort. Examples: Merge Sort, Insertion Sort,
Bubble Sort.
 Unstable sorting: When two same data appear in the different order in sorted data it is
called unstable sort. Examples: Quick Sort, Heap Sort, Shell Sort.

Bubble Sort(sinking sort)


Bubble sort is the simplest sorting algorithm that works by repeatedly swapping the
adjacent elements if they are in the wrong order. This algorithm is not suitable for large
data sets as its average and worst-case time complexity is quite high. Time complexity of
bubble sort is Time Complexity: O(N2)
In Bubble Sort algorithm:
 traverse from left and compare adjacent elements and the higher one is placed
at right side.
 In this way, the largest element is moved to the rightmost end at first.
 This process is then continued to find the second largest and place it and so on
until the data is sorted.

Advantages of Bubble Sort:


 Bubble sort is easy to understand and implement.
 It does not require any additional memory space.
 It is a stable sorting algorithm, meaning that elements with the same key value
maintain their relative order in the sorted output.
Disadvantages of Bubble Sort:
 Bubble sort has a time complexity of O(N2) which makes it very slow for large
data sets.
 Bubble sort is a comparison-based sorting algorithm, which means that it
requires a comparison operator to determine the relative order of elements in the
input data set. It can limit the efficiency of the algorithm in certain cases.
Algorithm
arr is an array of n elements. The assumed swap function in the algorithm will swap the
values of given array elements.
begin BubbleSort(arr)
for all array elements
if arr[i] > arr[i+1]
swap(arr[i], arr[i+1])
end if
end for
return arr
end BubbleSort

Selection Sort
 In selection sort, the smallest value among the unsorted elements of the array is selected
in every pass and inserted to its appropriate position into the array.
 It is an in-place comparison sorting algorithm.
 In this algorithm, the array is divided into two parts, first is sorted part, and another one is
the unsorted part.
 Initially, the sorted part of the array is empty, and unsorted part is the given array. Sorted
part is placed at the left, while the unsorted part is placed at the right.
 In selection sort, the first smallest element is selected from the unsorted array and placed
at the first position. After that second smallest element is selected and placed in the
second position. The process continues until the array is entirely sorted.
 The average and worst-case complexity of selection sort is O(n2), where n is the number
of items. Due to this, it is not suitable for large data sets.
 Selection sort is generally used when -
o A small array is to be sorted
o Swapping cost doesn't matter
o It is compulsory to check all elements
Algorithm
SELECTION SORT(arr, n)
Step 1: Repeat Steps 2 and 3 for i = 0 to n-1
Step 2: CALL SMALLEST(arr, i, n, pos)
Step 3: SWAP arr[i] with arr[pos]
[END OF LOOP]
Step 4: EXIT

SMALLEST (arr, i, n, pos)


Step 1: [INITIALIZE] SET SMALL = arr[i]
Step 2: [INITIALIZE] SET pos = i
Step 3: Repeat for j = i+1 to n
if (SMALL > arr[j])
SET SMALL = arr[j]
SET pos = j
[END OF if]
[END OF LOOP]
Step 4: RETURN pos

Insertion sort
 It is a simple sorting algorithm that works by iteratively inserting each element of
an unsorted list into its correct position in a sorted portion of the list.
 It is a stable sorting algorithm, meaning that elements with equal values maintain
their relative order in the sorted output.
 Insertion sort is like sorting playing cards in your hands. You split the cards into
two groups: the sorted cards and the unsorted cards. Then, you pick a card from the
unsorted group and put it in the right place in the sorted group.
 Insertion sort is a simple sorting algorithm that works by building a sorted array
one element at a time.
 It is considered an “in-place” sorting algorithm, meaning it doesn’t require any
additional memory space beyond the original array.
 Time Complexity: O(N^2)

To achieve insertion sort, follow these steps:


 We have to start with second element of the array as first element in the array is
assumed to be sorted.
 Compare second element with the first element and check if the second element
is smaller then swap them.
 Move to the third element and compare it with the second element, then the first
element and swap as necessary to put it in the correct position among the first
three elements.
 Continue this process, comparing each element with the ones before it and
swapping as needed to place it in the correct position among the sorted elements.
 Repeat until the entire array is sorted.

Insertion Sort Algorithm


Step 1 − If it is the first element, it is already sorted. return 1;
Step 2 − Pick next element
Step 3 − Compare with all elements in the sorted sub-list
Step 4 − Shift all the elements in the sorted sub-list that is greater than the value to be
sorted
Step 5 − Insert the value
Step 6 − Repeat until list is sorted

Pseudocode
Algorithm: Insertion-Sort(A)
for j = 2 to A.length
key = A[j]
i=j–1
while i > 0 and A[i] > key
A[i + 1] = A[i]
i = i -1
A[i + 1] = key

Advantages of Insertion Sort:


 Simple and easy to implement.
 Stable sorting algorithm.
 Efficient for small lists and nearly sorted lists.
 Space-efficient.
Disadvantages of Insertion Sort:
 Inefficient for large lists.
 Not as efficient as other sorting algorithms (e.g., merge sort, quick sort) for most
cases.
Quick Sort
 Quicksort is the widely used sorting algorithm that makes n log n comparisons in
average case for sorting an array of n elements.

 It is a faster and highly efficient sorting algorithm.

 This algorithm follows the divide and conquer approach.

 Divide and conquer is a technique of breaking down the algorithms into subproblems,
then solving the subproblems, and combining the results back together to solve the
original problem.

 The key process in quickSort is a partition(). The target of partitions is to place the
pivot (any element can be chosen to be a pivot) at its correct position in the sorted
array and put all smaller elements to the left of the pivot, and all greater elements to
the right of the pivot.
 Partition is done recursively on each side of the pivot after the pivot is placed in its
correct position and this finally sorts the array.
Choosing the pivot
Picking a good pivot is necessary for the fast implementation of quicksort. Some of the ways
of choosing a pivot are as follows -

o Pivot can be random from the given array.


o Pivot can either be the rightmost element or the leftmost element of the given array.
o Select median as the pivot element.

Partition Algorithm:
The logic is simple, we start from the leftmost element and keep track of the index of
smaller (or equal) elements as i. While traversing, if we find a smaller element, we swap
the current element with arr[i]. Otherwise, we ignore the current element.
Advantages of Quick Sort:
 It is a divide-and-conquer algorithm that makes it easier to solve problems.
 It is efficient on large data sets.
 It has a low overhead, as it only requires a small amount of memory to function.
Disadvantages of Quick Sort:
 It has a worst-case time complexity of O(N2), which occurs when the pivot is
chosen poorly.
 It is not a good choice for small data sets.
 It is not a stable sort, meaning that if two elements have the same key, their
relative order will not be preserved in the sorted output in case of quick sort,
because here we are swapping elements according to the pivot’s position
(without considering their original positions).

Merge Sort

 Merge sort is similar to the quick sort algorithm as it uses the divide and conquer
approach to sort the elements.

 It divides the given list into two equal halves, calls itself for the two halves and then
merges the two sorted halves.

 We have to define the merge() function to perform the merging.

 The sub-lists are divided again and again into halves until the list cannot be divided
further. Then we combine the pair of one element lists into two-element lists, sorting
them in the process. The sorted two-element pairs is merged into the four-element
lists, and so on until we get the sorted list.
MergeSort(arr):
1. If the length of the array is 1 or less, return the array as it is already sorted.
2. Divide the array into two halves, let's call them left and right.
3. Recursively call MergeSort on the left half.
4. Recursively call MergeSort on the right half.
5. Merge the two sorted halves:
5.1 Initialize empty arrays to store the merged result.
5.2 Compare the elements from the left and right halves, and add the
smaller (or larger, for descending order) element to the merged array.
5.3 Continue this process until one of the halves is exhausted.
5.4 Add the remaining elements from the non-empty half to the merged array.
6. Return the merged array.
Advantages of Merge Sort:
 Stability: Merge sort is a stable sorting algorithm, which means it maintains the
relative order of equal elements in the input array.
 Guaranteed worst-case performance: Merge sort has a worst-case time
complexity of O(N logN), which means it performs well even on large datasets.
 Simple to implement: The divide-and-conquer approach is straightforward.
Disadvantage of Merge Sort:
 Space complexity: Merge sort requires additional memory to store the merged
sub-arrays during the sorting process.
 Not in-place: Merge sort is not an in-place sorting algorithm, which means it
requires additional memory to store the sorted data. This can be a disadvantage
in applications where memory usage is a concern.

What is the difference between Merge Sort and Quick Sort?


Merge Sort splits the list in half, sorts each part, and then merges them. It keeps the order of
equal elements. Quick Sort picks a 'pivot' and arranges the list based on whether elements are
smaller or larger than this pivot. Quick Sort is usually faster for arrays, but can sometimes
take longer in worst-case scenarios. Plus, unlike Merge Sort, Quick Sort doesn't always
maintain the original order of equal elements.

You might also like