Chapter 1+DSTRU
Chapter 1+DSTRU
DSTRU1
Data Structures and Algorithms
A data structure should be seen as a logical concept that must address two fundamental concerns.
1
Classification of Data Structures:
structure used to represent the standard data types of any one of the computer languages.
Variables, arrays, pointers, structures, unions, etc. are examples of primitive data structures.
Compound Data structure:
Compound data structure can be constructed with the help of any one of the primitive data
structure and it is having a specific functionality. It can be designed by user. It can be
classified as
1. Add an element
2. Delete an element
3. Traverse
4. Sort the list of elements
5. Search for a data element
For example Stack, Queue, Tables, List, and Linked Lists.
An abstract data type, sometimes abbreviated ADT, is a logical description of how we view
the data and the operations that are allowed without regard to how they will be
implemented. This means that we are concerned only with what data is representing and not
with how it will eventually be constructed. By providing this level of abstraction, we are
creating an encapsulation around the data. The idea is that by encapsulating the details of the
implementation, we are hiding them from the user’s view. This is called information hiding.
The implementation of an abstract data type, often referred to as a data structure, will
require that we provide a physical view of the data using some collection of programming
constructs and primitive data types.
3
[Fig. 1.2: Abstract Data Type (ADT)]
Algorithms:
1. Input Step
2. Assignment Step
3. Decision Step
4. Repetitive Step
5. Output Step
4. Effectiveness: the operations of the algorithm must be basic enough to be put down
on pencil and paper. They should not be too complex to warrant writing another
algorithm for the operation.
5. Input-Output: The algorithm must have certain initial and precise inputs, and
outputs that may be generated both at its intermediate and final steps.
1. To save time (Time Complexity): A program that runs faster is a better program.
2. To save space (Space Complexity): A program that saves space over a competing
program is
5
considerable desirable.
Efficiency of Algorithms:
The performances of algorithms can be measured on the scales of time and space. The
performance of a program is the amount of computer memory and time needed to run a
program. We use two approaches to determine the performance of a program. One is
analytical and the other is experimental. In performance analysis we use analytical methods,
while in performance measurement we conduct experiments.
Analyzing Algorithms
Suppose M is an algorithm, and suppose n is the size of the input data. Clearly the complexity
f(n) of M increases as n increases. It is usually the rate of increase of f(n) with some standard
functions. The most common computing times are
O(1), O(log2 n), O(n), O(n log2 n), O(n2), O(n3), O(2n)
Example:
Chapter 1: Introduction to Data Structures
The total frequency counts of the program segments A, B and C given by 1, (3n+1) and
(3n2+3n+1) respectively are expressed as O(1), O(n) and O(n 2). These are referred to as the
time complexities of the program segments since they are indicative of the running times of
the program segments. In a similar manner space complexities of a program can also be
expressed in terms of mathematical notations,
7
which is nothing but the amount of memory they require for their execution.
Asymptotic Notations:
It is often used to describe how the size of the input data affects an algorithm’s usage of
computational resources. Running time of an algorithm is described as a function of input size n
for large n.
Big oh(O): Definition: f(n) = O(g(n)) (read as f of n is big oh of g of n) if there exist a positive
integer n0 and a positive number c such that |f(n)| ≤ c|g(n)| for all n ≥ n 0 . Here g(n) is the
upper bound of the function f(n).
Theta(Θ): Definition: f(n) = Θ(g(n)) (read as f of n is theta of g of n), if there exists a positive
integer n0 and two positive constants c1 and c2 such that c1 |g(n)| ≤ |f(n)| ≤ c2 |g(n)| for all n
≥ n0. The function g(n) is both an upper bound and a lower bound for the function f(n) for all
values of n, n ≥ n0 .
Chapter 1: Introduction to Data Structures
Little oh(o): Definition: f(n) = O(g(n)) ( read as f of n is little oh of g of n), if f(n) = O(g(n)) and
f(n) ≠ Ω(g(n)).
Time Complexity:
2. 1 2 2 4 8 4
3. 2 4 8 16 64 16
4. 3 8 24 64 512 256
9
Computational Time(CPU consumption).
Recursive Algorithms:
GCD Design: Given two integers a and b, the greatest common divisor is recursively found
using the formula
gcd(a,b) = a if b=0
Base case
b if a=0
General case
gcd(b, a mod b)
Fibonacci Design: To start a fibonacci series, we need to know the first two numbers.
Fibonacci(n) = 0 if n=0
Base case
1 if n=1
General case
Fibonacci(n-1) + fibonacci(n-2)
1. A function is said to be recursive if it calls itself again and again within its body
whereas iterative functions are loop based imperative functions.
3. Recursion uses more memory than iteration as its concept is based on stacks.
7. While using recursion multiple activation records are created on stack for each call
where as in iteration everything is done in one activation record.
8. Infinite recursion can crash the system whereas infinite looping uses CPU cycles
repeatedly.
Types of Recursion:
Recursion is of two types depending on whether a function calls itself from within itself or
Chapter 1: Introduction to Data Structures
whether two functions call one another mutually. The former is called direct recursion and
the later is called
11
indirect recursion. Thus there are two types of recursion:
Direct Recursion
Indirect Recursion
Linear Recursion
Binary Recursion
Multiple Recursion
Linear Recursion:
It is the most common type of Recursion in which function calls itself repeatedly until base
condition [termination case] is reached. Once the base case is reached the results are return to
the caller function. If a recursive function is called only once then it is called a linear
recursion.
Binary Recursion:
Some recursive functions don't just have one call to themselves; they have two (or more).
Functions with two recursive calls are referred to as binary recursive functions.
Example1: The Fibonacci function fib provides a classic example of binary recursion. The
Fibonacci numbers can be defined by the rule:
fib(n) = 0 if n is 0,
= 1 if n is 1,
Fib(1) = 1
Chapter 1: Introduction to Data Structures
= 2 Fib(4) = Fib(3) +
+ Fib(3) = 5 Fib(6) =
Fib(5) + Fib(4) = 8
nterms = 10
n1 = 0
n2 = 1
count = 0
print("Fibonacci sequence
upto",nterms,":") print(n1)
else:
13
print("Fibonacci sequence
print(n1,end=' ,
') nth = n1 + n2
# update
values n1 =
n2
n2 = nth
count +=
Tail Recursion:
Tail recursion is a form of linear recursion. In tail recursion, the recursive call is the last thing
the function does. Often, the value of the recursive call is returned. As such, tail recursive
functions can often be easily implemented in an iterative manner; by taking out the recursive
call and replacing it with a loop, the same effect can generally be achieved. In fact, a good
compiler can recognize tail recursion and convert it to iteration in order to optimize the
performance of the code.
A good example of a tail recursive function is a function to compute the GCD, or Greatest
Common Denominator, of two numbers:
def factorial(n):
if n == 0: return 1
else: return factorial(n-1) * n
Input: integer n ≥
0 Output: n!
GCD(m, n)
mod n) Time-Complexity:
O(ln n) Fibonacci(n)
Input: integer n ≥ 0
1. if n=1 or n=2
2. then Fibonacci(n)=1
Towers of Hanoi
Input: The aim of the tower of Hanoi problem is to move the initial n different sized disks
from needle A to needle C using a temporary needle B. The rule is that no larger disk is to be
placed above the smaller disk in any of the needle while moving or at any time, and only the
top of the disk is to be moved at a time from any needle to any needle.
Output:
15
if n == 1:
rod",to_rod return
from_rod)
n=4
Searching Techniques:
Linear Search: Searching is a process of finding a particular data item from a collection of
data items based on specific criteria. Every day we perform web searches to locate data items
containing in various pages. A search typically performed using a search key and it answers
either True or False based on the item is present or not in the list. Linear search algorithm is
the most simplest algorithm to do sequential search and this technique iterates over the
sequence and checks one item at a time, until the desired item is found or all items have been
examined. In Python the in operator is used to find the desired item in a sequence of items. The
in operator makes searching task simpler and hides the inner working details.
Consider an unsorted single dimensional array of integers and we need to check whether 31 is
present in the array or not, then search begins with the first element. As the first element
doesn't contain the desired value, then the next element is compared to value 31 and this
process continues until the desired element is found in the sixth position. Similarly, if we want
to search for 8 in the same array, then the search begins in the same manner, starting with the
first element until the desired element is found. In linear search, we cannot determine that a
given search value is present in the sequence or not until the entire array is traversed.
Chapter 1: Introduction to Data Structures
Source Code:
Binary Search: In Binary search algorithm, the target key is examined in a sorted sequence
and this algorithm starts searching with the middle item of the sorted sequence.
a. If the middle item is the target value, then the search item is found and it returns True.
b. If the target item < middle item, then search for the target value in the first half of the list.
c. If the target item > middle item, then search for the target value in the second half of the
list.
17
In binary search as the list is ordered, so we can eliminate half of the values in the list in each
iteration. Consider an example, suppose we want to search 10 in a sorted array of elements,
then we first determine
Chapter 1: Introduction to Data Structures
the middle element of the array. As the middle item contains 18, which is greater than the
target value 10, so can discard the second half of the list and repeat the process to first half of
the array. This process is repeated until the desired target item is located in the list. If the item
is found then it returns True, otherwise False.
Source Code:
array =[1,2,3,4,5,6,7,8,9]
def
binary_search(searchfor,arra
y): lowerbound=0
upperbound=len(array)-1
found=False
while found==False and
lowerbound<=upperbound:
midpoint=(lowerbound+upperbound)//2
if
array[midpoint]==search
for: found =True
return found
elif
array[midpoint]<searchfo
r:
lowerbound=midpoint+1
else:
upperbound=midpoint-1
return found
19
Time Complexity of Binary Search:
In Binary Search, each comparison eliminates about half of the items from the list. Consider a list
with n items, then about n/2 items will be eliminated after first comparison. After second
comparison, n/4 items
Chapter 1: Introduction to Data Structures
of the list will be eliminated. If this process is repeated for several times, then there will be just
one item left in the list. The number of comparisons required to reach to this point is n/2i = 1. If
we solve for i, then it gives us i = log n. The maximum number is comparison is logarithmic in
nature, hence the time complexity of binary search is O(log n).
Fibonacci Search: It is a comparison based technique that uses Fibonacci numbers to search
an element in a sorted array. It follows divide and conquer approach and it has a O(log n) time
complexity. Let the element to be searched is x, then the idea is to first find the smallest
Fibonacci number that is greater than or equal to length of given array. Let the Fibonacci
number be fib(nth Fibonacci number). Use (n-2)th Fibonacci number as index and say it is i, then
compare a[i] with x, if x is same then return i. Else if x is greater, then search the sub array
after i, else search the sub array before i.
Source Code:
21
# Check if fibMm2 is a valid
location i = min(offset+fibMMm2,
n-1)
# Driver Code
arr = [10, 22, 35, 40, 45, 50, 80, 82, 85, 90, 100]
n=
len(arr) x
= 80
print("Found at index:",
fibMonaccianSearch(arr, x,
n))
Time Complexity of Fibonacci Search:
Time complexity for Fibonacci search is O(log2 n)
Chapter 1: Introduction to Data Structures
Sorting Techniques:
Sorting in general refers to various methods of arranging or ordering things based on criteria's
(numerical, chronological, alphabetical, hierarchical etc.). There are many approaches to
sorting data and each has its own merits and demerits.
23
Bubble Sort:
This sorting technique is also known as exchange sort, which arranges values by iterating
over the list several times and in each iteration the larger value gets bubble up to the end of
the list. This algorithm uses multiple passes and in each pass the first and second data items
are compared. if the first data item is bigger than the second, then the two items are swapped.
Next the items in second and third position are compared and if the first one is larger than
the second, then they are swapped, otherwise no change in their order. This process continues
for each successive pair of data items until all items are sorted.
[End of Inner
Loop]
Source Code:
def
bubbleSort(arr)
: n = len(arr)
25
arr[j], arr[j+1] = arr[j+1], arr[j]
bubbleSort(arr)
Step-by-step example:
Let us take the array of numbers "5 1 4 2 8", and sort the array from lowest number to greatest
number using bubble sort. In each step, elements written in bold are being compared. Three
passes will be required.
First Pass:
(51428) ( 1 5 4 2 8 ), Here, algorithm compares the first two elements, and swaps since 5 >
1.
(14258)
( 1 4 2 5 8 ), Now, since these elements are already in order (8 > 5), algorithm does not swap them.
Second
Pass:
(14258) (14258)
(12458) (12458)
Now, the array is already sorted, but our algorithm does not know if it is completed. The
algorithm needs one whole pass without any swap to know it is sorted.
Third Pass:
(12458) (12458)
(12458) (12458)
Chapter 1: Introduction to Data Structures
(12458) (12458)
27
(12458) (12458)
Time Complexity:
The efficiency of Bubble sort algorithm is independent of number of data items in the array and
its initial arrangement. If an array containing n data items, then the outer loop executes n-1
times as the algorithm requires n-1 passes. In the first pass, the inner loop is executed n-1
times; in the second pass, n-2 times; in the third pass, n-3 times and so on. The total number of
iterations resulting in a run time of O(n2).
Selection Sort:
Selection sort algorithm is one of the simplest sorting algorithm, which sorts the elements in
an array by finding the minimum element in each pass from unsorted part and keeps it in the
beginning. This sorting technique improves over bubble sort by making only one exchange in
each pass. This sorting technique maintains two sub arrays, one sub array which is already
sorted and the other one which is unsorted. In each iteration the minimum element
(ascending order) is picked from unsorted array and moved to sorted sub array..
Source Code:
Step-by-step example:
64 25 12 22 11
11 25 12 22 64
11 12 25 22 64
11 12 22 25 64
11 12 22 25 64
Time Complexity:
Selection sort is not difficult to analyze compared to other sorting algorithms since none of the
loops depend on the data in the array. Selecting the lowest element requires scanning all n
elements (this takes n − 1 comparisons) and then swapping it into the first position. Finding the
next lowest element requires scanning the remaining n − 1 elements and so on, for (n − 1) + (n −
2) + ... + 2 + 1 = n(n − 1) / 2 ∈ O(n2) comparisons. Each of these scans requires one swap for n − 1
elements (the final element is already in place).
29
Average Case Performance O(n2)
Insertion Sort:
An algorithm consider the elements one at a time, inserting each in its suitable place among
those already considered (keeping them sorted). Insertion sort is an example of an incremental
algorithm. It builds the sorted sequence one number at a time. This is a suitable sorting
technique in playing card games. Insertion sort provides several advantages:
Simple implementation
Adaptive (i.e., efficient) for data sets that are already substantially sorted: the time
complexity is O(n + d), where d is the number of inversions
More efficient in practice than most other simple quadratic (i.e., O(n2))
algorithms such as selection sort or bubble sort; the best case (nearly sorted input)
is O(n)
Stable; i.e., does not change the relative order of elements with equal keys
In-place; i.e., only requires a constant amount O(1) of additional memory space
Source Code:
# Function to do insertion
sort def insertionSort(arr):
# Traverse through 1 to
len(arr) for i in range(1,
Chapter 1: Introduction to Data Structures
len(arr)):
31
key = arr[i]
Step-by-step example:
33
Suppose, you want to sort elements in ascending as in above figure. Then,
1. The second element of an array is compared with the elements that appear before it
(only first element in this case). If the second element is smaller than first element,
second element is inserted in the position of first element. After first step, first two
elements of an array will be sorted.
2. The third element of an array is compared with the elements that appears before it
(first and second element). If third element is smaller than first element, it is
inserted in the position of first element. If third element is larger than first element
but, smaller than second element, it is inserted in the position of second element. If
third element is larger than both the elements, it is kept in the position as it is.
After second step, first three elements of an array will be sorted.
3. Similarly, the fourth element of an array is compared with the elements that appear
before it (first, second and third element) and the same procedure is applied and
that element is inserted in the proper position. After third step, first four elements
of an array will be sorted.
If there are n elements to be sorted. Then, this procedure is repeated n-1 times to get sorted list of
array.
Time Complexity:
Output:
Enter no of elements:5
Enter elements:1 65 0 32 66
Quick Sort :
Quick sort is a divide and conquer algorithm. Quick sort first divides a large list into two
smaller sub- lists: the low elements and the high elements. Quick sort can then recursively sort
the sub-lists.
2. Reorder the list so that all elements with values less than the pivot come before
the pivot, while all elements with values greater than the pivot come after it (equal
values can go either way). After this partitioning, the pivot is in its final position.
This is called the partition operation.
Chapter 1: Introduction to Data Structures
3. Recursively apply the above steps to the sub-list of elements with smaller values and
separately the sub-list of elements with greater values.
The base case of the recursion is lists of size zero or one, which never need to be sorted.
35
Quick sort, or partition-exchange sort, is a sorting algorithm developed by Tony Hoare that,
on average, makes O(n log n) comparisons to sort n items. In the worst case, it makes O(n2)
comparisons, though this behavior is rare. Quick sort is often faster in practice than other O(n
log n) algorithms. It works by first of all by partitioning the array around a pivot value and
then dealing with the 2 smaller partitions separately. Partitioning is the most complex part of
quick sort. The simplest thing is to use the first value in the array, a[l] (or a[0] as l = 0 to begin
with) as the pivot. After the partitioning, all values to the left of the pivot are <= pivot and all
values to the right are > pivot. The same procedure for the two remaining sub lists is repeated
and so on recursively until we have the entire list sorted.
Advantages:
Does not need additional memory (the sorting takes place in the array - this is called
in-place
processing).
Source Code:
arr[i+1],arr[high] =
arr[high],arr[i+1] return ( i+1 )
Ending index
37
# Function to do Quick
sort def
quickSort(arr,low,high):
if low < high:
Step-by-step example:
1 2 3 4 5 6 7 8 9 10 11 12 13 Remarks
38 08 16 06 79 57 24 56 02 58 04 70 45
Pivo 08 16 06 Up 57 24 56 02 58 Dn 70 45 Swap up
t and down
Pivo 08 16 06 04 57 24 56 02 58 79 70 45
t
Pivo 08 16 06 04 Up 24 56 Dn 58 79 70 45 Swap up
t and down
Pivo 08 16 06 04 02 24 56 57 58 79 70 45
t
Pivo 08 16 06 04 02 Dn Up 57 58 79 70 45 Swap
t pivot
and
down
24 08 16 06 04 02 38 56 57 58 79 70 45
Pivo 08 16 06 04 Dn Up 56 57 58 79 70 45 Swap
Chapter 1: Introduction to Data Structures
t pivot
and
down
(02 08 16 06 04 24) 38 (56 57 58 79 70 45)
Pivo 08 16 06 04 Up
t
39
dn
Swap up Pivo Up 06
and down D
t n
Pivo 04 06 16
t Swap pivot
and down
Pivo 04 D U
t n p
06 04 08 Swap
pivot
Pivo Dn U and
t p down
04 06
Pivo 45 58 79 70 57
t
Swap
Pivo Dn Up 79 70 57 pivot
t and
down
(45) 56 (58 79 70 57)
Pivo U 70 Dn
t p Swap up and
down
Pivo 57 70 79
t
Pivo D Up 79
t n Swap
down
(57) 58 (70 79) and pivot
Pivo Up
Swap pivot
and down
Best Case
Performance(ne
arly)
Time O(n log2 n)
Complexity:
Worst Case Performance O(n2) Average Case
Performance
Chapter 1: Introduction to Data Structures
41
Merge Sort:
Merge sort is based on Divide and conquer method. It takes the list to be sorted and divide it
in half to create two unsorted lists. The two unsorted lists are then sorted and merged to get
a sorted list. The two unsorted lists are sorted by continually calling the merge-sort
algorithm; we eventually get a list of size 1 which is already sorted. The two lists of size 1 are
then merged.
1. Divide the input which we have to sort into two parts in the middle. Call it the left part
and right part.
2. Sort each of them separately. Note that here sort does not mean to sort it using some
other method. We use the same function recursively.
Input the total number of elements that are there in an array (number_of_elements). Input the
array (array[number_of_elements]). Then call the function MergeSort() to sort the input array.
MergeSort() function sorts the array in the range [left,right] i.e. from index left to index right
inclusive. Merge() function merges the two sorted parts. Sorted parts will be from [left, mid] and
[mid+1, right]. After merging output the sorted array.
MergeSort() function:
It takes the array, left-most and right-most index of the array to be sorted as arguments. Middle
index (mid) of the array is calculated as (left + right)/2. Check if (left<right) cause we have to
sort only when left<right because when left=right it is anyhow sorted. Sort the left part by
calling MergeSort() function again over the left part MergeSort(array,left,mid) and the right
part by recursive call of MergeSort function as MergeSort(array,mid + 1, right). Lastly merge
the two arrays using the Merge function.
Merge() function:
It takes the array, left-most , middle and right-most index of the array to be merged as
arguments. Finally copy back the sorted array to the original array.
Source Code:
result =
[] i, j = 0,
0
Chapter 1: Introduction to Data Structures
43
i+= 1
else:
result.append(right[j
]) j+= 1
if i == len(left) or j ==
len(right):
result.extend(left[i:] or
right[j:]) break
return result
def
mergesort(list): if
len(list) < 2:
return list
middle = int(len(list)/2)
left =
mergesort(list[:middle])
right =
mergesort(list[middle:])
return merge(left,
5, 6, 7]
print("Given array is")
print(seq
);
print("\
n")
print("Sorted array
is")
print(mergesort(seq)
)
Step-by-step example:
Merge Sort Example
Time Complexity:
log2 n)