0% found this document useful (0 votes)
171 views

DSA Unit 1

The document outlines the topics covered in 5 units of a course on data structures and algorithms. Unit I introduces recursion, asymptotic notations, and sorting algorithms. Unit II covers linear data structures like lists, stacks, queues, and their array and linked implementations. Unit III discusses non-linear data structures like binary trees, binary search trees, and AVL trees. Unit IV focuses on priority queues, heaps, hashing, and rehashing. Unit V presents disjoint set data structures and graph algorithms. The document also lists references for further reading.

Uploaded by

Harish ram
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
171 views

DSA Unit 1

The document outlines the topics covered in 5 units of a course on data structures and algorithms. Unit I introduces recursion, asymptotic notations, and sorting algorithms. Unit II covers linear data structures like lists, stacks, queues, and their array and linked implementations. Unit III discusses non-linear data structures like binary trees, binary search trees, and AVL trees. Unit IV focuses on priority queues, heaps, hashing, and rehashing. Unit V presents disjoint set data structures and graph algorithms. The document also lists references for further reading.

Uploaded by

Harish ram
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 125

Unit I

INTRODUCTION
Mathematics Review, Introduction to recursion,
Asymptotic notations, sequential search,
Binary Search - sorting algorithms: Insertion
sort, shell sort, merge sort, quick sort.
Unit II
LINEAR DATA STRUCTURES
List ADT - Stack ADT - Queue ADT - Array and
Linked Implementations - Applications.
Unit III
NON LINEAR DATA STRUCTURES
Binary Trees - Binary Search Tree - Adelson
Velski Landis(AVL) Trees - Tree Traversals - B-
Trees.
Unit IV
PRIORITY QUEUE AND HASHING
Priority Queue - Binary Heap - Heapsort - Hash
functions - separate chaining, open addressing
- rehasing - Extendible hashing.
Unit V
DISJOINT SET ADT AND GRAPH ALGORITHMS
Basic data structure - smart union algorithms -
path compression - Topological sort - Shortest
path algorithms - Minimum spanning tree.
References
1. Mark Allen Weiss, Data Structures and Algorithm
Analysis in C, Second Edition, Pearson Education,
2015.
2. Thomas H Cormen, Charles E Leiserson, Ronald L
Rivest, Clifford Stein, Introduction to Algorithms,
Third Edition, MIT Press, 2014.
3. Ellis Horowitz, Sartaj Sahni, Susan Anderson Freed,
Fundamentals of Data Structures in C, Second
Edition, Universities Press, 2008.
4. Gilberg, Data Structures: A Pseudocode Approach
with C, Second Edition, Cengage Learning, 2007.
NEED FOR DATA
STRUCTURES AND
MATHEMATICS REVIEW
INTRODUCTION
• Typically a computer is a data processing
machine

• Almost all the general purpose computing


machines contain the following components:
– Secondary storage(Hard Disk)
– Primary Storage(Main memory/RAM)
– Processing unit
Contd…
• The secondary storage operates at lower
speed than the primary storage and the data
stored in it sustains even without a power
supply.

• While on the contrary, the data stored in


primary storage is wiped out when the power
gets disconnected.
Contd…
• Generally the data is stored in the secondary
storage device and it is brought into main
memory when the processing is required.

• Since processor works at high speed, it is


convenient for a processor to work with main
memory than with the secondary storage
device.
Data Structures
“Clever” ways to organize information in order
to enable efficient computation

– What do we mean by clever?


– What do we mean by efficient?

*
Contd..
• The field of data structures mainly deals with
organization of data in the main
memory.
• What is data structure?
– Defines how the data is
organized / structured
Terminology
• Abstract Data Type (ADT)
– Mathematical description of an object with set of
operations on the object. Useful building block.
• Algorithm
– A high level, language independent, description of a step-
by-step process
• Data structure
– A specific family of algorithms for implementing an abstract
data type.
• Implementation of data structure
– A specific implementation in a specific language

*
Terminology examples
• A stack is an abstract data type supporting push,
pop and isEmpty operations
• A stack data structure could use an array, a linked
list, or anything that can hold data

*
Concepts vs. Mechanisms
• Abstract • Concrete
• Pseudocode • Specific programming language
• Algorithm • Program
– A sequence of high-level, – A sequence of operations in a specific
language independent programming language, which may
operations, which may act act upon real data in the form of
upon an abstracted view of numbers, images, sound, etc.
data. • Data structure
• Abstract Data Type (ADT) – A specific way in which a program’s
– A mathematical description of data is represented, which reflects
an object and the set of the programmer’s design
operations on the object. choices/goals.

*
Need for Data Structure
• To have efficiency in computation

General idea:
When something is organized properly, it
is very easy to have processing /
computation.
Definition
• Data structures is a way of organizing data
elements in a computer memory in a
manner that is convenient to perform
operations on it.
• When elements of data are organized
together in terms of some relationship
among the elements, the organization is
called DS.
Classification of data Structures
• Primitive data type
– Int
– Float
– Character

• Non primitive data type


– Arrays
– Strings
Contd..
• Abstract Data type
– Focusing on what it does and ignoring how it does
its job.
– considered apart from the detailed specifications
or implementation
– A non primitive data type with set of operations
defined on it
– Stack
– Queue
Abstract Data Types
Abstract Data Type (ADT): a definition for a data
type solely in terms of a set of values and a set
of operations on that data type.
Each ADT operation is defined by its inputs and
outputs.
Encapsulation: Hide implementation details.
Is a mathematical model with a collection of
operations defined on that model.

*
Data Structure
• A data structure is the physical implementation of
an ADT.
– Each operation associated with the ADT is
implemented by one or more subroutines in the
implementation.

• Data structure usually refers to an organization


for data in main memory.

• File structure is an organization for data on


peripheral storage, such as a disk drive.
*
Logical vs. Physical Form
Data items have both a logical and a physical
form.

Logical form: definition of the data item within


an ADT.
– Ex: Integers in mathematical sense: +, -

Physical form: implementation of the data item


within a data structure.
– Ex: 16/32 bit integers, overflow.
*
Algorithms and Programs
Algorithm: a method or a process followed to
solve a problem.
– A recipe.
An algorithm takes the input to a problem
(function) and transforms it to the output.
– A mapping of input to output.
A problem can have many algorithms.
Algorithms: how to access data for a result
Algorithms: how to provide a smart solution
*
Algorithm Properties
An algorithm possesses the following
properties:
– It must be correct.
– It must be composed of a series of concrete steps.
– There can be no ambiguity as to which step will be
performed next.
– It must be composed of a finite number of steps.
– It must terminate.

A computer program is an instance, or


concrete representation, for an algorithm in
some programming language.
*
Importance of Data Structures
• Data Structures + Algorithms = Programs

• Most important steps in Program design


– Selecting an appropriate and efficient data
structure
– Selection of efficient algorithm
BENEFITS
1. It gives different level of organization of
data. 
2. It tells how to data can be stored and
accessed at elementary level.
3. Efficient utilization of memory.
4. Able to analyze the running time of an
algorithm at different inputs so that we could
identify an efficient algorithm which saves
time as well as maintaining scalability.
ASSESSMENT
• The primary goal of the computer is
– to perform calculations
– to store information
– to retrieve information
– both b and c

Which one of the following allows you to measure the inherent difficulty of a
problem.
– a) Asymptotic analysis
– b) Program
– c) Algorithm
– d)  Semantic analysis
Contd…
• The performance of an algorithm often depends on
– Program
– Algorithm
– Data structure
– All of the above

• An array element is accessed using


– a first-in first out approach
– the dot operator
– a member name
– an index number
Mathematics review
• Exponents
• Logarithms
• Series
• Modular arithmetic
• The P word
Exponents
The exponent of a number says how many times to use the
number in a multiplication.
In 82 the "2" says to use 8 twice in a multiplication,
so 82 = 8 × 8 = 64.
Rules:
XA XB=XA+B
XA/XB=XA-B
(XA)B=XAB
XN+XN=2XN
2N+2N=2N+1
LOGARITHMS
In CS, all logarithms are to the base 2 unless specified.
Defn:
XA=B if and only if logx B=A 23=8 in log form is log2 8=3

Theorem:
logAB = logc B/logcA, C>0
log AB = log A+log B
log A/B = log A-log B
log(AB) = B log A
Log 1 = 0, log 2 =1, log 1,024=10
Series
Modular arithmetic
For a positive integer n, two integers a and b are
said to be congruent modulo n, written:
a is congruent to b modulo n
A Ξ B(mod N)
if their difference a − b is an integer multiple of n (or n divides a − b). The number
n is called the modulus of the congruence. 38=14 (mod12)

which also states that


A+C Ξ B+C(mod N)
AD Ξ BD (mod N)
The P word
Proof by induction
1. Proving a base case : establishing that a theorem is true for
some small values.
2. Inductive hypothesis : the theorem is assumed to be true for all
cases up to some limit k.
3. Given this assumption, show that the theorem is true for k+ 1.

 
Recursion
• A function that is defined in terms of itself is called
recursion.
• a recursive call will keep on being made until a base
case is reached.
• Value for which the function is directly known is called
base case
• Remember:
Recursion is meaningless without a base case.
int factorial(int n)
{
if n==1 or n==0 return 1;// base case
Else
return n * factorial(n-1); }
FAQ -1
1. Which Data Structure is used to perform Recursion?
a. Queue b. Stack c. Linked List d.
Tree

2. What’s happen if base condition is not defined in


recursion ?
a)Stack underflow b) Stack Overflow
c) None of these d) Both a and b
Two essential conditions
• Each time a function calls itself, it must be
“nearer”, in some sense, to a solution.

• There must be a decision criterion for


stopping the process or computation.
Fundamental rules of recursion
• Base cases(without recursion)
• Making progresses(the rec. call must always be to a case
that makes progress towards a base case)
• Design rule(all recursive calls work)
• Compound interest rule(no separate recursive calls)
General algorithm model for any recursive
procedure
• Step 1: Save the parameters, local variables and return
address.

• Step 2: If the base criterion has been reached, then


perform the final computation and go to step 3.
otherwise, perform the partial computation and go to
step 1 (initiate a recursive call).

• Step 3: Restore the most recently saved parameters,


local variables and return address. Go to this return
address.
Two types of recursion
• Recursively defined functions (primitive recursive functions)
– Eg: Factorial function
• Non primitive Recursive functions
– Ackermmann’s function
Example for recursion
#include<stdio.h>
int fact(int);
int main(){
  int num,f;
  printf("\nEnter a number: ");
  scanf("%d",&num);
  f=fact(num);
  printf("\nFactorial of %d is: %d",num,f);
  return 0;
}
int fact(int n){
   if(n==1)
       return 1;
   else
       return(n*fact(n-1));  }
FAQ 2
#include <stdio.h>
 int fun(int n)
{
    if (n == 4)
       return n;
    else return 2*fun(n+1);
}
 int main()
{
   printf("%d ", fun(2));
   return 0;
}
 
a.4 b.8 c.16 d.Runtime error
SO1- Algorithms
• What is an Algorithm?
– a clearly specified set of simple instructions to be
followed to solve a problem
• Takes a set of values, as input and

• produces a value, or set of values, as output

– May be specified

• In English

• As a computer program

• As a pseudo-code
SO2 - Why Analysis of
Algorithms?
• If we want to go from city A to city B, there can be many ways
of doing this: by flight, by bus, by train and also by cycle.
• Depending on the availability and convenience we choose the
one which suits us.
• Similarly, in computer science there can be multiple algorithms
exist for solving the same problem .
• Example: To sort the given numbers; there are so many sorting
algorithms available (Insertion sort, merge sort, selection sort,
etc)
How to analyze the algorithms?

• It is the task of determining how much computing

time and storage needed by the algorithms.

• The efficiency of an algorithm depends on the

performance of the algorithm which is based on

computing time and space occupied by the algorithm


What is Running Time Analysis?
• It is the process of determining how processing time increases as the size of the
problem increases.
• Input size is number of elements in the input.

In general, we encounter the following types of inputs.

– Size of an array

– Polynomial degree

– Number of elements in a matrix

– Number of bits in binary representation of the input

– Vertices and edges in a graph


Theoretical analysis of time efficiency
Time efficiency is analyzed by determining the
number of repetitions of the basic operation as a
function of input size

• Basic operation: theinput


operation
size that contributes the
most towards the running time of the algorithm

running time execution time Number of times


for ≈
basic operation basic operation is
T(n) coropcost
C(n) executed

Note: Different basic operations may cost differently!


Types of Analysis
• Worst case

– Defines the input for which the algorithm takes huge time.

– Input is the one for which the algorithm runs the slower.
• Best case

– Defines the input for which the algorithm takes lowest time.

– Input is the one for which the algorithm runs the fastest.
• Average case

– Provides a prediction about the running time of the algorithm

– Assumes that the input is random

Lower Bound <= Average Time <= Upper Bound


SO3 - Big Oh Notation
• This notation gives the tight upper bound of the given function.

• That means, at larger values of n , the upper bound of f(n) is g(n).


• Example: if f(n)=n4+100n2+10n+6 then,
g(n)=n4
• That means g(n) gives the maximum rate of growth for f(n) at larger
values of n.
O-notation
Big Omega Notation
Big  Notation
Big Theta Notation
• This notation decides whether the upper and lower
bounds of a given function (algorithm) are same or not.
• The average running time of algorithm is always
between lower bound and upper bound.
• If the upper bound (O) and lower bound () gives the
same result then notation will also have the same rate of
growth.
-notation
SO-4 Guidelines for Asymptotic Analysis

• Big O notation (O) ignores all constant factors


so that you're left with an upper bound on
growth rate.
• For a single line statement like assignment,
where
int the running
index time is independent
= 5; *//constant time* of the
input size n, the time complexity would be O(1)
int item = list[index]; *//constant time*
Guidelines for Asymptotic Analysis
• For Loop

– The running time of a loop is, at most, the running time


of the statements inside the loop (including tests)
multiplied by the number of iterations.
– Example: for(i=1;i<=n;i++) // executes n times

m=m+1; // constant c

Total time = c x n = cn => O(n) // ignore constant


Guidelines for Asymptotic Analysis
• Nested loops:
– Analyze from inside out. Total running time is the
product of the sizes of all the loops.
– Example: for(i=1;i<=n;i++)
{
for(j=1;j<=n;j++)
{
k=k+1;
}
}
Total time = c x n x n = cn2 => O(n2)
Consecutive statements
• Add the time complexities of each statement.
x = x +1; //constant time Total time =
// executed n times
c0+c1n+c2n2=O(n2)
for (i=1; i<=n; i++)
{ m = m + 2; //constant time
}
//outer loop executed n times
for (i=1; i<=n; i++)
{ //inner loop executed n times
for (j=1; j<=n; j++)
{ k = k+1; //constant time }}
If-then-else statements
if(length()!=otherstack.length())
{
return false;
}
else
{
for(int n=0;n<length();n++)
{
if(!list[n].equals(otherstack.list[n]))
return false;
}
}
Total time = c0+c1+(c2+c3)*n=O(n)
SO5- Algorithm using loops
i = 0;
while (i<N) {
X=X+Y; // O(1)
for loop // O(N), just an example...
i++; // O(1)
}
• The body of the while loop: O(N)
• Loop is executed: N times
N x O(N) = O(N2)
if (i<j)
for ( i=0; i<N; i++ )
X = X+i;
else
X=0;

Max ( O(N), O(1) ) = O (N)


Logarithmic complexity
• An algorithm is O(log n) if it takes a constant time
to cut the problem size by a fraction (usually by
1/2).
• Example: for(i=1;i<n) { i=i*2; }
• If we observe carefully, the value of i is doubling
every time. Initially i=1, in next step i=2, and in
subsequent steps i=4,8 and so on.
Logarithmic complexity

• Let us assume that the loop is executing some


k times. That means at kth step 2i=n and we
come out of loop. So, if we take logarithm on
both sides,
Basic asymptotic efficiency
classes
1 constant

log n logarithmic

n linear

n log n n-log-n

n2 quadratic

n3 cubic

2n exponential

n! factorial
Values of some important functions as n  
Assessment

1. Loops that contain loops are known as ____________.


a. nested loops
b. loops
c. loop within loops
d. array loop

2. The efficiency of a logarithmic loop is _______________.


a. log n
b. n log n
c. n!
d. n
Contd..

3. The efficiency of a linear loop is __________.


a. n log n
b. n!
c. n
d. log n

4. The efficiency of a quadratic loop is __________.


a. n
b. n2
c. n log n
d. log n
Contd..

5. The simplification of efficiency is known as


____________.

a. big-O notation
b. big-Ω notation
c. big-C notation
d. small-O notation
Searching
• Searching through a lot of records for a specific record or set
of records
• Placing records in order, which we call sorting
• Searching algorithms
 Sequential (Linear) Search
 Ordered
 Unordered
 Binary Search
Linear Search
• Searching is the process of determining whether or not a
given value exists in a data structure or a storage media.
•  Two searching methods on one-dimensional arrays:
linear search and binary search.
• The linear (or sequential) search algorithm on an array is:
– Sequentially scan the array, comparing each array item with the searched value.
– If a match is found; return the index of the matched element; otherwise return –
1.

• Note: linear search can be applied to both sorted and unsorted


arrays.

*
Sequential Search on an Ordered File
Basic algorithm:
Procedure ls(l,n,k)
{
i=0;
while ((i<n) and (k>l[i]))
{
i=i+1;
}
if (k==l[i])
printf(“key is found”);
else
printf(“key is not found”);
}
Example: 16,18,56,78,89,90,100
search element:78
Sequential Search on an Unordered File
Basic algorithm:
Procedure ls(l,n,k)
{
i=0;
while ((i<n) and (l[i]!= k))
{
i=i+1;
}
if (k==l[i])
printf(“key is found”);
else
printf(“key is not found”);
}
Example: 23,14,98,45,67,53
search element: 53
Linear Search
#include <stdio.h>   scanf("%d", &k);
int main() for (i = 0; i < n; i++)
{ int a[10], k, i, n;   { if (a[i] == k) /* if required
printf("Enter the number of element found */
elements in array\n"); { printf("%d is present at
scanf("%d",&n);   location %d.\n", k, i+1);
printf("Enter the numbers:”); break; } }
for (i = 0; i < n; i++) if (i == n)
scanf("%d", &a[i]);   printf("%d is not present in
  printf("Enter the number to array.\n", k);  
search\n"); return 0; }
Search Algorithms
How a Binary Search Works

 Always look at the center


value. Each time you get
to discard half of the
remaining list.

Is this fast ?
Binary Search
O(log2 n)

• A binary search looks for an item


in a list using a divide-and-
conquer strategy
Binary Search
 Binary search algorithm assumes that the items in
the array being searched are sorted
 The algorithm begins at the middle of the array in a
binary search
 If the item for which we are searching is less than
the item in the middle, we know that the item won’t
be in the second half of the array
 Once again we examine the “middle” element
 The process continues with each comparison cutting
in half the portion of the array where the item might
be
Binary Search
bool BinSearch(double list[ ], int n, double item, int&index)
{
int left=0;
int right=n-1;
int mid;
while(left<=right)
{
mid=(left+right)/2;
if (item> list [mid])
{
left=mid+1;
}
else if (item< list [mid])
{
right=mid-1;
}
else
{
item= list [mid];
index=mid;
return true;
}
}// while
return false;
}
Using Recursion
#include<stdio.h>
int main(){ int binary(int a[],int n,int m,int l,int u){
    int a[10],i,n,m,c,l,u;      int mid,c=0;
    printf("Enter the size of an array: ");      if(l<=u){
    scanf("%d",&n);           mid=(l+u)/2;
    printf("Enter the elements of the array:           if(m==a[mid]){
" );               c=1;
    for(i=0;i<n;i++){           }
         scanf("%d",&a[i]);
          else if(m<a[mid]){
    }
    printf("Enter the number to be search: ");
              return binary(a,n,m,l,mid-1);
    scanf("%d",&m);           }
    l=0,u=n-1;           else
    c=binary(a,n,m,l,u);               return binary(a,n,m,mid+1,u);
    if(c==0)      }
         printf("Number is not found.");      else
    else        return c;
         printf("Number is found."); }
    return 0;  }
Binary Search
ASSESSMENT

Binary Search is best for


a)unsorted records in a linked list
b)unsorted records in an array
c)sorted records in a linked list
d)sorted records in an array
Contd..

Assume that a recursive binary search


routine is a part of a larger program.  Then,
the best method to pass the input in order to
minimize the storage required is

a)pass by value
b)pass by reference
c)declare it in a global array variable.
CONTD..

Assuming that the probability of a request


for a particular record is the same as any
other record, the average and worst case
search times of sequential search are
proportional to

a)O(log n) base 2
b)O(log n) base 10
c)O(n)
d)O(n x n)
Classification of sorting
Internal sorting
•The data resides in the MM of the computer.
Disadvantages
•Amount of MM available is smaller than amount of
data
•The MM is a volatile device
External sorting
•The data is stored on the secondary storage device
Various sorting techniques
• Internal sorting
– Bubble sort
– Insertion sort
– Shell sort
– Selection sort
– Merge sort
– Radix sort
– Heap sort
– Quick sort
• External sorting
– Multiway merge
– Polyphase merge
– Replacement selection
INSERTION SORTING
• Real life example:
– An example of an insertion sort occurs in
everyday life while playing cards.
– To sort the cards in your hand you extract a card,
shift the remaining cards, and then insert the
extracted card in the correct place.
– This process is repeated until all the cards are in
the correct sequence.
Insertion Sort: Idea
1. We have two group of items:
– sorted group, and
– unsorted group
2. Initially, all items in the unsorted group and the sorted group
is empty.
– We assume that items in the unsorted group unsorted.
– We have to keep items in the sorted group sorted.
3. Pick any item from, then insert the item at the right position
in the sorted group to maintain sorted property.
4. Repeat the process until the unsorted group becomes empty.
Insertion Sort: Example

40 2 1 43 3 65 0 -1 58 3 42 4

2 40 1 43 3 65 0 -1 58 3 42 4

1 2 40 43 3 65 0 -1 58 3 42 4
Insertion Sort: Example

1 2 40 43 3 65 0 -1 58 3 42 4

1 2 3 40 43 65 0 -1 58 3 42 4

1 2 3 40 43 65 0 -1 58 3 42 4
Insertion Sort: Example

1 2 3 40 43 65 0 -1 58 3 42 4

0 1 2 3 40 43 65 -1 58 3 42 4

-1
0 1
0 2
1 3
2 40
3 40
43 43
65 65 58 3 42 4
Insertion Sort: Example
-1
0 1
0 2
1 3
2 40
3 40
43 43
65 58 65 3 42 4

-1
0 1
0 2
1 3
2 40
3 43
3 40
65 43
43 58 58
65 65 42 4

-1
0 1
0 2
1 3
2 40
3 43
3 40
65 42 43
43 65 58 65 4

-1
0 1
0 2
1 3
2 40
3 43
3 43
65
4 40
42 42
65 43
43 58 58
65 65
Algorithm
for( i=1;i<=n-1;i++)
{
temp=a[i];
j=i-1;
while( j>=0 && a[j]>temp)
{
a[j+1] = a[j];
j=j-1;
}
a[j+1]=temp;
}
Example: 70, 30, 20, 50, 60, 10
Insertion Sort: Analysis
• Running time analysis:
– Worst case: O(N2)
– Best case: O(N)
FA Questions
1. ____________an operation that segregates items into groups
according to specified criterion.
a.       Sorting b.      Searching c.       Inserting d.      Deleting
2. Sorting can be done in _________ is called internal sorting.
a.       main memory b.      secondary memory c.       auxiliary memory
d.      disks
3. __________ is an internal sorting algorithm
a.       poly-phase merge sort b.      replacement selection
c.       insertion sort d.      2-way merge sort
Shell Sort -General Description
– Divides an array into several smaller non-contiguous
segments
– The distance between successive elements in one
segment is called a gap.
– Each segment is sorted within itself using insertion sort.
– Then resegment into larger segments (smaller gaps)
and repeat sort.
– Continue until only one segment (gap = 1) - final sort
finishes array sorting.
Algorithm
for( int k = n/2; k > 0; k /= 2 )
for( i = k; i<n; i++ )
{
t = a[i];
for( j = i; j >= k; j -= k)
if( t < a[j-k] )
a[j] = a[j-k];
else
break;
a[j] = t;
}
Example: 25, 57, 48, 37, 12, 92, 89, 33
Assessment
1.Diminishing increment sort is a ____________
sort.
a) Bubble b) insertion c) selection d) shell
Ans:d
2.Shell sort is a _____ sorting.
a) Internal b) external c) searching d) merging
Ans:a
Divide and Conquer
• Recursive in structure
– Divide the problem into sub-problems that are
similar to the original but smaller in size
– Conquer the sub-problems by solving them
recursively. If they are small enough, just solve
them in a straightforward manner.
– Combine the solutions to create a solution to
the original problem
An Example: Merge Sort
Sorting Problem: Sort a sequence of n elements into
non-decreasing order.
• Divide: Divide the n-element sequence to be
sorted into two subsequences of n/2 elements
each
• Conquer: Sort the two subsequences recursively
using merge sort.
• Combine: Merge the two sorted subsequences to
produce the sorted answer.
Mergesort
•Mergesort (divide-and-conquer)
– Divide array into two halves.

A L G O R I T H M S

A L G O R I T H M S divide
Mergesort
•Mergesort (divide-and-conquer)
– Divide array into two halves.
– Recursively sort each half.

A L G O R I T H M S

A L G O R I T H M S divide

A G L O R H I M S T sort
Mergesort
•Mergesort (divide-and-conquer)
– Divide array into two halves.
– Recursively sort each half.
– Merge two halves to make sorted whole.
A L G O R I T H M S

A L G O R I T H M S divide

A G L O R H I M S T sort

A G H I L M O R S T merge
Merging
•Merge.
– Keep track of smallest element in each sorted half.
– Insert smallest of two elements into auxiliary array.
– Repeat until done.

smallest smallest

A G L O R H I M S T

A auxiliary array
Merging
•Merge.
– Keep track of smallest element in each sorted half.
– Insert smallest of two elements into auxiliary array.
– Repeat until done.

smallest smallest

A G L O R H I M S T

A G auxiliary array
Merging
•Merge.
– Keep track of smallest element in each sorted half.
– Insert smallest of two elements into auxiliary array.
– Repeat until done.

smallest smallest

A G L O R H I M S T

A G H auxiliary array
Merging
•Merge.
– Keep track of smallest element in each sorted half.
– Insert smallest of two elements into auxiliary array.
– Repeat until done.

smallest smallest

A G L O R H I M S T

A G H I auxiliary array
Merging
•Merge.
– Keep track of smallest element in each sorted half.
– Insert smallest of two elements into auxiliary array.
– Repeat until done.

smallest smallest

A G L O R H I M S T

A G H I L auxiliary array
Merging
•Merge.
– Keep track of smallest element in each sorted half.
– Insert smallest of two elements into auxiliary array.
– Repeat until done.

smallest smallest

A G L O R H I M S T

A G H I L M auxiliary array
Merging
•Merge.
– Keep track of smallest element in each sorted half.
– Insert smallest of two elements into auxiliary array.
– Repeat until done.

smallest smallest

A G L O R H I M S T

A G H I L M O auxiliary array
Merging
•Merge.
– Keep track of smallest element in each sorted half.
– Insert smallest of two elements into auxiliary array.
– Repeat until done.

smallest smallest

A G L O R H I M S T

A G H I L M O R auxiliary array
Merging
•Merge.
– Keep track of smallest element in each sorted half.
– Insert smallest of two elements into auxiliary array.
– Repeat until done.
first half
exhausted smallest

A G L O R H I M S T

A G H I L M O R S auxiliary array
Merging
•Merge.
– Keep track of smallest element in each sorted half.
– Insert smallest of two elements into auxiliary array.
– Repeat until done.

first half
exhausted smallest

A G L O R H I M S T

A G H I L M O R S T auxiliary array
Algorithm
i=i+1; k=k+1; }
if (l<h)
{ else // smaller element is in right
m=l+h/2; {
sort(a,l,m); t[k]=a[j];j=j+1; k=k+1; }
sort(a,m+1,h); }
combine(a,l,m,h); //copy remaining elements of left to temp
} while(i<=m)
combine(a[n],l,m,h) {
{ t[k]=a[i]; i=i+1; k=k+1;
k=l; //index for temp }
i=l;// index for left sublist // copy remaining elements of right to temp
j=m+1; //index for right sublist while (j<=h)
while(i<=m && j<= h) {
{ t[k]=a[j];
if)a[i]<=a[j]) // smaller element is in left j=j+1;
{ k=k+1;
t[k]=a[i]; }
Analysis of Merge Sort
• Running time T(n) of Merge Sort:
• Divide: computing the middle takes (1)
• Conquer: solving 2 sub problems takes 2T(n/2)
• Combine: merging n elements takes (n)
• Total:
T(n) = (1) if n = 1
T(n) = 2T(n/2) + (n) if n > 1
 T(n) = (n lg n)

Comp 122
Do by yourself
• 10,3,54,23,25,5,75,1,453,36,68,51

• 10,5,7,6,1,4,8,3,2,9
Assessment

1. Merge sort uses


A. Divide and conquer strategy
B. Backtracking approach c. Heuristic search
d. Greedy approach
Ans:a
Quick sort
Quick sort
• Also called as sorting by exchange or transposition
or partition exchange sort.
• Choose some element called a pivot
• Perform a sequence of exchanges so that
– All elements that are less than this pivot are to its left
and
– All elements that are greater than the pivot are to its
right.
• Divides the (sub)list into two smaller sub lists,
• Each of which may then be sorted independently
in the same way.
Basic Idea
• Pick one element in the array, which will be the
pivot.
• Make one pass through the array, called a
partition step, re-arranging the entries so that:
• the pivot is in its proper place.
• entries smaller than the pivot are to the left of the pivot.
• entries larger than the pivot are to its right.
• Recursively apply quicksort to the part of the
array that is to the left of the pivot,
and to the right part of the array.
Choosing the pivot
• Choosing the pivot is an essential step.
Depending on the pivot the algorithm may run very fast, or
in quadric time.:
– Some fixed element: e.g. the first, the last, the one in the middle
This is a bad choice - the pivot may turn to be the smallest or the
largest element, then one of the partitions will be empty.
– Randomly chosen (by random generator ) - still a bad choice.
– The median of the array (if the array has N numbers, the median
is the [N/2] largest number. This is difficult to compute -
increases the complexity.
– The median-of-three choice: take the first, the last and the
middle element.
– Choose the median of these three elements.
Algorithm
If lb<ub
i=lb;
j=ub+1;
pivot=a[lb];
while (a[i]<pivot) //scan the keys from L to R
i=i+1;
while (a[j]>pivot) // scan the keys from R to L
j=j-1;
if(i<=j)
swap(a[i] and a[j]);
else
swap ( a[lb] and a[j]; //when I crosses j
Example
42 23 74 11 65 58 94 36 99 87
ASSESSMENT
A ____________ more efficient exchange sorting scheme than bubble sort.
a. quick sort b. merge sort c. heap sort d. radix sort
Ans:a
2. A __________ uses a divide and conquer strategy
a. quick sort b. merge sort c. both a &b d. heap sort
Ans:c
3. Pivot element is used in _____ sort
a. merge sort b.quick sort c. heap sort d.  selection sort
Ans:b
CONTD..
4. __________ is the fastest sorting algorithm
a.      Merge b.      Quick c.       Bubble d.      Insertion
Ans:b
5. The _________ sort is faster than merge sort.
a.      Insertion b.      Heap c.       Quick d.      Radix
Ans:c
Do by Yourself
• 8, 3, 25, 6, 10, 17, 1, 2, 18, 5
• 50 30 10 90 80 20 40 70

You might also like