0% found this document useful (0 votes)

53 views163 pages

Cd3291 - Dsa - Book

Uploaded by

Arasu P

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views163 pages

Cd3291 - Dsa - Book

Uploaded by

Arasu P

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 163

CD3291 DATA STRUCTURES & ALGORITHMS

COURSE OBJECTIVES:
● To understand the concepts of ADTs
● To design linear data structures – lists, stacks, and queues
● To understand sorting, searching and hashing algorithms
● To apply Tree and Graph structures
UNIT I ABSTRACT DATA TYPES 9
Abstract Data Types (ADTs) – ADTs and classes – introduction to OOP – classes in
Python – inheritance – namespaces – shallow and deep copying. Introduction to analysis of
algorithms – asymptotic notations – recursion – analyzing recursive algorithms
UNIT II LINEAR STRUCTURES 9
List ADT – array-based implementations – linked list implementations – singly linked
lists – circularlylinked lists – doubly linked lists – applications of lists – Stack ADT – Queue
ADT – double ended queues
UNIT III SORTING AND SEARCHING 9
Bubble sort – selection sort – insertion sort – merge sort – quick sort – linear search
– binary search– hashing – hash functions – collision handling – load factors, rehashing, and
efficiency
UNIT IV TREE STRUCTURES 9
Tree ADT – Binary Tree ADT – tree traversals – binary search trees – AVL trees –
heaps – multiway search trees
UNIT V GRAPH STRUCTURES 9
Graph ADT – representations of graph – graph traversals – DAG – topological
ordering – shortest paths – minimum spanning trees

TOTAL: 45 HOURS
COURSE OUTCOMES:
At the end of the course, the student should be able to:

 Explain abstract data types.

 Design, implement, and analyse linear data structures, such as lists, queues, and
stacks, according to the needs of different applications.
 Design, implement, and analyse efficient tree structures to meet requirements such
as searching, indexing, and sorting.
 Model problems as graph problems and implement efficient graph algorithms to solve
them.
TEXT BOOKS:
1. Michael T. Goodrich, Roberto Tamassia, and Michael H. Goldwasser, “Data Structures
and Algorithms in Python” (An Indian Adaptation), Wiley, 2021.
2. Lee, Kent D., Hubbard, Steve, “Data Structures and Algorithms with Python” Springer
Edition 2015.
3. Narasimha Karumanchi, “Data Structures and Algorithmic Thinking with Python”
Careermonk, 2015.

1
UNIT I - ABSTRACT DATA TYPES

Variables

Variables are placeholders for representing data. In computer programming,

variables are used to hold data.

Ex: x2+2y-2=1

Data Types

A data type in a programming language is a set of data with predefined values.

Examples of data types are: integer-2 Bytes, floating point-4 Bytes, unit number, character,
string, etc.

At the top level, there are two types of data types:

• System-defined data types (also called Primitive data types)

• User-defined data types

System-defined data types (Primitive data types)

Data types that are defined by system are called primitive data types. The primitive
data types provided by many programming languages are: int, float, char, double, bool, etc.

The number of bits allocated for each primitive data type depends on the
programming languages, the compiler and the operating system.

For the same primitive data type, different languages may use different sizes.
Depending on the size of the data types, the total available values (domain) will also change.

For example, “int” may take 2 bytes or 4 bytes. If it takes 2 bytes (16 bits), then the
total possible values are minus 32,768 to plus 32,767 (-215 to 215-1). If it takes 4 bytes (32
bits), then the possible values are between -2,147,483,648 and +2,147,483,647 (-231 to
231-1). The same is the case with other data types.

2
User defined data types

If the system-defined data types are not enough, then most programming languages
allow the users to define their own data types, called user – defined data types.

Good examples of user defined data types are: structures in C/C + + and classes in
Java.

For example, in the snippet below, we are combining many system-defined data
types and calling the user defined data type by the name “newType”.

This gives more flexibility and comfort in dealing with computer memory.

struct newType
{
int data1;
float data2;
.
.
.
char datan;
};

Data Structures
Data structure is a particular way of storing and organizing data in a computer so that
it can be used efficiently.

A data structure is a special format for organizing and storing data. General data
structure types include arrays, files, linked lists, stacks, queues, trees, graphs and so on.

Depending on the organization of the elements, data structures are classified into two types:

1) Linear data structures: Elements are accessed in a sequential order but it is not
compulsory to store all elements sequentially. Examples: Linked Lists, Stacks andQueues.

2) Non – linear data structures: Elements of this data structure are stored/accessed
in a non-linear order. Examples: Trees and graphs.

Note: For system-defined data types, by default the system supports

implementations/operations like addition, subtraction etc.

3
Abstract Data Types (ADTs)
For user-defined data types we also need to define operations. The implementation
for these operations can be done when we want to actually use them. That means, in
general, user defined data types are defined along with their operations.

To simplify the process of solving problems, we combine the data structures with
their operations and we call this Abstract Data Types (ADTs). An ADT consists of two parts:

1. Declaration of data
2. Declaration of operations

Commonly used ADTs include: Linked Lists, Stacks, Queues, Priority Queues, Binary
Trees, Dictionaries, Disjoint Sets (Union and Find), Hash Tables, Graphs, and many others.

Object-Oriented Design Goals:

Software implementations should achieve robustness, adaptability, and reusability.

Robustness
A program produces the right output for all the anticipated inputs in the program’s
application. In addition, we want software to be robust, that is, capable of handling
unexpected inputs that are not explicitly defined for its application.

For example, if a program is expecting a positive integer and instead is given a

negative integer, then the program should be able to recover gracefully from this error.

Adaptability
Software, therefore, needs to be able to evolve over time in response to changing
conditions in its environment. Thus, another important goal of quality software is that it
achieves adaptability (also called evolvability).

Related to this concept is portability, which is the ability of software to run with
minimal change on different hardware and operating system platforms. An advantage
of writing software in Python is the portability provided by the language itself.
Reusability
Developing quality software can be an expensive enterprise, and its cost can be
offset somewhat if the software is designed in a way that makes it easily reusable in future
applications.

Such reuse should be done with care, however, for one of the major sources of
software errors in the Therac-25 came from inappropriate reuse of Therac-20 software.

4
Object-Oriented Design Principles
Chief principles of object-oriented approach is,
 Modularity
 Abstraction
 Encapsulation
Modularity
Modularity refers to an organizing principle in which different components of a
software system are divided into separate functional units.

Using modularity in a software system can also provide a powerful organizing

framework that brings clarity to an implementation.

In Python, we have already seen that a module is a collection of closely related

functions and classes that are defined together in a single file of source code.

Python’s standard libraries include, for example, the math module, which provides
definitions for key mathematical constants and functions, and the os module, which provides
support for interacting with the operating system.

Abstraction

Applying the abstraction paradigm to the design of data structures gives rise to
abstract data types (ADTs). An ADT is a mathematical model of a data structure that
specifies the type of data stored, the operations supported on them, and the types of
parameters of the operations, the collective set of behaviours supported by an ADT
designed as public interface.

Python supports abstract data types using a mechanism known as an abstract base
class (ABC). An abstract base class cannot be instantiated (i.e., you cannot directly create
an instance of that class), but it defines one or more common methods that all
implementations of the abstraction must have.

An ABC is realized by one or more concrete classes that inherit from the abstract
base class while providing implementations for those method declared by the ABC.

Encapsulation

Another important principle of object-oriented design is encapsulation. Different

components of a software system should not reveal the internal details of their
respective implementations.

One of the main advantages of encapsulation is that it gives one programmer

freedom to implement the details of a component, without concern that other programmers
will be writing code that intricately depends on those internal decisions.

5
Encapsulation yields robustness and adaptability, for it allows the implementation
details of parts of a program to change without adversely affecting other parts, thereby
making it easier to fix bugs or add new functionality with relatively local changes to a
component.

Class Definitions

A class serves as the primary means for abstraction in object-oriented

programming. In Python, every piece of data is represented as an instance of some class.

A class provides a set of behaviours in the form of member functions (also known as
methods), with implementations that are common to all instances of that class.

A class also serves as a blueprint for its instances, effectively determining the way
that state information for each instance is represented in the form of attributes (also known
as fields, instance variables, or data members)

Defining a Class:

Like function definitions begin with the def keyword in Python, class definitions begin
with a class keyword.

The first string inside the class is called docstring and has a brief description of the
class. Although not mandatory, this is highly recommended.

Here is a simple class definition:

class My NewClass:
'''This is a doc string. I have created a new class'''
Pass
A class creates a new local namespace where all its attributes are defined. Attributes
may be data or functions.

There are also special attributes in it that begins with double underscores __. For
example, __doc__ gives us the doc string of that class.

As soon as we define a class, a new class object is created with the same name.
This class object allows us to access the different attributes as well as to instantiate new
objects of that class.

Example:
class Person:
"This is a person class"
age = 10

def greet(self):
print('Hello')

6
print(Person.age)
print(Person.greet)
print(Person.__doc__)

Output:
10
<function Person.greet at 0x7fc78c6e8160>
This is a person class

Creating an Object in Python

We saw that the class object could be used to access different attributes. It can also
be used to create new object instances (instantiation) of that class. The procedure to create
an object is similar to a function call.

>>> harry = Person()

This will create a new object instance named harry. We can access the attributes of
objects using the object name prefix.

Attributes may be data or method. Methods of an object are corresponding functions

of that class. This means to say, since Person.greet is a function object (attribute of class),
Person.greet will be a method object.

Example:
class CreditCard:
def init (self, customer, bank, acnt, limit):
self. customer = customer
self. bank = bank
self. account = acnt
self. limit = limit
self. balance = 0
def get customer(self):
return self. Customer
def get bank(sf):
return self. Bank

def get account(self):

return self. Account
def get limit(self):
return self. Limit

def get balance(self):

return self. Balance
def charge(self, price):
if price + self. balance > self. limit: # if charge would exceed limit,
return False # cannot accept charge
else:
self. balance += price
return True
def make payment(self, amount):
self. balance −= amount

7
The Constructor
__init__ method that serves as the constructor of the class. Its primary
responsibilityis to establish the state of a newly created credit card object with appropriate
instance variables.

Encapsulation
A single leading underscore in the name of a data member, such as balance, implies
that it is intended as non-public. Users of a class should not directly access such members.

In the context of data structures, encapsulating the internal representation allows

usgreater flexibility to redesign the way a class works, perhaps to improve the efficiency of
the structure.

Additional Methods
The most interesting behaviours in our class are charge and make payment. The
charge function typically adds the given price to the credit card balance, to reflect a purchase
of said price by the customer.

Inheritance:

In object-oriented programming, the mechanism for a modular and hierarchical

organization is a technique known as inheritance. This allows a new class to be defined
based upon an existing class as the starting point. In object-oriented terminology, the
existing class is typically described as the base class, parent class, or superclass, while the
newly defined class is known as the subclass or child class.

There are two ways in which a subclass can differentiate itself from its superclass. A
subclass may specialize an existing behaviour by providing a new implementation that
overrides an existing method. A subclass may also extend its superclass by providing brand
new methods.

Syntax:

class BaseClass:
Body of base class

class DerivedClass(BaseClass):
Body of derived class

Derived class inherits features from the base class where new features can be added
to it. This results in re-usability of code.

8
Types of Inheritance
Depending upon the number of child and parent classes involved, there are four
types of inheritance in python.

Single Inheritance
When a child class inherits only a single parent class.
Example:
class Parent:
def func1(self):
print("this is function one")
class Child(Parent):
def func2(self):
print(" this is function 2 ")
ob = Child()
ob.func1()
ob.func2()

Multiple Inheritance
When a child class inherits from more than one parent class.
Example:
class Parent:
def func1(self):
print("this is function 1")
class Parent2:
def func2(self):
print("this is function 2")
class Child(Parent , Parent2):
def func3(self):
print("this is function 3")
ob = Child()
ob.func1()
ob.func2()

9
ob.func3()

Multilevel Inheritance
When a child class becomes a parent class for another child class.
Example:
class Parent:
def func1(self):
print("this is function 1")
class Child(Parent):
def func2(self):
print("this is function 2")
class Child2(Child):
def func3("this is function 3")
ob = Child2()
ob.func1()
ob.func2()
ob.func3()

Hierarchical Inheritance
Hierarchical inheritance involves multiple inheritance from the same base or parent
class.
Example:
class Parent:
def func1(self):
print("this is function one")
class Child(Parent):
def func2(self):
print("this is function 2")
class Child1(Parent):
def func3(self):
print(" this is function 3"):
class Child3(Parent , Child1):
def func4(self):
print(" this is function 4")
ob = Child3()
ob.func1()

Scopes and Namespaces

10
Whenever an identifier is assigned to a value that definition is made with a specific
scope. Top-level assignments are typically made in what is known as global scope.
Assignments made within the body of a function typically have scope that is local to that
function call. Therefore, an assignment, x=5, within a function has no effect on the identifier,
x, in the broader scope. Each distinct scope in Python is represented using an abstraction
known as a namespace. A namespace manages all identifiers that are currently defined in a
given scope.
The process of determining the value associated with an identifier is known as name
resolution.
In a Python program, there are three types of namespaces:

1. Built-In
2. Global
3. Local

These have differing lifetimes. As Python executes a program, it creates namespaces as

necessary and deletes them when they’re no longer needed. Typically, many namespaces
will exist at any given time.

11
a. Built-in Namespace in Python

This namespace gets created when the interpreter starts. It stores all the keywords or
the built-in names. This is the superset of all the Namespaces. This is the reason we can
use print, True, etc. from any part of the code.

b. Python Global Namespace

This is the namespace that holds all the global objects. This namespace gets created
when the program starts running and exists till the end of the execution.

Example of Global Namespace in Python:

text="PythonGeeks"
def func():
print(text)
func()
print(text)

Output:
PythonGeeks
PythonGeeks

c. Python Local Namespace

This is the namespace that generally exists for some part of the time during the
execution of the program. This stores the names of those objects in a function. These
namespaces exist as long as the functions exist. This is the reason we cannot globally
access a variable, created inside a function.

Example of local namespace:

var1="PythonGeeks"
def func():
var2="Python"
print(var2)
func()

Output:
Python

12
Copy an Object in Python
In Python, we use = operator to create a copy of an object. It only creates a new
variable that shares the reference of the original object.

Let's take an example where we create a list named old_list and pass an object
reference to new_list using = operator.
Example:
#Copy using = operator
old_list = [[1, 2, 3], [4, 5, 6], [7, 8, 'a']]
new_list = old_list T
The output will be:
new_list[2][2] = 9
O
print('Old List:', old_list) ld List: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
I
print('ID of Old List:', id(old_list))
D of Old List: 140673303268168
print('New List:', new_list) N
ew List: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
print('ID of New List:', id(new_list))
I
D of New List: 140673303268168

The output both variables old_list and new_list shares the same id i.e
140673303268168.So, the changes in new_list or old_list, will be visible in both.
Essentially, sometimes you may want to have the original values unchanged and only modify
the new values or vice versa. In Python, there are two ways to create copies:
 Shallow Copy
 Deep Copy
To make these copy work, copy module is used.
For example:
import copy
copy.copy(x)
copy.deepcopy(x)

Here, the copy() return a shallow copy of x. Similarly, deepcopy() return a deep copy of x.

13
Shallow Copy
A shallow copy creates a new object which stores the reference of the original
elements.So, a shallow copy doesn't create a copy of nested objects, instead it just copies
the reference of nested objects. This means, a copy process does not recurse or create
copies of nested objects itself.

Example: Create a copy using shallow copy

import copy
old_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
new_list = copy.copy(old_list)
print("Old list:", old_list)
print("New list:", new_list)
The output will be:
Old list: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
New list: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

Example: Adding [4, 4, 4] to old_list, using shallow copy

import copy
old_list = [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
new_list = copy.copy(old_list)
old_list.append([4, 4, 4])
print("Old list:", old_list)
print("New list:", new_list)
Output:
Old list: [[1, 1, 1], [2, 2, 2], [3, 3, 3], [4, 4, 4]]
New list: [[1, 1, 1], [2, 2, 2], [3, 3, 3]]

In the above program, shallow copy of old_list is created. The new_list contains
references to original nested objects stored in old_list. Then we add the new list i.e [4, 4, 4]
into old_list. This new sublist was not copied in new_list.

Example: Adding new nested object using Shallow copy

import copy
old_list = [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
new_list = copy.copy(old_list)
old_list[1][1] = 'AA'
print("Old list:", old_list)
print("New list:", new_list)
Output:
Old list: [[1, 1, 1], [2, 'AA', 2], [3, 3, 3]]

14
New list: [[1, 1, 1], [2, 'AA', 2], [3, 3, 3]]
In the above program, changes to old_list i.e old_list[1][1] = 'AA' affects both sublists
old_list and new_list at index [1][1]. This is because, both lists share the reference of same
nested objects.
Deep Copy
A deep copy creates a new object and recursively adds the copies of nested objects
present in the original elements.

Example: Copying a list using deepcopy()

import copy
old_list = [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
new_list = copy.deepcopy(old_list)
print("Old list:", old_list)
print("New list:", new_list)
Output:
Old list: [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
New list: [[1, 1, 1], [2, 2, 2], [3, 3, 3]]

In the above program, changes to any nested objects in original object old_list, makes
changes to the copy new_list.

Example: Adding a new nested object in the list using Deep copy
import copy
old_list = [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
new_list = copy.deepcopy(old_list)
old_list[1][0] = 'BB'
print("Old list:", old_list)
print("New list:", new_list)
Output:
Old list: [[1, 1, 1], ['BB', 2, 2], [3, 3, 3]]
New list: [[1, 1, 1], [2, 2, 2], [3, 3, 3]]

In the above program, changes in old_list, makes changes only in the old_list. This
means, both the old_list and the new_list are independent. This is because the old_list was
recursively copied, which is true for all its nested objects.

15
Introduction to Analysis of Algorithms

Data structure is a systematic way of organizing and accessing data, and an

algorithm is a step-by-step procedure for performing some task in a finite amount of time.
Algorithm analysis helps us to determine which algorithm is most efficient in terms of time
and space consumed.

The running time of an algorithm can be calculated by executing it on various test

inputs and recording the time spent during each execution.

from time import time

start time = time( ) # record the starting time
run algorithm
end time = time( ) # record the ending time
elapsed = end time − start time # compute the elapsed time
Challenges of Experimental Analysis

 Experimental running times of two algorithms are difficult to directly compare unless
the experiments are performed in the same hardware and software environments.
 Experiments can be done only on a limited set of test inputs; hence, they leave out
the running times of inputs not included in the experiment (and these inputs may be
important).
 An algorithm must be fully implemented in order to execute it to study its running
time experimentally.

Moving Beyond Experimental Analysis

Our goal is to develop an approach to analyzing the efficiency of algorithms that:

1. Allows us to evaluate the relative efficiency of any two algorithms in a way that is
independent of the hardware and software environment.

2. Takes into account all possible inputs.

3. Is performed by studying a high-level description of the algorithm without need

for implementation.

Types of Analysis

Algorithm analysis depends on which inputs the algorithm takes less time
(performing wel1) and with which inputs the algorithm takes a long time.

Worst case

16
 Defines the input for which the algorithm takes a long time (slowest time to
complete).
 Input is the one for which the algorithm runs the slowest.
Best case
 Defines the input for which the algorithm takes the least time (fastest time to
complete).
 Input is the one for which the algorithm runs the fastest.
Average case
 Provides a prediction about the running time of the algorithm.
 Run the algorithm many times, using many different inputs and divided by the
number of trials.
 Assumes that the input is random.

Lower Bound <= Average Time <= Upper Bound

Asymptotic Notation

For the best, average and worst cases, we need to identify the upper and lower
bounds. To represent these upper and lower bounds, we need some kind of syntax,
represented in the form of function f(n).

Big-O Notation [Upper Bounding Function]

This notation gives the tight upper bound of the given function. Generally, it is
represented as f(n) = O(g(n)). That means, at larger values of n, the upper bound of f(n) is
g(n).

For example, if f(n) = n 4 + 100n 2 + 10n + 50 is the given algorithm, then n 4 is g(n).
That means g(n) gives the maximum rate of growth for f(n) at larger values of n.

Big-O Examples

Example-1 Find upper bound for f(n) = 3n + 8

Solution: 3n + 8 ≤ 4n, for all n ≥ 8

∴ 3n + 8 = O(n) with c = 4 and n0 = 8
Example-2 Find upper bound for f(n) = n2 + 1
Solution: n2 + 1 ≤ 2n2, for all n ≥ 1
∴ n2 + 1 = O(n2) with c = 2 and n0 = 1
Big Omega Notation [Lower Bounding Function]

17
This notation gives the tighter lower bound of the given algorithm and we represent it
as f(n) = Ω(g(n)). That means, at larger values of n, the tighter lower bound of f(n) is g(n).
For example, if f(n) = 100n2 + 10n + 50, g(n) is Ω(n2 ).

Example : 3nlog n−2n is Ω(nlog n).

Solution: 3nlog n− 2n = nlog n+ 2n(logn− 1) ≥ nlogn for n ≥ 2; hence, we can take c
= 1 and n0 = 2 in this case.
Big-Theta

There is a notation that allows us to say that two functions grow at the same rate, up
to constant factors. f(n) is Θ(g(n)), pronounced “ f(n) is big-Theta of g(n),” if f(n) is O(g(n))
and f(n) is Ω(g(n)) , that is, there are real constants c > 0 and c > 0, and an integer constant
n0 ≥ 1 such that

Example: 3nlog n+4n+5logn is Θ(nlog n).

Solution: 3nlogn ≤ 3nlog n+4n+5logn ≤ (3+4+5) nlogn for n≥2.
Asymptotic Analysis

There are some general rules to help us determine the running time of an algorithm.

1) Loops: The running time of a loop is, at most, the running time of the statements inside
the loop (including tests) multiplied by the number of iterations.

Example: // Executes n times

For(i=1;i<=n;i++)
M=m+2 //constant time, c

Total time = a constant c × n = c n = O(n).

2) Nested loops: Analyze from the inside out. Total running time is the product of the sizes
of all the loops.

Example:

//outer loop
For(i=1;i<=n;i++)
For(j=1;j<=n;j++)
K=k+1 //constant time, c
Total time = c × n × n = c n2 = O(n2 ).

3) Consecutive statements: Add the time complexities of each statement

Example:

X=x+1

For(i=1;i<=n;i++)

18
M=m+2 //constant time, c
//outer loop
For(i=1;i<=n;i++)
For(j=1;j<=n;j++)
K=k+1 //constant time, c

Total time = c0 + c1n + c2n2 = O(n2 ).

4) If-then-else statements:

Worst-case running time: the test, plus either the then part or the else part (whichever is the
larger).

//test : constant
If(length()==0)
Return false;
Else:
For(int n=0;n<length();n++)
If(!list[n].equals(otherList.list[n]))
Return false;
Total time = c0 + c1 + (c2 + c3 ) * n = O(n).

19
Recursion

Recursion is a technique by which a function makes one or more calls to itself during
execution, until the condition gets satisfied. Recursion provides a powerful alternative for
performing repetitive tasks.

Example

 The factorial function (commonly denoted as n!) is a classic mathematical function

that has a natural recursive definition.
 An English ruler has a recursive pattern that is a simple example of a fractal
structure.
 Binary search is among the most important computer algorithms. It allows us to
efficiently locate a desired value in a data set with upwards of billions of entries.
 The file system for a computer has a recursive structure in which directories can be
nested arbitrarily deeply within other directories. Recursive algorithms are widely
used to explore and manage these file systems.

1) The Factorial Function

The factorial of a positive integer n, denoted n!, is defined as the product of the
integers from 1 to n. If n = 0, then n! is defined as 1 by convention. More formally, for any
integer n ≥ 0.

The factorial function is used to find the number of ways in which n distinct items can
be arranged into a sequence, that is, the number of permutations of n items. For example,
the three characters a, b, and c can be arranged in 3! = 3 · 2 · 1 = 6 ways: abc, acb, bac,
bca, cab, and cba.

A Recursive Implementation of the Factorial Function

Recursion is not just a mathematical notation; we can use recursion to design aPython
implementation of a factorial function, as shown in Code Fragment 4.1.

def factorial(n):
if n == 0:
return 1
else:
return n*factorial(n−1)

20
Trace for the factorial function is,

A recursion trace for the call factorial(5)

2) Drawing an English Ruler

This is to draw the markings of a typical English ruler. For each inch, we place a tick
with a numeric label. We denote the length of the tick designating a whole inch as the major
tick length. Between the marks for whole inches, the ruler contains a series of minor ticks,
placed at intervals of 1/2 inch, 1/4 inch, and so on.

In general, an interval with a central tick length L ≥ 1 is composed of:

 An interval with a central tick length L−1

 A single tick of length L
 An interval with a central tick length L−1

21
Python Code:

def draw_line(tick_length, tick_label=' '):

"""Draw one line with given tick length (followed by optional label)."""

line = '-'* tick_length

if tick_label:

line += ''+ tick_label Output

print(line)

def drawinterval(centerlength):

"""Draw tick interval based upon a central tick length."""

if centerlength > 0: # stop when length drops to 0

drawinterval(centerlength-1) # recursively draw top ticks

draw_line(centerlength) # draw center tick

drawinterval(centerlength-1)

def drawruler(numinches, majorlength):

"""DrawEnglish ruler with given number of inches, major tick length."""

draw_line(majorlength, 0 ) # draw inch 0 line

for j in range(1, 1 + numinches):

drawinterval(majorlength-1) # draw interior ticks for inch

draw_line(majorlength, str(j)) # draw inch j line and label

drawruler(2,4)

22
Trace of the English Ruler Code:

3) Binary Search

Binary search, that is used to efficiently locate a target value within a sorted
sequence of n elements.

Values stored in sorted order within an indexable sequence,

such as a Python list. The numbers at top are the indices.

The algorithm maintains two parameters, low and high, such that all the candidate
entries have index at least low and at most high. Initially, low = 0 and high = n− 1. We then
compare the target value to the median candidate, that is, the item data[mid] with index

mid = (low +high)/2

We consider three cases:

 If the target equals data[mid], then we have found the item we are looking for,
and the search terminates successfully.
 If target < data[mid], then we recur on the first half of the sequence, that is, on
the interval of indices from low to mid−1.
 If target > data[mid], then we recur on the second half of the sequence, that
is, on the interval of indices from mid+1 to high.

23
 An unsuccessful search occurs if low > high, as the interval [low,high] is
empty.

This algorithm is known as binary search. Whereas sequential search runs in O(n)
time, the more efficient binary search runs in O(logn) time.

4) File Systems

Modern operating systems define file-system directories (which are also sometimes
called “folders”) in a recursive way. Namely, a file system consists of a top-level directory,
and the contents of this directory consists of files and other directories, which in turn can
contain files and other directories, and so on.

24
25
Analyzing Recursive Algorithm

Efficiency of the algorithm calculated as big-Oh to summarize the relationship

between the number of operations and the input size for a problem.

With a recursive algorithm, we will account for each operation that is performed
based upon the particular activation of the function that manages the flow of control at the
time it is executed. Stated another way, for each invocation of the function, we only account
for the number of operations that are performed within the body of that activation. We can
then account for the overall number of operations that are executed as part of the recursive
algorithm by taking the sum, over all activations, of the number of operations that take place
during each individual activation

Computing Factorials

It is relatively easy to analyze the efficiency of our function for computing factorials.
Sample recursion trace is,

To compute factorial(n), there are a total of n+1 activations, as the parameter

decreases from n in the first call, to n−1 in the second call, and so on, until reaching the
base case with parameter 0. Each individual activation of factorial executes a constant
number of operations. Therefore, the overall number of operations for computing factorial(n)
is O(n), as there are n+1 activations, each of which accounts for O(1) operations.

Drawing an English Ruler

In analyzing the English ruler application, the fundamental question of how many
total lines of output are generated by an initial call to draw interval(c), where c denotes the
center length. This is a reasonable benchmark for the overall efficiency of the algorithm as
each line of output is based upon a call to the draw line utility, and each recursive call to
draw interval with nonzero parameter makes exactly one direct call to draw line. Some
intuition may be gained by examining the source code and the recursion trace. We know that
a call to draw interval(c) for c > 0 spawns two calls to draw interval(c−1) and a single call to
draw line. We will rely on this intuition to prove the following claim. Proposition 4.1: For c ≥ 0,
a call to draw interval(c) results in precisely 2c − 1 lines of output.

26
Justification:

In fact, induction is a natural mathematical technique for proving the correctness and
efficiency of a recursive process. In the case of the ruler, we note that an application of draw
interval(0) generates no output, and that 20 −1 = 1−1 = 0. This serves as a base case for our
claim. More generally, the number of lines printed by draw interval(c) is one more than twice
the number generated by a call to draw interval(c−1), as one center line is printed between
two such recursive calls. By induction, we have that the number of lines is thus 1+2 ·(2c−1
−1) = 1+2c −2 = 2c −1. This proof is indicative of a more mathematically rigorous tool, known
as a recurrence equation that can be used to analyze the running time of a recursive
algorithm.

Performing a Binary Search

Considering the running time of the binary search algorithm, a constant number of
primitive operations are executed at each recursive call of method of a binary search.
Hence, the running time is proportional to the number of recursive calls performed. The most
log n+1 recursive calls are made during a binary search of a sequence having n elements,
leading to the following claim. The binary search algorithm runs in O(logn) time for a sorted
sequence with n elements.

Justification:

Each recursive call the number of candidate entries still to be searched is given by
the value high−low+1. Moreover, the number of remaining candidates is reduced by at least
one half with each recursive call. Specifically, from the definition of mid, the number of
remaining candidates is either.

Initially, the number of candidates is n; after the first call in a binary search, it is at
most n/2; after the second call, it is at most n/4; and so on. In general, after the j th call in a
binary search, the number of candidate entries remaining is at most n/2j . In the worst case
(an unsuccessful search), the recursive calls stop when there are no more candidate entries.
Hence, the maximum number of recursive calls performed, is the smallest integer r such that
n 2r < 1. In other words (recalling that we omit a logarithm’s base when it is 2), r > logn.
Thus, we have r = logn+1, which implies that binary search runs in O(logn) time

Computing Disk Space Usage

To characterize the “problem size” for our analysis, we let n denote the number of
file-system entries in the portion of the file system that is considered. (For example, the file
system portrayed in Figure 4.6 has n = 19 entries.) To characterize the cumulative time

27
spent for an initial call to the disk usage function, we must analyze the total number of
recursive invocations that are made, as well as the number of operations that are executed
within those invocations.

Intuitively, a call to disk usage for a particular entry ‘e’ of the file system is only made
from ‘e’, and that entry will only be explored once.

The fact that each iteration of that loop makes a recursive call to disk usage, and yet
we have already concluded that there are a total of n calls to disk usage (including the
original call). We therefore conclude that there are O(n) recursive calls, each of which uses
O(1) time outside the loop, and that the overall number of operations due to the loop is O(n).
Summing all of these bounds, the overall number of operations is O(n).

28
Unit I – 2 Mark Questions with Answers

1. What are Variables?

Variables are placeholders for representing data. In computer programming,

variables are used to hold data.
Ex: x2+2y-2=1

2. What are Data Types?

A data type in a programming language is a set of data with predefined values.

Examples of data types are: integer-2 Bytes, floating point-4 Bytes, unit number, character,
string, etc.

At the top level, there are two types of data types:

• System-defined data types (also called Primitive data types)
• User-defined data types

3. What are System-defined data types (Or Primitive data types)?

Data types that are defined by system are called primitive data types. The primitive
data types provided by many programming languages are: int, float, char, double, bool, etc.

The number of bits allocated for each primitive data type depends on the
programming languages, the compiler and the operating system.

For the same primitive data type, different languages may use different sizes.
Depending on the size of the data types, the total available values (domain) will also change.

4. What are User defined data types?

If the system-defined data types are not enough, then most programming languages
allow the users to define their own data types, called user – defined data types.
Good examples of user defined data types are: structures in C/C + + and classes in
Java. For example, in the snippet below, we are combining many system-defined data types
and calling the user defined data type by the name “newType”.
This gives more flexibility and comfort in dealing with computer memory.
struct newType
{
int data1;
float data2;
.
.
.
char data-n;
};

29
5. What are Data Structures?
Data structure is a particular way of storing and organizing data in a computer so that
it can be used efficiently.

A data structure is a special format for organizing and storing data. General data
structure types include arrays, files, linked lists, stacks, queues, trees, graphs and so on.

Depending on the organization of the elements, data structures are classified into two types:

1) Linear data structures: Elements are accessed in a sequential order but it is not
compulsory to store all elements sequentially. Examples: Linked Lists, Stacks and Queues.

2) Non – linear data structures: Elements of this data structure are stored/accessed
in a non-linear order. Examples: Trees and graphs.

6. What are Abstract Data Types? Or What is ADT?

An abstract Data type (ADT) is defined as a mathematical model with a collection of
operations defined on that model. Set of integers, together with the operations of union,
intersection and set difference form a example of an ADT. An ADT consists of data together
with functions that operate on that data.
Advantages/Benefits of ADT:
1.Modularity
2.Reuse
3.code is easier to understand
4.Implementation of ADTs can be changed without requiring changes to the program
that uses the ADTs.

7. Characteristics of Python?
Robustness
Adaptable
Reusable
Modular

8. Why Python is Robustness?

A program produces the right output for all the anticipated inputs in the program’s
application. In addition, we want software to be robust, that is, capable of handling
unexpected inputs that are not explicitly defined for its application.

For example, if a program is expecting a positive integer and instead is given a

negative integer, then the program should be able to recover gracefully from this error.

30
9. Is Python Adaptable?
Software, needs to be able to evolve over time in response to changing conditions in
its environment. Thus, another important goal of quality software is that it achieves
adaptability (also called evolvability).

10. What is Reusability?

Developing quality software can be an expensive enterprise, and its cost can be
offset somewhat if the software is designed in a way that makes it easily reusable in future
applications.

Such reuse should be done with care, however, for one of the major sources of
software errors in the Therac-25 came from inappropriate reuse of Therac-20 software.

11. What are Object-Oriented Design Principles?

Chief principles of object-oriented approach is,
 Modularity
 Abstraction
 Encapsulation

12. What is Modularity?

Modularity refers to an organizing principle in which different components of a
software system are divided into separate functional units.

Using modularity in a software system can also provide a powerful organizing

framework that brings clarity to an implementation.

13. What is Abstract base class?

An ABC is realized by one or more concrete classes that inherit from the abstract
base class while providing implementations for those method declared by the ABC.

31
14. What is Encapsulation?

Wrapping up on Class and Data together into single unit is called Encapsulation.
Different components of a software system should not reveal the internal details of their
respective implementations.

Encapsulation yields robustness and adaptability, for it allows to fix bugs or add new
functionality with relatively local changes to a component.

15. Define Class.

A class is a collection of objects. A class contains the blueprints or the prototype from
which the objects are being created. It is a logical entity that contains some attributes and
methods.

A class provides a set of behaviors in the form of member functions (also known as
methods), with implementations that are common to all instances of that class.

A class determines the way that state information for each instance, it is represented
in the form of attributes (also known as fields, instance variables, or data members)

Example:
class Person:
__init__(self,a,b):
Print(“Sum=”,a+b)
Obj=Person(2,3);

Output:
Sum=5

16. Define Constructor.

__init__ is a Special method that serves as the constructor of the class. Its primary
responsibility is to establish the state of a newly created credit card object with appropriate
instance variables.

Example:
class Person:
__init__(self,a,b):
Print(“Sum=”,a+b)
Obj=Person(2,3);

Output:
Sum=5

32
17. What is Inheritance?

In object-oriented programming, the mechanism for a modular and hierarchical

organization is a technique known as inheritance.

This allows a new class to be defined based upon an existing class as the starting
point. In object-oriented terminology, the existing class is typically described as the base
class, parent class, or superclass, while the newly defined class is known as the subclass or
child class.

Syntax:
class BaseClass:
Body of base class

class DerivedClass(BaseClass):
Body of derived class

Derived class inherits features from the base class where new features can be added
to it. This results in re-usability of code.

18. List the types of Inheritance.

Depending upon the number of child and parent classes involved, there are four
types of inheritance in python.

19. Give example for Single Inheritance.

When a child class inherits only a single parent class.
Example:
class Parent:
def func1(self):
print("this is function one")
class Child(Parent):
def func2(self):
print(" this is function 2 ")
ob = Child()
ob.func1()
ob.func2()

33
20. What is Multiple Inheritance?
When a child class inherits from more than one parent class.
Example:
class Parent:
def func1(self):
print("this is function 1")
class Parent2:
def func2(self):
print("this is function 2")
class Child(Parent , Parent2):
def func3(self):
print("this is function 3")
ob = Child()
ob.func1()
ob.func2()
ob.func3()

21. What is Multilevel Inheritance?

When a child class becomes a parent class for another child class.
Example:
class Parent:
def func1(self):
print("this is function 1")
class Child(Parent):
def func2(self):
print("this is function 2")
class Child2(Child):
def func3("this is function 3")
ob = Child2()
ob.func1()
ob.func2()
ob.func3()

34
22. What is Hierarchical Inheritance?
Hierarchical inheritance involves multiple inheritance from the same base or parent
class.
Example:
class Parent:
def func1(self):
print("this is function one")
class Child(Parent):
def func2(self):
print("this is function 2")
class Child1(Parent):
def func3(self):
print(" this is function 3"):
class Child3(Parent , Child1):
def func4(self):
print(" this is function 4")
ob = Child3()
ob.func1()

23. What is Scope?

Whenever an identifier is assigned to a value that definition is made with a specific
scope. Top-level assignments are typically made in what is known as global scope.
Assignments made within the body of a function typically have scope that is local to that
function call.

24. What are Namespaces?

Each distinct scope in Python is represented using an abstraction known as a
namespace. A namespace manages all identifiers that are currently defined in a given
scope.
The process of determining the value associated with an identifier is known as name
resolution.
In a Python program, there are three types of namespaces:

1. Built-In
2. Global
3. Local

These have differing lifetimes. As Python executes a program, it creates namespaces as

necessary and deletes them when they’re no longer needed.

35
25. What isPython Global Namespace?

This is the namespace that holds all the global objects. This namespace gets created
when the program starts running and exists till the end of the execution.

Example of Global Namespace in Python:

text="Python"
def func():
print(text)
func()
print(text)
Output:
Python
Python
26. What are Built-in Namespaces in Python?

27. What is Python Local Namespace

This is the namespace that generally exists for some part of the time during the
execution of the program. This stores the names of those objects in a function.

These namespaces exist as long as the functions exist. This is the reason we cannot
globally access a variable, created inside a function.

Example of local namespace:

var1="Python"
def func():
var2="Python"
print(var2)
func()
print(var1)
print(var2) //Error

Output:
NameError: name 'var2' is not defined

28. How to Copy an Object in Python?

In Python, we use = operator to create a copy of an object. It only creates a new

variable that shares the reference of the original object.
In Python, there are two other ways to create copies:
o Shallow Copy

36
o Deep Copy
To make these copy work, copy module is used.
For example:
import copy
copy.copy(x)
copy.deepcopy(x)

29. Give example for Python Copy methods.

Here, the copy() return a shallow copy of x. Similarly, deepcopy() return a deep copy
of x.

30. How will you create a reference in python?

A shallow copy creates a new object which stores the reference of the original
elements. So, a shallow copy doesn't create a copy of nested objects, instead it just copies
the reference of nested objects. This means, a copy process does not recurse or create
copies of nested objects itself.
Example: Create a copy using shallow copy
import copy The output will be:
O
old_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
ld list: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
new_list = copy.copy(old_list) N
ew list: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
print("Old list:", old_list)
print("New list:", new_list)

31. Give the procedure to create a new object from original elements.
A deep copy creates a new object and recursively adds the copies of nested objects
present in the original elements.

Example: Copying a list using deepcopy()

import copy
old_list = [[1, 1, 1], [2, 2, 2], [3, 3, 3]] Output:
Old list: [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
new_list = copy.deepcopy(old_list) New list: [[1, 1, 1], [2, 2, 2], [3, 3, 3]]

37
print("Old list:", old_list)
print("New list:", new_list)
32. What is the use of Algorithm analysis?
Algorithm analysis helps us to determine which algorithm is most efficient in terms of
time and space consumed.

The running time of an algorithm can be calculated by executing it on various test

inputs and recording the time spent during each execution.

from time import time

start time = time( ) # record the starting time
run algorithm
end time = time( ) # record the ending time
elapsed = end time − start time # compute the elapsed time

33. What are the Challenges of Experimental Analysis?

34. What is runtime analysis of algorithms?

Runtime Analysis of Algorithms In general cases, we mainly used to measure and
compare the worst-case theoretical running time complexities of algorithms for the
performance analysis. The fastest possible running time for any algorithm is O (1),
commonly referred to as Constant Running Time.

35. How to analyse the complexity of user input to an algorithm?

Algorithm analysis depends on which inputs the algorithm takes less time (performing
well) and with which inputs the algorithm takes a long time.

Worst case
 Defines the input for which the algorithm takes a long time (slowest time to
complete).
 Input is the one for which the algorithm runs the slowest.
Best case
 Defines the input for which the algorithm takes the least time (fastest time to
complete).
 Input is the one for which the algorithm runs the fastest.
Average case
 Provides a prediction about the running time of the algorithm.
 Run the algorithm many times, using many different inputs and divide by the
number of trials.
 Assumes that the input is random.

38
Lower Bound <= Average Time <= Upper Bound

36. What is Asymptotic Notation?

An Asymptotic Notations is the notation which represent the complexity of an

algorithm. It is used to study how the running time of an algorithm grows as the value of the
input or the unknown variable increases. Therefore, it is also known as the "growth rate" of
an algorithm.

Big-O Notation – Upper Bound, represented as f(n) = O(g(n)).

Example-1 Find upper bound for f(n) = 3n + 8
Solution: 3n + 8 ≤ 4n, for all n ≥ 8
∴ 3n + 8 = O(n) with c = 4 and n0 = 8
Big Omega Notation - Lower Bound, represented as f(n) = Ω(g(n)).
Example : 3nlog n−2n is Ω(nlog n).
Solution: 3nlog n− 2n = nlog n+ 2n(logn− 1) ≥ nlogn for n ≥ 2; hence, we can take c =
1 and n0 = 2 in this case.
Big-Theta Notation - says that two functions growing at the same rate, up to constant factors.
Example: 3nlog n+4n+5logn is Θ(nlog n).
Solution: 3nlogn ≤ 3nlog n+4n+5logn ≤ (3+4+5) nlogn for n≥2.

37. Give example for Asymptotic Analysis.

There are some general rules to help us determine the running time of an algorithm.

1) Loops: The running time of a loop is, at most, the running time of the statements inside
the loop (including tests) multiplied by the number of iterations.

Example: // Executes n times

For(i=1;i<=n;i++)
M=m+2 //constant time, c

Total time = a constant c × n = c n = O(n).

38. How will you analyse the running time complexity of Nested Loops?

Nested loops: Analyze from the inside out. Total running time is the product of the
sizes of all the loops.

Example:
//outer loop

39
For(i=1;i<=n;i++)
For(j=1;j<=n;j++)
K=k+1 //constant time, c
Total time = c × n × n = c n2 = O(n2 ).

39. How will you analyse the running time complexity of Consecutive Statements?

Consecutive statements: Add the time complexities of each statement

Example:
X=x+1
For(i=1;i<=n;i++)
M=m+2 //constant time, c
//outer loop
For(i=1;i<=n;i++)
For(j=1;j<=n;j++)
K=k+1 //constant time, c

Total time = c0 + c1n + c2n2 = O(n2 ).

40. How will you analyse the running time complexity of if-then-else statements?

If-then-else statements - Worst-case running time: the test, plus either the then part
or the else part (whichever is the larger).

//test : constant
If(length()==0)
Return false;
Else:
For(int n=0;n<length();n++)
If(!list[n].equals(otherList.list[n]))
Return false;
Total time = c0 + c1 + (c2 + c3 ) * n = O(n).

41. What is Recursion?

Example – Finding Factorial of a given number:

A Recursive Implementation of the Factorial Function

def factorial(n):
if n == 0:
return 1

40
else:
return n*factorial(n−1)

42. Give the procedure for drawing an English Ruler using Recursion.

 This is to draw the markings of a typical English ruler.

 For each inch, we place a tick with a numeric label.
 We denote the length of the tick designating a whole inch as the major tick
length.
 Between the marks for whole inches, the ruler contains a series of minor
ticks, placed at intervals of 1/2 inch, 1/4 inch, and so on.

Trace of the English Ruler Code:

43. Give procedure to find disk space using Recursion.

41
UNIT II LINEAR STRUCTURES

List ADT – array-based implementations – linked list implementations – singly linked

lists – circularly linked lists – doubly linked lists – applications of lists – Stack ADT – Queue
ADT – double ended queues

List ADT:

List ADT can be implemented using Array or Python List.

Array Implementation:

Basic structure for storing and accessing a collection of data is the array. A one-
dimensional array is a collection of contiguous elements in which individual elements are
identified by a unique integer subscript starting with zero. Once an array is created, its size
cannot be changed.

 Array(size): Creates a one-dimensional array consisting of size elements

with each element initially set to None. size must be greater than zero.
 length (): Returns the length or number of elements in the array.
 getitem (index): Returns the value stored in the array at element position
index. The index argument must be within the valid range. Accessed using
the subscript operator.
 setitem (index, value): Modifies the contents of the array element at position
index to contain value. The index must be within the valid range. Accessed
using the subscript operator.
 clearing(value): Clears the array by setting every element to value.
 iterator (): Creates and returns an iterator that can be used to traverse the
elements of the array.

Python List Implementation

Python’s list structure is a mutable sequence container that can change size as
items are added or removed. It is an abstract data type that is implemented using an array
structure to store the items contained in the list.

42
Appending Items

pyList.append( 50 )

If there is room in the array, the item is stored in the next available slot of the array
and the length field is incremented by one.

pyList.append( 18 )

pyList.append( 64 )

pyList.append( 6)

After the second statement is executed, the array becomes full and there is no
available space to add more values.

By definition, a list can contain any number of items and never becomes full. Thus,
when the third statement is executed, the array will have to be expanded to make room for
value 6. Array cannot change size once it has been created. To allow for the expansion of
the list, the following steps have to be performed:

(1) A new array is created with additional capacity,

(2) The items from the original array are copied to the new array,

(3) The new larger array is set as the data structure for the list,

(4) The original smaller array is destroyed. After the array has been expanded, the
value can be appended to the end of the list.

43
Extending a List

A list can be appended to a second list using the extend() method as shown in the
following example:

pyListA = [ 34, 12 ]

pyListB = [ 4, 6, 31, 9 ]

pyListA.extend( pyListB )

If the list being extended has the capacity to store all of the elements from the
second list, the elements are simply copied, element by element. If there is not enough
capacity for all of the elements, the underlying array has to be expanded as was done with
the append() method.

44
Inserting Items

An item can be inserted anywhere within the list using the insert() method. In the
following example pyList.insert( 3, 79 ) we insert the value 79 at index position 3. Since there
is already an item at that position, we must make room for the new item by shifting all of the
items down one position starting with the item at index position 3. After shifting the items, the
value 79 is then inserted at position 3.

Removing Items

An item can be removed from any position within the list using the pop() method.
Consider the following code segment, which removes both the first and last items from the
sample list:

pyList.pop( 0 ) # remove the first item

pyList.pop() # remove the last item

The first statement removes the first item from the list. After the item is removed,
typically by setting the reference variable to None, the items following it within the array are
shifted down, from left to right, to close the gap. Finally, the length of the list is decremented
to reflect the smaller size.

The second pop() operation in the example code removes the last item from the list.
Since there are no items following the last one, the only operations required are to remove
the item and decrement the size of the list. After removing an item from the list, the size of
the array may be reduced using a technique similar to that for expansion. This reduction
occurs when the number of available slots in the internal array falls below a certain

45
threshold. For example, when more than half of the array elements are empty, the size of the
array may be cut in half.

List Slice
Slicing is an operation that creates a new list consisting of a contiguous subset of
elements from the original list. The original list is not modified by this operation. Instead,
references to the corresponding elements are copied and stored in the new list. In Python,
slicing is performed on a list using the colon operator and specifying the beginning element
index and the number of elements included in the subset. Consider the following example
code segment, which creates a slice from our sample list: aSlice = theVector[2:3]

Python Code
import ctypes
class dy_array:
def __init__(self):
self.n=0
self.capacity=1
self.Arr=self.makearray(self.capacity)

def makearray(self,c):
return (c*ctypes.py_object)( )

def findlength(self,obj):
for i in obj:
pass

46
return (i)

def getitem(self,x):
for i in range(len):
if(self.Arr[i]==x):
return i
else:
print("Data Not Found")

def append(self,obj):
if(self.n==self.capacity):
self.resize(2*self.capacity)
self.Arr[self.n]=obj
self.n+=1

def resize(self,c):
B=self.makearray(2*self.capacity)
for i in range(self.n):
B[i]=self.Arr[i]
self.Arr=B
self.capacity=c

def insert(self,pos,val):
if(self.n==self.capacity):
self.resize(2*self.capacity)
for i in range(self.n,pos,-1):
self.Arr[i]=self.Arr[i-1]
self.Arr[pos]=val
self.n+=1

def extend(self,val):
len=self.findlength(val)
print("..",len)
for i in range(len):
self.append(val[i])

def remove(self,val):
for i in range(self.n):
if self.Arr[i]==val:
for j in range(i,self.n-1):
self.Arr[j]=self.Arr[j+1]
self.n-=1

def disp(self):
for i in range(self.n):
print(self.Arr[i])

47
Linked List Implementation:

Linked list, which provides an alternative to an array-based sequence (such as a

Python list). An array provides the more centralized representation, with one large chunk of
memory capable of accommodating references to many elements.
A linked list, relies on a more distributed representation in which a lightweight
object, known as a node, is allocated for each element. Each node maintains a reference to
its element and one or more references to neighbouring nodes in order to collectively
represent the linear order of the sequence.

Singly Linked Lists

A singly linked list, is a collection of nodes that collectively form a linear sequence.
Each node stores a reference to an object that is an element of the sequence and a
reference to the nextnode of the list.

Singly Linked List Node Representation

Singly Linked List Representation

Inserting an Element at the Head of a Singly Linked List

1. Create new node instance storing reference to element e
2. Set new node’s next to reference the old head node
3. Set variable head to reference the new node
4. Increment the node count.

48
After Insertion

Algorithm add first(L,e):

newest = Node(e) #{create new node instance storing reference to element e}
newest.next = L.head #{set new node’s next to reference the old head node}
L.head = newest #{set variable head to reference the new node}
L.size = L.size+1 #{increment the node count}

Inserting an Element at the Tail of a Singly Linked List

Algorithm add last(L,e):

newest = Node(e) {create new node instance storing reference to element
e}
newest.next = None {set new node’s next to reference the None object}
L.tail.next = newest {make old tail node point to new node}
L.tail = newest {set variable tail to reference the new node}
L.size = L.size+1 {increment the node count}

Removing First Element from a Singly Linked List

49
Algorithm remove first(L):
if L.head is None then
Indicate an error: the list is empty.
L.head = L.head.next {make head point to next node (or None)}
L.size = L.size−1 {decrement the node count}

Python Code for Singly Linked List:

class L:
class Node:
def __init__(self,data):
self.data=data
self.next=None

def __init__(self):
self.head=None
self.tail=None
self.size=0

def len(self):
return self.size

def insert_first(self,data):
newnode=L.Node(data)
newnode.next=self.head
self.head=newnode
if self.tail==None:
self.tail=newnode
self.size+=1

def insert_last(self,data):
newnode=L.Node(data)
if self.tail.data==None:
self.head=self.tail=newnode
else:
self.tail.next=newnode;
self.tail=newnode
self.size+=1

def remove_first(self):
if self.head==None:
print("Invalid")
else:
self.head=self.head.next
self.size-=1
def display(self):
n=L.Node(None);
n=self.head
for i in range(self.size):
print(n.data)
n=n.next
def length(self):
return self.size

50
Circularly Linked List:

A circularly linked list, is a collection of nodes that collectively form a linear sequence
and the next of tail node point back to the head of the list.
A circularly linked list provides a more general model than a standard linked list for
data sets that are cyclic, that is, which do not have any particular notion of a beginning and
end.

Singly circular linked list

A A A
1 2 3

Doubly circular linked list

A A
A1
2 3

Node Creation & Initialization:

class Node:
def __init__(self,data):
self.data=data
self.next=None

def __init__(self):
self.head=None
self.tail=None
self.size=0

Insert at first:

def insert_first(self,data):
newnode=L.Node(data)
newnode.next=self.head
if self.head==None:
self.head=newnode
self.tail=newnode
self.tail.next=self.head
self.size+=1

51
Insert at Last:

def insert_last(self,data):
newnode=L.Node(data)
if self.tail.data==None:
self.head=self.tail=newnode
else:
self.tail.next=newnode;
self.tail=newnode
self.tail.next=self.head
self.size+=1

Remove First:
def remove_first(self):
if self.head==None:
print("Invalid")
else:
self.tail.next=self.head.next
self.head=self.head.next
self.size-=1

Displaying all the elements of the List:

def display(self):
n=L.Node(None);
n=self.head
for i in range(self.size):
print(n.data)
n=n.next

Finding Length of the List:

def length(self):
return self.size

52
Doubly Linked Lists

A linked list in which each node keeps an explicit reference to the node before it
and a reference to the node after it is known as a doubly linked list.

Doubly Linked List Structure

Advantages:
 Deletion operation is easier.
 Finding the predecessor and successor of node is easier.

Doubly Linked List Node:

Newnode

Insertion in Doubly Linked List:

Algorithm
1. Create a newnode
2. If there is no list already, make newnode as Head and Tail.
3. else Find the node predata.
4. Update,
Newnode’s next = predata.next
predate.next.prev =newnode.
newnode.prev = predata.
predata.next = newnode.
5. Increase the size.

53
Deletion in Doubly Linked List

Algorithm:

1. Before deletion, check, if list is empty, print (“List is empty”)

2. else using temp variable, find the data to be deleted.
3. If found, Update,
temp.prev.next=temp.next
temp.next.prev=temp.prev
4. Decrease the size

Python code for Doubly Linked List:

class List:
class Node:
def __init__(self,data):
self.data=data
self.next=None
self.prev=None

def __init__(self):
self.head=None
self.tail=None
self.size=0

def len(self):
return self.size
def insert(self,predata,data):
newnode=List.Node(data)
if(self.head==None):
self.head=newnode
self.tail=newnode
self.size+=1
else:
temp=self.head
while(temp!=None):
if(temp.data==predata):
newnode.next=temp.next
temp.next.prev=newnode
temp.next=newnode

54
newnode.prev=temp
self.size+=1
break
temp=temp.next
else:
print("data not found")

def remove(self,x):
print("size:",self.size)
temp=self.head
if(self.head==None):
print("Empty List")
return
elif(self.head.data==x and self.size==1):
self.head=None
self.tail=None
self.size-=1
print("First node deleted")
return
elif(self.head.data==x):
self.head=self.head.next
self.head.prev=None
self.size-=1
return
else:
while(temp.data!=x):
temp=temp.next
else:
print("Data not found")
return
temp.prev.next=temp.next
temp.next.prev=temp.prev
self.size-=1
return

def display(self):
if self.head==None:
print("List empty")
else:
n=L.Node(None)
n=self.head
for i in range(self.size):
print(n.data)
n=n.next

55
Stack ADT

A stack is an ordered list in which all insertions and deletions are made at one
end,called the top.

Stack is a list with the restriction that insertions and deletions can be performed in
only one position, namely the end of the list called Top.

It follows LIFO approach. LIFO represents “Last In First Out”. The basic
operations are push and pop.

PushEquivalent to insert.

Pop Equivalent to delete. It deletes the most recently inserted element.

Stack Model:

Pop Push(x)
Stack

Pop
Push
3
Top/Tos
10
6
4
5

The Basic Operations performed in the stack are:

o PUSH()
o POP()
o IsEmpty()
o IsFull()

Primitive operations on the stack Running Time Complexity

 To create a stack
 To insert an element on to the stack.
 To delete an element from the stack.
 To check which element is at the top of the stack.
 To check whether a stack is empty or not.
 To check whether the stack is full or not
 To find the length of the stack

56
Implementation of Stack:
There are two methods of implementing stack operations.
 Array implementation
 Linked List implementation
Push Operation:
The process of putting a new data element onto stack is known as a Push
Operation.
Push operation involves a series of steps –
Step 1 − Checks if the stack is full.
Step 2 − If the stack is full, produces an error and exit.
Step 3 − If the stack is not full, increments top to point next empty space.
Step 4 − Adds data element to the stack location, where top is pointing.

Code for Push Operation:

def push(self,data):
newnode=Stack.node(data)
if(self.top==None):
self.top=newnode
else:
newnode.next=self.top
self.top=newnode
self.size+=1

Pop Operation:
POP operation is performed on the stack to remove items from the stack
Pop operation involves a series of steps –
Step1 - Check if top== (-1) then stack is empty else goto step 4
Step 2 - Access the element top is pointing num = stk[top];
Step 3 - Decrease the top by 1 top = top-1;

Code for Pop Operation:

def pop(self):
if(self.isempty()):
print("Stack is empty")
else:
self.top=self.top.next
self.size-=1

isFull():
To check whether the stack is full or not before every push operation.

Code for isFull:

def isFull(self):
return(self.size==MaxSize)

isEmpty():
To check whether the stack is Empty or not before every pop operation.

Code for isEmpty:

def isempty(self):
return(self.size==0)

57
Linked List implementation of a Stack:
class Stack:
class node:
def __init__(self,data):
self.data=data
self.next=None

def __init__(self):
self.top=None
self.size=0

def push(self,data):
newnode=Stack.node(data)
if(self.top==None):
self.top=newnode
else:
newnode.next=self.top
self.top=newnode
self.size+=1

def pop(self):
if(self.isempty()):
print("Stack is empty")
else:
self.top=self.top.next
self.size-=1

def isFull():
return(self.size==MaxSize)

def isempty(self):
return(self.size==0)

def length(self):
return(self.size)

def display(self):
temp=self.top
for i in range(self.size):
print(temp.data)
temp=temp.next

58
Queue ADT
Queue is an ordered collection of data items. It delete item at front of the queue. It
inserts item at rear of the queue. It has FIFO structure i.e. “First In First Out”.

Queue Model:

Dequeue (Q) Enqueue (Q)

QUEUE

The basic operations are,

Enqueue which inserts an element at the end of the list called rear end.
Dequeue Which deletes an element at the other end (front) of the list called
front end.
Types of Queue:
 Simple Queue
 Circular Queue
 Double Ended Queue
 Priority Queue

Implementation of Simple Queue(Queue):

Like stack, queue can also be implemented in two methods.
 Using Array
 Using Linked list

Queue Operations:
1. Enqueue:
To add an item to the queue. If the queue is full, then it is said to be an Overflow
condition.
Code for Enqueue:
def Enqueue(self,data):
newnode=Queue.Node(data)
if(self.size==0):
self.Front=self.Rear=newnode
else:
self.Rear.next=newnode
self.Rear=newnode
self.size+=1

2. Dequeue:
Dequeue: Removes an item from the queue. The items are popped in the same
order in which they are pushed. If the queue is empty, then it is said to be an Underflow
condition.
Code for Dequeue:
def Dequeue(self):
if(self.size==0):
print("Queue is Empty")

59
else:
self.Front=self.Front.next
self.size-=1

3. isFull:
To check whether the queue is full before every Enqueue Operation.
Code for isFull:
def isFull(self):
return(self.size==MaxSize)

Linked List Implementation of a Queue:

class Queue:
class Node:
def __init__(self,data):
self.data=data
self.next=None

def __init__(self):
self.Front=None
self.Rear=None
self.size=0

def Enqueue(self,data):
newnode=Queue.Node(data)
if(self.size==0):
self.Front=self.Rear=newnode
else:
self.Rear.next=newnode
self.Rear=newnode
self.size+=1

def Dequeue(self):
if(self.size==0):
print("Queue is Empty")
else:
self.Front=self.Front.next
self.size-=1

def isempty(self):
return(self.size==0)

def length(self):
return(self.size)

60
Double Ended Queue – Deque:

Data structure that supports insertion and deletion at both the front and the back of
the queue is called a double ended queue, or deque.

Deque D supports the following methods:

D.add_first(e) : Add element e to the front of deque D.
D.add_last(e) : Add element e to the back of deque D.
D.delete_first( ): Remove and return the first element from deque D;
an error occurs if the deque is empty.
D.delete_last( ): Remove and return the last element from deque D;
an error occurs if the deque is empty.

Additionally, the deque ADT will include the following accessors:

D.is empty( ) : Return True if deque D does not contain any elements.
len(D) : Return the number of elements in deque D; in Python,
we implement this with the special method len .

Deque Implementation with Doubly Linked List:

class Deque:
class Node:
def __init__(self,data):
self.data=data
self.prev=None
self.next=None
def __init__(self):
self.Front=None
self.Rear=None
self.size=0

def Enqueue_Front(self,data):
newnode=Deque.Node(data)
if(self.isempty()):
self.Front=self.Rear=newnode
else:
newnode.next=self.Front
self.Front.prev=newnode
self.Front=newnode
self.size+=1

def Enqueue_Rear(self,data):
newnode=Deque.Node(data)
if(self.isempty()):

61
self.Front=self.Rear=newnode
else:
self.Rear.next=newnode
newnode.prev=self.Rear
self.Rear=newnode
self.size+=1

def Dequeue_Front(self):
if(self.isempty()):
print("Deque is Empty")
else:
self.Front=self.Front.next
self.size-=1

def Dequeue_Rear(self):
if(self.isempty()):
print("Deque is Empty")
else:
self.Rear=self.Rear.prev
self.Rear.next=None
self.size-=1

def isempty(self):
return(self.size==0)

def display(self):
temp=self.Front
for i in range(self.size):
print(temp.data)
temp=temp.next

62
UNIT II – 2 Marks Questions with Answers

1.Define data structure.

The data structure can be defined as the collection of elements and all the possible
operations which are required for those set of elements.
It is a way of organizing data that considers not only the items stored but also their
relationship to each other.
Ex: Array, Linked List, Stack, Queue etc.

2. What do you mean by non-linear data structure? Give example.

The non-linear data structure is the kind of data structure in which the data may be
arranged in hierarchical fashion.
For example- Trees and graphs.

3. What do you mean linear data structure? Give example.

The linear data structure is the kind of data structure in which the data is linearly
arranged.
For example- stacks, queues, linked list.

4. List the various operations that can be performed on data structure.

Various operations that can be performed on the data structure are
• Create
• Insertion of element
• Deletion of element
• Searching for the desired element
• Sorting the elements in the data structure
• Reversing the list of elements.

5. What is abstract data type? What are all not concerned in an ADT?
The abstract data type is a triple of D i.e. set of axioms, F-set of functions and A-
Axioms in which only what is to be done is mentioned but how is to be done is not
mentioned.
Thus ADT is not concerned with implementation details.

6. List out the areas in which data structures are applied extensively.
Following are the areas in which data structures are applied extensively.
 Operating system- the data structures like priority queues are used for
scheduling the jobs in the operating system.

63
 Compiler design- the tree data structure is used in parsing the source program.
Stack data structure is used in handling recursive calls.
 Database management system- The file data structure is used in database
management systems. Sorting and searching techniques can be applied on these
data in the file.
 Numerical analysis package- the array is used to perform the numerical
analysis on the given set of data.
 Graphics- the array and the linked list are useful in graphics applications.
 Artificial intelligence- the graph and trees are used for the applications like
building expression trees, game playing.

7. What is a linked list?

A singly linked list, is a collection of nodes that collectively form a linear sequence. It
is a set of nodes where each node has two fields ‘data’ and ‘link’. The data field is used to
store actual piece of information and link field is used to store address of next node.

Singly Linked List Node Representation

Singly Linked List Representation

8. What are the pitfall encountered in singly linked list?

Following are the pitfall encountered in singly linked list
 The singly linked list has only forward pointer and no backward link is provided.
Hence the traversing of the list is possible only in one direction. Backward traversing
is not possible.
 Insertion and deletion operations are less efficient because for inserting the element
at desired position the list needs to be traversed. Similarly, traversing of the list is
required for locating the element which needs to be deleted.

9. Define doubly linked list.

Doubly linked list is a kind of linked list in which each node has two link fields. One
link field stores the address of previous node and the other link field stores the address of
the next node.

64
Doubly Linked List Structure

10. Write down the steps to modify a node in linked lists.

 Enter the position of the node which is to be modified.
 Enter the new value for the node to be modified.
 Search the corresponding node in the linked list.
 Replace the original value of that node by a new value.
 Display the messages as “The node is modified”.

11. Difference between arrays and lists.

In arrays any element can be accessed randomly with the help of index of array,
whereas in lists any element can be accessed by sequential access only.
Insertion and deletion of data is difficult in arrays on the other hand insertion and
deletion of data is easy in lists.

12. State the properties of LIST abstract data type with suitable example.
Various properties of LIST abstract data type are
 It is linear data structure in which the elements are arranged adjacent to each other.
 It allows to store single variable polynomial.
 If the LIST is implemented using dynamic memory then it is called linked list.
Example of LIST are- stacks, queues, linked list.

13. State the advantages of circular lists over doubly linked list.
In circular list the next pointer of last node points to head node, whereas in doubly
linked list each node has two pointers: one previous pointer and another is next pointer. The
main advantage of circular list over doubly linked list is that with the help of single pointer
field we can access head node quickly. Hence some amount of memory get saved because
in circular list only one pointer is reserved.

14. What are the advantages of doubly linked list over singly linked list?
The doubly linked list has two pointer fields. One field is previous link field and
another is next link field. Because of these two pointer fields we can access any node
efficiently whereas in singly linked list only one pointer field is there which stores forward
pointer.

15. Why is the linked list used for polynomial arithmetic?

We can have separate coefficient and exponent fields for representing each term of
polynomial. Hence there is no limit for exponent. We can have any number as an exponent.

16. What is the advantage of linked list over arrays?

65
The linked list makes use of the dynamic memory allocation. Hence the user can
allocate or de allocate the memory as per his requirements. On the other hand, the array
makes use of the static memory location. Hence there are chances of wastage of the
memory or shortage of memory for allocation.
17. What is the circular linked list?
The circular linked list is a kind of linked list in which the last node is connected to the
first node or head node of the linked list.
Singly circular linked list

A A
A
2 3
1

18. What is the basic purpose of header of the linked list?

The header node is the very first node of the linked list. Sometimes a dummy value
such - 999 is stored in the data field of header node.

19. What is the advantage of an ADT?

 Change: the implementation of the ADT can be changed without making changes in
theclient program that uses the ADT.
 Understandability: ADT specifies what is to be done and does not specify
theimplementation details. Hence code becomes easy to understand due to ADT.
 Reusability: the ADT can be reused by some program in future.

20. What is static linked list?

State any two applications of it.
 The linked list structure which can be represented using arrays is called static linked
list.
 It is easy to implement, hence for creation of small databases, it is useful.
 The searching of any record is efficient, hence the applications in which the record
need to be searched quickly when the static linked list are used.

21. Define Stack

A Stack is an ordered list in which all insertions (Push operation) and deletion (Pop
operation) are made at one end, called the top. The topmost element is pointed by top. The
top is initialized to -1 when the stack is created that is when the stack is empty.
In a stack S = (a1,an), a1 is the bottom most element and element a is on top of
element ai-1. Stack is also referred as Last In First Out (LIFO) list.

22. What are the various Operations performed on the Stack?

66
The various operations that are performed on the stack are CREATE(S) – Creates S
as an empty stack. PUSH(S,X) – Adds the element X to the top of the stack. POP(S) –
Deletes the top most elements from the stack. TOP(S) – returns the value of top element
from the stack. ISEMTPTY(S) – returns true if Stack is empty else false. ISFULL(S) - returns
true if Stack is full else false.

23. Explain the usage of stack in recursive algorithm implementation?

In recursive algorithms, stack data structures is used to store the return address
when a recursive call is encountered.
Also it stores the values of all the parameters essential to the current state of the
function.
Recursion makes a program more readable.
In latest enhanced CPU systems, recursion is more efficient than iterations.

24. Define Queue.

A Queue is an ordered list in which all insertions take place at one end called the
rear, while all deletions take place at the other end called the front. Rear is initialized to -1
and front is initialized to 0. Queue is also referred as First In First Out (FIFO) list.

25. What are the various operations performed on the Queue?

The various operations performed on the queue are CREATE(Q) – Creates Q as an
empty Queue.
Enqueue(Q,X) – Adds the element X to the Queue.
Dequeue(Q) – Deletes a element from the Queue.
ISEMTPTY(Q) – returns true if Queue is empty else false.
ISFULL(Q) - returns true if Queue is full else false.

26. How do you test for an empty Queue?

The condition for testing an empty queue is rear=front-1. In linked list implementation
of queue the condition for an empty queue is the header node link field is NULL.
bool isEmpty(Queue Q)
{
If(Q->Rear==-1)
return (true)
}

27. Define Dequeue.

Deque stands for Double ended queue. It is a linear list in which insertions and
deletion are made from either end of the queue structure.

28. Write down the function to insert an element into a queue, in which the queue is
implemented as an array.
void enqueue (int X, Queue Q)
{
if(IsFull(Q))

67
Error (“Full queue”);
else
{
Q->Size++;
Q->Rear = Q->Rear+1;
Q->Array[ Q->Rear ]=X;
}
}

29.Define Circular Queue.

Another representation of a queue, which prevents an excessive use of memory by
arranging elements/ nodes Q1,Q2,…Qn in a circular fashion. That is, it is the queue, which
wraps around upon reaching the end of the queue

30. List any four applications of stack.

 Parsing context free languages
 Evaluating arithmetic expressions
 Function call
 Traversing trees and graph
 Tower of Hanoi

68
UNIT – III SORTING AND SEARCHING

Bubble sort – selection sort – insertion sort – merge sort – quick sort – linear
search – binary search – hashing – hash functions – collision handling – load factors,
rehashing, and efficiency.

Sorting:

Sorting is used to arrange the data(collection of items) in the array/list in ascending

or descending order. Given a collection, the goal is to rearrange the elements so that they
are ordered from Smallest to Largest. Or Largest to Smallest.

1) Bubble Sort:

There is a simple, but inefficient algorithm, called bubble-sort, for sorting a list L of n
comparable elements. This algorithm scans the list n−1 times, where, in each scan, the
algorithm compares the current element with the next one and swaps them if they are out of
order.

This algorithm uses multiple passes and in each pass the first and second data items
are compared.

 If the first data item is bigger than the second, then the two items are swapped.
 Next the items in second and third position are compared and if the first one is
larger than the second, then they are swapped, otherwise no change in their
order.
 This process continues for each successive pair of data items until all items are
sorted.

Algorithm:

#Bubble Sort
Function BubbleSort(A):
for i in range(len(A)):
for j in range(len(A)-1):
if(A[j]>A[j+1]):
A[j],A[j+1]=A[j+1],A[j]
Output:
print(A)
[1, 2, 3, 4, 7, 8, 9]
BubbleSort([4,2,7,3,1,8,9])

69
Step-by-step example:
Let us take the array of numbers "6 2 5 3 9", and sort the array from lowest number
to greatest number using bubble sort.
In each step, elements written in bold are being compared. Three passes will be
required.

Time Complexity:
The efficiency of Bubble sort algorithm is independent of number of data items in the
array and its initial arrangement. If an array containing n data items, then the outer loop
executes n-1 times as the algorithm requires n-1 passes.
In the first pass, the inner loop is executed n-1 times; in the second pass, n-2 times;
in the third pass, n-3 times and so on. The total number of iterations resulting in a run time of
O(n2).
Worst Case Performance O(n2)
Best Case Performance O(n2)
Average Case Performance O(n 2)
The total no. of iterations for the inner loop will be the sum of the first n - 1 integers,
which Equals resulting in a run time of O(n 2).

70
2) Selection Sort
Selection sort algorithm is one of the simplest sorting algorithm, which sorts the
elements in an array by finding the minimum element in each pass from unsorted part and
keeps it in the beginning. .
This sorting technique improves over bubble sort by making only one exchange in
each pass. This sorting technique maintains two sub arrays, one sub array which is already
sorted and the other one which is unsorted. In each iteration the minimum element
(ascending order) is picked from unsorted array and moved to sorted sub array.

Python Code
# Selection Sort
Function SelectionSort(A):
for i in range(len(A)):
min=i
for j in range(i+1,len(A)):
if(A[min]>A[j]):
min=j
A[i],A[min]=A[min],A[i]
print(A)
SelectionSort([3,20,1,4,5,2])

Step-by-step example:

Here is an example of this sort algorithm sorting five elements:

Time Complexity:
Selection sort is not difficult to analyse compared to other sorting algorithms since
none of the loops depend on the data in the array. Selecting the lowest element requires
scanning all n elements (this takes n − 1 comparisons) and then swapping it into the first
position. Finding the next lowest element requires scanning the remaining n − 1 elements
and so on, for (n − 1) + (n − 2) + ... + 2 + 1 = n(n − 1) / 2 ∈ O(n2) comparisons. Each of
these scans requires one swap for n − 1 elements (the final element is already in place).
Worst Case Performance O(n2)
Best Case Performance O(n2)
Average Case Performance O(n2)

71
3) Insertion Sort:
We start with the first element in the array. One element by itself is already sorted.
Then we consider the next element in the array. If it is smaller than the first, we swap them.
Next we consider the third element in the array. We swap it leftward until it is in its
proper order with the first two elements. We then consider the fourth element, and swap it
leftward until it is in the proper order with the first three.
We continue in this manner with the fifth element, the sixth, and so on, until the whole
array is sorted.
Algorithm InsertionSort(A):
Input: An array A of n comparable elements
Output: The array A with elements rearranged in nondecreasing order
for k from 1 to n − 1 do
Insert A[k] at its proper location within A[0], A[1], ..., A[k].
Step-by-step example:

Algorithm:
Function InsertionSort(A):
for j in range(1,len(A)):
i=j
while(i>0):
if(A[i]<A[i-1]):
A[i],A[i-1]=A[i-1],A[i]
i-=1

print(A)

InsertionSort([5,14,30,2,1])

72
Time Complexity:
Worst Case Performance O(n2)
Best Case Performance(nearly) O(n)
Average Case Performance O(n 2)

4) Merge Sort:

Merge sort is based on Divide and conquer method. It takes the list to be sorted
and divide it in half to create two unsorted lists. The two unsorted lists are then sorted
and merged to get a sorted list. The two unsorted lists are sorted by continually calling the
Partition algorithm; we eventually get a list of size 1 which is already sorted. The two lists
of size 1 are then merged.

This is a divide and conquer algorithm. This works as follows:

1. Divide the input which we have to sort into two parts in the middle. Call it the left
part and right part.
2. Sort each of them separately. Note that here sort does not mean to sort it using
some other method. We use the same function recursively.
3. Then merge the two sorted parts.

Step-by-step Example:

Algorithm:
Function merge(S1, S2, S):
i=j=0
while i + j < len(S):
if j == len(S2) or (i < len(S1) and S1[i] < S2[j]):
S[i+j] = S1[i]
i += 1
else:
S[i+j] = S2[j]
j += 1

73
Function Partition(S):
n = len(S)
if n < 2:
return
mid = n // 2
S1 = S[0:mid] # copy of first half
S2 = S[mid:n] # copy of second half
Partition(S1) # sort copy of first half
Partition(S2) # sort copy of second half
merge(S1, S2, S)

A=[85,24,63,450,170,31,96,50]
Partition(A)
print("Sorted Array is")
for i in range(len(A)):
print(A[i])

The Running Time of Merge-Sort:

We begin by analyzing the running time of the merge algorithm. Let n1 and n2 be the
number of elements of S1 and S2, respectively. It is clear that the operations performed
inside each pass of the while loop take O(1) time. The key observation is that during each
iteration of the loop, one element is copied from either S1 or S2 into. Therefore, the number
of iterations of the loop is n1+n2. Thus, the running time of algorithm merge is O(n1+n2).

5) Quick Sort:
The quick sort algorithm also uses the divide and conquer strategy. But unlike
the merge sort, which splits the sequence of keys at the midpoint, the quick sort partitions
the sequence by dividing it into two segments based on a selected pivot key. In addition, the
quick sort can be implemented to work with virtual sub sequences without the need for
temporary storage.
Quick sort is a divide and conquer algorithm. Quick sort first divides a large list
into two smaller sublists: the low elements and the high elements. Quick sort can then
recursively sort the sub-lists.

The steps are:

1. Pick an element, called a pivot, from the list.
2. Reorder the list so that all elements with values less than the pivot come before the
pivot, while all elements with values greater than the pivot come after it. (equal
values can go either way).
3. After this partitioning, the pivot is in its final position. This is called the partition
operation.
4. Recursively apply the above steps to the sub-list of elements with smaller values and
separately the sub-list of elements with greater values.

74
Step-by-step Example 1:
Pivot=last
15 6 8 4 11 9 2 1 5
Left=0 position
left right Pivot
Right=last-1 position
If left < right,
1 6 8 4 11 9 2 15 5
If left <Pivot, move left to next
left right Pivot
If right>pivot, move right to prev
1 2 8 4 11 9 6 15 5 If left < right, swap left & right
left right Pivot Left=left+1, right=right-1
If left<right,
If left <Pivot, move left to next
If right>pivot, move right to prev
1 2 8 4 11 9 6 15 5
left right Pivot
If left < right, swap left & right
Left=left+1, right=right-1

1 2 8 4 11 9 6 15 5 If right>pivot, move right to prev

left right Pivot
1 2 8 4 11 9 6 15 5 If right>pivot, move right to prev
left right Pivot
1 2 4 8 11 9 6 15 5 If left < right, swap left & right
right left Pivot Left=left+1, right=right-1
1 2 4 8 11 9 6 15 5
If left > right, swap left & pivot
right left Pivot
Lock pivot
1 2 4 5 11 9 6 15 8
(Do quicksort for left of 5 and right
left right Pivot
of 5 separately.)
1 2 4 5 11 9 6 15 8 If left >=right, swap left & pivot
left right Pivot Lock pivot

9
1 2 4 5 15 8 If left >= right, swap left & pivot
6 Left, 11
Pivot
right
Lock pivot
(Do quicksort for left of 8 and right
1 2 4 5 8 11 15 9 of 8 separately.)
6
Left right Pivot Pivot=last
Left=next position
Right=last-1 position
15 If left<right,
1 2 4 5 6 8 9 11
Left, If left <Pivot, move left to next
Pivot
right If right>pivot, move right to prev

1 2 4 5 6 8 9 15 11 If left >= right, swap left & pivot

Lock pivot
Lock all single elements
(Do quicksort for left of 5)
1 2 4
5 6 8 9 11 15 Pivot=last(4)
left Right pivot
Left=0 position
Right=last-1 position
2 4 If left<right,
1
right Pivot 5 6 8 9 11 15 If left <Pivot, move left to next
left If right>pivot, move right to prev

If left >= right, swap left & pivot

1 2 4 5 6 8 9 11 15
Lock pivot

Lock all single elements

1 2 4 5 6 8 9 11 15

75
Example 2:

Algorithm:
Function Quicksort(S, l, r):
if l >= r:
return
pivot = S[r]
left = l
right = r-1
while left <= right:
while left <= right and S[left] < pivot:
left += 1
while left <= right and pivot < S[right]:
right-=1
if left <= right:
S[left], S[right] = S[right], S[left]
left = left + 1
right = right - 1
else:
S[left], S[pivot] = S[pivot], S[left]
Quicksort(S, l, left - 1)
Quicksort(S, left + 1, r)
A=[41,21,5,10,6,3]
Quicksort(A,0,5)
print(A)

76
Searching:

Searching is the process of selecting particular information from a collection of

data based on specific criteria. Example: Performing web searches to locate pages
containing certain words or phrases or when looking up a phone number in the telephone
book.

1) Linear Search:
The algorithm uses the guess and check pattern by first guessing that the
smallest item is the first item in the list and then checking the subsequent items to see if it
made an incorrect guess.

When the sequence is unsorted, the standard approach to search for a target
value is to use a loop to examine every element, until either finding the target or exhausting
the data set. This is known as the Linear or sequential search algorithm. This algorithm runs
in O(n) time (i.e., linear time) since every element is inspected in the worst case.
Example:
List : 10,51,2,18,4,31,13,5,23,64,29 Element to be searched : 31

Algorithm:

#Linear Search
Function LinearSearch(List,data):
for i in range(len(List)):
if(List[i]==data):
print(data,"present at position",i+1)
break
else:
print("Element not found!")

List=[10,51,2,18,4,31,13,5,23,64,29]
LinearSearch(List,31)

Output:
31 present at position 6

77
78
2) Binary Search:

Binary Search is a classic recursive algorithm used to efficiently locate a target

value within a sorted sequence of n elements. This is among the most important of
computer algorithms, and it is the reason that we so often store data in sorted order.

Values stored in sorted order within an indexable sequence, such as Python list.

The numbers at top are the indices.

Example of Binary Search for the target value 22.

When the sequence is sorted and indexable, there is a much more efficient
algorithm. Initially, low = 0 and high = n− 1. We then compare the target value to the median
candidate, that is, the item data[mid] with index mid = (low +high)/2 .

We consider three cases:

 If the target equals data[mid], then we have found the item we are looking for, and
the search terminates successfully.
 If target < data[mid], then we recur on the first half of the sequence, that is, on the
interval of indices from low to mid−1.
 If target > data[mid], then we recur on the second half of the sequence, that is, on the
interval of indices from mid+1 to high.
 An unsuccessful search occurs if low > high, as the interval [low,high] is empty.

This algorithm is known as Binary Search.

79
Algorithm For Binary Search

Function BinarySearch(List,data,low,high):
if(low<=high):
mid=(low+high)//2
if(data==List[mid]):
print(data,"present at position",mid+1)
elif(data<List[mid]):
BinarySearch(List,data,low,mid-1)
else:
BinarySearch(List,data,mid+1,high)
else:
print("Element not found!")

List=[1,2,3,4,5,6,7,8,9,0]
BinarySearch(List,1,0,9)

Output:
1 present at position 1

80
Hashing:
What is Hashing?
Hashing in the data structure is a technique of mapping a large chunk of data into
small tables using a hashing function. It is also known as the message digest function. It is a
technique that uniquely identifies a specific item from a collection of similar items. It uses
hash tables to store the data in an array format.
Each value in the array has assigned a unique index number. Hash tables use a
technique to generate these unique index numbers for each value stored in an array format.
This technique is called the hash technique.

Hash table:

The hash table data structure is merely an array of some fixed size, containing the
keys. A key is a string with an associated value.

Each key is mapped into some number in the range 0 to tablesize-1 and placed in
the appropriate cell. In the following example, tablesize is 5 ie., 0 to 4.

21%5=1 1 21
18%5=3 2 32
32%5=2
3 18

Hash function:

A hash function is a key to address transformation which acts upon a given key to
compute the relative position of the key in an array.

The choice of hash function should be simple and it must distribute the data evenly.

def Hash(int key,int tablesize):

return key%tablesize;

Importance of hashing:

 Maps key with the corresponding value using hash function.

 Hash tables support the efficient addition of new entries and the time spent on
searching for the required data is independent of the number of items stored.
 A hash function is any function that can be used to map data of arbitrary size to data
of fixed size.
 A perfect hash function has no blanks and no collisions.

81
1. Division method: The hash function depends upon the remainder of division.

H(key) = record % table size

For ex, Insert 12,16,34.

12%11=1 16%11=5 30%11=8

8
0 1 2 3 4 5 6 7 9 10
12 16 30

2. Mid square: In the mid square method, the key is squared and the middle or mid part of
the result is used as the index.

Consider that if we want to place a record 3111 then for the hash table size 1000
31112 = 9678321
H(3111) = 783 ( the middle 3 digits)

3. Digital Folding: The Key is divided into separate part and using some simple operation
these parts are combined to produce the hash key.

For example, consider a record 12365412

H(key) = 123 + 654 +12
= 789
The record will be placed at location 789 in the hash table.

Collision:

When an element is inserted, it hashes to the same value as an already inserted

element, and then it produces collision.

 Separate chaining or External hashing.

 Open addressing or Closed hashing

Separate Chaining:

Separate chaining is a collision resolution technique to keep the list of all elements that
hash to the same value. This is called separate chaining because each hash table element
is a separate chain (linked list). Each linked list contains all the elements whose keys hash
to the same index.

More number of elements can be inserted as it uses linked lists. For ex, insert
18,54,28,25,41,38,36,12,90.

82
In the worst case, operations on an individual bucket take time proportional to the
size of the bucket. Assuming we use a good hash function to index the n items of our map
in a bucket array of capacity N, the expected size of a bucket is n/N.

Therefore, if given a good hash function, the core map operations run in O( n/N).
The ratio λ = n/N, called the load factor of the hash table, should be bounded by a small
constant, preferably below 1. As long as λ is O(1), the core operations on the hash table run
in O(1) expected time.

Advantages of separate chaining:

1. Simple to implement.
2. Hash table never fills up.
3. Less sensitive to the hash function or load factors.
4. It is mostly used when it is unknown how many and how frequently keys may be
inserted or deleted.
Disadvantages of separate chaining:
1. Cache performance of chaining is not good.
2. Wastage of Space.
3. If the chain becomes long, then search time can become O(n) in worst case.
4. Uses extra space for links.

Performance Evaluation of Separate Chaining:

m = Number of slots in hash table.
n = Number of keys to be inserted in has table.
Load factor α = n/m.
Expected time to search = O (1 + α).
Expected time to insert/delete = O (1 + α).
Time complexity of search insert and delete is O (1) if Load Factor (α) is O (1).

83
Open addressing:

Open addressing is a collision resolving strategy in which, if collision occurs

alternative cells are tried until an empty cell is found. The cells h0(x), h1(x), h2(x) ,…. are
tried in succession, where,

hi(x)=(Hash(x)+F(i))mod Tablesize with F(0)=0.

The function F is the collision resolution strategy.

Open addressing requires that the load factor is always at most 1 and that items are
stored directly in the cells of the bucket array itself.

Collision Resolution Strategy In Open Addressing:

(i) Linear probing - With this approach, if we try to insert an item (k,v) into a bucket A[ j]
that is already occupied, where j = h(k), then we next try A[(j +1) mod N]. If A[(j +1) mod N]
is also occupied, then we try A[(j + 2) mod N], and so on, until we find an empty bucket that
can accept the new item. Once this bucket is located, we simply insert the item there.

Example:

(ii) Quadratic probing - Another open addressing strategy, known as quadratic probing,
iteratively tries the buckets A[(h(k)+ f(i)) mod N], for i = 0,1,2,..., where f(i) = i 2, until finding an
empty bucket. As with linear probing, the quadratic probing strategy complicates the removal
operation, but it does avoid the kinds of clustering patterns that occur with linear probing.

H = (Hash (key)+i2) mod m Where m is a table size or any prime number.

Example:If we have to insert following elements in the hash table with size 10.

37, 90, 55, 22, 17, 49, 87.

(22+12)%11=1
0 1 2 3 4 5 6 7 8 9 10

55 22 90 37 49 17 87

84
(iii) Double hashing - in which F(i)=i.hash2(X). This formula says that we apply a second
hash function to X and probe at a distance hash2(X), 2hash2(X),…., and so on.

In this approach, we choose a secondary hash function, h, and if h maps some key
k to a bucket A[h(k)] that is already occupied, then we iteratively try the buckets A[(h(k) +
f(i)) mod N] next, for i = 1,2,3,..., where f(i) = i · h (k). In this scheme, the secondary hash
function is not allowed to evaluate to zero; a common choice is h(k) = q−(k mod q), for
some prime number q < N. Also, N should be a prime.

A function such as hash2(X)=R-(X mod R), with R a prime smaller than Tablesize.

Example:

Insert 37, 90, 55, 22, 14 into a hash table with size 7 using Double Hashing method.

Here the prime no chosen is, 5.

7-(22%7)=6
0 1 2 3 4 5 6 7 8 9 10

55 22 90 37 49 22 87

Comparison of above three:

Linear probing has the best cache performance, but suffers from clustering. One
more advantage of Linear probing is easy to compute.

Quadratic probing lies between the two in terms of cache performance and
clustering.

Double hashing has poor cache performance but no clustering. Double hashing
requires more computation time as two hash functions need to be computed.

85
2 Mark Questions with Answers
1.Define Sorting.

Sorting is a method of arranging data items in ascending or descending order.

The various methods of sorting are,
 Bubble Sort
 Insertion Sort
 Selection Sort
 Merge Sort
 Quick Sort

2. What is searching?
t is a process of locating an element stored in a file or array.
Different searching methods are,
a. Linear Search(or Sequential Search)
b. Binary Search
Advantage of linear search method:
It is simple and useful when the elements to be searched are not in any definite
order.

3. What are Stability in sorting algorithms?

A sorting algorithm is said to be stable if two objects with equal keys appear in the
same order in sorted output as they appear in the input unsorted array. Some sorting
algorithms are stable by nature like Insertion sort, Merge Sort, Bubble Sort, etc. And some
sorting algorithms are not, like Heap Sort, Quick Sort, etc

3. Specify the time complexity of different sorting algorithm.

Algorithm Time Complexity

Best Average Worst
Selection Sort Ω(n^2) θ(n^2) O(n^2)
Bubble Sort Ω(n) θ(n^2) O(n^2)
Insertion Sort Ω(n) θ(n^2) O(n^2)
Heap Sort Ω(n log(n)) θ(n log(n)) O(n log(n))
Quick Sort Ω(n log(n)) θ(n log(n)) O(n^2)
Merge Sort Ω(n log(n)) θ(n log(n)) O(n log(n))
Bucket Sort Ω(n+k) θ(n+k) O(n^2)
Radix Sort Ω(nk) θ(nk) O(nk)

86
4. Specify the space complexity of different sorting algorithm.

5. List the sorting algorithms which uses logarithmic time complexity.

Algorithm Time Complexity

Best Average Worst

Heap Sort Ω(n log(n)) θ(n log(n)) O(n log(n))

Quick Sort Ω(n log(n)) θ(n log(n)) O(n^2)

Merge Sort Ω(n log(n)) θ(n log(n)) O(n log(n))

6. What is the time complexity of linear and binary search?

Linear Search (sorted/unsorted)
 Worst-case performance - O(n)
 Best-case performance - O(1)
 Average performance - O(n)
 Worst-case space complexity - O(1) iterative
Binary Search (sorted/unsorted)
 Worst-case performance - O(log n)
 Best-case performance - O(1)
 Average performance - O(log n)
 Worst-case space complexity - O(1)

7. What is in-place sorting?

An in-place sorting algorithm uses constant extra space even for producing the
output ie., modifies the given array only. For example, Insertion Sort and Selection Sorts
are in-place sorting algorithms and a typical implementation of Merge Sort is not in-place.

8. What are Internal and External Sorting?

When all data that needs to be sorted cannot be placed in-memory at a time, the
sorting is called external sorting. External Sorting is used for massive amount of data.
Merge Sort and its variations are typically used for external sorting. Some external storage
like hard-disk, CD, etc is used for external storage. When all data is placed in-memory, then
sorting is called internal sorting.

87
9. Define Hashing.

Hashing is the process of mapping large amount of data item to a smaller table with
the help of a hashing function. Modulo operator is used to get the key value from the
actual data/information.

For ex, Insert 12,16,34.

0
1
2
3
12 %10 =2
4
16%10=6
5
34%10=4
6
7
8
9

10. What do you mean by hash table?

The hash table data structure is merely an array of some fixed size, containing the
keys. A key is a string with an associated value.

Each key is mapped into some number in the range 0 to tablesize-1 and placed in
the appropriate cell. In the following example, tablesize is 5 ie., 0 to 4.

0
21%5=1 1 21
18%5=3 2 18
32%5=2 3 32
4

11. What do you mean by hash function?

A hash function is a key to address transformation which acts upon a given key to
compute the relative position of the key in an array.

The choice of hash function should be simple and it must distribute the data evenly.

Index Hash(int key,int tablesize)

{
return key%tablesize;
}

88
12. Write the importance of hashing.

 Maps key with the corresponding value using hash function.

13. What do you mean by collision in hashing? Name some collision resolution
techniques.

When an element is inserted, it hashes to the same value as an already inserted

element, and then it produces collision.
 Separate chaining or External hashing.
 Open addressing or Closed hashing

14. What do you mean by separate chaining?

Separate chaining is a collision resolution technique to keep the list of all elements
that hash to the same value. This is called separate chaining because each hash table
element is a separate chain (linked list). Each linked list contains all the elements whose
keys hash to the same index.

More number of elements can be inserted as it uses linked lists. For ex, insert
12,17,22,24.

0
12%5=>2
1
17%5=>2
2
12 12 12
22%5=>2 3
24%5=>4 4 12

15. Give the Performance Evaluation of Separate Chaining.

m = Number of slots in hash table

n = Number of keys to be inserted in has table
Load factor α = n/m
Expected time to search = O(1 + α)
Expected time to insert/delete = O(1 + α)
Time complexity of search insert and delete is O(1) if α is O(1)

89
16. List some advantages and disadvantages of separate chaining?
Advantage:
1. Simple to implement.
2. Hash table never fills up, we can always add more elements to chain.
3. Less sensitive to the hash function or load factors.
4. It is mostly used when it is unknown how many and how frequently keys may be
inserted or deleted.
Disadvantages of separate chaining.
1. Cache performance of chaining is not good as keys are stored using linked list.
Open addressing provides better cache performance as everything is stored in same table.
2. Wastage of Space (Some Parts of hash table are never used)
3. If the chain becomes long, then search time can become O(n) in worst case.
4. Uses extra space for links.

17. What do you mean by open addressing?

Open addressing is a collision resolving strategy in which, if collision occurs

alternative cells are tried until an empty cell is found. The cells h0(x), h1(x), h2(x),…. are
tried in succession, where,

hi(x)=(Hash(x)+F(i))mod Tablesize with F(0)=0.

The function F is the collision resolution strategy.
Let us consider a simple hash function as “key mod 7” and sequence of keys as 50,
700, 76, 85, 92, 73, 101.

18. What do you mean by primary clustering?

In linear probing collision resolution strategy, even if the table is relatively empty,
blocks of occupied cells start forming. This effect is known as primary clustering means that
any key hashes into the cluster will require several attempts to resolve the collision and
then it will add to the cluster.

90
19. What are the types of collision resolution strategies in open addressing?

Linear probing - In which F is a linear function of i, F(i)=i. This amounts to trying

sequentially in search of an empty cell. If the table is big enough, a free cell can always be
found, but the time to do so can get quite large.

Quadratic probing - If collision occurs, alternative cells are tried until an empty cell
is found. In linear probing method, the hash table is represented one-dimensional array with
indices that range from 0 to the desired table.

In Quadratic probing the alternative cells are calculated using the formula, F(i) = i2.

H = (Hash(key)+i2) mod m Where m is a table size or any prime number.

Double hashing - in which F(i)=i.hash2(X). This formula says that we apply a

second hash function to X and probe at a distance hash2(X), 2hash2(X),….,and so on. A
function such as hash2(X)=R-(XmodR), with R a prime smaller than Tablesize.

Comparison of above three:

 Linear probing has the best cache performance, but suffers from clustering. One
more advantage of Linear probing is easy to compute.
 Quadratic probing lies between the two in terms of cache performance and
clustering.
 Double hashing has poor cache performance but no clustering. Double hashing
requires more computation time as two hash functions need to be computed.

20. What do you mean by secondary clustering?

Although quadratic probing eliminates primary clustering, elements that hash to the
same position will probe the same alternative cells. This is known as secondary clustering.

21. What do you mean by rehashing?

Building another table that is about twice as big with the associated new hash
function and scan down the entire original hash table, computing the new hash value for
each element and inserting it in the new table. This entire operation is called rehashing.

Advantage:

 Table size is not a problem.

 Hash tables cannot be made arbitrarily large.
 Rehashing can be used in other data structures.
Disadvantage:

 It is a very expensive operation.

 The running time is 0(N).

91
 Slowing down of rehashing method.
22. What is the need for extendible hashing?

If either open addressing hashing or separate chaining hashing is used, the major
problem is that collisions could cause several blocks to be examined during a Find, even for
a well-distributed hash table. Extendible hashing allows a find to be performed in two disk
accesses. Insertions also require few disk accesses.

23. List the limitations of linear probing.

Linear probing - In which F is a linear function of i, F(i)=i. This amounts to trying

sequentially in search of an empty cell. If the table is big enough, a free cell can always be
found, but the time to do so can get quite large.

Limitations of linear probing:

• Time taken for finding the next available cell is large.

• In linear probing, we come across a problem known as clustering.

24. Mention one advantage and disadvantage of using quadratic probing.

Advantage: The problem of primary clustering is eliminated.

Disadvantage: There is no guarantee of finding an unoccupied cell once the table

is nearly half full.

25. Give some advantages of Open Addressing and Separate Chaining.

Advantages of Chaining:

1) Chaining is Simpler to implement.

2) In chaining, Hash table never fills up, we can always add more elements to chain.
In open addressing, table may become full.
3) Chaining is Less sensitive to the hash function or load factors.
4) Chaining is mostly used when it is unknown how many and how frequently keys
may be inserted or deleted.
5) Open addressing requires extra care for to avoid clustering and load factor.

Advantages of Open Addressing:

1) Cache performance of chaining is not good as keys are stored using linked list.
Open addressing provides better cache performance as everything is stored in
same table.
2) Wastage of Space (Some Parts of hash table in chaining are never used). In
Open addressing, a slot can be used even if an input doesn’t map to it.
3) Chaining uses extra space for links.

92
26.List out the applications of hashing.

1. DBMS. 2. Computer networks

3. Storage of secret data 4. Cryptography.

5. Securing the database 6. Database applications

7. Storing data in a database

27. What is linear searching? What is the time complexity?

 A linear search scans one item at a time, without jumping to any item.
 The worst case complexity is O(n), sometimes known an O(n) search
 Time taken to search elements keep increasing as the number of elements are
increased.

28. What is Binary search?

Binary Search is a searching algorithm for finding an element's position in a sorted

array by repeatedly dividing the search interval in half.
Implementation
• Iterative Method
• Recursive Method

29. What are the Applications of Binary search?

 In libraries of Java, .Net, C++ STL
 While debugging, the binary search is used to pinpoint the place where the error
happens.

30. Why Hashing is needed?

 After storing a large amount of data. Linear search and binary search perform
lookups/search with time complexity of O(n) and O(log n) respectively.
 As the size of the dataset increases, these complexities also become significantly
high which is not acceptable.
 We need a technique that does not depend on the size of data. Hashing allows
lookups to occur in constant time i.e. O(1).

93
UNIT IV TREE STRUCTURES
Tree ADT – Binary Tree ADT – tree traversals – binary search trees – AVL trees
– heaps – multiway search trees.

Tree ADT:
Tree is an abstract data type that stores elements hierarchically. With the exception
of the top element, each element in a tree has a parent element and zero or more children
elements.
A tree is usually visualized by placing elements inside ovals or rectangles, and by
drawing the connections between parents and children with straight lines.
Formal Tree Definition
Formally, we define a tree T as a set of nodes storing elements such that the nodes
have a parent-child relationship that satisfies the following properties:
 If T is nonempty, it has a special node, called the root of T, which has no parent.
 Each node v of T different from the root has a unique parent node w; every node with
parent w is a child of w.

Node Relationships
 A node v is external, if v has no children. External nodes are also known as leaves.
 A node v is internal if it has one or more children.
 A node u is an ancestor of a node v, if u = v or u is an ancestor of the parent of v.
 Conversely, we say that a node v is a descendant of a node u if u is an ancestor of
v.
 A tree is ordered if there is a meaningful linear order among the children of each
node;
 Path: Path refers to the sequence of nodes along the edges of a tree.
 Root: The node at the top of the tree is called root. There is only one root per tree
and one path from the root node to any node.
 Parent: Any node except the root node has one edge upward to a node called
parent.
 Child: The node below a given node connected by its edge downward is called its
child node.
 Sub tree: Sub tree represents the descendants of a node.
 Traversing: Traversing means passing through nodes in a specific order.
 Levels: Level of a node represents the generation of a node. If the root node is at
level 0, then its next child node is at level 1, its grandchild is at level 2, and soon.
 Keys: Key represents a value of a node based on which a search operation is to be
carried out for a node.
 Siblings: All the nodes that share the same parent are called siblings.

94
 Depth: The depth of a node N is the length of the path from the root to the node N
 Height: The Height of a node N is the length of the path from the node to the deepest
leaf.
Properties of Tree:
 Every tree has a special node called the root node. The root node can be used to
traverse every node of the tree. It is called root because the tree originated from root
only.
 If a tree has N vertices(nodes) than the number of edges is always one less than the
number of nodes(vertices) i.e N-1. If it has more than N-1 edges it is called a graph
not a tree.
 Every child has only a single Parent but Parent can have multiple child.

Edges and Paths in Trees

 An edge of tree T is a pair of nodes (u,v) such that u is the parent of v, or vice versa.
 A path of T is a sequence of nodes such that any two consecutive nodes in the
sequence form an edge.

Example

The Tree Abstract Data Type

A tree ADT using the concept of a position as an abstraction for a node of a tree. An
element is stored at each position, and positions satisfy parent-child relationships that define
the tree structure.

Computing Depth and Height

Depth
Let p be the position of a node of a tree T. The depth of p is the number of ancestors
of p, excluding p itself.
 If p is the root, then the depth of p is 0.
 Otherwise, the depth of p is one plus the depth of the parent of p.
def depth(self, p):

95
if self.is root(p):
return 0
else:
return 1 + self.depth(self.parent(p))
Height
The height of a position p in a tree T is also defined recursively:
 If p is a leaf, then the height of p is 0.
 Otherwise, the height of p is one more than the maximum of the heights of p’s
children. The height of a nonempty tree T is the height of the root of T.
def height(self, p):
if self.is leaf(p):
return 0
else:
return 1 + max(self. height2(c) for c in self.children(p))

The performance of the linked structure implementation of a binary tree is,

Types of Tree:
1. Binary Tree
Binary tree is the type of tree in which each parent can have at most two children.
The children are referred to as left child or right child.

2. Binary Search Tree

Binary Search Tree (BST) is an extension of Binary tree with some added
constraints. In BST, the value of the left child of a node must be smaller than or equal to the
value of its parent and the value of the right child is always larger than or equal to the value
of its parent.

96
3. AVL Tree:
AVL tree is a self-balancing binary search tree. In AVL tree, the heights of children of
a node differ by at most 1. The valid balancing factor in AVL tree are 1, 0 and -1. When a
new node is added to the AVL tree and tree becomes unbalanced then rotation is done to
make sure that the tree remains balanced.

4. B-tree
B-tree is another self-balancing search tree that comprises many nodes to keep data
stored in a particular order. Each node has over two child nodes and each node comprises
multiple keys. B-trees are compatible with file systems and databases that can write and
read larger blocks of data.

5. N-ary Tree:
In an N-ary tree, the maximum number of children that a node can have is limited to
N. A binary tree is 2-ary tree as each node in binary tree has at most 2 children. Trie data
structure is one of the most commonly used implementation of N-ary tree. A full N-ary tree is
a tree in which children of a node is either 0 or N. A complete N-ary tree is the tree in which
all the leaf nodes are at the same level.

97
Advantages of Tree:
 The tree reflects the data structural connections.
 The tree is used for hierarchy.
 It offers an efficient search and insertion procedure.
 The trees are flexible. This allows subtrees to be relocated with minimal effort.

Tree Traversal:
Traversal is a process to visit all the nodes of a tree and may print their values too.
Because, all nodes are connected via edges (links) we always start from the root (head)
node. That is, we cannot randomly access a node in a tree. There are three ways which we
use to traverse a tree −
• In-order Traversal
• Pre-order Traversal
• Post-order Traversal
Generally, we traverse a tree to search or locate a given item or key in the tree or to
print all the values it contains.

In-order Traversal
In this traversal method, the left sub tree is visited first, then the root and later the
right sub-tree. We should always remember that every node may represent a sub tree itself.
If a binary tree is traversed in-order, the output will produce sorted key values in an
ascending order.

We start from A, and following in-order traversal, we move to its left subtree B. B is
also traversed inorder. The process goes on until all the nodes are visited. The output of
inorder traversal of this tree wills be−

D→B→E→A→F→C→G
Algorithm
Until all nodes are traversed −
Step 1 − Recursively traverse left subtree.
Step 2 − Visit root node.
Step 3 − Recursively traverse right subtree.

98
Python Code for inorder traversal:
def Inorder(self):
if self.left:
self.left.Inorder()
print( self.data)
if self.right:
self.right.Inorder()
Pre-order Traversal
In this traversal method, the root node is visited first, then the left subtree and finally
the right subtree.

We start from A, and following pre-order traversal, we first visit A itself and then move
to its left subtree B. B is also traversed pre-order. The process goes on until all the nodes
are visited. The output of preorder traversal of this tree willbe−

A → B → D → E → C → F →G
Algorithm
Until all nodes are traversed −
Step 1 − Visit root node.
Step 2 − Recursively traverse left subtree.
Step 3 − Recursively traverse right subtree.

Python Code for preorder traversal:

def preorder(self):
print( self.data)
if self.left:
self.left.preorder()
if self.right:
self.right.preorder()

99
Post-order Traversal
In this traversal method, the root node is visited last, hence the name. First we
traverse the left subtree, then the right subtree and finally the root node.

We start from A, and following Post-order traversal, we first visit the left subtree B. B is also
traversed post-order. The process goes on until all the nodes are visited. The output of post-
order traversal of this tree will be −

D→E→B→F→G→C→A
Algorithm
Until all nodes are traversed −
Step 1 − Recursively traverse left subtree.
Step 2 − Recursively traverse right subtree.
Step 3 − Visit root node.

Python Code for postorder traversal:

def postorder(self):
if self.left:
self.left.postorder()
if self.right:
self.right.postorder()
print( self.data)

Preorder Traversal Inorder Traversal Post Order Traversal

Root-Left-Right Left-Root-Right Left-Right-Root
[A-B-D-E-C-F-G] [D-B-E-A-F-C-G] [D-E-B-F-G-C-A]

100
Binary Tree:
A binary tree is an ordered tree with the following properties:
1. Every node has at most two children.
2. Each child node is labeled as being either a left child or a right child.
3. A left child precedes a right child in the order of children of a node.

The subtree rooted at a left or right child of an internal node v is called a left subtree
or right subtree, respectively, of v.

Binary Tree Terminologies:

 Root: Topmost node in a tree.
 Parent: Every node (excluding a root) in a tree is connected by a directed edge
from exactly one other node. This node is called a parent.
 Child: A node directly connected to another node when moving away from the
root.
 Leaf/External node: Node with no children.
 Internal node: Node with atleast one children.
 Depth of a node: Number of edges from root to the node.
 Height of a node: Number of edges from the node to the deepest leaf. Height of
the tree is the height of the root.
 Sibling: Nodes with the same parent are called siblings.

Types of Binary Tree:

 Complete binary tree: It is a binary tree in which every level, except possibly the
last, is completely filled, and all nodes are as far left as possible.
 Full Binary Tree: if each node has either zero or two children.
 In a proper binary tree, every internal node has exactly two children.
 Perfect binary tree: It is a binary tree in which all interior nodes have two children
and all leaves have the same depth or same level.

101
Complete Binary Tree
Decision trees: Full Binary Tree Perfect Binary Tree
To represent a number of different outcomes that can result from answering a
series of yes-or-no questions. Each internal node is associated with a question. Starting at
the root, go to the left or right child of the current node, depending on whether the answer
to the question is “Yes” or “No.” Such binary trees are known as decision trees.

Decision Trees
Arithmetic expression tree:
An arithmetic expression can be represented by a binary tree whose leaves are
associated with variables or constants, and whose internal nodes are associated with one
of the operators +, −, ×, and /. Such tree is called arithmetic expression tree.

Arithmetic Expression Tree

102
Properties of Binary Tree

Finding Height of Binary Tree:

In a tree data structure, the number of edges from the leaf node to the particular
node in the longest path is known as the height of that node. In the tree, the height of the
root node is called "Height of Tree". Height of leaf node is always 0.

Example:

Algorithm:
def height(self,root):
if root is None:
return 0;
l=self.height(root.left)
r=self.height(root.right)
return max(l,r)+1

Finding parent of a node:

The parent of a node is the node whose leftChild reference or rightChild reference is
pointing to the current node.
Example:

Parent of 2 is 1. Parent of 5 is 2.

103
Algorithm:
def parent(self,data):
if self.data==data:
print(data,"is the root")
elif self.left.data==data or self.right.data==data:
print(self.data)
elif data<self.data and self.left is not None:
self.left.parent(data)
elif data>self.data and self.right is not None:
self.right.parent(data)
else:
print("No such data")
Inserting a New node:
For inserting a node in a binary tree you will have to check the following conditions:
 If a node in the binary tree does not have its left child, then insert the given node (the
one that we have to insert) as its left child.
 If a node in the binary tree does not have its right child then insert the given node as
its right child.
 If the above-given conditions do not apply then search for the node which does not
have a child at all and insert the given node there.

Example:

Binary Tree After inserting 7.

Algorithm:
def insert(self,data):
if self.data:
if data<self.data:
if self.left is None:
self.left=Node(data)
else:
self.left.insert(data)
elif data>self.data:
if self.right is None:
self.right=Node(data)
else:
self.right.insert(data)
else:
self.data = data

104
Finding / Searching a node:
Procedure:
a) It checks whether the root is null, which means the tree is empty.

b) If the tree is not empty, it will compare root’s data with value. If they are equal, it
will set the flag to true and return.

c) Traverse left subtree by calling searchNode() recursively and check whether the
value is present in left subtree.

d) Traverse right subtree by calling searchNode() recursively and check whether the
value is present in the right subtree.

Algorithm:
def search(self,data):
if self.data==data:
print("Data found")
elif(data<self.data and self.left is not None):
self.left.search(data)
elif(data>self.data and self.right is not None):
self.right.search(data)
else:
print("Not Found")

105
Binary Search Tree:

Ordered sequence of elements in a binary tree, is called binary search tree. A binary
search tree for S is a binary tree T such that, for each position p of T:

 Position p stores an element of S, denoted as e(p).

 Elements stored in the left subtree of p (if any) are less than e(p).
 Elements stored in the right subtree of p (if any) are greater than e(p).

Figure: A binary search tree with integer keys.

Navigating (Traversing) a Binary Search Tree:

Binary search tree hierarchically represents the sorted order of its keys. An inorder
traversal of a binary search tree visits positions in increasing order of their keys.

Algorithm for Traversing:

def inorder(self):
if self.data is not None:
if self.left is not None:
self.left.inorder()
print(self.data)
if self.right is not None:
self.right.inorder()

Finding / Searching a data in Binary Search Tree:

Searching in Binary Search Tree is based on decisions. A question is asked in each
position p, if key is less than p, search in left sub tree. Otherwise search in right sub tree. If
there is a match it will return. Or else, if we reach an empty subtree, then the search
terminates unsuccessfully.

106
Algorithm for Searching:
def search(self,data):
if self.data==data:
print("Data found")
elif(data<self.data and self.left is not None):
self.left.search(data)
elif(data>self.data and self.right is not None):
self.right.search(data)
else:
print("Not Found")

Analysis of Binary Tree Searching

Algorithm TreeSearch is recursive and executes a constant number of primitive

operations for each recursive call. Each recursive call of TreeSearch is made on a child of
the previous position. That is, TreeSearch is called on the positions of a path of T that starts
at the root and goes down one level at a time. Thus, the number of such positions is
bounded by h+1, where h is the height of T. In other words, since we spend O(1) time per
position encountered in the search, the overall search runs in O(h) time, where h is the
height of the binary search tree T.

Insertion in Binary Search Tree:

If there is no root, it will place the data in the root. If the data is less than the data, it
will find appropriate left side position to place the data. Otherwise, it will find the right side
position to place the data.

107
Algorithm for Binary Search Tree Insertion

def insert(self,data):
if self.data:
if(data<self.data and self.left is None):
self.left=BinarySearchTree(data)
elif(self.left is not None):
self.left.insert(data)
elif(data>self.data and self.right is None):
self.right=BinarySearchTree(data)
elif(self.right is not None):
self.right.insert(data)
else:
self.data=data

Deletion in Binary Search Tree

To delete an item with key k, we begin by calling TreeSearch(T, T.root( ), k) to find

the position p of T storing an item with key equal to k. If the search is successful, we
distinguish between three cases
 Input the data of the node to be deleted.
 Case 1: If the node is a leaf node, delete the node directly.
 Case 2: Else if the node has one child, copy the child to the node to be deleted
and delete the child node.
 Case 3: Else if the node has two children, find the inorder successor of the node.
 Copy the contents of the inorder successor to the node to be deleted and delete
the inorder successor.

108
Figure: Before and After deleting 7
Algorithm for Deletion in BST:
def delete(self, root):
if self.root == None:
return self. root
if value == self. root.value:
if self.left == None:
self.data = self.data.right
elif self.data.right == None:
self.data = self.data.left
else:
self. root = findsuccessor(self. root)
self. root.right = delete(self. root.right, self. root.value)
elif value < self. root.value:
self. root.left = self.delete(self. root.left, value)
else:
self. root.right = self.delete(self. root.right, value)
return self. root

#Python Code for Binary Search Tree and its operations:

class BinarySearchTree:
def __init__(self,data):
self.left=None
self.data=data
self.right=None
self.root=self.data
# 4,3,5,1,2

def insert(self,data):
if self.data:
if(data<self.data and self.left is None):

109
self.left=BinarySearchTree(data)
elif(self.left is not None):
self.left.insert(data)
elif(data>self.data and self.right is None):
self.right=BinarySearchTree(data)
elif(self.right is not None):
self.right.insert(data)
else:
self.data=data

def Search(self, data):

if(self.data==data):
print(self.data,"is present")
elif(data<self.data and self.left is not None):
self.left.Search(data)
elif(data>self.data and self.right is not None):
self.right.Search(data)
else:
print("Not Present")

def findMin(self):
if self.data:
if self.left is not None:
self.left.findMin()
else:
print(self.data)
else:
print("Tree Not Found")

def findMax(self):
if self.data:
if self.right is not None:
self.right.findMin()
else:
print(self.data)
else:
print("Tree Not Found")

def parent(self,data):

110
if self.data==data:
print(data,"is the root")
elif self.left.data==data or self.right.data==data:
print(self.data)
elif data<self.data and self.left is not None:
self.left.parent(data)
elif data>self.data and self.right is not None:
self.right.parent(data)
else:
print("No such data")

def inorder(self):
if self.data is not None:
if self.left is not None:
self.left.inorder()
print(self.data)
if self.right is not None:
self.right.inorder()

111
AVL Tree

Height-Balance Property: For every position p of T, the heights of the children of p

differ by at most 1.
Any binary search tree T that satisfies the height-balance property is said to be an
AVL tree, named after the initials of its inventors: Adel’son-Vel’skii and Landis.

A tree is called an AVL tree if each node of the tree possesses one of the following
properties:

 A node is called left heavy if the longest path in its left subtree is one longer than the
longest path of its right subtree
 A node is called right heavy if the longest path in the right subtree is one longer than
the path in its left subtree
 A node is called balanced if the longest path in both the right and left subtree are
equal.

AVL tree is a height-balanced tree where the difference between the heights of the
right subtree and left subtree of every node is either -1, 0 or 1. The difference between the
heights of the subtree is maintained by a factor named as balance factor. Therefore, we can
define AVL as it is a balanced binary search tree where the balance factor of every node in
the tree is either -1, 0, or +1. Here, the balance factor is calculated by the formula:

Balance Factor = Height_Of_Left_Subtree – Height_Of_Right_Subtree

As AVL is the height-balanced tree, it helps to control the height of the binary search
tree and further help the tree to prevent skewing. When the binary tree gets skewed, the
running time complexity becomes the worst-case scenario i.e O(n) but in the case of the AVL
tree, the time complexity remains O(logn). Therefore, it is always advisable to use an AVL
tree rather than a binary search tree.

Every AVL Tree is a binary search tree but every Binary Search Tree need not be
AVL Tree.

112
AVL Rotation

When certain operations like insertion and deletion are performed on the AVL tree,
the balance factor of the tree may get affected. If after the insertion or deletion of the
element, the balance factor of any node is affected then this problem is overcome by using
rotation. Therefore, rotation is used to restore the balance of the search tree. Rotation is the
method of moving the nodes of trees either to left or to right to make the tree heighted
balance tree.

There are total two categories of rotation which is further divided into two further parts:

1) Single Rotation
Single rotation switches the roles of the parent and child while maintaining the search
order. We rotate the node and its child, the child becomes a parent.

Single LL(Left Left) Rotation

Here, every node of the tree moves towards the right from its current position.
Therefore, a parent becomes the right child in LL rotation. Let us see the below examples

#Python Code for Rotation with left data

def lRotate(self, z):
y = z.left
z.left = y.right
y.right = z
z.height = 1 + max(self.getHeight(z.left),self.getHeight(z.right))
y.height = 1 + max(self.getHeight(y.left),self.getHeight(y.right))
return y

Single RR(Right Right) Rotation

Here, every node of the tree moves towards the left from the current position.
Therefore, the parent becomes a left child in RR rotation. Let us see the below example

113
#Python code for rotation with right data
def rRotate(self, z):
y = z.right
z.right = y.left
y.left = z
z.height = 1 + max(self.getHeight(z.left),self.getHeight(z.right))
y.height = 1 + max(self.getHeight(y.left),self.getHeight(y.right))
return y

2) Double Rotation
Single rotation does not fix the LR rotation and RL rotation. For this, we require
double rotation involving three nodes. Therefore, double rotation is equivalent to the
sequence of two single rotations.

LR(Left-Right) Rotation
The LR rotation is the process where we perform a single left rotation followed by a
single right rotation. Therefore, first, every node moves towards the left and then the node of
this new tree moves one position towards the right. Let us see the below example

def DoubleRotateWithLeft(self, z):

z.left = rRotate(z.left);
return (lRotate(z))

RL (Right-Left) Rotation

114
The RL rotation is the process where we perform a single right rotation followed by a
single left rotation. Therefore, first, every node moves towards the right and then the node of
this new tree moves one position towards the left. Let us see the below example

def DoubleRotateWithRight(self, z):

z.right = lRotate(z.right);
return (rRotate(z))

115
Operations In AVL Tree
There are 2 major operations performed on the AVL tree

1. Insertion Operation
2. Deletion Operation

Let us study them one by one in detail

Insertion Operation In AVL Tree

In the AVL tree, the new node is always added as a leaf node. After the insertion of the
new node, it is necessary to modify the balance factor of each node in the AVL tree using
the rotation operations. The algorithm steps of insertion operation in an AVL tree are:

1. Find the appropriate empty subtree where the new value should be added by
comparing the values in the tree
2. Create a new node at the empty subtree
3. The new node is a leaf ad thus will have a balance factor of zero
4. Return to the parent node and adjust the balance factor of each node through the
rotation process and continue it until we are back at the root. Remember that the
modification of the balance factor must happen in a bottom-up fashion

Example:
The root node is added as shown in the below figure

The node to the root node is added as shown below. Here the tree is balanced

Then, The right child is added to the parent node. Here, the balance factor of the tree is
changed, therefore, the LL rotation is performed and the tree becomes a balanced tree

116
Later, one more right child is added to the new tree as shown below

Again further, one more right child is added and the balance factor of the tree is changed.
Therefore, again LL rotation is performed on the tree and the balance factor of the tree is
restored as shown in the below figure

#Python code for Insertion in AVL Tee

def insert(self, root, key):
if not root:
return treeNode(key)
elif key < root.value:
root.left=self.insert(root.left, key)
else:
root.right=self.insert(root.right, key)
root.height = 1 + max(self.getHeight(root.left),self.getHeight(root.right))
b = self.getBF(root)

if b > 1 and key < root.left.value:

117
return self.lRotate(root)
if b < -1 and key > root.right.value:
return self.rRotate(root)
if b > 1 and key > root.left.value:
root.left = self.rRotate(root.left)
return self.lRotate(root)
if b < -1 and key < root.right.value:
root.right = self.lRotate(root.right)
return self.rRotate(root)
return root
Deletion Operation In AVL

The deletion operation in the AVL tree is the same as the deletion operation in BST.
In the AVL tree, the node is always deleted as a leaf node and after the deletion of the node,
the balance factor of each node is modified accordingly. Rotation operations are used to
modify the balance factor of each node. The algorithm steps of deletion operation in an AVL
tree are:

1. Locate the node to be deleted

2. If the node does not have any child, then remove the node
3. If the node has one child node, replace the content of the deletion node with the child
node and remove the node.
4. If the node has two children nodes, find the inorder successor node ‘k' which has no
child node and replace the contents of the deletion node with the ‘k’ followed by
removing the node.
5. Update the balance factor of the AVL tree

Example:
Let us consider the below AVL tree with the given balance factor as shown in the
figure below

Here, we have to delete the node '25' from the tree. As the node to be deleted does
not have any child node, we will simply remove the node from the tree

118
After removal of the tree, the balance factor of the tree is changed and therefore, the
rotation is performed to restore the balance factor of the tree and create the perfectly
balanced tree.

#Python code for AVL TREE

class treeNode:

def init(self, value):

self.value = value
self.left = None
self.right = None
self.height = 1

class AVLTree:

def insert(self, root, key):

if not root:
return treeNode(key)
elif key < root.value:
root.left=self.insert(root.left, key)
else:
root.right=self.insert(root.right, key)
root.height = 1 + max(self.getHeight(root.left),self.getHeight(root.right))

b = self.getBal(root)
if b > 1 and key < root.left.value:
return self.rRotate(root)
if b < -1 and key > root.right.value:
return self.lRotate(root)
if b > 1 and key > root.left.value:
root.left = self.lRotate(root.left)
return self.rRotate(root)
if b < -1 and key < root.right.value:

119
root.right = self.rRotate(root.right)
return self.lRotate(root)
return root

def lRotate(self, z):

y = z.right
T2 = y.left
y.left = z
z.right = T2
z.height = 1 + max(self.getHeight(z.left),self.getHeight(z.right))
y.height = 1 + max(self.getHeight(y.left),self.getHeight(y.right))
return y

def rRotate(self, z):

y = z.left
T3 = y.right
y.right = z
z.left = T3
z.height = 1 + max(self.getHeight(z.left),self.getHeight(z.right))
y.height = 1 + max(self.getHeight(y.left),self.getHeight(y.right))
return y

def DoubleRotateWithLeft(self, z):

z.left = rRotate(z.left);
return (lRotate(z))

def DoubleRotateWithRight(self, z):

z.right = lRotate(z.right);
return (rRotate(z))

def getHeight(self, root):

if not root:
return 0
return root.height

def getBal(self, root):

if not root:
return 0
return self.getHeight(root.left) - self.getHeight(root.right)

120
def preOrder(self, root):
if not root:
return
print("{0} ".format(root.value), end="")
self.preOrder(root.left)
self.preOrder(root.right)

Tree = AVLTree()
root = None
root = Tree.insert(root, 1)
root = Tree.insert(root, 2)
root = Tree.insert(root, 3)
root = Tree.insert(root, 4)
root = Tree.insert(root, 5)
root = Tree.insert(root, 6)

print("Preorder traversal of the",”constructed AVL tree is")

Tree.preOrder(root)
print()

Heap:
Heap is a data structure that follows a complete binary tree's property and satisfies
the heap property. Therefore, it is also known as a binary heap. As we all know, the
complete binary tree is a tree with every level filled and all the nodes are as far left as
possible. In the binary tree, it is possible that the last level is empty and not filled.
In the heap data structure, we assign key-value or weight to every node of the tree.
Now, the root node key value is compared with the children’s nodes and then the tree is
arranged accordingly into two categories i.e., max-heap and min-heap.

121
Heapify:
The process of creating a heap data structure using the binary tree is called Heapify.
The heapify process is used to create the Max-Heap or the Min-Heap.

Using this array, we will create the complete binary tree

Min Heap
When the value of each internal node is smaller than the value of its children node
then it is called the Min-Heap Property. Also, in the min-heap, the value of the root node is
the smallest among all the other nodes of the tree. Therefore, if “a” has a child node “b” then

Key(a) < key(b)

represents the Min Heap Property. Let us display the max heap using an array.
Therefore, the root node will be arr[0]. So, for kth node i.e., arr[k]:

 arr[(k - 1)/2] will return the parent node

 arr[(2*k) + 1] will return left child

 arr[(2*k) + 2] will return right child

#Python Code

defmin_heapify(A,k):
l = left(k)
r = right(k)
if l < len(A) and A[l] < A[k]:
smallest = l
else:
smallest = k
if r < len(A) and A[r] < A[smallest]:
smallest = r
if smallest != k:
A[k], A[smallest] = A[smallest], A[k]
min_heapify(A, smallest)

122
defleft(k):
return2 * k + 1

defright(k):
return2 * k + 2

defbuild_min_heap(A):
n = int((len(A)//2)-1)
for k in range(n, -1, -1):
min_heapify(A,k)

A = [3,9,2,1,4,5]
build_min_heap(A)
print(A)

Max Heap
When the value of each internal node is greater than the value of its children node
then it is called the Max-Heap Property. Also, in the max-heap, the value of the root node is
the greatest among all the other nodes of the tree. Therefore, if “a” has a child node “b” then

Key(a) > key(b)

represents the Max Heap Property. Let us display the max heap using an array.
Therefore, the root node will be arr[0]. So, for kth node i.e., arr[k]:

 arr[(k - 1)/2] will return the parent node

 arr[(2*k) + 1] will return left child

 arr[(2*k) + 2] will return right child,

#Python Code

defmax_heapify(A,k):
l = left(k)
r = right(k)
if l < len(A) and A[l] > A[k]:
max = l
else:
max = k
if r < len(A) and A[r] > A[max]:
max = r
if max != k:

123
A[k], A[max] = A[max], A[k]
max_heapify(A, max)

defleft(k):
return2 * k + 1

defright(k):
return2 * k + 2

defbuild_max_heap(A):
n = int((len(A)//2)-1)
for k in range(n, -1, -1):
max_heapify(A,k)

A = [3,9,2,1,4,5]
build_max_heap(A)
print(A)

Time complexity

The running time complexity of the building heap is O(n log(n)) where each call for
heapify costs O(log(n)) and the cost of building heap is O(n). Therefore, the overall time
complexity will be O(n log(n)).

Applications of Heap

 Heap is used while implementing priority queue

 Heap is used in Heap sort
 Heap data structure is used while working with Dijkstra's algorithm
 We can use max-heap and min-heap in the operating system for the job scheduling
algorithm
 It is used in the selection algorithm
 Heap data structure is used in graph algorithms like prim’s algorithm
 It is used in order statistics
 Heap data structure is used in k-way merge

124
Multiway Search Tree:
A multiway search tree or (2-4) Search tree is one with nodes that have two or more
children. Each internal nodes may have more than two children. Root can have maximum of
two children.
Properties:
 Each internal node of T has at least two children. That is, each internal node
is a d-node such that d ≥ 2.
 Each internal d-node w of T with children c1,...,cd stores an ordered set of d
−1 key-value pairs (k1,v1),..., (kd−1,vd−1), where k1 ≤···≤ kd−1.
 Let us conventionally define k0 = −∞ and k d = +∞. For each item (k,v) stored
at a node in the subtree of w rooted at ci, i = 1,...,d, we have that k i−1 ≤ k ≤ ki.
Example:

Searching in a Multiway Tree:

Perform such a search by tracing a path in T starting at the root. When we are at a d-
node w during this search, we compare the key k with the keys k1,...,kd−1 stored at w. If k =
ki for some i, the search is successfully completed. Otherwise, we continue the search in the
child ci of w such that ki−1 < k < ki.

(2,4)-Tree Operations:
A multiway search tree that keeps the secondary data structures stored at each node
small and also keeps the primary multiway tree balanced is the (2,4) tree, which is
sometimes called a 2-4 tree or 2-3-4 tree. This data structure achieves these goals by
maintaining two simple properties,
Size Property: Every internal node has at most four children.
Depth Property: All the external nodes have the same depth

125
Insertion in (2-4) Tree:

Analysis of Insertion in a (2,4) Tree:

Because dmax is at most 4, the original search for the placement of new key k uses
O(1) time at each level, and thus O(logn) time overall, since the height of the tree is O(logn).
The modifications to a single node to insert a new key and child can be implemented to run
in O(1) time, as can a single split operation. The number of cascading split operations is
bounded by the height of the tree, and so that phase of the insertion process also runs in
O(log n) time. Therefore, the total time to perform an insertion in a (2,4) tree is O(logn).

126
Deletion in (2-4) Tree:

Performance of (2,4) Trees:

The asymptotic performance of a (2,4) tree is identical to that of an AVL tree in terms
of the sorted map ADT, with guaranteed logarithmic bounds for most operations. The time
complexity analysis for a (2,4) tree having n key value pairs is based on the following:
 The height of a (2,4) tree storing n entries is O(logn).
 A split, transfer, or fusion operation takes O(1) time.
 A search, insertion, or removal of an entry visits O(logn) nodes. Thus, (2,4)
trees provide for fast map search and update operations.

127
2 Mark Question with answers
1. What is tree?
A tree is an abstract data type that stores elements hierarchically. With the exception
of the top element, each element in a tree has a parent element and zero or more children
elements.

2. What is sibling?
Two nodes that are children of the same parent are siblings.

3. Define Binary Tree?

Binary tree is a finite set of elements that is either empty or is partitioned into three
disjoint subsets- The root, left sub-tree and right sub-tree.

4. What is a leaf node?

The nodes that do not have any sons is called leaf node.

5. Define the term ancestor?

Node n1 is said to be the ancestor of node n2, if n1 is either the father of n2 or father
of some ancestor of n2.

128
6. Give the array representation of the given binary tree?

7. What are the Applications of Tree Traversals?

 Table of Contents
 Parenthetic Representations of a Tree
 Computing Disk Space

8. Define the term descendent?

Node n2 is said to be the descendant of node n1, if n2 is either the left or right son of
node n1 or son of some descendant of n1.

9. Define the term left descendant?

A node n2 is the left descendant of node n1, if n2 is either the left son of n1 or a
descendant of the left son of n1.

10. Define the term right descendant?

A node n2 is the right descendant of node n1, if n2 is either the right son of n1 or a
descendant of the right son of n1.

11. Define expression tree.

Expression tree is a binary tree to represent the structure of an arithmetic expression.
The leaves of an expression tree are operands such as constants or variable names and the
other nodes contain operators.

Tree representing the arithmetic expression: A * (B − C) + (D + E)

129
12. What is a strictly binary tree?
The binary tree, in which every non-leaf node has nonempty left and right sub-trees,
is called a strictly binary tree.

13. Define level of a binary tree?

The level of a binary tree is defined as, the root of the tree has level 0, and the level
of any other node in the tree is one more than the level of its father.

14. Define Depth of a binary tree?

The depth of a binary tree is the maximum level of any leaf in the tree. This is same
as the length of the longest path from the root to any leaf.

15. What is a strictly binary tree?

A complete binary tree of depth d is the strictly binary tree all of whose leaves are at
level d.

130
16. Write the in-order, pre-order, post-order and Breadth-First or Level Order Traversal
for the given tree.
Inorder : 3 7 8 6 11 2 5 4 9
Preorder : 2 7 3 6 8 11 5 9 4
Postorder : 3 8 11 6 7 4 9 5 2

17. What is Full Binary Tree?

For a full binary tree, every node has either 2 children or 0 children.

18. What is Degree of a node in a tree? What is the Degree of and B for the given tree?

19. What is an Internal/External node?

Leaf nodes are external nodes and non-leaf nodes are internal nodes.

20. What is Height of a node in a tree? What is the height of E in the given node?

131
 If ‘p’ is a leaf, then the height of p is 0.
 Otherwise, the height of p is one more than the maximum of the heights of p’s
children.
 The height of a nonempty tree T is the height of the root of T.

21. What is Perfect Binary Tree?

A Binary tree is Perfect Binary Tree in which all internal nodes have two children and
all leaves are at same level.

22. Draw a tree for the given data in the array?

Binary Tree

23. What are the application of the complete binary tree?

 Heap Sort
 Heap sort based data structure

24. What are the properties of complete binary tree?

Properties of Complete Binary Tree:

 In a complete binary tree number of nodes at depth d is 2 d.
 In a complete binary tree with n nodes height of the tree is log (n+1).
 All the levels except the last level are completely full.

25. What are the basic operations on a binary tree?

132
Let p be a pointer to a node and x be the information. Now, the basic operations are:
i) info(p)
ii) father(p) / parent(p)
iii) left(p)
iv) right(p)
v) brother(p)
vi) isleft(p)
vii) isright(p)

26. What is the length of the path in a tree?

The length of the path is the number of edges on the path. In a tree there is exactly
one path from the root to each node.

- The Length of the path A-B-E-J is 3.

- The length of the path C-G-K is 2.

27. What are the applications of binary tree?

Binary tree is used in data processing.
a. File index schemes
b. Hierarchical database management system

28. What is meant by traversing?

Traversing a tree means processing it in such a way, that each node is visited only
once.

29. What are the different types of traversing?

a. Pre-order traversal. (Root-Left-Right)
b. In-order traversal (Left-Root-Right)
c. Post-order traversal (Left-Right-Root)

30. What are the two methods of binary tree implementation?

a. Linear representation.
b. Linked representation

31. Define pre-order traversal?

a. Visit the root node
b. Traverse the left sub-tree
c. Traverse the right sub-tree

133
32. What is a binary search tree?
A binary tree in which all the elements in the left sub-tree of a node n are less than
the contents of n, and all the elements in the right sub-tree of n are greater than or equal to
the contents of n is called a binary search tree.

33. How can you say recursive procedure is efficient than non-recursive?
There is no extra recursion. The automatic stacking and unstacking make it more
efficient. There are no extraneous parameters and local variables used.

34. Define AVL tree?

AVL tree also called as height balanced tree. It is a height balanced tree in which
every node will have a balancing factor of –1,0,1 Balancing factor Balancing factor of a node
is given by the difference between the height of the left sub tree and the height of the right
sub tree.

35. Why AVL Trees?

Most of the BST operations (e.g., search, max, min, insert, delete.. etc) take O(h)
time where h is the height of the BST. The cost of these operations may become O(n) for a
skewed Binary tree. If we make sure that height of the tree remains O(Logn) after every
insertion and deletion, then we can guarantee an upper bound of O(Logn) for all these
operations. The height of an AVL tree is always O(Logn) where n is the number of nodes in
the tree.

36. What is heap?

A heap is a tree-based data structure in which all the nodes of the tree are in a
specific order. There are two types of heap. 1) Min Heap 2) Max Heap

37. What are Max Heap and Min Heap?

In a Max-Heap the key present at the root node must be greatest among the keys
present at all of it‟s children. The same property must be recursively true for all sub-trees in
that Binary Tree.
In a Min-Heap the key present at the root node must be minimum among the keys
present at all of it’s children. The same property must be recursively true for all sub-trees in
that Binary Tree.

134
38. What are the applications of Heap?
 Heap Implemented priority queues are used in Graph algorithms like Prim‟s
Algorithm and Dijkstra‟s algorithm.
 Order statistics: The Heap data structure can be used to efficiently find the kth
smallest (or largest) element in an array.
 Priority Queues: Priority queues can be efficiently implemented using Binary Heap
because it supports insert(), delete() and extractmax(), decreaseKey() operations in
O(logn) time.

39. In a binary max heap containing n numbers, the smallest element can be found in
time.
Time complexity : O(n) In a max heap, the smallest element is always present at a
leaf node. So we need to check for all leaf nodes for the minimum value. Worst case
complexity will be O(n).

40. What are the two relational property of heap?

Heap-Order Property: In a heap T, for every position p other than the root, the key
stored at p is greater than or equal to the key stored at p‟s parent.
Complete Binary Tree Property: A heap T with height h is a complete binary tree if
levels 0,1,2, . . . ,h−1 of T have the maximum number of nodes possible (namely, level i has
2i nodes, for 0 ≤ i ≤ h−1) and the remaining nodes at level h reside in the leftmost possible
positions at that level.

41. What is multi -way search tree?

Multi-way trees or m-Way tree are generalised versions of binary trees where each
node contains multiple elements. In an m-Way tree of order m, each node contains a
maximum of m – 1 elements and m children.

42. What are the operations of Heap?

135
heapify(iterable) :- This function is used to convert the iterable into a heap data
structure. i.e. in heap order.
heappush(heap, ele) :- This function is used to insert the element mentioned in its
arguments into heap. The order is adjusted, so as heap structure is maintained.
heappop(heap) :- This function is used to remove and return the smallest element
from heap. The order is adjusted, so as heap structure is maintain.

43. What is heapify?

Heapify is the process of creating a heap data structure from a binary tree. It is used
to create a Min-Heap or a Max-Heap.

Max Heapify

136
137
UNIT V - GRAPH STRUCTURES
Graph ADT – representations of graph – graph traversals – DAG – topological
ordering – shortest paths – minimum spanning trees.

Graphs:
A graph is a way of representing relationships that exist between pairs of objects.
That is, a graph is a set of objects, called vertices, together with a collection of pairwise
connections between them, called edges.It can also be represented as G=(V, E).

Graphs have applications in modelling many domains, including mapping, transportation,

computer networks, and electrical engineering.
A graph G is simply a set V of vertices and a collection E of pairs of vertices from V,
called edges. Thus, a graph is a way of representing connections or relationships between
pairs of objects from some set V.

Graph Data Structure Terminologies

Mathematical graphs can be represented in data structure. We can represent a graph using
an array of vertices and a two-dimensional array of edges. Before we proceed further, let's
familiarize ourselves with some important terms –

Vertex − Each node of the graph is represented as a vertex. In the following example, the
labelled circle represents vertices. Thus, A to E are vertices.
Edge − Edge represents a path between two vertices or a line between two vertices. In the
following example, the lines from A to B, B to E, and so on represents edges.
Adjacency − Two node or vertices are adjacent if they are connected to each other through
an edge. In the following example, B is adjacent to A, D is adjacent to B, and so on.

Path − Path represents a sequence of edges between the two vertices. In the following
example,

138
ABDE represents a path from A to D.

Directed path - is a path such that all edges are directed and are traversed along their
direction.

Length - The no of edges in a path is called as length of the path in a graph. For example,
the length of the path (A,D) in a above graph is 2 because it contains two edges there are
(A,B),(B,D).

Degree - The number of edges incident on a vertex in a graph is called as degree. It is

classified in to two types.
 Indegree
 Outdegree
o Indegree
The number of incoming edges to a vertex is called as indegree
Indegree of A = 0 Indegree of B = 2
Indegree of C = 1 Indegree of D = 3
o Outdegree
The number of outgoing edges from a vertex is called as outdegree
Outdegree of A=3 Outdegree of B=1
Outdegree of C=1 Outdegree of D=0
Reachable - Given vertices u and v of a (directed) graph G, we say that u reaches v, and
that v is reachable from u, if G has a (directed) path from u to v.
strongly connected - A directed graph G is strongly connected if, for any two vertices u
and v of G, u reaches v and v reaches u.
An undirected graph is connected, if there is a path from every vertex to every other
vertex. A directed graph with this property is called strongly connected.

139
subgraph - A subgraph of a graph G is a graph H whose vertices and edges are subsets of
the vertices and edges of G, respectively.

Tree - A tree is a connected forest, that is, a connected graph without cycles.

Types of Graphs
1.Directed Graph
Directed graph is a graph in which edges are directed. Here each edge is unidirectional. In
directed graph the edges (A,C) is not same as (C,A). It is also called as digraph.

2. Undirected Graph
Undirected graph is a graph in which edges are undirected. Here each edge is Bidirectional.
In undirected graph, (A,C) = (C,A )

3. Weighted Graph
Weighted graph is a graph in which edges are assigned by some a weight or value.
This value is considered as cost/distance of traversing from one vertex to another vertex.
Weighted graph can be either directed or undirected.

140
4. Complete Graph
Complete graph is a graph in which there is an edge between each pair of vertices.
Here there is a path from each vertex to every other vertex. A complete graph with n vertices
should have n(n-1)/2 edges.

5. Cyclic Graph
Cyclic graph is a graph which has cycles. Cycle is a path which starts and ends at
same vertex.

6. Acyclic Graph
Acyclic graph is a graph in which does not have cycles in it. It is also called as Directed
Acyclic Graph(DAG).

Representation of Graphs or Data Structures for Graphs

The four commonly used representations of Graphs are:
1. Edge List - maintains an unordered list of all edges, but there is no efficient way to
locate a particular edge (u,v), or the set of all edges incident to a vertex v.
2. Adjacency List - for each vertex, a separate list containing those edges that are
incident to the vertex. The complete set of edges can be determined by taking the
union of the smaller sets.
3. Adjacency Map - is very similar to an adjacency list, but the secondary container
of all edges incident to a vertex is organized as a map, rather than as a list, with
the adjacent vertex serving as a key. This allows for access to a specific edge (u,v)
in O(1) expected time.
4. Adjacency Matrix - provides worst-case O(1) access to a specific edge (u,v) by
maintaining an n × n matrix, for a graph with n vertices. Each entry is dedicated to
storing a reference to the edge (u,v) for a particular pair of vertices u and v; if no
such edge exists, the entry will be None.

141
1. Edge List:
It maintains an unordered list of all edges, but there is no efficient way to locate a
particular edge (u,v), or the set of all edges incident to a vertex v.

Performance of the Edge List Structure:

The performance of an edge list structure in fulfilling the graph ADT is O(n + m) for
representing a graph with n vertices and m edges. Each individual vertex or edge instance
uses O(1) space, and the additional lists V and E use space proportional to their number of
entries.

2. Adjacency List Structure:

The adjacency list structure groups the edges of a graph by storing them in smaller,
secondary containers that are associated with each individual vertex. Specifically, for each
vertex v, we maintain a collection I(v), called the incidence collection of v, whose entries are
edges incident to v.

142
Performance of the Adjacency List Structure:
 O(n + m) for representing a graph with n vertices and m edges.
 Each individual vertex or edge instance uses O(1) space.
 Vertex count and edge count methods run in O(1) time.
 Methods vertices and edges run respectively in O(n) and O(m) time.

3. Adjacency Map Structure:

It is very similar to an adjacency list, but the secondary container of all edges incident
to a vertex is organized as a map, rather than as a list, with the adjacent vertex serving as a
key. This allows for access to a specific edge (u,v) in O(1) expected time.

Performance of Adjacency Map Structure:

 Space usage for an adjacency map remains O(n+ m).
 Edge(u,v) method can be implemented in expected O(1) time.
 Worst-case bound retains O(min(deg(u),deg(v))).

143
4. Adjacency Matrix Structure:
It provides worst-case O(1) access to a specific edge (u,v) by maintaining an n × n matrix,
for a graph with n vertices. Each entry is dedicated to storing a reference to the edge (u,v)
for a particular pair of vertices u and v if no such edge exists, the entry will be None.

Performance of Adjacency Matrix Structure:

 Any edge (u,v) can be accessed in worst-case O(1) time.
 Adjacency list or map can locate those edges in optimal O(deg(v)) time.
 Adding or removing vertices from a graph is problematic, as the matrix must be
resized.
 O(n2) space usage of an adjacency matrix is typically far worse than the O(n +
m).

144
Graph Traversals:
A traversal is a systematic procedure for exploring a graph by examining all of its
vertices and edges. A traversal is efficient if it visits all the vertices and edges in time
proportional to their number, that is, in linear time.
Graph traversal shows the notion of reachability. Reachability in an undirected graph
G include the following:
 Computing a path from vertex u to vertex v, or reporting that no such path
exists.
 Given a start vertex s of G, computing, for every vertex v of G, a path with the
minimum number of edges between s and v, or reporting that no such path
exists.
 Testing whether G is connected.
 Computing a spanning tree of G, if G is connected.
 Computing a cycle in G, or reporting that G has no cycles.

Reachability in a directed graph G include the following:

 Computing a directed path from vertex u to vertex v, or reporting that no such
path exists.
 Finding all the vertices of G that are reachable from a given vertex s.
 Determine whether G is acyclic.
 Determine whether G is strongly connected.
To reach all the nodes of a graph we need Graph Traversal Techniques. There are two
types of Graph Traversal methods,
1. Depth First Search
2. Breadth First Search

Depth First Search:

Depth-first search is useful for testing 1) Whether there is a path from one vertex to
another and 2) Whether or not a graph is connected.
Procedure:
Step 1: Visit adjacent unvisited vertex. Mark it visited. Display it. Push it in a stack.
Step 2: If no adjacent vertex found, pop up a vertex from stack. (It will pop up all the vertices
from the stack which do not have adjacent vertices.)
Step 3: Repeat from Step 1 until stack is empty.

145
Algorithm DFS(G,u): {We assume u has already been marked as visited}
Input: A graph G and a vertex u of G
Output: A collection of vertices reachable from u, with their discovery edges
for each outgoing edge e = (u,v) of u do
if vertex v has not been visited then
Mark vertex v as visited (via edge e).
Recursively call DFS(G,v).

Example:

146
Running Time of Depth-First Search:
 incident edges(v) takes O(deg(v)) time.
 e.opposite(v) method takes O(1) time.
 edge has been explored in O(1) time.
Python Code:
def dfs(visited, graph, node):
if node not in visited:
print (node)
visited.add(node)
for neighbour in graph[node]:
dfs(visited, graph, neighbour)

Breadth-First Search
Traversing a connected component of a graph, known as a breadth-first search
(BFS)
Procedure:
A BFS proceeds in rounds and subdivides the vertices into levels.
 BFS starts at vertex s, which is at level 0.
 In the first round, we paint as “visited,” all vertices adjacent to the start vertex s-
these vertices are one step away from the beginning and are placed into level 1.
 In the second round, we allow all explorers to go two steps (i.e., edges) away from
the starting vertex. These new vertices, which are adjacent to level 1 vertices and not
previously assigned to a level, are placed into level 2 and marked as “visited.”
 This process continues in similar fashion, terminating when no new vertices are
found in a level.
Python Code:
def bfs(visited, graph, node):
visited.append(node)
queue.append(node)
while queue:
s = queue.pop(0)
print (s, end = " ")
for neighbour in graph[s]:
if neighbour not in visited:
visited.append(neighbour)
queue.append(neighbour)
bfs(visited, graph, 'A')

Example:

147
Running time of Breadth First Search:
 A BFS traversal of G takes O(n+m) time.

148
Directed Acyclic Graphs:
Directed Graphs without cycles are referred as Directed Acyclic Graphs – DAG.
Topological Ordering:
A topological ordering is an ordering such that any directed path in G traverses
vertices in increasing order. Note that a directed graph may have more than one topological
ordering.
Python Code
def Topsort(Graph G):
int Counter;
Vertex V, W;
for counter in range(0,NumVertex)
V=FindNewVertexOfDegreeZero()
if(V==NotAVertex):
Error(“Graph has a cycle)
break
TopNum[V]= Counter
for W in range(o, adjacent to V):
Indegree[W]--;
Example:

149
Ordering of vertices using Topological Sorting:

A B C D E F G H

A 0 0 0 0 0 0 0 0

B 0 0 0 0 0 0 0 0

C 1 0 0 0 0 0 0 0

D 3 2 1 1 0 0 0 0

E 1 1 0 0 0 0 0 0

F 2 2 2 2 1 0 0 0

G 2 2 2 1 1 1 0 0

H 3 3 2 2 2 2 1 0

A C E B D F G H

ORDERED VERTICES ARE : A C E B D F G H

150
Shortest Paths:
Breadth-first search strategy can be used to find a shortest path from some starting
vertex to every other vertex in a connected graph.

Defining Shortest Paths in a Weighted Graph:

Let G be a weighted graph. The length (or weight) of a path is the sum of the weights
of the edges of P. That is, if P = ((v0,v1),(v1,v2),...,(vk−1,vk)), then the length of P, denoted
w(P), is defined as,

The distance from a vertex u to a vertex v in G, denoted d(u,v), is the length of a

minimum-length path (also called shortest path) from u to v, if such a path exists. People
often use the convention that d(u,v) = ∞ if there is no path at all from u to v in G.
Even if there is a path from u to v in G, however, if there is a cycle in G whose total
weight is negative, the distance from u to v may not be defined.
There is an interesting approach for solving this single-source problem based on the
greedy method design pattern.

Dijkstra’s Algorithm:
Single-source shortest path problem is to perform a “weighted” breadth-first search
starting at the source vertex s.
In each iteration, the next vertex chosen is the vertex outside the cloud that is closest
to s. The algorithm terminates when no more vertices are outside the cloud.
Applying the greedy method to the single-source shortest-path problem, results in an
algorithm known as Dijkstra’s algorithm.

Edge Relaxation:

Procedure:
• Assign the source node as S and Enqueue S.
• Dequeue the vertex S from queue and assign the value of that vertex to be known
and then find its adjacency vertices.
• If the distance of the adjacent vertices is equal to infinity then change the distance of
that vertex as the distance of its source vertex. Increment by 1 and enqueue the
vertex.
• Repeat step ii until the queue becomes empty.

151
Algorithm ShortestPath(G,s):
#Input: A weighted graph G with nonnegative edge weights, and a #distinguished vertex s of
G.
#Output: The length of a shortest path from s to v for each vertex v of G.
#Initialize D[s] = 0 and D[v] = ∞ for each vertex v = s.
#Let a priority queue Q contain all the vertices of G using the D labels as keys.
while Q is not empty do
Exa u = value returned by Q.remove min() #{pull a new vertex u into the cloud}
mpl for each vertex v adjacent to u such that v is in Q do
e: if D[u] +w(u,v) < D[v] then #{perform the relaxation procedure on edge (u,v)}
Find D[v] = D[u] +w(u,v)
the Change to D[v] the key of vertex v in Q.
Shor return the label D[v] of each vertex v
test
path using Dijkstra’s Algorithm

Solution:

1. v1 is taken as source.

152
2. Now v1 is known vertex, marked as 1. Its adjacent vertices are v2, v4, pv and dv values
are updated
T[v2]. dist = Min (T[v2].dist, T[v1].dist + Cv1, v2) = Min (α , 0+2) = 2
T[v4]. dist = Min (T[v4].dist, T[v1].dist + Cv1, v4) = Min (α , 0+1) = 1

3. Select the vertex with minimum distance away v2 and v4. V4 is marked as known vertex.
Its adjacent vertices are v3, v5, v6 and v7 .
T[v3]. dist = Min (T[v3].dist, T[v4].dist + Cv4, v3) = Min (α , 1+2) = 3
T[v5]. dist = Min (T[v5].dist, T[v4].dist + Cv4, v5) = Min (α , 1+2) = 3
T[v6]. dist = Min (T[v6].dist, T[v4].dist + Cv4, v6) = Min (α , 1+8) = 9
T[v7]. dist = Min (T[v7].dist, T[v4].dist + Cv4, v7) = Min (α , 1+4) = 5

4. Select the vertex which is shortest distance from source v1. v2 is smallest one. v2 is
marked as known vertex. Its adjacent vertices are v4 ad v5. The distance from v1 to v4 and
v5 through v2 is more comparing with previous value of dv. No change in dv and pv value.

153
5. Select the next smallest vertex from source. v3 and v5 are smallest one. Adjacent vertices
for v3 are v1 and v6. v1 is source there is no change in dv and pv
T[v6]. dist = Min (T[v6].dist, T[v3].dist + Cv3, v6) = Min (9 , 3+5) = 8
dv and pv values are updated. Adjacent vertices for v5 are v7. No change in dv and pv
value.

6. Next smallest vertex v7. Its adjacent vertex is v6.

T[v6]. dist = Min (T[v6].dist, T[v7].dist + Cv7, v6) = Min (8 , 5+1) = 6
dv and pv values are updated.

154
7. The last vertex v6 is declared as known.

No adjacent vertices for v6. No updation in

the table.

The shortest distance identified from the source V1 are:

V1 V2 is 2
V1 V4 is 1
V1 V6 is 6
V1 V3 is 3
V1 V5 is 3
V1 V7 is 5 Algorithm Analysis Time complexity of this algorithm O(|E| + |V|2 ) =
O(|V|2 )
Minimum Spanning Trees:
A tree, that contains every vertex of a connected graph G is said to be a spanning
tree, and the problem of computing a spanning tree T with smallest total weight is known as
the minimum spanning tree (or MST) problem.
Given an undirected, weighted graph G, finding a tree T that contains all the vertices
in G and minimizes the sum can be represented as,

Procedure:
o We begin with some vertex s,
o defining the initial “cloud” of vertices C.
o Then, in each iteration, choose a minimum-weight edge e = (u,v), connecting
a vertex u in the cloud C to a vertex v outside of C.
o The vertex v is then brought into the cloud C and the process is repeated until
a spanning tree is formed.

155
Algorithm PrimJarnik(G):
Input: An undirected, weighted, connected graph G with n vertices and m edges
Output: A minimum spanning tree T for G
Pick any vertex s of G
INF = 9999999
V=5
selected = [0, 0, 0, 0, 0]
no_edge = 0
selected[0] = True
print("Edge : Weight\n")
while (no_edge < V - 1):
minimum = INF
x=0
y=0
for i in range(V):
if selected[i]:
for j in range(V):
if ((not selected[j]) and G[i][j]):
# not in selected and there is an edge
if minimum > G[i][j]:
minimum = G[i][j]
x=i
y=j
print(str(x) + "-" + str(y) + ":" + str(G[x][y]))
selected[y] = True
no_edge += 1

156
Example

Analyzing the Prim-Jarnik Algorithm:

o Each operation runs in O(logn) time.
o The overall time for the algorithm is O((n + m)logn), which is O(mlogn) for a
connected graph.
o Alternatively, we can achieve O(n2) running time by using an unsorted list as
a priority queue.

157
2 Mark Questions with Answers
1. Define Graph.
A graph is a way of representing relationships that exist between pairs of objects.
That is, a graph is a set of objects, called vertices ‘V’, together with a collection of pairwise
connections between them, called edges ‘E’. It can also be represented as G=(V, E).

2. Define adjacent nodes.

Any two nodes which are connected by an edge in a graph are called adjacent
nodes. For example, if an edge x ε E is associated with a pair of nodes (u,v) where u, v ε V,
then we say that the edge x connects the nodes u and v.

3. What is a directed graph?

A graph in which every edge is directed is called a directed graph.

4. What is an undirected graph?

A graph in which every edge is undirected is called a directed graph.

5. What is a loop?
An edge of a graph which connects to itself is called a loop or sling.

158
6. What is a simple graph?
A simple graph is a graph, which has not more than one edge between a pair of
nodes than such a graph is called a simple graph.

7. What is a weighted graph?

A graph in which weights are assigned to every edge is called a weighted graph.

8. Define outdegree of a graph?

In a directed graph, for any node v, the number of edges which have v as their initial
node is called the out degree of the node v.

9. Define indegree of a graph?

In a directed graph, for any node v, the number of edges which have v as their
terminal node is called the indegree of the node v.

10. Define path in a graph?

The path in a graph is the route taken to reach terminal node from a starting node.

159
11. What is a simple path?
A path in a diagram in which the edges are distinct is called a simple path. It is also
called as edge simple.

12. What is a cycle or a circuit?

A path which originates and ends in the same node is called a cycle or circuit.

13. What is an acyclic graph?

A simple diagram which does not have any cycles is called an acyclic graph.

14. What is meant by strongly connected in a graph?

An undirected graph is connected, if there is a path from every vertex to every other
vertex. A directed graph with this property is called strongly connected.

15. When is a graph said to be weakly connected?

When a directed graph is not strongly connected but the underlying graph is
connected, then the graph is said to be weakly connected.

16. Name the different ways of representing a graph?

1. Edge List - maintains an unordered list of all edges, but there is no efficient way
to locate a particular edge (u,v), or the set of all edges incident to a vertex v.
2. Adjacency List - for each vertex, a separate list containing those edges that are
incident to the vertex. The complete set of edges can be determined by taking the
union of the smaller sets.

160
3. Adjacency Map - is very similar to an adjacency list, but the secondary container
of all edges incident to a vertex is organized as a map.
4. Adjacency Matrix - Each entry is dedicated to storing a reference to the edge
(u,v) for a particular pair of vertices u and v; if no such edge exists, the entry will
be None.

17. What is an undirected acyclic graph?

When every edge in an acyclic graph is undirected, it is called an undirected acyclic
graph. It is also called as undirected forest.

18. What is a minimum spanning tree?

A minimum spanning tree of an undirected graph G is a tree formed from graph
edges that connects all the vertices of G at the lowest total cost.

19. Name two algorithms two find minimum spanning tree.

 Kruskal’salgorithm
 Prim’s algorithm

20. Define graph traversals.

Traversing a graph is an efficient way to visit each vertex and edge exactly once.
There are the two traversal strategies used in traversing a graph
a. Breadth first search
b. Depth first search

21. List the two important key points of depth first search.
i) If path exists from one node to another node, walk across the edge – exploring the edge.
ii) If path does not exist from one specific node to any other node, return to the previous
node where we have been before – backtracking.

22. What do you mean by breadth first search (BFS)?

BFS performs simultaneous explorations starting from a common point and
spreading out independently.

161
23. Differentiate BFS and DFS.
SNo Concept BFS DFS
BFS stands for Breadth First
1. Stands for DFS stands for Depth First Search.
Search.
Approach It works on the concept of FIFO It works on the concept of LIFO
2.
used (First In First Out). (Last In First Out).
BFS is more suitable for searching
DFS is more suitable when there are
3. Suitable for vertices which are closer to the
solutions away from source.
given source.
The Time complexity of BFS is O(V The Time complexity of DFS is also
Time + E) when Adjacency List is used O(V + E) when Adjacency List is
4.
Complexity and O(V^2) when Adjacency Matrix used and O(V^2) when Adjacency
is used. Matrix is used.
Visiting of
Here, siblings are visited before the Here, children are visited before the
5. Siblings/
children. siblings.
Children
BFS is used in various application DFS is used in various application
6. Applications such as bipartite graph, and such as acyclic graph and
shortest path etc. topological order etc.
7. Memory BFS requires more memory. DFS requires less memory.

8. Speed BFS is slow as compared to DFS. DFS is fast as compared to BFS.

24. What do you mean by tree edge?

If w is undiscovered at the time vw is explored, then vw is called a tree edge and v
becomes the parent of w.
It is an edge which is present in the tree obtained after applying DFS on the graph.
All the Green edges are tree edges.

25. What do you mean by back edge?

If w is the ancestor of v, then vw is called a back edge.

162
26. Define biconnectivity.
A connected graph G is said to be biconnected, if it remains connected after removal
of any one vertex and the edges that are incident upon that vertex. A connected graph is
biconnected, if it has no articulation points.

27. What do you mean by articulation point?

If a graph is not biconnected, the vertices whose removal would disconnect the graph
are known as articulation points.

28. What do you mean by shortest path?

A path having minimum weight between two vertices is known as shortest path, in
which weight is always a positive number.

29. Define Activity node graph.

Activity node graphs represent a set of activities and scheduling constraints. Each
node represents an activity (task), and an edge represents the next activity.

30. Define adjacency list.

Adjacency list is an array indexed by vertex number containing linked lists. Each
node Vi the ith array entry contains a list with information on all edges of G that leave V i. It is
used to represent the graph related problems.

163

CSP U1l06 Representing Text Sample 2
No ratings yet
CSP U1l06 Representing Text Sample 2
15 pages
cd3291 Dsa Study Material
No ratings yet
cd3291 Dsa Study Material
168 pages
DOC-20240130-WA0011
No ratings yet
DOC-20240130-WA0011
195 pages
Cd3291 Dsa Notes
100% (1)
Cd3291 Dsa Notes
168 pages
cd3291 Dsa Study Material
No ratings yet
cd3291 Dsa Study Material
169 pages
AD3251-DSD Lecture Notes
No ratings yet
AD3251-DSD Lecture Notes
26 pages
Unit 1 Dsd Notes It Covers All the Topics in First Unit
No ratings yet
Unit 1 Dsd Notes It Covers All the Topics in First Unit
35 pages
DSD Unit 1 Abstract Data Type
No ratings yet
DSD Unit 1 Abstract Data Type
60 pages
Introduction
No ratings yet
Introduction
27 pages
Data Structures
No ratings yet
Data Structures
303 pages
Data Structures and Algorithms
No ratings yet
Data Structures and Algorithms
211 pages
1 Introduction
No ratings yet
1 Introduction
52 pages
WEEK 1-Material
No ratings yet
WEEK 1-Material
12 pages
Unit 1
No ratings yet
Unit 1
13 pages
Unit 1
No ratings yet
Unit 1
11 pages
Data Structures by D Samantha PDF
No ratings yet
Data Structures by D Samantha PDF
167 pages
DSA-Lecture-Notes-UNIT-I
No ratings yet
DSA-Lecture-Notes-UNIT-I
44 pages
Data Structures Design - AD3251 - Important Questions with Answer - Unit 1 - Abstract Data Types
No ratings yet
Data Structures Design - AD3251 - Important Questions with Answer - Unit 1 - Abstract Data Types
15 pages
Unit 1: Introduction To Fundamental Data Types and Structures
No ratings yet
Unit 1: Introduction To Fundamental Data Types and Structures
15 pages
Dsa Codes Using Python
No ratings yet
Dsa Codes Using Python
131 pages
Introduction To Data Structure
No ratings yet
Introduction To Data Structure
7 pages
DS-I_Introduction to Data Structure_24 Oct 2018
No ratings yet
DS-I_Introduction to Data Structure_24 Oct 2018
70 pages
Iare DS Lecture Notes 2
No ratings yet
Iare DS Lecture Notes 2
135 pages
Unit 19 - Assignment Brief 1 - Huynh Nhat Nam-Môn DATA AND ALGORITHMS
50% (2)
Unit 19 - Assignment Brief 1 - Huynh Nhat Nam-Môn DATA AND ALGORITHMS
27 pages
Data Structures and Programming Methodologies
No ratings yet
Data Structures and Programming Methodologies
19 pages
Lecture 1-Introduction: Data Structure and Algorithm Analysis
No ratings yet
Lecture 1-Introduction: Data Structure and Algorithm Analysis
27 pages
Chen
No ratings yet
Chen
132 pages
Introduction To Data Structures and Algorithms
No ratings yet
Introduction To Data Structures and Algorithms
5 pages
Data Structures and Algorithms: Aamir Zia
No ratings yet
Data Structures and Algorithms: Aamir Zia
19 pages
DSA Python (1)
No ratings yet
DSA Python (1)
21 pages
Python Lecture 31,32
No ratings yet
Python Lecture 31,32
17 pages
Notes DS CH 1 Shraddha
No ratings yet
Notes DS CH 1 Shraddha
7 pages
Unit 1 Introduction to Data Structures
No ratings yet
Unit 1 Introduction to Data Structures
98 pages
Unit One
No ratings yet
Unit One
14 pages
Ec22303 - All Unit Notes
No ratings yet
Ec22303 - All Unit Notes
372 pages
EC22303 UNIT 1 notes
No ratings yet
EC22303 UNIT 1 notes
156 pages
Davija CP and Ds Notes-Eee 2nd Year
No ratings yet
Davija CP and Ds Notes-Eee 2nd Year
340 pages
Data Structures by D. Samantha
67% (9)
Data Structures by D. Samantha
167 pages
Data Structures by D Samantha
No ratings yet
Data Structures by D Samantha
167 pages
DS-I - Introduction To Data Structure
No ratings yet
DS-I - Introduction To Data Structure
64 pages
ITSE205-DataStructures and Algorithms PDF
No ratings yet
ITSE205-DataStructures and Algorithms PDF
115 pages
Lecture 1
No ratings yet
Lecture 1
17 pages
lec1
No ratings yet
lec1
13 pages
Notes - DS Using C++ Sem IV CBCS - Opt
No ratings yet
Notes - DS Using C++ Sem IV CBCS - Opt
49 pages
1 Data Structures Introduction - 1
No ratings yet
1 Data Structures Introduction - 1
27 pages
DSA Unit 1_merged
No ratings yet
DSA Unit 1_merged
98 pages
DS Unit - 1
No ratings yet
DS Unit - 1
47 pages
Ds Oops Unit 1
No ratings yet
Ds Oops Unit 1
53 pages
3_Introduction to Data Structures
No ratings yet
3_Introduction to Data Structures
9 pages
CC 204 Module
No ratings yet
CC 204 Module
12 pages
Advanced Data Structure
No ratings yet
Advanced Data Structure
35 pages
Handout Data Structures Final
No ratings yet
Handout Data Structures Final
73 pages
1649 Assignment1
No ratings yet
1649 Assignment1
37 pages
1649 Assignment1
No ratings yet
1649 Assignment1
18 pages
DSA - Module 1 - Part 1
No ratings yet
DSA - Module 1 - Part 1
39 pages
BMC205 DSAA Unit1 Intro Notes
No ratings yet
BMC205 DSAA Unit1 Intro Notes
14 pages
Unit 1
No ratings yet
Unit 1
24 pages
ds-2021_1643097115
No ratings yet
ds-2021_1643097115
133 pages
Data Structures and Algorithms
No ratings yet
Data Structures and Algorithms
61 pages
AD3251 - DS - Unit 1- ABSTRACT DATA TYPES
No ratings yet
AD3251 - DS - Unit 1- ABSTRACT DATA TYPES
62 pages
Algorithms and Data Structures: An Easy Guide to Programming Skills
From Everand
Algorithms and Data Structures: An Easy Guide to Programming Skills
Rigdon Jonathan
No ratings yet
Java OOPs Concepts
No ratings yet
Java OOPs Concepts
6 pages
Oodm Note
No ratings yet
Oodm Note
174 pages
Unit-3 Notes Oosd
0% (1)
Unit-3 Notes Oosd
20 pages
Kuiz Programming Paradigm
No ratings yet
Kuiz Programming Paradigm
4 pages
Design Document Template
No ratings yet
Design Document Template
6 pages
L-1 Data Mining Issues
No ratings yet
L-1 Data Mining Issues
24 pages
Assignment 1 Front Sheet: Qualification BTEC Level 5 HND Diploma in Computing
No ratings yet
Assignment 1 Front Sheet: Qualification BTEC Level 5 HND Diploma in Computing
13 pages
OOPM Lab Manual 2019-20
No ratings yet
OOPM Lab Manual 2019-20
26 pages
Notes - 1056 - Unit I
No ratings yet
Notes - 1056 - Unit I
54 pages
Oops Viva Questions
No ratings yet
Oops Viva Questions
9 pages
ICSE Mind Maps & On Tips Notes Class 10 - Computer Applications
No ratings yet
ICSE Mind Maps & On Tips Notes Class 10 - Computer Applications
5 pages
Object-Oriented Programming (OOP) Lecture No. 1
100% (1)
Object-Oriented Programming (OOP) Lecture No. 1
220 pages
PU2 ModelPaper - 2023
No ratings yet
PU2 ModelPaper - 2023
10 pages
Fundamental Design Concepts (PSE SE)
No ratings yet
Fundamental Design Concepts (PSE SE)
47 pages
Fundamentals of Object Oriented Programming Procedure Oriented Programming (POP)
No ratings yet
Fundamentals of Object Oriented Programming Procedure Oriented Programming (POP)
8 pages
Module 1
No ratings yet
Module 1
58 pages
Solid Python PDF
No ratings yet
Solid Python PDF
11 pages
Unit - Iii It6602 Software Architectures: Architectural Views
No ratings yet
Unit - Iii It6602 Software Architectures: Architectural Views
58 pages
CSC 313 Past Questions Answer
No ratings yet
CSC 313 Past Questions Answer
6 pages
Relational Database Management System
0% (2)
Relational Database Management System
95 pages
Degrees of Data Abstraction
No ratings yet
Degrees of Data Abstraction
5 pages
Leverage Object-Oriented Industrial Programming - Control Engineering
No ratings yet
Leverage Object-Oriented Industrial Programming - Control Engineering
20 pages
Java Developer Skills - Important Skills of A Java Developer - by Swatee Chand - Edureka - Medium
No ratings yet
Java Developer Skills - Important Skills of A Java Developer - by Swatee Chand - Edureka - Medium
12 pages
Bridge Design Pattern
No ratings yet
Bridge Design Pattern
21 pages
Programming Languages Research Report
No ratings yet
Programming Languages Research Report
16 pages
Ques - With Ansy
No ratings yet
Ques - With Ansy
16 pages
CSE2004 - Database Management Systems
No ratings yet
CSE2004 - Database Management Systems
102 pages
Ryan Thesis
No ratings yet
Ryan Thesis
78 pages
A-WPS Office
No ratings yet
A-WPS Office
8 pages