Unit 1 Notes
Unit 1 Notes
The intimate relationship between data and programs can be traced to the
beginning of computing. In any area of application, the input data, (internally
stored data) and output data may each have a unique structure. Data structure is
representation of the logical relationship existing between individual elements of
data. In other words, a data structure is a way of organizing all data items that
considers not only the elements stored but also their relationship to each other.
Data structures are the building blocks of a program, and it affects the design of
both structural and functional aspects of a program. And hence the selection of a
particular data structure stresses on the following two things.
These are basic structures and are directly operated upon by the machine
instructions. In general, they have different representations on different
computers. Integer, floating point numbers, character constants, string
constants, pointers etc.
These are more sophisticated data structures. These are derived from the
primitive data structures. The non-primitive data structures emphasize on
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 1 | 22
structuring of a group of homogeneous (same type) or heterogeneous (different type)
data items. Arrays, lists and files are examples.
int a[10] ;
Where int specifies the data type or type of elements array stores. ”a” is the name
of array, and the number specified inside the square brackets is the number of
elements an array can store this is also called size or length of array.
2. Lists - A list (Linear Linked list) can be defined as a collection of variable number
of data items. Lists are the most commonly used non-primitive data structures.
An element of list must contain least two fields, one for storing data or
information and other for storing address of next element.
3.
4. For storing address we have a special data structures called pointers, hence the
second field of the list must be pointer type. Technically each such element is
referred to as a node.
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 2 | 22
4.1 Linear data structures – Those data structure where the data elements are
organised in some sequence is called Linear data structures. Here operation on
data structure are possible in a sequence. Stack, queue, array are example of
linear data structure.
4.1.2 Queues - Queues are first in first out type of Data Structures (i.e., FIFO).
In a queue, new elements are
added to the queue from one
end called REAR end, and the
elements are always removed
from other end called the
FRONT end. Figure 4. Operations on Queue.
4.2 Non Linear data structures – Those data structure where the data elements
are not organised in some sequence, organised in some arbitrary function without any
sequence is called Non linear data structures. Graph, Tree are example of linear
data structure.
1. There is a special data item at the top of hierarchy called the Root of the tree.
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 3 | 22
2. The remaining data items are partitioned into number of mutually exclusive
subsets, each of which is itself, a tree, which is called the subtree.
3. The tree always grows in length towards bottom in data structures, unlike
natural trees which grow upwards.
The tree structure organizes the data into branches, which relate the information.
1. Simple Graph
2. Directed Graph
3. Non-directed Graph
4. Connected Graph
5. Non-connected Graph
6. Multi-Graph
4. UPDATION - As the name implies this operation updates or modifies the data
in the data structure. Probably new data may be entered or previously stored
data may be deleted.
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 4 | 22
Operation On Data Structure - Other operations performed on data structure
includes -:
1. SEARCHING – Searching operation finds the presence of the desired data item
in the list of data item. It may also find the locations of all elements that
satisfy certain conditions.
Before knowing about the abstract data type model, we should know about
abstraction and encapsulation.
Abstract Data type (ADT) is a type or class for objects whose behaviour is defined
by a set of values and a set of operations. The definition of ADT only mentions
what operations are to be performed but not how these operations will be
implemented. It does not specify how data will be organized in memory and what
algorithms will be used for implementing the operations. It is called “abstract”
because it gives an implementation-independent view. The process of providing
only the essentials and hiding the details is known as abstraction.
The below figure shows the ADT model. There are two types of models in the ADT
model, i.e., the public function and the private function. The ADT model also
contains the data structures that we are using in a program. In this model, first
encapsulation is performed, i.e., all the data is wrapped in a single unit, i.e., ADT.
Then, the abstraction is performed means showing the operations that can be
performed on the data structure and what are the data structures that we are
using in a program.
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 5 | 22
Figure 7. ADT
The user of data type does not need to know how that data type is implemented,
for example, we have been using Primitive values like int, float, char data types
only with the knowledge that these data type can operate and be performed on
without any idea of how they are implemented. So a user only needs to know what
a data type can do, but not how it will be implemented. Think of ADT as a black
box which hides the inner structure and design of the data type. Now we’ll define
three ADTs namely List ADT, Stack ADT, Queue ADT.
1. List ADT - The data is generally stored in key sequence in a list which has a
head structure consisting of count, pointers and address of compare
function needed to compare the data in the list. The data node contains
the pointer to a data structure and a self-
referential pointer which points to the next
node in the list. The List ADT Functions is
given below:
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 6 | 22
1. pop() – Remove and return the element at the top of the stack, if it is not
empty.
2. push() – Insert an element at one end of the stack called top.
3. peek() – Return the element at the top of the stack without removing it, if the
stack is not empty.
4. size() – Return the number of elements in the stack.
5. isEmpty() – Return true if the stack is empty, otherwise return false.
6. isFull() – Return true if the stack is full, otherwise return false.
5. Queue ADT - The queue abstract data type (ADT) follows the basic design of
the stack abstract data type. Each node contains a void pointer to
the data and the link pointer to the next
element in the queue. The program’s
responsibility is to allocate memory for storing
the data.
Features of ADT:
Abstract data types (ADTs) are a way of encapsulating data and operations on that
data into a single unit. Some of the key features of ADTs include:
1. Abstraction: The user does not need to know the implementation of the data
structure only essentials are provided.
2. Better Conceptualization: ADT gives us a better conceptualization of the
real world.
3. Robust: The program is robust and has the ability to catch errors.
4. Encapsulation: ADTs hide the internal details of the data and provide a
public interface for users to interact with the data. This allows for easier
maintenance and modification of the data structure.
5. Data Abstraction: ADTs provide a level of abstraction from the
implementation details of the data. Users only need to know the operations
that can be performed on the data, not how those operations are
implemented.
6. Data Structure Independence: ADTs can be implemented using different
data structures, such as arrays or linked lists, without affecting the
functionality of the ADT.
7. Information Hiding: ADTs can protect the integrity of the data by allowing
access only to authorized users and operations. This helps prevent errors and
misuse of the data.
8. Modularity: ADTs can be combined with other ADTs to form larger, more
complex data structures. This allows for greater flexibility and modularity in
programming.
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 7 | 22
Overall, ADTs provide a powerful tool for organizing and manipulating data in a
structured and efficient manner. Abstract data types (ADTs) have several
advantages and disadvantages that should be considered when deciding to use
them in software development. Here are some of the main advantages and
disadvantages of using ADTs:
Advantages:
Disadvantages:
Introduction to Algorithm
An algorithm named after ninth century is defined as - An algorithm is a set of
rules for carrying out calculations either by hand or on a machine. It is a
sequence of computational steps that transform the input into the output or a
sequence of operations performed on data that have to be organized in data
structures. We can also say that algorithm is an obstruction of a program to be
executed on a physical machine.
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 8 | 22
An algorithm is a step-by-step procedure to solve a particular function. That is, it
is a set of instructions written to carry out certain tasks and the data structure is
the way of organizing the data with their logical relationship retained. To develop a
program of an algorithm, we should select an appropriate data structure for that
algorithm. Therefore algorithm and its associated data structures form a program.
Dataflow of an Algorithm
o Problem: A problem can be a real-world problem or any instance from the real-
world problem for which we need to create a program or the set of instructions.
The set of instructions is known as an algorithm.
o Algorithm: An algorithm will be designed for a problem which is a step by step
procedure.
o Input: After designing an algorithm, the required and the desired inputs are
provided to the algorithm.
o Processing unit: The input will be given to the processing unit, and the
processing unit will produce the desired output.
o Output: The output is the outcome or the result of the program.
o Input: An algorithm has some input values. We can pass 0 or some input value
to an algorithm.
o Output: We will get 1 or more output at the end of an algorithm.
o Unambiguity: An algorithm should be unambiguous which means that the
instructions in an algorithm should be clear and simple.
o Finiteness: An algorithm should have finiteness. Here, finiteness means that
the algorithm should contain a limited number of instructions, i.e., the
instructions should be countable.
o Effectiveness: An algorithm should be effective as each instruction in an
algorithm affects the overall process.
o Language independent: An algorithm must be language-independent so that
the instructions in an algorithm can be implemented in any of the languages
with the same output.
Example: Suppose we want to make a Orange juice, so following are the steps
required to make a Orange juice:
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 9 | 22
Step 6: Store the juice in a fridge for 5 to minutes.
Step 7: Now, it's ready to drink.
Step 1: Start
Step 2: Declare three variables a, b, and sum.
Step 3: Enter the values of a and b.
Step 4: Add the values of a and b and store the result in the sum variable, i.e.,
sum=a+b.
Step 5: Print sum
Step 6: Stop
Factors of an Algorithm - The following are the factors that we need to consider for
designing an algorithm:
o Modularity: If any problem is given and we can break that problem into small-
small modules or small-small steps, which is a basic definition of an algorithm,
it means that this feature has been perfectly designed for the algorithm.
o Correctness: The correctness of an algorithm is defined as when the given
inputs produce the desired output, which means that the algorithm has been
designed algorithm. The analysis of an algorithm has been done correctly.
o Maintainability: Here, maintainability means that the algorithm should be
designed in a very simple structured way so that when we redefine the
algorithm, no major change will be done in the algorithm.
o Functionality: It considers various logical steps to solve the real-world problem.
o Robustness: Robustness means that how an algorithm can clearly define our
problem.
o User-friendly: If the algorithm is not user-friendly, then the designer will not be
able to explain it to the programmer.
o Simplicity: If the algorithm is simple then it is easy to understand.
o Extensibility: If any other algorithm designer or programmer wants to use your
algorithm then it should be extensible.
Importance of Algorithms
Issues of Algorithms - The following are the issues that come while designing an
algorithm:
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 10 |
22
Algorithm Design
There are primarily main designing methods of algorithms are categories can be
named in this type of classification. There are several types of algorithms available.
Some important algorithms are:
1. Brute Force Algorithm: It is the simplest approach for a problem. A brute force
algorithm is the first approach that comes to finding when we see a problem.
4. Searching Algorithm: Searching algorithms are the ones that are used for
searching elements or groups of elements from a particular data structure. They
can be of different types based on their approach or the data structure in which
the element should be found.
7. Divide and Conquer Algorithm: This algorithm breaks a problem into sub-
problems, solves a single sub-problem and merges the solutions together to get the
final solution. It consists of the following three steps: 1. Divide, 2. Solve, 3.
Combine Example: Merge sort, Quicksort.
8. Greedy Algorithm: In this type of algorithm the solution is built part by part. The
solution of the next part is built based on the immediate benefit of the next part.
The one solution giving the most benefit will be chosen as the solution for the next
part. Example: Fractional Knapsack, Activity Selection.
9. Dynamic Programming Algorithm: This algorithm uses the concept of using the
already found solution to avoid repetitive calculation of the same part of the
problem. It divides the problem into smaller overlapping subproblems and solves
them. The approach of Dynamic programming is similar to divide and conquer.
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 11 |
22
The difference is that whenever we have recursive function calls with the same
result, instead of calling them again we try to store the result in a data structure in
the form of a table and retrieve the results from the table. Thus, the overall time
complexity is reduced. “Dynamic” means we dynamically decide, whether to call a
function or retrieve values from the table. Example: 0-1 Knapsack, subset-sum problem.
There are two approaches for designing an algorithm. these approaches include
1. Top-Down Approach
2. Bottom-up approach
Program Design - There are various ways by which we can specify an program
design. Those are -
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 12 |
22
programming language. There is no restriction of following syntax of
programming language. Pseudo code cannot be compiled; it is just a previous step
of developing a code in algorithm.
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 13 |
22
can be considered before implementing the algorithm like processor speed,
memory size etc. which has no effect on the implementation part.
Performance Analysis
Suppose we want to find out the time taken by following program statement
x = x+1
Determining the amount of time required by the above statement in terms of clock
time is not possible because it is always dynamic. Following are always dynamic
The above information varies from machine to machine. Hence it is not possible to
find out the exact figure. Hence the performance of the machine is measured in
terms of frequency count.
Definition: The frequency count is a count that denotes how many times
particular statement is executed. For Example Consider following code for
counting the frequency count
void fun()
{
int a=10;
a++; ……………………1
printf (“%d, a”); ………………………1
}
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 14 |
22
Algorithm Complexity
The term algorithm complexity measures how many steps are required by the
algorithm to solve the given problem. It evaluates the order of count of operations
executed by an algorithm as a function of input data size. To assess the
complexity, the order (approximation) of the count of operation is always
considered instead of counting the exact steps.
The complexity can be found in any form such as constant, logarithmic, linear,
n*log(n), quadratic, cubic, exponential, etc. It is nothing but the order of
constant, logarithmic, linear and so on, the number of steps encountered for the
completion of a particular algorithm. To make it even more precise, we often call
the complexity of an algorithm as "running time".
The complexity of an algorithm computes the amount of time and spaces required
by an algorithm for an input of size (n). The complexity of an algorithm can be
divided into two types. The time complexity and the space complexity.
Space Complexity
S(P)=C+Sp
where C is a constant i.e. fixed part and it denotes the space of inputs and
outputs. This space is an amount of space taken by instruction, variables and
identifiers. And Sp is a space dependent upon instance characteristics. This is a
variable part whose space requirement depends on particular problem instance.
There are two types of components that contribute to the space complexity
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 15 |
22
1. The variables whose size is dependent upon the particular problem instance
being solved. The control statements (such as for, do, While, choice) are used
to solve such instance
2. Recursion stack for handling recursive call.
Time Complexity –
The time complexity is defined as the process of determining a formula for total
time required towards the execution of that algorithm is called the time
complexity of that algorithm. This calculation is totally independent of
implementation and programming language. For determining the time complexity of
particular algorithm following steps are carried out.
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 16 |
22
5. Cubic Complexity: It imposes a complexity of O(n3). For N input data size, it
executes the order of N3 steps on N elements to solve a given problem.
For example, if there exist 100 elements, it is going to execute 1,000,000 steps.
• Best case: Define the input for which algorithm takes less time or
minimum time. In the best case calculate the lower bound of an algorithm.
• Worst Case: Define the input for which algorithm takes a long time or
maximum time. In the worst calculate the upper bound of an algorithm.
• Average case: In the average case take all random inputs and calculate the
computation time for all inputs. And then we divide it by the total number of
inputs. Average case = all random case time / total no of case
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 17 |
22
How to calculate time complexity
Start
s=0------------------------1
For i=0;i<n;i++---------n+1
s=s+i------------------n
Return s-------------------1
End
Time Complexity
T(n)=1+n+1+n+1
T(n)=2n+3
T(n)=O(n)
Start
For i=0;i<n;i++ ----------------n+1
For j=0;j<n;j++ -----------n * (n+1)
C[i][j]=A[i][j]+B[i][j]--------n(for I loop)*n(for j loop)
Return C[i][j]---------------------------------------------------1
End
Time Complexity
T(n)= n+1+n*(n+1)+n*n+1+1
T(n)=n+1+n^2+n+n^2+2
T(n)=2n^2+2n+2
T(n)=O(n^2)
Start
For i=0;i<n;i++ ………….n+1
For j=0;j<n;j++ …………n*(n+1)
C[i][j]=0 …………….n*n
For k=0;k<n;k++ ………………….n+1
C[i][j]=C[i][j]+A[j][k]*B[K][j] ………..n
Return C[i][j] …………..1
End
Time Complexity
T(n)= n+1+n*(n+1)+n*n+n+1+n+1
T(n)= n+1+n^2+n+n^2+n+1+n+1
T(n)= 2n^2+4n+4
T(n)= O(n^2)
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 18 |
22
Asymptotic notations
To choose the best algorithm, we need to check efficiency of each algorithm. The
efficiency can be measured by computing time complexity of each algorithm.
Asymptotic notation is a shorthand way to represent the time complexity. Using
asymptotic notations we can give time complexity as ”fastest possible”, "slowest
possible” or ”average time”. Various notations such as Ω-Omega, θ-theta and O-
Big O used are called asymptotic notations.
Then F(n) is big oh of g(n). It is also denoted as F(n) є O (g(n)). In other words F(n) is
less than g(n) if g(n) is multiple of some constant c.
for n = 1 then
F(n) = 2n+ 2
= 2(1) + 2
F(n) = 4
and g(n) = n2
= (1)2
g(n) = 1
i.e. F(n) > g(n) ‘
If n = 2 then
F(n) = 2(2) + 2
=6
g(n) = (2)2
g(n) = 4
i.e. F(n) > g(n)
If n = 3 then
F(n) = 2(3) + 2
=8
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 19 |
22
g(n) = (3)2
g(n) = 9
Then if n = 0
F(n) = 2(0)2 + 5
g(n) = 7 (0)
=0
i.e. F(n) > g(n)
But if n = 1
F(n) = 2(1)2 + 5
. =7
g(n) = 7(1)
=7
i.e. F(n) = g(n)
If n = 3 then,
F(n) = 2(3)2 + 5
= 18 + 5
= 23
g(n) = 7(3)
= 21
i.e. F(n) > g(n)
Thus for n > 3 we get F(n) > c * g(n). It can be represented as 2n2 + 5 є Ω (n)
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 20 |
22
3. Θ Notation - The theta notation is denoted by Θ. By this method the running
time is between upper bound and lower bound.
The theta notation is more precise with both big oh and omega notation.
Order of Growth
• Best case: Define the input for which algorithm takes less time or
minimum time. In the best case calculate the lower bound of an algorithm.
• Worst Case: Define the input for which algorithm takes a long time or
maximum time. In the worst calculate the upper bound of an algorithm.
• Average case: In the average case take all random inputs and calculate the
computation time for all inputs. And then we divide it by the total number of
inputs. Average case = all random case time / total no of case.
Best, Worst, and Average ease Efficiency - Let us assume a list of n number of
values stored in an array. Suppose if we want to search a particular element in
this list, the algorithm that search the key element in the list among n elements,
by comparing the key element with each element in the list sequentially.
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 21 |
22
The best case would be if the first element in the list matches with the key
element to be searched in a list of elements. The efficiency in that case would be
expressed as O(1) because only one comparison is enough. Minimum number of
comparisons = 1
Similarly, the worst case in this scenario would be if the complete list is
searched and the element is found only at the end of the list or is not found in
the list. The efficiency of an algorithm in that case would be expressed as O(n)
because n comparisons required to complete the search. Maximum number of
comparisons = n
Time space trade-off is basically a situation where either a space efficiency (memory
utilization) can be achieved at the cost of time or a time efficiency (performance
efficiency) can be achieved at the cost of memory.
Example 1 : Consider the programs like compilers in which symbol table is used to
handle the variables and constants. Now if entire symbol table is stored in the
program then the time required for searching or storing the variable in the symbol
table will be reduced but memory requirement will be more. On the other hand, if
we do not store the symbol table in the program and simply compute the table
entries then memory will be reduced but the processing time will be more.
Example 2 : Suppose, in a file, if we store the uncompressed data then reading the
data will be an efficient job but if the compressed data is stored then to read such
data more time will be required.
Example 3 : This is an example of reversing the order of the elements. That is, the
elements are stored in an ascending order and we want them in the descending
order. This can be done in two ways -
i. We will use another array b[ ] in which the elements in descending order can be
arranged by reading the array an in reverse direction. This approach will actually
increase the memory but time will be reduced.
ii. We will apply some extra logic for the same array all to arrange the elements in
descending order. This approach will actually reduce the memory but time of
execution will get increased.
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 22 |
22