0% found this document useful (0 votes)
15 views

1.1 Introduction

Uploaded by

ucem pavan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

1.1 Introduction

Uploaded by

ucem pavan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 44

Introduction to Data Structures

Text Books
Aaron M. Tenenbaum, Yedidyah Langsam and Moshe J.
Augenstein, “Data Structures Using C and C++”, PHI
Learning Private Limited, Delhi India
Horowitz and Sahani, “Fundamentals of Data Structures”,
Galgotia Publications Pvt Ltd Delhi India.
Lipschutz, “Data Structures” Schaum’s Outline Series, Tata
McGraw-hill Education (India) Pvt. Ltd.
 Thareja, “Data Structure Using C” Oxford Higher
Education.
Course Outcomes [CO]
CO1: Understand and analyze the time and space
complexity of an algorithm.
CO2 : Understand and implement linear structures such as
Stack, Queue and Linked list.
CO3 : Discuss various tree representation of data and
techniques for its traversals.
CO4 : Understand and implement various graph
representation of data and graph traversal algorithms.
CO5 : Apply various searching and sorting techniques.
Syllabus [Unit-1]
Introduction: Basic Terminology, Elementary Data
Organization, Built in Data Types in C. Algorithm, Efficiency
of an Algorithm, Time and Space Complexity, Asymptotic
notations: Big Oh, Big Theta and Big Omega, Time-Space
trade-off. Abstract Data Types (ADT) Arrays: Definition,
Single and Multidimensional Arrays, Representation of
Arrays: Row Major Order, and Column Major Order,
Derivation of Index Formulae for 1-D,2-D,3-D and n-D Array
Application of arrays, Sparse Matrices and their representations.
Linked lists: Array Implementation and Pointer Implementation
of Singly Linked Lists, Doubly Linked List, Circularly Linked
List, Operations on a Linked List. Insertion, Deletion, Traversal,
Polynomial Representation and Addition Subtraction &
Multiplications of Single variable & Two variables Polynomial.
Data Structure is a way of collecting and organising
data in such a way that we can perform operations on
these data in an effective way.
The idea is to reduce the space and time complexities
of different tasks.
The choice of a good data structure makes it possible to
perform a variety of critical operations effectively.
An efficient data structure also uses minimum memory
space and execution time to process the structure.
Data Structures is about rendering data elements in
terms of some relationship, for better organization and
storage.
If you are aware of Object Oriented
programming concepts, then a class also does the
same thing, it collects different type of data
under one single entity.
In simple language, Data Structures are
structures programmed to store ordered data, so
that various operations can be performed on it
easily.
It should be designed and implemented in such a
way that it reduces the complexity and increases
the efficiency.
Why is Data Structure important?
Data structure is important because it is used in
almost every program or software system.
It helps to write efficient code, structures the code and
solve problems.
Data can be maintained more easily by encouraging a
better design or implementation.
Data structure is just a container for the data that is
used to store, manipulate and arrange. It can be
processed by algorithms.
Basic types of Data Structures
 As we have discussed above, anything that can store data can be
called as a data structure, hence Integer, Float, Boolean, Char etc, all
are data structures. They are known as Primitive Data Structures.
 Then we also have some complex Data Structures, which are used to
store large and connected data. Some example of Abstract Data
Structure are :
 Arrays
 Linked List
 Tree
 Graph
 Stack, Queue etc.
 All these data structures allow us to perform different operations on
data. We select these data structures based on which type of
operation is required.
Characteristic Description

Linear In Linear data structures, the data items


are arranged in a linear sequence.
Example: Array

Non-Linear In Non-Linear data structures, the data


items are not in sequence.
Example: Tree, Graph
Homogeneous In homogeneous data structures, all the
elements are of same type.
Example: Array
Non-Homogeneous In Non-Homogeneous data structure, the
elements may or may not be of the same
type. Example: Structures

Static Static data structures are those whose


sizes and structures associated memory
locations are fixed, at compile time.
Example: Array
Dynamic Dynamic structures are those which
expands or shrinks depending upon the
program need and its execution. Also,
their associated memory locations
changes. Example: Linked List created
Elementary data organization
Data are simply values or sets of values. A single unit of
value is called a Data item. Data items are divided into
subgroups called Group items.
An Entity is something that has certain attributes which
may be assigned values. An entity with similar attributes
is called an Entity set.
Eg:-
Entity : Employee
Attribute : Name , Age phone
Values : "ABC", 42, 9847092568
Entity set : All employees in an organization.
Meaningful or processed data is called information. The
collection of data is organized into the hierarchy of fields,
records and files. A single elementary unit of information
representing an attribute of an entity is called a Field.
Records are the collection of field values of a given entity.
Collection of records of the entities in a given entity set is
called a file. Each record may contain a certain field that
uniquely represents that record. Such a field K is called
a primary key.
Based on their length, records may be classified into two.
They are:
Fixed-length record : All record contain the same data items
with the same amount of space assigned to each items.
Variance length record: Records may contain different
length data items.
What is an Algorithm ?
 An algorithm is a finite set of instructions or logic, written in order,
to accomplish a certain predefined task. Algorithm is not the
complete code or program, it is just the core logic(solution) of a
problem, which can be expressed either as an informal high level
description as pseudo code or using a flowchart.
 Every Algorithm must satisfy the following properties:
 Input- There should be 0 or more inputs supplied externally to the
algorithm.
 Output- There should be at least 1 output obtained.
 Definiteness- Every step of the algorithm should be clear and well
defined.
 Finiteness- The algorithm should have finite number of steps.
 Correctness- Every step of the algorithm must generate a correct
output.
Efficiency of an Algorithm
The efficiency of an algorithm depends on its design,
implementation, resources, and memory required by it
for the computation.
An algorithm is said to be efficient and fast, if it takes
less time to execute and consumes less memory space.
The performance of an algorithm is measured on the
basis of following properties :
Time Complexity
Space Complexity
Space Complexity
It is the amount of memory space required by the algorithm,
during the course of its execution. Space complexity must be
taken seriously for multi-user systems and in situations where
limited memory is available.
Auxiliary Space is the extra space or temporary space used by
an algorithm.
Space Complexity of an algorithm is the total space taken by
the algorithm with respect to the input size. Space complexity
includes both Auxiliary space and space used by input.
Note: Space complexity depends on a variety of things such
as the programming language, the compiler, or even the
machine running the algorithm.
GATE-CS-2005

double foo (int n)


{
int i;
double sum;
if (n = = 0) return 1.0;
else
{
sum = 0.0;
for (i = 0; i < n; i++)
sum += foo (i);
return sum;
}
}
 Function foo() is recursive. Space complexity is O(n) as there can
be at most O(n) active functions at a time.
Time Complexity
Time Complexity is a way to represent the amount of time
required by the program to run till its completion.
It's generally a good practice to try to keep the time required
minimum, so that our algorithm completes it's execution in
the minimum time possible.
The time complexity of an algorithm depends on the
behavior of input:
Worst-case
Best-case
Average-case
GATE-CS-2004
Let A[1, ..., n] be an array storing a bit (1 or 0) at each location, and f(m)
is a function whose time complexity is θ(m).

counter = 0;
for (i = 1; i < = n; i++)
{
if (A[i] == 1)
counter++;
else {
f(counter);
counter = 0;
}
}
 The complexity of this program fragment is θ(n)
Asymptotic Notations
When it comes to analyzing the complexity of any algorithm in
terms of time and space, we can never provide an exact number
to define the time required and the space required by the
algorithm, instead we express it using some standard notations,
also known as Asymptotic Notations.
When we analyse any algorithm, we generally get a formula to
represent the amount of time required for execution or the time
required by the computer to run the lines of code of the
algorithm, number of memory accesses, number of
comparisons, temporary variables occupying memory space
etc.
Asymptotic Notations allow us to analyze an algorithm's
running time by identifying its behavior as its input size grows.
This is also referred to as an algorithm's growth rate.
Let us take an example, if some algorithm has a time complexity
of
T(n) = (n2 + 3n + 4), which is a quadratic equation. For large
values of n, the 3n + 4 part will become insignificant compared to
the n2 part.
For n = 1000, n2 will be 1000000 while 3n + 4 will be 3004.
Also, When we compare the execution times of two
algorithms the constant coefficients of higher order terms
are also neglected.
An algorithm that takes a time of 200n2 will be faster than
some other algorithm that takes n3 time, for any value
of n larger than 200. Since we're only interested in the
asymptotic behavior of the growth of the function, the
constant factor can be ignored too.
Types of functions
 Logarithmic Function - log n
 Linear Function - an + b
 Quadratic Function - an2 + bn + c
 Polynomial Function - anz + . . . + an2 + an1 + an0, where z is
some constant
 Exponential Function - an, where a is some constant

GATE-CS-2021
Consider the following three functions.
f1=10n
f2=nlogn
f3=n√n
Arranges the functions in the increasing order of asymptotic growth rate:
f2<f3<f1
What is Asymptotic Behavior?
The word Asymptotic means approaching a value or curve
arbitrarily closely (i.e., as some sort of limit is taken).
In asymptotic notations, we use the same model to ignore
the constant factors and insignificant parts of an expression,
to device a better way of representing complexities of
algorithms, in a single term, so that comparison between
algorithms can be done easily.
Let's take an example to understand this:
 If we have two algorithms with the following expressions
representing the time required by them for execution:
 Expression 1: (20n2 + 3n - 4)
 Expression 2: (n3 + 100n - 2)
 Now, as per asymptotic notations, we should just worry about how
the function will grow as the value of n (input) will grow, and that
will entirely depend on n2 for the Expression 1, and on n3 for
Expression 2. Hence, we can clearly say that the algorithm for
which running time is represented by the Expression 2, will grow
faster than the other one, simply by analyzing the highest power
coefficient and ignoring the other constants(20 in 20n2) and
insignificant parts of the expression(3n - 4 and 100n - 2).
 The main idea behind casting aside the less important part is to
make things manageable.
 All we need to do is, first analyze the algorithm to find out an
expression to define it's time requirements and then analyse how
that expression will grow as the input (n) will grow.
Types of Asymptotic Notations
We use three types of asymptotic notations to represent the
growth of any algorithm, as input increases:
Big Theta (Θ)
Big Oh(O)
Big Omega (Ω)
Tight Bounds: Theta
 When we say tight bounds, we mean that the time complexity
represented by the Big-Θ notation is like the average value or range
within which the actual time of execution of the algorithm will be.
 For example, if for some algorithm the time complexity is
represented by the expression 3n 2 + 5n, and we use the Big-Θ
notation to represent this, then the time complexity would be Θ(n 2),
ignoring the constant coefficient and removing the insignificant
part, which is 5n.
 Here, in the example above, complexity of Θ(n 2) means, that the
average time for any input n will remain in
between, c1*n2 and c2*n2, where c1, c2 are two constants, thereby
tightly binding the expression representing the growth of the
algorithm.
Let f(n) and g(n) are two nonnegative functions indicating
running time of two algorithms. We say the function g(n) is tight
bound of function f(n) if there exist some positive constants c1,
c2, and n0 such that 0 ≤ c1 g(n) ≤ f(n) ≤ c2 g(n) for all n ≥ n0. It is
denoted as f(n) = Θ (g(n)).
Upper Bounds: Big-O
 This notation is known as the upper bound of the algorithm, or a
Worst Case of an algorithm.
 It tells us that a certain function will never exceed a specified time for
any value of input n.
 The question is why we need this representation when we already
have the big-Θ notation, which represents the tightly bound running
time for any algorithm. Let's take a small example to understand this.
 Consider Linear Search algorithm, in which we traverse an array
elements, one by one to search a given number.
 In Worst case, starting from the front of the array, we find the
element or number we are searching for at the end, which will lead to
a time complexity of n, where n represents the number of total
elements.
But it can happen, that the element that we are searching
for is the first element of the array, in which case the time
complexity will be 1.
Now in this case, saying that the big-Θ or tight bound time
complexity for Linear search is Θ(n), will mean that the
time required will always be related to n, as this is the right
way to represent the average time complexity, but when we
use the big-O notation, which means that the time
complexity will never exceed n, defining the upper bound,
hence saying that it can be less than or equal to n, which is
the correct representation.
This is the reason, most of the time you will see Big-O
notation being used to represent the time complexity of
any algorithm, because it makes more sense.
Let f(n) and g(n) are two nonnegative functions indicating the
running time of two algorithms. We say, g(n) is upper bound of
f(n) if there exist some positive constants c and n0 such that 0
≤ f(n) ≤ c.g(n) for all n ≥ n0. It is denoted as f(n) = Ο(g(n)).
Lower Bounds: Omega
Big Omega notation is used to define the lower bound of
any algorithm or we can say the best case of any
algorithm.
This always indicates the minimum time required for any
algorithm for all input values, therefore the best case of any
algorithm.
In simple words, when we represent a time complexity for
any algorithm in the form of big-Ω, we mean that the
algorithm will take at least this much time to complete it's
execution. It can definitely take more time than this too.
Let f(n) and g(n) are two nonnegative functions indicating the
running time of two algorithms. We say the function g(n) is
lower bound of function f(n) if there exist some positive
constants c and n0 such that 0 ≤ c.g(n) ≤ f(n) for all n ≥ n0. It is
denoted as f(n) = Ω (g(n)).
ISRO CS 2015
int recursive (int n)
{
if(n == 1)
return (1);
else return (recursive (n-1) + recursive (n-1));
}

The time complexity of the following C function is


(assume n > 0): O(2n)
Best, Average and Worst case Analysis of
Algorithms
We all know that the running time of an algorithm
increases (or remains constant) as the input size (n)
increases.
Sometimes even if the size of the input is same, the running
time varies among different instances of the input.
In that case, we perform best, average and worst-case
analysis.
The best case gives the minimum time, the worst case
running time gives the maximum time and average case
running time gives the time required on average to execute
the algorithm.
int LinearSearch(int a, int n, int item) {
int i;
for (i = 0; i < n; i++) {
if (a[i] == item) {
return a[i];
}
}
return -1
}
Best case happens when the item we are looking for is in the very
first position of the array: Θ(1)
Worst case happens when the item we are searching is in the last
position of the array or the item is not in the array: Θ(n)
Average case analyses all possible inputs and calculate the running
time for all inputs. Add up all the calculated values ​and divide the
sum by the total number of entries:(1+2+3+…+n+(n+1))/(n+1)
=O(n)
Complexity And Space-Time Tradeoff
In computer science, a space-time or time-memory
tradeoff is a way of solving a problem or calculation in less
time by using more storage space (or memory), or by solving
a problem in very little space by spending a long time.
Most computers have a large amount of space, but not infinite
space. Also, most people are willing to wait a little while for a
big calculation, but not forever. So if your problem is taking a
long time but not much memory, a space-time tradeoff would
let you use more memory and solve the problem more
quickly. Or, if it could be solved very quickly but requires
more memory than you have, you can try to spend more time
solving the problem in the limited memory.
The most common condition is an algorithm using a lookup
table. This means that the answers for some question for
every possible value can be written down. One way of
solving this problem is to write down the entire lookup
table, which will let you find answers very quickly, but will
use a lot of space. Another way is to calculate the answers
without writing down anything, which uses very little
space, but might take a long time.
A space-time tradeoff can be used with the problem of data
storage. If data is stored uncompressed, it takes more space
but less time than if the data were stored compressed (since
compressing the data decreases the amount of space it
takes, but it takes time to run the compression algorithm).
Larger code size can be used to increase program speed
when using loop unwinding. This technique makes
the program code longer for each iteration of a loop, but
saves the computation time needed for jumping back to the
beginning of the loop at the end of each iteration.
In the field of cryptography, using space-time tradeoff, the
attacker is decreasing the exponential time required for
a brute force attack.
Dynamic programming is another example where the time
of solving problems can be decreased by using more
memory.
Abstract Data Type (ADT)
An ADT is the way we look at a data structure, focusing on
what it does and ignoring how it does its job.
It is a special kind of data type, whose behavior is defined
by a set of values and set of operations.
The keyword “Abstract” is used as we can use these data
types, we can perform different operations. But how those
operations are working that is totally hidden from the user.
The ADT is made of with primitive data types, but
operation logics are hidden.
Some examples of ADT are Stack, Queue, List etc.
Let us see some operations of those mentioned ADT −
Stack −
isFull(), This is used to check whether stack is full or not
isEmpry(), This is used to check whether stack is empty or
not
push(x), This is used to push x into the stack
pop(), This is used to delete one element from top of the
stack
peek(), This is used to get the top most element of the stack
size(), this function is used to get number of elements present
into the stack
Queue −
isFull(), This is used to check whether queue is full or not
isEmpry(), This is used to check whether queue is empty or
not
insert(x), This is used to add x into the queue at the rear end
delete(), This is used to delete one element from the front end
of the queue
size(), this function is used to get number of elements present
into the queue
List −
size(), this function is used to get number of elements
present into the list
insert(x), this function is used to insert one element into the
list
remove(x), this function is used to remove given element
from the list
get(i), this function is used to get element at position i
replace(x, y), this function is used to replace x with y value

You might also like