Data Structure and Algorithms-By Festo
Data Structure and Algorithms-By Festo
Data Structures are the programmatic way of storing data so that data can be used
efficiently. Almost every enterprise application uses various types of data structures in
one or the other way. This tutorial will give you a great understanding on Data
Structures needed to understand the complexity of enterprise level applications and
need of algorithms, and data structures.
To solve the above-mentioned problems, data structures come to rescue. Data can be
organized in a data structure in such a way that all items may not be required to be
searched, and the required data can be searched almost instantly.
• Tower of Hanoi
• All pair shortest path by Floyd-Warshall
• Shortest path by Dijkstra
• Project scheduling
• Interface − Each data structure has an interface. Interface represents the set of
operations that a data structure supports. An interface only provides the list of
supported operations, type of parameters they can accept and return type of
these operations.
• Implementation − Implementation provides the internal representation of a data
structure. Implementation also provides the definition of the algorithms used in
the operations of the data structure.
As applications are getting complex and data rich, there are three common problems
that applications face now-a-days.
DS Notes
By [email protected]
To solve the above-mentioned problems, data structures come to rescue. Data can be
organized in a data structure in such a way that all items may not be required to be
searched, and the required data can be searched almost instantly.
• Worst Case − This is the scenario where a particular data structure operation
takes maximum time it can take. If an operation's worst case time is ƒ(n) then this
operation will not take more than ƒ(n) time where ƒ(n) represents function of n.
• Average Case − This is the scenario depicting the average execution time of an
operation of a data structure. If an operation takes ƒ(n) time in execution, then m
operations will take mƒ(n) time.
• Best Case − This is the scenario depicting the least possible execution time of
an operation of a data structure. If an operation takes ƒ(n) time in execution, then
the actual operation may take time as the random number which would be
maximum as ƒ(n).
Basic Terminology
• Data − Data are values or set of values.
• Data Item − Data item refers to single unit of values.
• Group Items − Data items that are divided into sub items are called as Group
Items.
• Elementary Items − Data items that cannot be divided are called as Elementary
Items.
• Attribute and Entity − An entity is that which contains certain attributes or
properties, which may be assigned values.
• Entity Set − Entities of similar attributes form an entity set.
DS Notes
By [email protected]
Characteristics of an Algorithm
Not all procedures can be called an algorithm. An algorithm should have the following
characteristics −
• Unambiguous − Algorithm should be clear and unambiguous. Each of its steps
(or phases), and their inputs/outputs should be clear and must lead to only one
meaning.
• Input − An algorithm should have 0 or more well-defined inputs.
• Output − An algorithm should have 1 or more well-defined outputs, and should
match the desired output.
• Finiteness − Algorithms must terminate after a finite number of steps.
• Feasibility − Should be feasible with the available resources.
• Independent − An algorithm should have step-by-step directions, which should
be independent of any programming code.
DS Notes
By [email protected]
Example
Let's try to learn algorithm-writing by using an example.
Problem − Design an algorithm to add two numbers and display the result.
Step 1 − START
Step 2 − declare three integers a, b & c
Step 3 − define values of a & b
Step 4 − add values of a & b
Step 5 − store output of step 4 to c
Step 6 − print c
Step 7 − STOP
Algorithms tell the programmers how to code the program. Alternatively, the algorithm
can be written as −
Step 1 − START ADD
Step 2 − get values of a & b
Step 3 − c ← a + b
Step 4 − display c
Step 5 − STOP
In design and analysis of algorithms, usually the second method is used to describe an
algorithm. It makes it easy for the analyst to analyze the algorithm ignoring all
unwanted definitions. He can observe what operations are being used and how the
process is flowing.
Writing step numbers is optional.
We design an algorithm to get a solution of a given problem. A problem can be solved
in more than one ways.
DS Notes
By [email protected]
Hence, many solution algorithms can be derived for a given problem. The next step is
to analyze those proposed solution algorithms and implement the best suitable
solution.
Algorithm Analysis
Efficiency of an algorithm can be analyzed at two different stages, before
implementation and after implementation. They are the following −
• A Priori Analysis − This is a theoretical analysis of an algorithm. Efficiency of
an algorithm is measured by assuming that all other factors, for example,
processor speed, are constant and have no effect on the implementation.
• A Posterior Analysis − This is an empirical analysis of an algorithm. The
selected algorithm is implemented using programming language. This is then
executed on target computer machine. In this analysis, actual statistics like
running time and space required, are collected.
We shall learn about a priori algorithm analysis. Algorithm analysis deals with the
execution or running time of various operations involved. The running time of an
operation can be defined as the number of computer instructions executed per
operation.
Algorithm Complexity
Suppose X is an algorithm and n is the size of input data, the time and space used by
the algorithm X are the two main factors, which decide the efficiency of X.
• Time Factor − Time is measured by counting the number of key operations such
as comparisons in the sorting algorithm.
DS Notes
By [email protected]
Space Complexity
Space complexity of an algorithm represents the amount of memory space required by
the algorithm in its life cycle. The space required by an algorithm is equal to the sum of
the following two components −
• A fixed part that is a space required to store certain data and variables, that are
independent of the size of the problem. For example, simple variables and
constants used, program size, etc.
• A variable part is a space required by variables, whose size depends on the size
of the problem. For example, dynamic memory allocation, recursion stack
space, etc.
Space complexity S(P) of any algorithm P is S(P) = C + SP(I), where C is the fixed part
and S(I) is the variable part of the algorithm, which depends on instance characteristic
I. Following is a simple example that tries to explain the concept −
Algorithm: SUM(A, B)
Step 1 - START
Step 2 - C ← A + B + 10
Step 3 - Stop
Here we have three variables A, B, and C and one constant. Hence S(P) = 1 + 3. Now,
space depends on data types of given variables and constant types and it will be
multiplied accordingly.
Time Complexity
Time complexity of an algorithm represents the amount of time required by the
algorithm to run to completion. Time requirements can be defined as a numerical
function T(n), where T(n) can be measured as the number of steps, provided each step
consumes constant time.
For example, addition of two n-bit integers takes n steps. Consequently, the total
computational time is T(n) = c ∗ n, where c is the time taken for the addition of two bits.
Here, we observe that T(n) grows linearly as the input size increases.
DS Notes
By [email protected]
Counting Coins
This problem is to count to a desired value by choosing the least possible coins and
the greedy approach forces the algorithm to pick the largest possible coin. If we are
provided coins of ₹ 1, 2, 5 and 10 and we are asked to count ₹ 18 then the greedy
procedure will be −
1 − Select one ₹ 10 coin, the remaining count is 8
DS Notes
By [email protected]
Though, it seems to be working fine, for this count we need to pick only 4 coins. But if
we slightly change the problem then the same approach may not be able to produce
the same optimum result.
For the currency system, where we have coins of 1, 7, 10 value, counting coins for
value 18 will be absolutely optimum but for count like 15, it may use more coins than
necessary. For example, the greedy approach will use 10 + 1 + 1 + 1 + 1 + 1, total 6
coins. Whereas the same problem could be solved by using only 3 coins (7 + 7 + 1)
Hence, we may conclude that the greedy approach picks an immediate optimized
solution and may fail where global optimization is a major concern.
Divide/Break
This step involves breaking the problem into smaller sub-problems. Sub-problems
should represent a part of the original problem. This step generally takes a recursive
approach to divide the problem until no sub-problem is further divisible. At this stage,
sub-problems become atomic in nature but still represent some part of the actual
problem.
Conquer/Solve
This step receives a lot of smaller sub-problems to be solved. Generally, at this level,
the problems are considered 'solved' on their own.
Merge/Combine
When the smaller sub-problems are solved, this stage recursively combines them until
they formulate a solution of the original problem. This algorithmic approach works
recursively and conquer & merge steps works so close that they appear as one.
Examples
The following computer algorithms are based on divide-and-conquer programming
approach −
• Merge Sort
• Quick Sort
• Binary Search
• Strassen's Matrix Multiplication
• Closest pair (points)
There are various ways available to solve any computer problem, but the mentioned
are a good example of divide and conquer approach.
DS Notes
By [email protected]
Dynamic programming is used where we have problems, which can be divided into
similar sub-problems, so that their results can be re-used. Mostly, these algorithms are
used for optimization. Before solving the in-hand sub-problem, dynamic algorithm will
try to examine the results of the previously solved sub-problems. The solutions of sub-
problems are combined in order to achieve the best solution.
So we can say that −
• The problem should be able to be divided into smaller overlapping sub-problem.
• An optimum solution can be achieved by using an optimum solution of smaller
sub-problems.
• Dynamic algorithms use Memoization.
Comparison
In contrast to greedy algorithms, where local optimization is addressed, dynamic
algorithms are motivated for an overall optimization of the problem.
In contrast to divide and conquer algorithms, where solutions are combined to achieve
an overall solution, dynamic algorithms use the output of a smaller sub-problem and
then try to optimize a bigger sub-problem. Dynamic algorithms use Memoization to
remember the output of already solved sub-problems.
Example
The following computer problems can be solved using dynamic programming approach
• Fibonacci number series
• Knapsack problem
• Tower of Hanoi
• All pair shortest path by Floyd-Warshall
• Shortest path by Dijkstra
• Project scheduling
Dynamic programming can be used in both top-down and bottom-up manner. And of
course, most of the times, referring to the previous solution output is cheaper than
recomputing in terms of CPU cycles.
DS Notes
By [email protected]
Data Definition
Data Object
Data Type
Data type is a way to classify various types of data such as integer, string, etc. which
determines the values that can be used with the corresponding type of data, the type of
operations that can be performed on the corresponding type of data. There are two data
types −
Those data types for which a language has built-in support are known as Built-in Data
types. For example, most of the languages provide the following built-in data types.
• Integers
• Boolean (true, false)
• Floating (Decimal numbers)
• Character and Strings
Those data types which are implementation independent as they can be implemented in
one or the other way are known as derived data types. These data types are normally
built by the combination of primary or built-in data types and associated operations on
them. For example −
• List
• Array
• Stack
• Queue
Basic Operations
The data in the data structures are processed by certain operations. The particular data
structure chosen largely depends on the frequency of the operation that needs to be
performed on the data structure.
• Traversing
• Searching
• Insertion
• Deletion
• Sorting
• Merging
Array is a container which can hold a fix number of items and these items should be of
the same type. Most of the data structures make use of arrays to implement their
algorithms. Following are the important terms to understand the concept of Array.
Array Representation
Arrays can be declared in various ways in different languages. For illustration, let's take
C array declaration.
DS Notes
By [email protected]
Arrays can be declared in various ways in different languages. For illustration, let's take
C array declaration.
As per the above illustration, following are the important points to be considered.
Basic Operations
In C, when an array is initialized with size, then it assigns defaults values to its elements
in following order.
bool false
DS Notes
By [email protected]
char 0
int 0
float 0.0
double 0.0f
void
wchar_t 0
A linked list is a sequence of data structures, which are connected together via links.
Linked List is a sequence of links which contains items. Each link contains a connection
to another link. Linked list is the second most-used data structure after array. Following
are the important terms to understand the concept of Linked List.
• Link − Each link of a linked list can store a data called an element.
• Next − Each link of a linked list contains a link to the next link called Next.
• LinkedList − A Linked List contains the connection link to the first link called
First.
Linked list can be visualized as a chain of nodes, where every node points to the next
node.
As per the above illustration, following are the important points to be considered.
• Each link carries a data field(s) and a link field called next.
• Each link is linked with its next link using its next link.
• Last link carries a link as null to mark the end of the list.
Basic Operations
How?
A real-world stack allows operations at one end only. For example, we can place or
remove a card or plate from the top of the stack only. Likewise, Stack ADT allows all
data operations at one end only. At any given time, we can only access the top element
of a stack.
This feature makes it LIFO data structure. LIFO stands for Last-in-first-out. Here, the
element which is placed (inserted or added) last, is accessed first. In stack terminology,
insertion operation is called PUSH operation and removal operation is
called POP operation.
Stack Representation
A stack can be implemented by means of Array, Structure, Pointer, and Linked List.
Stack can either be a fixed size one or it may have a sense of dynamic resizing. Here,
we are going to implement stack using arrays, which makes it a fixed size stack
implementation.
Basic Operations
Stack operations may involve initializing the stack, using it and then de-initializing it.
Apart from these basic stuffs, a stack is used for the following two primary operations −
To use a stack efficiently, we need to check the status of stack as well. For the same
purpose, the following functionality is added to stacks −
DS Notes
By [email protected]
• peek() − get the top data element of the stack, without removing it.
At all times, we maintain a pointer to the last PUSHed data on the stack. As this pointer
always represents the top of the stack, hence named top. The top pointer provides top
value of the stack without actually removing it.
Push Operation
The process of putting a new data element onto stack is known as a Push Operation.
Push operation involves a series of steps −
• Step 1 − Checks if the stack is full.
• Step 2 − If the stack is full, produces an error and exit.
• Step 3 − If the stack is not full, increments top to point next empty space.
• Step 4 − Adds data element to the stack location, where top is pointing.
• Step 5 − Returns success.
If the linked list is used to implement the stack, then in step 3, we need to allocate
space dynamically.
Pop Operation
Accessing the content while removing it from the stack, is known as a Pop Operation.
In an array implementation of pop() operation, the data element is not actually
removed, instead top is decremented to a lower position in the stack to point to the
next value. But in linked-list implementation, pop() actually removes data element and
deallocates memory space.
DS Notes
By [email protected]
A real-world example of queue can be a single-lane one-way road, where the vehicle
enters first, exits first. More real-world examples can be seen as queues at the ticket
windows and bus-stops.
Queue Representation
DS Notes
By [email protected]
As we now understand that in queue, we access both ends for different reasons. The
following diagram given below tries to explain queue representation as data structure −
As in stacks, a queue can also be implemented using Arrays, Linked-lists, Pointers and
Structures. For the sake of simplicity, we shall implement queues using one-
dimensional array.
Basic Operations
Queue operations may involve initializing or defining the queue, utilizing it, and then
completely erasing it from the memory. Here we shall try to understand the basic
operations associated with queues −
Few more functions are required to make the above-mentioned queue operation
efficient. These are −
• peek() − Gets the element at the front of the queue without removing it.
In queue, we always dequeue (or access) data, pointed by front pointer and while
enqueing (or storing) data in the queue we take help of rear pointer.
Enqueue Operation
Queues maintain two data pointers, front and rear. Therefore, its operations are
comparatively difficult to implement than that of stacks.
The following steps should be taken to enqueue (insert) data into a queue −
• Step 3 − If the queue is not full, increment rear pointer to point the next empty
space.
• Step 4 − Add data element to the queue location, where the rear is pointing.
Dequeue Operation
Accessing data from the queue is a process of two tasks − access the data
where front is pointing and remove the data after access. The following steps are taken
to perform dequeue operation −
• Step 3 − If the queue is not empty, access the data where front is pointing.
• Step 4 − Increment front pointer to point to the next available data element.
1. Linear
Linear search is a very simple search algorithm. In this type of search, a sequential
search is made over all items one by one. Every item is checked and if a match is found
then that particular item is returned, otherwise the search continues till the end of the
data collection.
2. Binary search
3. Interpolation Search
4. Hash table