0% found this document useful (0 votes)
20 views27 pages

Introduction To Data Structures: Basic Terminologies Elementary Data Organization

This document provides an introduction to data structures, explaining basic terminologies, organization of data, and classifications of data structures such as primitive and non-primitive types. It covers linear structures like arrays and linked lists, as well as non-linear structures like trees and graphs, and details operations on data structures including traversing, searching, inserting, deleting, sorting, and merging. Additionally, it introduces algorithms, their characteristics, control structures, and recursion with examples such as factorial and Fibonacci sequences.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views27 pages

Introduction To Data Structures: Basic Terminologies Elementary Data Organization

This document provides an introduction to data structures, explaining basic terminologies, organization of data, and classifications of data structures such as primitive and non-primitive types. It covers linear structures like arrays and linked lists, as well as non-linear structures like trees and graphs, and details operations on data structures including traversing, searching, inserting, deleting, sorting, and merging. Additionally, it introduces algorithms, their characteristics, control structures, and recursion with examples such as factorial and Fibonacci sequences.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Unit-I

Introduction to data structures


BASIC TERMINOLOGIES; ELEMENTARY DATA
ORGANIZATION
• Data are values or set of values. A data item refers to a single unit of values. Data items
that are divided into sub items are called group items; those that are not are called
elementary items. For example, an employee’s name may be divided into three sub items-
first name, middle name and last name – but the social security number would normally
be treated as a single item. Collections of data are organised into a hierarchy of fields,
records and files.
• An entity has certain attributes or properties which may be assigned values. The values
may be either numeric or nonnumeric. For example, the employee of a given organization:
Attributes : Name Age Gender Social Security Number
Values : Asha 19 F 134-24-5555
• Entity with similar attributes (e.g., the entire employees in an organization) form an entity
set. Each attribute of an entity set has a range of values, the set of all possible values that
could be assigned to the particular attribute.
• A field is a single elementary unit of information representing an attribute of an entity, a
record is the collection of field values of a given entity and a file is the collection of records
of the entities in a given entity set.
• Each record in a file may contain many field items, but the value in a certain field may
uniquely determine the record in the file. Such a field K is called primary key, and the
values k1, k2, k3,...... in such a field are called keys or key values.
• Records may also be classified according to length. A file can have fixed-length records or
variable length records. In fixed-length records, all the records contain the same data
items with same amount of space assigned to each data item. In variable-length records,
file records may contain different lengths. For example, student records usually have
variable-lengths, since different students take different number of courses.

DATA STRUCTURE
Data may be organised into many different ways; the logical or mathematical model of
particular organization of data is called a data structure. The choice of a particular data model
depends on two considerations.
1. It must give actual relationships of the data in the real world.
2. The structure should be simple, that can effectively process the data when necessary.
Classification of data structures
Data structure is generally classified into primitive and non-primitive data structures. Basic
data types such as integer, real, character and Boolean are known as primitive data structures.
These data types consist of characters that cannot be divided, and hence they are called simple
data types.
Based on the structure and arrangement of data, non-primitive data structures are further
classified into linear and non-linear.
A data structure is said to be linear if its elements form a sequence or a linear list. In linear
data structures, the data is arranged in a linear fashion although they are stored in memory

1
Unit-I

need not be sequential. Arrays, linked list, stacks and queues are example linear data
structures.
A data structure is said to be non-linear if the data are not arranged into sequence. The
insertion and deletion of data is therefore not possible in a linear fashion. Trees and graphs
are examples of non-linear data structures.

Arrays:
The simple type of data structure is a linear (or one dimensional ) array. Linear array is list
of a finite number n of similar data elements referenced respectively by a set of n consecutive
numbers, usually 1,2,3,....,n . If we choose the name A for the array, then the elements of A
are denoted by subscript notation a1, a2, a3, .........., an or by the parenthesis notation A(1),
A(2),A(3),........,A(N) or by the bracket notation A[1],A[2],A[3],......A[N].
Linked List:
The difference between arrays and linked list is that in the arrays all the memory may not be
utilized. Only fewer may be used and the rest become unused even though they are allocated.
This leads to wastage of memory. The memory is very important resource which has to be
handled efficiently. In the array, the memory is allocated before the execution of the program;
it is fixed and cannot be changed. This problem could be overcome using linked list.
Linked list is a non-sequential collection of data items. For every data item in the linked list,
there is an associated address that would give the memory location of the next data item in
the list. The data items in the linked list are not in a consecutive memory location. They may
be anywhere in the memory. Accessing of these items is easier because of pointers used to
link. A linked list is shown in the following figure:

Advantage
➢ Linked lists are dynamic data structures: They can grow or shrink during the execution
of a program.

2
Unit-I

➢ Efficient memory utilization: Memory is not pre-allocated. Memory is allocated


whenever it is required. And it is de-allocated (removed) when it is no longer needed.
➢ Insertion and deletion are easier and efficient: Linked lists provide flexibility in
inserting a data item at a specified position and deletion of a data item from a given
position.
➢ Many complex applications can be easily carried out with linked lists.
Disadvantages
➢ More memory: If the numbers of fields are more, then more memory space is needed.
➢ Access to an arbitrary data item is little bit cumbersome and also time consuming.

Trees:
Data frequently contain a hierarchical relationship between various elements. The data
structures which reflects this relationship is called a tree.
Stack:
A stack , also called a last-in-first-out ( LIFO ) system, is a linear list in which insertions and
deletions can take place only at the end, called the top.
Queue:
A queue , also called a first-in-first-out ( FIFO ) system, is a linear list in which deletion take
place only at one end of the list, the “front” of the list, and insertions can take place only at
the other end of the list, the “rear” of the list.
Graph:
Data sometimes contain a relationship between pair of elements which is not necessarily
hierarchical in nature.

DATA STRUCTURE OPERATIONS:


1. Traversing: Accessing each record exactly once so that certain items in the record may be
processed. (this accessing and processing is sometimes called “visiting” the record)
2. Searching: Finding the location of the record with the given key value, or finding the
locations of all records which satisfy one or more conditions.
3. Inserting: Adding a new record to the structure.
4. Deleting: Removing a record from the structure.
5. Sorting: Arranging the records in some logical order.
6. Merging: Combing the records in two different sorted files into a single sorted file.

3
Unit-I

Preliminaries
INTRODUCTION TO ALGORITHMS:
Algorithm is a step-by-step procedure, which defines a set of instructions to be executed in a
certain order to get the desired output. Algorithms are generally created independent of
underlying languages, i.e., an algorithm can be implemented in more than one programming
language.
1. Unambiguous − Algorithm should be clear and unambiguous. Each of its steps (or phases),
and their inputs/outputs should be clear and must lead to only one meaning.
2. Input − An algorithm should have 0 or more well-defined inputs.
3. Output − An algorithm should have 1 or more well-defined outputs, and should match the
desired output.
4. Finiteness − Algorithms must terminate after a finite number of steps.
5. Feasibility − Should be feasible with the available resources.
6. Independent − An algorithm should have step-by-step directions, which should be
independent of any programming code.
ALGORITHMIC NOTATIONS:
An algorithm is a finite step-by-step list of well-defined instructions for solving a particular
problem. The following summarize the basic format conventions used in the formulation of
algorithms.
⬧ Name of Algorithm: Every algorithm is given an identifying name written in capital
letters.
⬧ Introductory comments: The algorithm name is followed by a brief description of tasks
the algorithm performs and any assumptions that have been made. The description gives
the names and types of the variables used in the algorithm.
⬧ Comments: An algorithm step may terminate with a comment enclosed in round
parentheses intended to help the reader to understand the step better. Comments specify
no action and are included only for clarity.
⬧ Identifying number: Each algorithm is assigned an identifying number as follows:
Algorithm 4.3 refers to third algorithm in chapter 4.
⬧ Steps, Control, Exit
 Algorithm is made up of sequence of numbered steps, each beginning with a phrase
enclosed in square brackets which gives abbreviated description of that step.
Following this phrase is an ordered sequence of statements which describe actions to
be performed. The steps of algorithm are executed one after the other, beginning with
Step 1, unless indicated otherwise.
 Control may be transferred to Step n of algorithm by the statement “Go to Step n.”
 The algorithm is completed when the statement
Exit
is encountered.
⬧ Assignment Statement: The assignment statement is indicated by placing an arrow (←)
between the righthand side of the statement and the variable receiving the value.
Ex:
MAX←A[I]
which means that the value of the vector element A[I] replaces the contents of variable
MAX. An exchange of values of two variables (accomplished by the sequence of statements
TEMP←A, A←B, B←TEMP) is written as A↔B. Many variables can be set to the same
value by using a multiple assignment. Ex:
I←0, J←0, K←0
4
Unit-I

could be written as
I←J←K←0
Sometimes it uses := notation. For example,
MAX := DATA[1]
assigns the value in DATA[1] to MAX.
⬧ Input and Output: Data may be input and assigned to variables by means of a Read
statement with the following form,
 Read: Variable names
 Similarly, messages, placed in quotation marks, and data in variables may be output
by means of a Write or Print statement with the following form,
 Write: Messages and / or variable names
⬧ Variable names: A variable is an entity that possesses a value, and its name is chosen to
reflect the meaning of the value it holds (Ex: MAX holds the largest element). A variable
name always begins with a letter followed by characters including letters, numeric digits
and some special characters. Blanks are not permitted within a name, and all letters are
capitalized. Example for valid variable names are BLACK_BOX, X_SQUARED. The most
useful of the special characters is “_” (called break character), which may be used as a
separator in names made up of several words.
For example:
Algorithm : (Largest element in array ) LARGEST(DATA, N)
A nonempty array DATA with N numerical values is given. This algorithm finds the location
LOC and the value MAX of the largest element of DATA. The variable K is used as a counter.
Step 1. [ Initialize ] Set K := 1 , LOC := 1 and MAX := DATA[1]
Step 2. [ Increment Counter] Set K := K + 1
Step 3. [Test Counter] If K >N , then:
Write : LOC , MAX and Exit
Step 4. [ Compare and Update ] If MAX <DATA[K] , then:
Set LOC := K and MAX := DATA[K]
Step 5. [ Repeat Loop] Go to Step 2

CONTROL STRUCTURES:
The three types of logic or flow of control are
1. Sequential logic or sequential flow
2. Selection logic or conditional flow
3. Iteration logic or repetitive flow
Sequential logic or sequential flow:
Instructions or the modules are executed in sequence. The sequence may be presented
explicitly, by means of number of steps, or implicitly, by the order in which the modules are
written.

5
Unit-I

Selection logic or conditional flow:


Selection logic which leads to a selection of one out of several alternative modules. The
structure which implements this logic are called conditional structures or If structures. End
of the structure is indicated by the statement [End of If structure.]. These conditional
structures fall into three types.

1. Single Alternative:
The structure has form

If the condition holds, then Module A, which may consist of one or more statements, is
executed; otherwise Module A is skipped and control transfers to the next step of the
algorithm.

2. Double Alternative:
This structure has the form

If the condition holds, then Module A is executed; otherwise Module B is executed.

3. Multiple Alternatives:
This structure has the form

The logic of this structure allows only one of the modules to be executed. Either the
module which follows the first condition which holds is executed, or the module which
follows the final Else statement is executed.

Iteration logic or repetitive flow:


There are two types of structures involving loops. Each type begins with a Repeat statement
and is followed by a module, called the body of the loop. End of the structure is indicated by
the statement [ End of loop.]

6
Unit-I

Repeat-for:
The Repeat-for loop uses an index variable to control the loop. The loop has the form

Here R is called the initial value, S the end value or test value and T the increment

Repeat-while:
The repeat-while uses a condition to control the loop. The loop has the form
Repeat while condition:
[Module]
[End of loop.]

7
Unit-I

Recursion
INTRODUCTION:
The process in which a function calls itself directly or indirectly is called recursion and the
corresponding function is called as recursive function. Using recursive algorithm, certain
problems can be solved easily. Examples of such problems are Factorial, Fibonacci sequence,
and Towers of Hanoi (TOH) etc.
FACTORIAL FUNCTION:
Factorial of n is the product of all positive descending integers. Factorial of n is denoted by
n!. If n = 0, then n!=1.
If n > 0, then n! = n x (n – 1)!
For example:
If n = 5 then 5! = 5x4x3x2x1 = 120
If n=3 then 3! = 3x2x1 = 6
FACTORIAL(FACT, N)
This algorithm calculates N! and returns the value in the variable FACT.
1. If N =0, then: Set FACT := 1, and Return.
2. Call FACTORIAL(FACT, N-1).
3. Set FACT := N * FACT.
4. Return.
FIBONACCI SEQUENCE:
Fibonacci series generates the subsequent number by adding two previous numbers.
Fibonacci series starts from two numbers − F0 & F1. The initial values of F0 & F1 can be taken
0, 1 or 1, 1 respectively. If n = 0 or n = 1 then F n = n
If n > 1, then F n = F n-2 + F n-1
For example:
If n=8 then F8 = 0 1 1 2 3 5 8 13
If n=5 then F5 = 0 1 1 2 3
FIBONACCI(FIB, N)
This algorithm calculates F N and returns the value in the variable FIB.
1. If N =0 or N = 1, then: Set FIB := N, and Return.
2. Call FIBONACCI (FIBA, N-2).
3. Call FIBONACCI (FIBB, N-1).
4. Set FIB := FIBA + FIBB.
5. Return.
TOWERS OF HANOI:
Tower of Hanoi, is a mathematical puzzle which consists of three towers (pegs) and more
than one rings is as depicted –

8
Unit-I

These rings are of different sizes and stacked upon in an ascending order, i.e., the smaller one
sits over the larger one. There are other variations of the puzzle where the number of disks
increase, but the tower count remains the same.
Rules: To move all the disks to some another tower without violating the sequence of
arrangement. A few rules to be followed for Tower of Hanoi are −
• Only one disk can be moved among the towers at any given time.
• Only the "top" disk can be removed.
• No large disk can sit over a small disk.

Algorithm
To write an algorithm for Tower of Hanoi, first we need to learn how to solve this problem
with lesser amount of disks, say → 1 or 2. We mark three towers with name, source,
destination and aux (only to help moving the disks). If we have only one disk, then it can
easily be moved from source to destination peg. If we have 2 disks −
• First, we move the smaller (top) disk to aux peg.
• Then, we move the larger (bottom) disk to destination peg.
• And finally, we move the smaller disk from aux to destination peg.
TOWER(N,BEG,AUX,END)
This algorithm gives a recursive solution to the towers of Hanoi problem for N disks.
1. If N = 1, then:
(a) Write BEG → END.
(b) Return.
[End of If Structure]
2. [Move N – 1 disks from peg BEG to Peg AUX.]
Call TOWER(N – 1, BEG, END, AUX)
3. Write BEG → END
4. [Move N – 1 disks from peg AUX to Peg END]
Call TOWER(N – 1, AUX, BEG, END)
5. Return

9
Unit-I

Arrays
Data structures are classified as either linear or non-linear. A data structure is said to be
linear if its elements form a sequence. There are two basic ways of representing such linear
structures in memory. One way is to have linear relationship between the elements
represented by means of sequential memory locations. These linear structures are called
arrays. The other way is to have the linear relationship between the elements represented by
means of a pointers or links. These linear structures called linked lists. They are frequently
used to store relatively permanent collections of data.
The operations one normally perform on any linear structure, whether it be an array or a
linked list, include the following:
a) Traversal: Processing each element in the list.
b) Search: Finding the location of the element with the given value, or the record with a
given key.
c) Insertion: Adding a new element to the list.
d) Deletion: Removing an element from the list.
e) Sorting: Arranging the elements in some type of order.
f) Merging: Combing two lists into a single list.

LINEAR ARRAYS
A linear array is a list of a finite number n of homogeneous data elements (i.e., data elements
of the same type) such that
a) The elements of the array are referenced respectively by an index set consisting of n
consecutive numbers.
b) The elements of the array are stored respectively in successive memory locations.
The number n of elements is called the length or size of the array. If not explicitly stated,
assume the index set consists of the integers 1,2,…..,n . In general, the length or the number
of data elements of the array can be obtained from the index set by the formula
Length = UB – LB + 1
Where UB is the largest index , called the Upper Bound and LB is the smallest index, called
the Lower Bound, of the array.
Ex:

The elements of an array A may be denoted by the subscript notation


A1 , A2 , A3 , …………………, An
Or by the bracket notation A[1] , A[2] , A[3] , ……………….., A[ N]
The number k in A[K] is called a subscript or an index and A[K] is called a subscripted
variable.

10
Unit-I

Each programming language has its own rules for declaring arrays. Each such declaration
must give, implicitly or explicitly, three items of information:
1. The name of the array
2. The data type of the array
3. The index set of the array
Ex: Suppose DATA is a 6-element linear array containing real values. C language declares
such an array as follows:
Float DATA[6];

REPRESENTATION OF LINEAR ARRAYS IN MEMORY


Let LA be a linear array in the memory of the computer. Let us use the notation
LOC(LA[K]) = address of the element LA[K] of the array LA

Computer Memory

The elements of LA are stored in successive memory cells. The computer keep track of the
address of the first element of LA, denoted by Base(LA) and called the base address of LA.
Using this address Base(LA) , the computer calculates the address of any element of LA by
the following formula :
LOC(LA[K]) = Base(LA) + w(K – lower bound)
where w is the number of words per memory cell for the array LA.
Ex: Consider the array AUTO, which records the number of automobiles sold each year from
1932 through 1984. Suppose Base(AUTO)=200 in memory and w=4 words per memory cell
for AUTO. Then

The address of the array element for the year K=1965 can be obtained by using equation
LOC(AUTO[K]) = Base(AUTO) + w(K – lower bound)

TRAVERSING LINEAR ARRAYS:


Let A be a collection of data elements stored in the memory of the computer. Suppose we
want to print the contents of each element of A or suppose we want to count the number of
elements of A with a given property. This can be accomplished by traversing A, that is, by
accessing and processing (frequently called visiting) each element of A exactly once.
The following algorithm traverses a linear array LA.

11
Unit-I

Algorithm: (Traversing a Linear Array) Here LA is a Linear Array with lower bound LB
and upper bound UB. This algorithm traverses LA applying an operation PROCESS to each
element of LA.
1. [ Initialize counter ] Set K := LB.
2. Repeat Step 3 and 4 while K  UB.
3. [Visit element] Apply PROCESS to LA[K].
4. [Increase counter] Set K := K + 1.
[ End of Step 2 loop.]
5. Exit.
The operation PROCESS in the traversal algorithm may use certain variables which must be
initialized before PROCESS is applied to any of the elements in the array.

INSERTING AND DELETING ELEMENTS


Let A be a collection of data elements in the memory of the computer. “Inserting” refers to
the operation of adding another element to the collection A, and “deleting” refers to the
operation of removing one of the elements from A.
Inserting element at the end of a linear array can be easily done provided the memory space
allocated for the array is large enough to accommodate the additional element. Suppose to
insert an element in the middle of the array, and then on the average, half of the elements
must be moved downward to new locations to accommodate the new element and keep the
order of the other elements. Since linear arrays are usually pictured extended downward (Fib
4.5), the term downward refers to locations with larger subscripts, and the term upward refers
to locations with smaller subscripts. Deleting an element at the end of an array presents no
difficulties, but deleting an element in the middle of the array would require that each
subsequent element be moved one location upward in order to fill up the array.
The following Algorithms shows the insertion and deletion in Linear Array

Algorithm: ( Inserting into a Linear Array) INSERT(LA,N,K,ITEM)


Here LA is a linear array with N elements and K is a positive integer such that K ≤ N. This
algorithm inserts an element ITEM into the K thposition in LA.
1. [ Initailize counter.] Set J :=N.
2. Repeat Step 3 and 4 while J  K.
3. [Move Jth element downward] Set LA [ J + 1 ] := LA [ J ].
4. [Decrease counter ] Set J := J – 1.
[End of Step 2 loop.]
5. [Insert element] Set LA [K]: = ITEM.
6. [Reset N] Set N := N + 1.
7. Exit.

Algorithm: (Deleting from a Linear Array) DELETE(LA,N,K,ITEM)


Here LA is a linear array with N elements and K is a positive integer such that K ≤ N. This
algorithm deleted the Kth element from LA.
1. Set ITEM := LA[K].
2. Repeat for J = K to N – 1:
[Move J + 1st element upward.] Set LA [J] := LA [J + 1].
[End of loop.]
3. [Reset the number N of elements in LA.] Set N := N – 1.
4. Exit.

12
Unit-I

Fig. 4.5

MULTIDIMENSIONAL ARRAYS
Two Dimensional Arrays:
A two dimensional m x n array is a collection of m.n data elements such that each element is
specified by a pair of integers (such as J, K), called subscripts, with property that 1 ≤ J ≤ m
and 1 ≤ K ≤ n. The element of A with first subscript j and second subscript k will be denoted
by AJ,K or A[J, K] .
Two dimensional arrays are called matrices in mathematics and tables in business
applications; hence two-dimensional arrays are called matrix arrays.
There is a standard way of drawing two dimensional m x n array A where the elements of A
form a rectangular array with m rows and n columns and where the element A[J, K] appear
in row J and column K. For example, two dimensional 3 x 4 array A is represented as:

Suppose A is a two-dimensional m x n array. The first dimension of A contains the index set
1, ….., m with lower bound 1 and upper bound m ; and the second dimension of A contains
the index set 1, ….., n with lower bound 1 and upper bound n. The length of the dimension
is the number of integers in its index set. The pair of lengths m x n is called the size of the
array.
The length of given dimension can be obtained from the formula :
Length = upper bound – lower bound + 1
Storage representations
Row-major Representation:
Consider a two-dimensional array as a one-dimensional array since it has elements with a
single dimension. As a result, a two-dimensional array can be assumed as a single column
13
Unit-I

with many rows and mapped sequentially. Such a representation is called a Row-major
Representation.

Row-major Representation of two-dimensional array


We can calculate the address of the element of the mth row and the nth column in a two-
dimensional array using the formula shown below:
addr(a[m, n]) = (total number of rows present before the mth row x size of a row)+ (total
number of elements present before the nth element in the mth row x size of element )
In the above equation:
The total number of rows present before the mth row x size of a row = (m – lb1) and
lb1 is a first dimensional lower bound.
size of a row = total number of elements present in a row x size of an element.
total number of elements present in a row is calculated using (ub2 – lb2 + 1) and ub2 and lb2
are the second dimensional upper and lower bounds.
Now the above equation can be written as:
addr(a[m, n]) = (( m – lb1 ) x ( ub2 – lb2 + 1 ) x size ) + ( ( n – lb2 ) x size )

Column-major Representation:
We can represent a two-dimensional array as one single row of columns and map it
sequentially. Such a representation is called column-major representation.

Column-major Representation of two-dimensional array


We can calculate the address of the element of the mth row and the nth column in a two-
dimensional array using the formula shown below:
addr(a[m, n ]) = (total number of columns present before the nth column x column size) + (
total number of elements present before the mth element in the nth column x each elements
size ) In the above equation:
Columns which are placed nth column = ( n – lb2 )
14
Unit-I

Where lb2 is a second dimensional lower bound.


Column size = total number of elements present in a column x element size
Number of elements in a column = ( ub1 – lb1 + 1 )
Here ub1 is a first dimensional upper bound and lb1 first dimensional lower bound
Now the above equation can be written as :
addr(a[m, n]) = (( n – lb2 ) x ( ub1 – lb1 + 1 ) x size ) + ( ( m – lb1 ) x size )

REPRESENTATION OF TWO-DIMENSIONAL ARRAYS IN


MEMORY
Let A be a two-dimensional m x n array. The array will be represented in memory by a block
of m.n sequential memory locations. The programming language store array A either column
by column is called column-major order or by row, in row-major order. The following figure
shows two-dimensional 3 X 4 array.

The computer keeps track of Base (A) – the address of the first element A[1,1] of A and
computes the address LOC ( A [ J ,K ] ) of A [ J , K ] using the formula
Column-major order : LOC ( A [ J ,K ] ) = Base ( A ) + w [ M ( K – 1 ) + ( J – 1 ) ]
Row-major order : LOC ( A [ J ,K ] ) = Base ( A ) + w [ N ( J – 1 ) + ( K – 1 ) ]
w denotes the number of words per memory locations for the array A.
Ex: Consider the 25 x 4 matrix array SCORE. Suppose Base(SCORE)=200 and there are w=4
words per memory cell. Furthermore, suppose the array is stored in row-major order. Then
the address of SCORE[12, 3], the third test of the twelfth student, follows:

SPARSE MATRICES
Matrices with a relatively high proportion of zero entries are called sparse matrices.

15
Unit-I

Here 2/3 of the total elements in a matrix are zeros. Two general types of n-square sparse
matrices, which occur in various applications. The first matrix, where all entries above the
main diagonal are zero, where nonzero entries can only occur on or below the main diagonal,
is called a (lower) triangular matrix. Similarly, a square matrix is called upper triangular
if all the entries below the main diagonal are zero.
The second matrix, where nonzero entries can only occur on the diagonal or on elements
immediately above or below the diagonal, is called a tridiagonal matrix.

16
Unit-I

Sorting
Sorting refers to the operation of arranging data in some given order, such as increasing or
decreasing, with numerical data or alphabetically, with character data. Let A be a list of n
elements A1,A2,A3,………,An in memory. Sorting A refers to the operation of rearranging
the elements of A so that they are increasing in order as:
A[1] ≤ A[2] ≤ A[3] ≤ ………. ≤ A[n]
For example, suppose A contains
8, 4,19, 2, 7, 13, 5, 16
After sorting, A is: 2, 4, 5, 7, 8, 13, 16, 19

SELECTION SORT
The algorithm achieves its name from the fact that with each iteration the smallest element
for a key position is selected from the list of remaining elements and put in the required
position of the array i.e., we start the search assuming that the current element is the smallest
until we find an element smaller than it and then interchange the elements. The algorithm is
not efficient for large arrays. The method of selection sort relies on comparison mechanism
to achieve its goals.
Algorithm: SELECTION_SORT(A, N)
Given a vector A of N elements, this procedure rearranges the array in ascending order. The
variable SMALL stores the smallest element in the vector.
1.[Examine all the elements on the array]
Repeat through step 2 for I = 0, 1, …..N-2
2.[Assume Ith element as smallest]
SMALL←A[I]
3.[Find the smallest element in the array]
Repeat through step 3 for J = I + 1,……,N-1
[Compare and exchange]
If A[J] < SMALL then
A[J]↔SMALL
[ End of If structure.]
[ End of loop. ]
4.[Finished]
Exit

Suppose the list of numbers are,


21, 14, 42, 9
A[0] = 21, A[1] = 14, A[2] = 42, A[3] = 9
Pass 1:
SMALL = A[0] = 21
Compare A[1] with SMALL. Since 14 < 21, SMALL = 14
Compare A[2] with SMALL. 42 is not less than 14
Compare A[3] with SMALL. Since 9 <14, SMALL = 9
Now bring the smallest element to the position 0 i.e., interchange the smallest element and
the element in the position 0 as: 9, 14, 42, 21
Observe that Pass 1 involves N-1 comparisons. During Pass 1 the smallest element is selected
and placed in the position 0. When Pass 1 is completed, A[0] will contain the smallest
element.

17
Unit-I

Pass 2:
Now SMALL= A[1] = 14
Compare A[2] with SMALL. 42 is not less than 14
Compare A[3] with SMALL. 21 is not less than 14
List is not altered. Observe that Pass 2 involves N-2 comparisons.
Pass 3:
SMALL = A[2] = 42
Compare A[3] with SMALL. Since 21 < 42, SMALL = 21
When Pass 3 is completed, A[2] will contain the next smallest element of the array.
i.e., 9, 14, 21, 42

BUBBLE SORT
This is the most popular of all sorting algorithms because it is very simple to understand and
implement this algorithm. The algorithm achieves its name from the fact that in each iteration
a number moves like a bubble to its appropriate position. However, the algorithm is not
efficient for large arrays. The method of bubble sort relies heavily on an exchange mechanism
to achieve its goals. The method is also called as “sorting by exchange”.
The algorithm of bubble sort functions as follows:
The algorithm begins by comparing the element at the bottom of the array with the next
element. If the first element is larger than the second element then they are swapped or
interchanged. The process is then repeated for the next two elements. After n-1 comparisons
the largest of all the items slowly ascends to the top of the array. The entire process till now
forms one pass of comparisons. During the next pass the same steps are repeated from the
beginning of the array, however this time the comparisons are only for n-1 elements. The
second pass results in the second largest element ascending to its position. The process is
repeated again and again until only two elements are left for comparisons. The last iteration
ensures that the first two elements of the array are placed in the correct order.
Algorithm: ( Bubble Sort ) BUBBLE(DATA, N)
Here DATA is an array with N elements. This algorithm sorts the elements in DATA.
1. Repeat Steps 2 and 3 for K = 1 to N – 1
2. Set PTR := 1 [Initialize pass pointer PTR.]
3. Repeat while PTR ≤ N – K:
a) If DATA [ PTR ] > DATA [ PTR + 1 ], then:
Interchange DATA [ PTR ] and DATA [ PTR + 1 ].
[ End of If structure.]
b) Set PTR := PTR + 1
[ End of inner loop. ]
[ End of Step 1 outer loop ]
4. Exit

Suppose the list of numbers, 33, 44, 22, 11


A[0] = 33, A[1] = 44, A[2] = 22, A[3] = 11
PASS 1:
Compare A[0] with A[1]. Since 33 < 44, the list is not altered: 33, 44, 22, 11
Compare A[1] with A[2] Since 44 > 22, Interchange 22 & 44 as: 33, 22, 44, 11
Compare A[2] with A[3]. Since 44 > 11, Interchange 44 & 11 as: 33, 22, 11, 44
Observe that pass 1 involves N-1 comparisons. During pass 1 the largest element is bubbled
up to (N-1)th position. When pass 1 is completed, A[N-1] will contain the largest element.
PASS 2:
Compare A[0] with A[1]. Since 33 > 22, Interchange 33 & 22 as: 22, 33, 11, 44
18
Unit-I

Compare A[1] with A[2]. Since 33 > 11, Interchange 33 & 11 as: 22, 11, 33, 44
When pass 2 is completed, A[N-2] will contain the second largest element.
PASS 3:
Compare A[0] with A[1]. Since 22 > 11, Interchange 22 & 11 as: 11, 22, 33, 44
After N-1 passes, the list will be sorted in increasing order.

QUICK SORT
Quicksort is a Divide and Conquer algorithm. It first selects a value, which is called the pivot
value and partitions the given array around the picked pivot. The role of the pivot value is to
assist with splitting the list. The actual position where the pivot value belongs in the final
sorted list, commonly called the split point. Quick sort follows the below steps:
• Make any element as pivot
• Partition the array on the basis of pivot
• Apply quick sort on left partition recursively
• Apply quick sort on right partition recursively
For example, decide any value to be the pivot from the following list. Here 54 will serve as
our first pivot value. 54 will eventually end up in the position currently holding 31. The
partition process will happen next. It will find the split point and at the same time move
other items to the appropriate side of the list, either less than or greater than the pivot value.

The First Pivot Value for a Quick Sort

Partitioning begins by locating two position markers—let’s call them leftmark and right-
mark—at the beginning and end of the remaining items in the list (positions 1 and 8 in Fig.).
The goal of the partition process is to move items that are on the wrong side with respect to
the pivot value while also converging on the split point. Following Fig. shows this process as
we locate the position of 54.

19
Unit-I

Finding the Split Point for 54

We begin by incrementing leftmark until we locate a value that is greater than the pivot
value. We then decrement rightmark until we find a value that is less than the pivot value. At
this point we have discovered two items that are out of place with respect to the eventual split
point. For our example, this occurs at 93 and 20. Now we can exchange these two items and
then repeat the process again.

At the point where rightmark becomes less than leftmark, we stop. The position of rightmark
is now the split point. The pivot value can be exchanged with the contents of the split point
and the pivot value is now in correct place. In addition, all the items to the left of the split
point are less than the pivot value, and all the items to the right of the split point are greater
than the pivot value. The list can now be divided at the split point and the quick sort can be
invoked recursively on the two halves.

Completing the Partition Process to find the split point for 54

20
Unit-I

QUICKSORT(A,FIRST,LAST):
This algorithm sorts an array A with N elements.
1. If FIRST<LAST, then:
SPLITPOINT := partition(A,FIRST,LAST)
QUICKSORT(A,FIRST,SPLITPOINT-1)
QUICKSORT(A,SPLITPOINT+1,LAST)
[End of If structure]
2. Return

PARTITION(A,FIRST,LAST):
1. [Initialize]
PIVOTVALUE = A[FIRST]
LEFTMARK = FIRST+1
RIGHTMARK = LAST
2. Repeat while LEFTMARK < RIGHTMARK:
Repeat while LEFTMARK <= RIGHTMARK and A[LEFTMARK] <=
PIVOTVALUE:
LEFTMARK = LEFTMARK + 1
[End of while loop]
Repeat while A[RIGHTMARK] >= PIVOTVALUE and RIGHTMARK
>= LEFTMARK:
RIGHTMARK = RIGHTMARK -1
[End of while loop]
If RIGHTMARK < LEFTMARK:
A[LEFTMARK] → A[RIGHTMARK]
[End of If structure]
[End of while loop]
3. A[FIRST] → A[RIGHTMARK]
4. Return RIGHTMARK

INSERTION SORT
Suppose an array A with n elements A[1],A[2],…………A[n] is in memory. The insertion
sort algorithm scans A from A[1] to A[n], inserting each element A[k] into its proper
position in the previously sorted subarray A[1],A[2],…………A[k-1].
Pass 1: A[1] by itself is trivially sorted.
Pass 2: A[2] is inserted either before or after A[1] so that A[1],A[2] are sorted.
Pass 3: A[3] is inserted into its proper place in A[1], A[2] that is before A[1] between A[1]
and A[2] , or after A[2] so that A[1], A[2], A[3] is sorted.
Pass 4: A[4] is inserted into its proper place in A[1], A[2], A[3] so that A[1] , A[2], A[3],
A[4] is sorted.
Pass n: A[n] is inserted into its proper place in A[1], A[2],….. A[n-1] so that A[1] ,
A[2],…. A[n] is sorted.

21
Unit-I

22
Unit-I

INSERTION(A, N) : This algorithm sorts the array A with N elements.


1. [ Initialize element]
Set A[0] :=0
2. Repeat step 3 to 5 for K = 2, 3, ……, N
3. Set TEMP := A[K] and PTR := K-1
4. Repeat while TEMP < A[PTR]
Set A[PTR+1] := A[PTR]
Set PTR :=PTR-1
[ End of loop]
5. Set A[PTR+1] :=TEMP
[End of step 2 loop]
6. Return

23
Unit-I

MERGE SORT
This sorting method follows the technique of divide and conquer. The technique of merge
sort works as follows. Given a sequence of ‘N’ elements the idea is to split them into two sets.
Each set is individually sorted and the resulting sequence is then combined to produce a single
sorted sequence of N elements. If each subset is of the same type as the original set, then the
subset is further recursively divided into smaller subsets until each subset is small enough to
be solved independently without splitting. The sorted subsets are then combined to obtain a
single solution to the entire problem.
The process of dividing a problem into sub problems can be clearly understood by the example
shown in the following figure.

Is l<r?
Yes
m = (l + r)/2

24
Unit-I

The following diagram shows the complete merge sort process for an example array
{38, 27, 43, 3, 9, 82, 10}.

25
Unit-I

If we take a closer look at the diagram, we can see that the array is recursively divided
into two halves till the size becomes 1. Once the size becomes 1, the merge processes
come into action and start merging arrays back till the complete array is merged.

MERGE_SORT(A, LOW, HIGH).


Given a vector A, this procedure sorts the elements in the ascending order. The variables
LOW and HIGH are used to identify the positions of the first and the last elements in each
partition.
1.[Divide the sequence of elements into two equal parts]
If LOW < HIGH then
MID← (LOW + HIGH) / 2
[ Recursively sort the elements on the left of the division]
Call MERGE_SORT(A, LOW, MID)
[ Recursively sort the elements on the right of the division]
Call MERGE_SORT(A, MID+1, HIGH)
[ Merge the sorted left and right parts into a single sorted array]
Call MERGE(A, LOW, MID, HIGH)
[End of If structure.]
2.[Finished]
Exit
The statement LOW < HIGH with the if statement allows the procedure to be applied until
it is not possible to split the problem. The first two recursive procedure calls continue the
process of splitting until each sub problem can be solved independently. The third procedure
call combines the solved sub problems into a single solution.

26
Unit-I

The process of comparison starts by comparing first element of both subarrays and then the
smaller element should be placed in the resultant vector. And then the process is again
continued with the next element. The algorithm of merging may have the following steps.
MERGE(A, LOW, MID, HIGH): Given a vector A with N elements, this procedure sorts
the elements in the ascending order. C is a vector to store the result. LOW, MID and HIGH
are the variables used to identify the low, mid and high position of the elements in each
partition. The first partition is from the position low to the position MID and the next
partition is from the position MID+1 to the position HIGH. I and J are the temporary
variables.
1.[ Initialize ]
I←LOW
J←MID + 1
K←LOW
2.[ Examine the elements of the vector ]
Repeat thru step 3 while I ≤ MID and J≤HIGH
3. [ Compare and store the Ith element in the resultant vector C ]
If A[I] < A[J] then
C[K]←A[I]
K←K + 1
I←I + 1
[ Otherwise store the Jth element in the resultant vector C]
else
C[K]←A[J]
K←K + 1
J←J + 1
[End of If structure.]
[End of Loop.]
4.[ Examine the elements of the vector till I is less than or equal to MID]
Repeat while I ≤ MID
C[K]←A[I]
K←K + 1
I←I + 1
[End of Loop.]
5.[ Examine the elements of the vector till J is less than or equal to HIGH]
Repeat while J ≤ HIGH
C[K]←A[J]
K←K + 1
J←J + 1
[End of Loop.]
6.[ Assign the elements of C into A vector]
Repeat for I = LOW,…., HIGH
A[I]←C[I]
[End of Loop.]
7.[ Finished]
Exit

27

You might also like