0% found this document useful (0 votes)
60 views

CPE 202 Lecture Notes

The document provides an overview of key concepts in data structures and algorithms including data structures like arrays, lists, queues, stacks and trees. It discusses how algorithms can be described through pseudocode and how their efficiency is analyzed based on how resources required scale with input size. Key aspects of algorithms like specification, verification and performance analysis are introduced. Data structures are described as ways to organize data and abstract data types are defined to describe common features independent of implementation. Design patterns are presented as general algorithm structures that can be applied to problems.

Uploaded by

Samuel jidayi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views

CPE 202 Lecture Notes

The document provides an overview of key concepts in data structures and algorithms including data structures like arrays, lists, queues, stacks and trees. It discusses how algorithms can be described through pseudocode and how their efficiency is analyzed based on how resources required scale with input size. Key aspects of algorithms like specification, verification and performance analysis are introduced. Data structures are described as ways to organize data and abstract data types are defined to describe common features independent of implementation. Design patterns are presented as general algorithm structures that can be applied to problems.

Uploaded by

Samuel jidayi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

Data Structure and Algorithms Lecture Notes

CPE 202
Introduction
These lecture notes cover the key ideas involved in data structures and algorithms.
We shall see how they depend on the design of suitable data structures, and how
some structures and algorithms are more efficient than others for the same task.
We will concentrate on a few basic tasks, such as storing, sorting and searching
data, that underlie much of computer science, but the techniques discussed will be
applicable much more generally.
We will start by studying some key data structures, such as arrays, lists, queues,
stacks and trees, and then move on to explore their use in a range of different
searching and sorting algorithms. This leads on to the consideration of approaches
for more efficient storage of data in hash tables. Finally, we will look at graph
based representations and cover the kinds of algorithms needed to work efficiently
with them. Throughout, we will investigate the computational efficiency of the
algorithms we develop, and gain intuitions about the pros and cons of the various
potential approaches for each task. We will mostly write our codes in pseudocodes
or Java.

1.1 Algorithms as opposed to programs


An algorithm for a particular task can be defined as a finite sequence of
instructions, each of which has a clear meaning and can be performed with a finite
amount of effort in a finite length of time". As such, an algorithm must be precise
enough to be understood by human beings. However, in order to be executed by a
computer, we will generally need a program that is written in a rigorous formal
language; and since computers are quite inflexible compared to the human mind,
programs usually need to contain more details than algorithms. Here we shall
1
ignore most of those programming details and concentrate on the design of
algorithms rather than programs.
The task of implementing the discussed algorithms as computer programs is
important, of course, but these notes will concentrate on the theoretical aspects and
leave the practical programming aspects to be studied elsewhere. Having said that,
we will often find it useful to write down segments of actual programs in order to
clarify and test certain theoretical aspects of algorithms and their data structures. It
is also worth bearing in mind the distinction between different programming
paradigms: Imperative Programming describes computation in terms of
instructions that change the program/data state, whereas Declarative Programming
specifies what the program should accomplish without describing how to do it.
These notes will primarily be concerned with developing algorithms that map
easily onto the imperative programming approach.
Algorithms can obviously be described in plain English, and we will
sometimes do that. However, for computer scientists it is usually easier and clearer
to use something that comes somewhere in between formatted English and
computer program code, but is not runnable because certain details are omitted.
This is called pseudocode, which comes in a variety of forms. Often these notes
will present segments of pseudocode that are very similar to the languages we are
mainly interested in, namely the overlap of C and Java, with the advantage
that they can easily be inserted into runnable programs.
1.2 Fundamental questions about algorithms
Given an algorithm to solve a particular problem, we are naturally led to ask:
1. What is it supposed to do?
2. Does it really do what it is supposed to do?
3. How efficiently does it do it?

2
The technical terms normally used for these three aspects are:
1. Specification.
2. Verification.
3. Performance analysis.
The details of these three aspects will usually be rather problem dependent.
The specification should formalize the crucial details of the problem that the
algorithm is intended to solve. Sometimes that will be based on a particular
representation of the associated data, and sometimes it will be presented more
abstractly. Typically, it will have to specify how the inputs and outputs of the
algorithm are related, though there is no general requirement that the specification
is complete or non-ambiguous.
For simple problems, it is often easy to see that a particular algorithm will always
work, i.e. that it satisfies its specification. However, for more complicated
specifications and/or algorithms, the fact that an algorithm satisfies its specification
may not be obvious at all.
In this case, we need to spend some effort verifying whether the algorithm is
indeed correct. In general, testing on a few particular inputs can be enough to show
that the algorithm is incorrect. However, since the number of different potential
inputs for most algorithms is infinite in theory, and huge in practice, more than just
testing on particular cases is needed to be sure that the algorithm satisfies its
specification. We need correctness proofs. Although we will discuss proofs in
these notes, and useful relevant ideas like invariants, we will usually only do so in
a rather informal manner (though, of course, we will attempt to be rigorous).
The reason is that we want to concentrate on the data structures and algorithms.
Formal verification techniques are complex and will normally be left till after the
basic ideas of these notes have been studied.

3
Finally, the efficiency or performance of an algorithm relates to the resources
required y it, such as how quickly it will run, or how much computer memory it
will use. This will usually depend on the problem instance size, the choice of data
representation, and the details of the algorithm. Indeed, this is what normally
drives the development of new data structures and algorithms.
1.3 Data structures, abstract data types, design patterns
For many problems, the ability to formulate an efficient algorithm depends on
being able to organize the data in an appropriate manner. The term data structure is
used to denote a particular way of organizing data for particular types of operation.
These notes will look at numerous data structures ranging from familiar arrays and
lists to more complex structures such as trees, heaps and graphs, and we will see
how their choice affects the efficiency of the algorithms based upon them.
Often we want to talk about data structures without having to worry about all the
implementation details associated with particular programming languages, or how
the data is stored in computer memory. We can do this by formulating abstract
mathematical models of particular classes of data structures or data types which
have common features. These are called abstract data types, and are defined only
by the operations that may be performed on them. Typically, we specify how they
are built out of more primitive data types (e.g., integers or strings), how to extract
that data from them, and some basic checks to control the ow of processing in
algorithms. The idea that the implementation details are hidden from the user
and protected from outside access is known as encapsulation. We shall see many
examples of abstract data types throughout these notes. At an even higher level of
abstraction are design patterns which describe the design of
algorithms, rather the design of data structures. These embody and generalize
important
4
design concepts that appear repeatedly in many problem contexts. They provide a
general
structure for algorithms, leaving the details to be added as required for particular
problems.
These can speed up the development of algorithms by providing familiar proven
algorithm
structures that can be applied straightforwardly to new problems. We shall see a
number of
familiar design patterns throughout these notes.

Algorithm Analysis
As the “size” of an algorithm’s input grows (integer, length of array, size of queue,
etc.), we want to know
– How much longer does the algorithm take to run? (time)
– How much more memory does the algorithm need? (space)
Because the curves we saw are so different, often care about only “which curve we
are like”
Separate issue: Algorithm correctness – does it produce the right answer for all
inputs
– Usually more important, naturally

5
6
7
n*(n+ 1)/2 vs. just n2/2

8
Big O running times
For a processor capable of one million instructions per second

9
Arrays
An array is an indexed collection of data elements of the same type.
1) Indexed means that the array elements are numbered (starting at 0) and ends at
index n-1.
2) The restriction of the same type is an important one, because arrays are stored
in consecutive memory cells. Every cell must be the same type (and therefore, the
same size).
Declaring Arrays:
An array declaration is similar to the form of a normal declaration (typeName
variableName), but we add on a size:
typeName variableName[size];

10
This declares an array with the specified size, named variableName, of type
typeName. The array is indexed from 0 to size-1. The size (in brackets) must be an
integer literal or a constant variable.

Examples:
int list[30]; // an array of 30 integers
string name[20]; // an array of 20 string
double nums[50]; // an array of 50 decimals
int table[5][10]; // a two dimensional array of integers
The last example illustrates a two dimensional array (which we often like to think
about as a table). We usually think of the first size as rows, and the second as
columns, but it really does not matter, as long as you are consistent! So, we could
think of the last declaration as a table with 5 rows and 10 columns, for example.
Initializing Arrays:
With normal variables, we could declare on one line, then initialize on the next:
int x;
x = 0;
Or, we could simply initialize the variable in the declaration statement itself:
int x = 0;
Can we do the same for arrays? Yes, for the built-in types. Simply list the array
values (literals) in set notation { } after the declaration. Here are some examples:

int list[] = {2, 4, 6, 8};


char letters[] = {'a', 'e', 'i', 'o', 'u'};
double numbers[] = {3.45, 2.39, 9.1};
int table[][] = {{2, 5} , {3,1} , {4,9}};
Character arrays are special cases, because we use strings so often
11
Examples:
char ar[] = new char[10];
int darrays [][] ={{3,8}, {5,-3}, {100,200}};
int num[] = {3, 5, -67, 100, 89, 30};

for (int i = 0; i < num.length; i++) {


System.out.println(num[i]);
}
System.out.println(" ");
for (int i = 0; i < darrays.length; i++) {
for (int j = 0; j < darrays[i].length; j++) {
System.out.println(darrays[i][j]);
}

}
int list[] = {1, 3, 5, 7, 9}; // size is 5
Note: Using initializers on the declaration, as in the examples above, is probably
not going to be as feasible (or desirable) with very large arrays.
Another common way to initialize an array -- with a for loop:
This example initializes the array numList to {0, 2, 4, 6, 8, 10, 12, 14, 16, 18}.
int numList[10];
for (int i = 0; i < 10; i++){
numList[i] = i * 2;
}
Using Arrays:
Once your arrays are declared, you access the elements in an array with the array
name, and the index number inside brackets [ ]. If an array is declared as:
12
typeName varName[size], then the element with index n is referred to as
varName[n]. Examples:
int x, list[5]; // declaration
double nums[10]; // declaration

list[3] = 6; // assign value 6 to array item with index 3


System.out.println (nums[2]); // output array item with index 2
list[x] = list[x+1];
It would not be appropriate, however, to use an array index that is outside the
bounds of the valid array indices:

list[5] = 10; // bad statement, since your valid indices are 0 - 4.


The statement above is syntactically legal, however. It is the programmer's job to
make sure that out of bounds indices are not used. Do not count on the compiler to
check it for you -- it will not!

Copying arrays:
If we have these two arrays, how do we copy the contents of list2 to list1?
int list1[5];
int list2[5] = {3, 5, 7, 9, 11};
With variables, we use the assignment statement, so this would be the natural
tendency -- but it is wrong!

list1 = list2; // this does NOT copy the array contents


We must copy between arrays element by element. A for loop makes this easy,
however:

13
for (int i = 0; i < 5; i++)
list1[i] = list2[i];

Simple I/O in JAVA


strings:
In the special case of strings, they can be used like normal arrays. Accessing a
single array element means accessing one character. A function called
charAt(index) could be used to access the characters of a string.
Strings can also be output and input in their entirety, with the standard input and
output objects.

import java.util.*;
System.out.println("");
Scanner scan = new Scanner(System.in);
System.out.println("Enter a number: ");
int n = scan.nextInt();
System.out.println("Square of "+n+" is "+n*n);

14
Stacks and Queues
Introduction`
There are certain frequent situations in computer science when one wants to restrict insertions and
deletions so that they can take place only at the beginning or at the end of the list, not in the middle.
Two of the data structures that are useful in such situations are stacks and Queues.

Stacks
Stack is an ordered list in which there are only one end, for both insertions and deletions. Elements are
inserted and deleted from the same end called top of the stack. Stack is called Last In First Out
(LIFO) list. Since the first element in the stack will be the last element out of the stack.
In particular, the elements are removed from a stack in the reverse order of that in which they were
inserted into the stack.
Basic operations associated with stacks are:
 Push: Insertion of any element is called "Push" operation.
 Pop: Deletion from the stack is called the "Pop" operation.
The most and least accessible elements in a stack are known as the "Top" and "Bottom" of the stack
respectively.
A common example of a stack phenomenon, which permits the selection of only its end element, is a
pile of trays in a cafeteria. Plates can be added to this pile only on the top and removed only from the
top.

Example

Suppose following 6 digits are pushed, in order, onto an empty stack: 1, 2, 3, 4, 5, 6.


Following figure shows the three ways of picturing such a stack. When these elements will be popped
from stack the order will be: 6, 5, 4, 3, 2, 1.

1
TOP

TOP 2
1 2 3 4 5 6

TOP

4 6

5 5
15

6 4
Fig: Diagrams of Stack
The implication is that the right most element is the top element. Regardless of the way a stack is
described, its underlying property is that insertions and deletions can occur only at the top of the stack.
This means 5 can not be deleted before 6, 4 can not be deleted before 5 and 6 are deleted, and so on.
Consequently, the elements may be popped from the stack only in the reverse order of what in which
they were pushed onto the stack.

Representation of Stacks
Stacks may be represented in the computer in various ways, usually by means of one-way list or a linear
Array.
The location of the top element of the stack is stored in an integer variable TOP.
The condition : TOP = 0 or TOP = NULL will indicate that the stack is empty.
When we represent any stack through an array, we have to predefine the size of stack and we can not
enter more elements than that predefined size say MAX.
Whenever any element is added to the stack the value of TOP is increased by 1. This can be
implemented as :
TOP = TOP+1
and whenever any element is deleted from the stack the value of TOP is decreased by 1, this can be
implemented as :
TOP = TOP-1
The operation of adding (pushing) an item onto a stack and the operation of removing (popping) an
item from a stack may be implemented respectively by the following algorithms, called PUSH and
POP. In executing the procedure PUSH, we must first test whether there is room in the stack for the
new item; if not, then we have the condition known as overflow. Analogously, in execution the
procedure POP, we must first test whether there is an element in the stack to be deleted; if not, then we
have the condition known as underflow.
Algorithm: [This algorithm pushes an item onto a STACK.]
Step 1: START
Step 2: PUSH (STACK, ITEM)
Step 3: [stack already full?]
If TOP = MAX, then Print: Overflow and goto Step 6.
Step 4: TOP = TOP + 1 [Increase TOP by 1]
Step 5: STACK [TOP] = ITEM [Insert ITEM in new TOP position]
Step 6: STOP

Algorithm: [This algorithm deletes the top element of STACK.]


Step 1: START

16
Step 2: POP (STACK, ITEM)
Step 3: [stack has an item to be removed?]
If TOP = 0, then Print: Underflow and goto Step 5.
Step 4: TOP = TOP - 1 [Decrease TOP by 1]
Step 5: STOP

Example

Consider the following stack of characters, where STACK is allocated N = 8 memory cells:
STACK: A C D F K
Describe the stack as the following operations take place:
a) POP
b) POP
c) PUSH (L)
d) PUSH (P)
e) POP
f) PUSH (R)
g) PUSH (S)
h) POP
Solution: The POP always deletes the top element from the stack, and the PUSH always adds the
elements to the top of the stack.
a) POP : A C D F

b) POP : A C D

c) PUSH (L) : A C D L

d) PUSH (P) : A C D L P

e) POP: A C D L

f) PUSH (R) : A C D L R

17
g) PUSH (S) :
A C D L R S

h) POP: A C D L R

Student Activity 3.1


1. What are stacks?
2. With an example, explain the representation of stacks.
3. Consider the following stack, where STACK is allocated N = 10 memory cells:
STACK: X P Y 6 M B 8
Describe the stack as the following operations take place:
i) PUSH (T)
j) POP
k) PUSH (S)
l) POP
m) POP
n) PUSH (Q)
o) PUSH (9)

Queues
Queues arise quite naturally in the computer solution of many problems. Perhaps the most common
occurrence of a queue in Computer Applications is for the scheduling of jobs.
Queue is a Linear list which has two ends, one for insertion of elements called REAR and other for
deletion of elements called FRONT. Elements are inserted from Rear End and Deleted from Front End.
Queues are called First In First Out (FIFO) List, since the first element in a queue will be the first
element out of the queue. In other words, the order in which the elements enter a queue is the order in
which they leave.
Queues abound in everyday life. For example the people waiting in line at a bank form a queue, where
the first person in the line is the first person to be waited on.

Representation of Queues
Queues may be represented in the computer in various ways, usually by means of one way lists or linear
Arrays. Each Queue will be maintained by a linear array queue [ ] and two integer variables FRONT

18
AND REAR containing the location of the front element of the queue and the location of the Rear
element of the queue. The condition FRONT=NULL will indicate that the queue is empty.
Following figure shows the three ways of picturing such a queue for the elements: 1, 2, 3, 4, 5, 6.

FRONT 1

2 REAR 1 2 3 4 5 6

6
3
REAR FRONT REAR
5

4
FRONT 4
Fig: Diagrams of Queue
5
3
6
In executing the insertion operation, we must first test whether there is room in the queue for the new
item; if not, then we have the condition known as overflow. Analogously, in execution the deletion
operation, we must first test whether there is an element in the queue to be deleted; if not, then we have
the condition known as underflow. 2

Whenever an element is added to the queue, the value of REAR is decreased by 1. means
1 = REAR + 1
REAR

Example

Consider the following queue, where QUEUE is allocated N = 8 memory cells:


STACK: FRONT=1 M O R D
REAR= 4

19
Describe the queue as the following operations take place:
a) F is added to the queue.
b) L, T are added to the queue.
c) Two elements are deleted.
d) X is added to the queue.
e) U is added to the queue.
f) One element is deleted.

Solution:
a) F is added to the queue: FRONT=1
M O R D F
REAR= 5
b) L, T are added to the queue: FRONT=1
M O R D F L T
REAR= 7
c) Two elements are deleted: FRONT=1
R D F L T
REAR= 5
d) X is added to the queue: FRONT=1
R D F L T X
REAR= 6
e) U is added to the queue: FRONT=1
R D F L T X U
REAR= 7
f) One element is deleted: FRONT=1
D F L T X U
REAR= 6

Student Activity 3.2


1. What is queue?
2. Diagramatically represent a Queue with an example.
3. Consider the following queue, where QUEUE is allocated N = 11 memory cells:
STACK: FRONT=1 9 10 5
REAR= 3

20
Describe the queue as the following operations take place:
g) One element is deleted.
h) 18 is added to the queue.
i) 20 is added to the queue.
j) Two elements are deleted.
k) 12, 4, 23 are added to the queue.
l) Three elements are deleted.

21
Java ArrayList
The ArrayList class is a resizable array, which can be found in the java.util package.

The difference between a built-in array and an ArrayList in Java, is that the size of an array
cannot be modified (if you want to add or remove elements to/from an array, you have to create a
new one). While elements can be added and removed from an ArrayList whenever you want.
The syntax is also slightly different:

Example
Create an ArrayList object called cars that will store strings:
import java.util.ArrayList; // import the ArrayList class
ArrayList<String> cars = new ArrayList<String>(); // Create an ArrayList object

Add Items
The ArrayList class has many useful methods. For example, to add elements to the ArrayList,
use the add() method:

Example
import java.util.ArrayList;

public class Main {


public static void main(String[] args) {
ArrayList<String> cars = new ArrayList<String>();
cars.add("Volvo");
cars.add("BMW");
cars.add("Ford");
cars.add("Mazda");
System.out.println(cars);
}
}

22
Access an Item
To access an element in the ArrayList, use the get() method and refer to the index number:

Example
cars.get(0);
System.Out.Printlh(cars.get(0));
This will give the first element of the ArrayList

Change an Item
To modify an element, use the set() method and refer to the index number:
Example
cars.set(0, "Opel");

Remove an Item
To remove an element, use the remove() method and refer to the index number:

Example
cars.remove(0);

To remove all the elements in the ArrayList, use the clear() method:
Example
cars.clear();

ArrayList Size
To find out how many elements an ArrayList have, use the size method:
Example
cars.size();
Loop Through an ArrayList

23
Loop through the elements of an ArrayList with a for loop, and use the size() method to specify
how many times the loop should run:

Example
public class Main {
public static void main(String[] args) {
ArrayList<String> cars = new ArrayList<String>();
cars.add("Volvo");
cars.add("BMW");
cars.add("Ford");
cars.add("Mazda");
for (int i = 0; i < cars.size(); i++) {
System.out.println(cars.get(i));
}
}
}

You can also loop through an ArrayList with the for-each loop:
Example
public class Main {
public static void main(String[] args) {
ArrayList<String> cars = new ArrayList<String>();
cars.add("Volvo");
cars.add("BMW");
cars.add("Ford");
cars.add("Mazda");
for (String i : cars) {
System.out.println(i);
}
}
}

24
Linked List
A linked list is a linear collection of data elements, called nodes, where the linear order is given
by means of pointers. That is, each node is divided into two parts: the first part contains the
information of the element, and the second part, called the link field or next pointer field, contains
the address of the next node in the list.
The following figure is a schematic diagram of a linked list with 6 nodes. Each node is pictured
with two parts. The left part represents the information part of the node, which may contain an
entire record of data items (e.g., name, address,….). The right part represents the next pointer field
of the node, and there is an arrow drawn from it to the next node in the list. The pointer of the last
node contains a special value, called the null pointer, which is any invalid address.
The linked list also contains a list pointer variable called START or NAME which contains the
address of the first node in the list. We need only this address in START to trace through the list.
Following figure is the schematic diagram of a linked list with 6 nodes.

Example: A hospital ward contains 12 beds, of which 9 are occupied as shown in the following
figure.

25
Suppose we want an alphabetical listing of the patients. This listing may be given by the pointer
field, called Next in the figure. We use the variable START to point to the first patient. Hence
START contains 5, science the first patient, Abubakar occupies bed 5. Also, Abubakar pointer is
equal to 3, science Rahab the next patient, occupies bed 3; Rahab’s pointer is 11, science Ahmad,
the next patient, occupies bed 11; and so on. The entry for the last patient Zara contains the null
pointer, denoted by 0.

Representation of Linked Lists in Memory


Let LIST be a linked list. Then LIST will be maintained in memory as follows. First of all, LIST
requires two linear arrays – we will call them here INFO and LINK – such that INFO[K] contains
the information part and the LINK[K] contains next pointer field of a node of LIST. LIST also
requires a variable name – such as START – which contains the location of the beginning of the
list and a next pointer sentinel – denoted by NULL – which indicates the end of the list. We will
represent NULL by 0.
The following example of linked lists indicate that the nodes of a list need not occupy adjacent
elements in the arrays INFO and LINK, and that more than one list may be maintained in the same
linear arrays INFO and LINK. However, each list must have its own pointer variable giving the
location of its first node.

We can obtain the actual list of characters as follows:


START = 9, so INFO[9] = N is the first character.
LINK[9] = 3, so INFO[3] = O is the second character.
LINK[3] = 6, so INFO[6] = blank.

26
LINK[6] = 11, so INFO[11] = E is the fourth character.
LINK[11] = 7, so INFO[7] = X is the fifth character.
LINK[7] = 10, so INFO[10] = I is the sixth character.
LINK[10] = 4, so INFO[4] = T is the seventh character.
LINK[4] = 0, the NULL value, so the list has ended.

Example: Find the character strings stored in the following linked lists:

START
5

Solution: We can obtain the actual string as follows:


START = 5, so INFO[5] = Abuja is the first value.
LINK[5] = 2, so INFO[2] = Delhi is the second value.
LINK[2] = 4, so INFO[4] = Lagos is the third value.
LINK[4] = 6, so INFO[6] = Lucknow is the fourth value.
LINK[6] = 9, so INFO[9] = Deoria is the fifth value.
LINK[9] = 10, so INFO[10] = Noida is the sixth character.
LINK[10] = 0, the NULL value, so the list has ended.
Hence the string is Abuja, Delhi, Lagos, Lucknow, Deoria, Noida.

Student Activity 1
1. Define linked list with an example.

27
2. Find the character strings stored in the following linked lists:

28
TRESS

Introduction
So far we have been studying mainly linear types of data structures such as Arrays, Lists, Stacks and
Queues, now we will study a Non Linear Data Structure called a Tree. This structure is mainly used to
represent data containing a hierarchical relationship between elements, familiar examples of such structure
are: family trees, the hierarchy of positions in an organization, an algebraic expression involving operations
for which certain rules of precedence are prescribed etc. For example suppose we wish to use a data
structure to represent a person and all of his or her descendants. Assume that the person's name is Alan and
he has 3 children, John, Joe and Johnson. Also suppose that joe has 3 children, Alec, Marc and Chris and
Johnson has one child Peter. We can represent Alan and his descendants quite naturally with the tree
structure shown in Figure 5.1.

Figure 5.1

Basic Terminology about Trees


Definition of Tree: A tree is a finite set of one or more nodes such that there is a specially designated node
called the Root and remaining nodes are partitioned into n>0 disjoint sets S1....Sn where each of these sets
is a tree. S1...Sn are called the subtrees of the root. If we look at Fig 5.1 we see that the root of the tree is
Alan . Tree has three subtrees whose roots are Joe, John and Johnson.

The condition that S1....Sn be disjoint sets prohibits subtrees from ever connecting together. It means a tree
does not contain cycle.

Node (Vertex): A node stands for the item of information plus the branches to other items. Consider the
Tree of Fig 5.1 it has 8 nodes.

Edge: An edge is a line that connects two nodes. Consider the Tree of Fig 5.1 it has 7 nodes.

Degree: The number of edges connected to a given node is called its degree. In Fig 5.1 the degree of node
Alan is 3.

Leaf or Terminal Nodes: Nodes with no successors are called leaf or Terminal nodes. In Fig 5.1, John,
Alec, Marc, Chris and Peter are 'Leaf' nodes; other nodes of Tree are called 'NonLeaf' nodes.

29
Children: The roots of the subtrees of a node I are called the children of node I. I is the 'parent' of its
children.

Siblings: Children of the same parent are called 'Siblings'. Alec, Marc, Chris are Siblings.

Level: The 'level' of a node is defined by initially letting the root be at level 1. If a node is at level l, then
its children are at level l+1.

Height or Depth : The height or depth of a Tree is defined as the maximum level of any node in the Tree.

Forest : A 'forest' is a set of n>0 disjoint trees.

Binary Trees
A Binary Tree is a finite set of elements that is either empty or is partitioned into three disjoint subsets. The
first subset contains a single element called the Root of the tree. The other two subsets are themselves
Binary Trees, called the left and right subtrees of the original tree. A left or right subtree can be empty. Fig
5.2 shows a typical Binary Tree. A node of a Binary Tree can have at most two Branches.

Figure 5.2 : A Binary Tree

If A is the root of a Binary Tree and B,C are the roots of its left and right subtrees respectively then A is
said to be the father of B, C and B, C are called Left and Right Sons respectively. If every Non Leaf node
in a Binary Tree has Non Empty Left and Right subtrees the tree is termed as Strictly Binary Tree. The
Binary Tree of Fig 5.2 is Strictly Binary Tree. A Strictly Binary Tree with n leaves always contains 2n-
1 nodes.
A Complete Binary Tree is a Strictly Binary Tree of depth 'd' whose all leaves are at level d. Fig 5.3
represents complete Binary tree.

30
Figure 5.3 : A Complete Binary Tree

Student Activity 1
1. What is a node?
2. What is the difference between leaf and children?
3. What is forest?
4. What is a binary tree?
5. What are the various forms of binary tree?
Theorem 1: The maximum number of nodes on Level i of a Binary Tree is 2i-1, i>1.

Binary Tree Traversal


The traversal of a Binary Tree is to visit each node in the tree exactly once. A full traversal produces a
linear order for the information in a tree. When traversing a binary tree we want to treat each node and its
subtrees in the same fashion. There are three standard ways of traversing a binary tree. These are called
PREORDER, INORDER and POSTORDER.

Preorder
To traverse a non empty tree in preorder we perform the following three operations:
i. Visit the root
ii. Traverse the left Subtree in Preorder
iii. Traverse the right Subtree in Preorder.

Inorder
i. Traverse the left Subtree in Inorder
ii. Visit the root
iii. Traverse the right Subtree in Inorder.

31
Postorder
i. Traverse the Left Subtree in Postorder.
ii. Traverse the right Subtree in Postorder.
iii. Visit the root.

Example 1

Consider the following Binary Tree

 In Preorder Traversal of the above Binary tree will be


ABDECFHIG
 In Inorder Traversal of the above Binary tree will be
DBEAHFICG
 In Postorder Traversal of the above Binary tree will be
DEBHIFGCA

Student Activity 2
1. Why is the traversal of binary tree required?
2. What are the various traversal orders of a tree ?

Binary Search Tree


A binary tree is called a binary search tree if all elements in the left subtree of a node N are less than
the contents of N, and all elements in the right subtree of N are greater than or equal to the contents of
N. Binary Tree with this property is called a Binary Search Tree.
Consider the following binary tree.

32
The above tree is a binary search tree; that is every node exceeds every number in its left subtree and
is less than every number in its right subtree. Suppose the 23 were replaced by 35. Then the above
binary tree would still be a binary search tree. On the other hand; suppose the 23 were replaced by 40.
Then the above binary tree would not be a binary search tree, since the 38 would not be greater than
the 40 in its left subtree.

Searching and inserting in Binary Search Trees


Suppose T is a binary search tree. Suppose an ITEM of information is given. The following algorithm
finds the location of ITEM in the binary search tree T, or inserts ITEM as a new node in its appropriate
place in the tree.
a) Compare ITEM with the root node N of the tree
i. If ITEM < N, proceed to the left child of N.
ii. If ITEM > N, proceed to the right child of N.
b) Repeat Step (a) until one of the following occurs:
i.We met a node N such that ITEM = N. In this case the search is successful.
ii.We meet an empty subtree, which indicates that the search is unsuccessful, and we insert ITEM
in the place of the empty subtree.

Example 2

Consider the following Binary Search Tree. Suppose ITEM = 20 is given.

33
Simulating the above algorithm, we obtain the following steps:
1. Compare ITEM = 20 with the root, 38. Since 20<38, proceed to the left child of 38, which is 14.
2. Compare ITEM = 20 with 14. Since 20>14, proceed to the right child of 14, which is 23.
3. Compare ITEM = 20 with23. Since 20<23, proceed to the left child of 23, which is 18.
4. Compare ITEM = 20 with 18. Since 20>18, and 18 does not have a right child, insert 20 as the
right child of 18.
The following figure shows the new tree

Example 3

Suppose the following six numbers are inserted in order into an empty binary search tree:
40, 60, 50, 33, 55, 11
The following figure shows the six stages of the tree.

34
Student Activity 3
1. Explain the similarity and difference between Binary tree and Binary search tree?
2. Suppose the following list of letters is inserted in order into an empty binary search tree:
J, R, D, G, T, E, M, H, P, A, F, Q
Find the final tree (show all steps) and find the inorder traversal of the resultant tree.

Summary
Tree is one of the most important Non linear Data Structures.

35
 Tree is used to represent hierarchical relationship between data items.
 Binary Tree is a tree in which each node has only two children i.e. left child and right child.
 Left child of any node is the root of left subtree of that node, similarly right child of any node is the root
of right subtree of that node.
 A Binary Search Tree T is a binary tree in which all identifiers in the left subtree of T are less than the
identifier in the root node and all identifiers in the right subtree of T are greater than the identifier in the
root.
 A Binary Search Tree can be traversed in 3 ways:
i) Preorder Traversal in which root is traversed first and then left subtree and then Right
subtree.
ii) Inorder Traversal in which left subtree is traversed first then root and then right subtree.
iii) Postorder Traversal in which left subtree is traversed first then right subtree and then root.

36
Graphs
Definition and Terminology
A graph G consists of a non empty set V called the set of nodes (points, vertices) of the graph, a set E,
which is the set of edges of the graph and a mapping from the set of edges E to a pair of elements of V.
Any two nodes, which are connected by an edge in a graph are called "adjacent nodes".
In a graph G(V,E) an edge which is directed from one node to another is called a "directed edge", while
an edge which has no specific direction is called an "undirected edge". A graph in which every edge is
directed is called a "directed graph or digraph". A graph in which every edge is undirected is called an
"undirected graph".
If some of edges are directed and some are undirected in a graph then the graph is called a "mixed graph".
Let (V,E) be a graph and xE be a directed edge associated with the ordered pair of nodes (u,v), then the
edge is said to "initiating" or "originating" in the node u and "terminating" or "ending" in the node y. The
nodes u and v are also called "initial or terminal" nodes of the edge x. An edge xE which joins the nodes
u and v, whether it be directed or undirected, is said to be "incidents" to the node u and v. An edge of a
graph which joins a node to itself is called a "loop".
In some directed as well as undirected graphs we may have certain pairs of nodes joined by more than one
edge. Such edges are called "Parallel edges".
Any graph which contains some parallel edges is called a "multigraph".
If there is no more than one edge between a pair of nodes then, such a graph is called "simple graph."
A graph in which weights are assigned to every edge is called a "weighted graph".
In a graph, a node which is not adjacent to any other node is called "isolated node".
A graph containing only isolated nodes is called a "null graph". In a directed graph for any node v the
number of edges which have v as initial node is called the "outdegree" of the node v. The number of edges
to have v as their terminal node is called the "Indegree" of v and sum of outdegree and indegree of a node
v is called its total degree.
In the case of undirected graph the total degree of v is equal to the number of edges incident on v. The total
degree of a loop is 2 and that of an isolated node is 0.
Any sequence of edges of a digraph such that the terminal node of the edge if any, appearing next in the
sequence defines path of the graph. A path is said to traverse through the nodes appearing in the sequence
originating in the initial node of the first edge and ending in the terminal node of the first edge and ending
and at the terminal node of the last edge in the sequence. The number of edges in the sequence of a path is
called the "length" of the path.
A path of a digraph in which the edges are distinct is called simple path (edge simple). A path in which all
the nodes through which traversing is done, are distinct is called "elementary path (node simple)".
A path which originates and ends in the same node is called "cycle (circuit)".

37
Definition and Terminology
A GRAPH G, consists of two sets V and E. V is a finite non-empty set of vertices. E is a set of pairs of
vertices, these pairs are called edges. V(G) and E(G) will represent the sets of vertices and edges of graph
G. In an undirected graph the pair of vertices representing any edge is unordered. Thus, the pair (V1 , V2)
and (V2, V1) represent the same edge.
In a directed graph each edge is represented by a directed pair <V 1, V2>, V1 is the tail and V2 the head of
edge. Thus <V2, V1> and <V1, V2> represent two different edges.

G1 G2
Figure 6.1 : Two Sample Graph

The graph G1 is undirected while G2 is a directed graph.


 V(G1) = {1, 2, 3, 4,} ; E(G1) = {(1, 2), (1, 3), (1, 4), (2, 3), (2, 4) (3, 4)}
 V(G2) = {1, 2, 3}; E(G2) = {<1, 2>, <2, 1>, <2, 3>}
The length of path is the number of edges on it. A simple path is a path in which all vertices except possibly
the first and the last are distinct.
e.g., Path 1, 2, 4, 3 and 1, 3, 4, 2 are both of length 3 in G1. The first is a simple path while the other is not.
A cycle is a simple path in which the first and last vertices are the same. In an undirected graph, G, two
vertices V1 and V2 are said to be connected if there is a path in G from V1 to V2. If the graph is undirected
then there must also be a path from V1 to V2.
An undirected graph is said to be connected if for every pair of distinct vertices Vi, Vj in V(G) there is a
path from Vi to Vj in G.
A tree is connected acyclic graph. A directed graph G is said to be strongly connected if for every pair
of distinct vertices Vi, Vj in V(G) there is a directed path from Vi to Vj and also from Vj to Vi.
The degree of a vertex is the number of edges incident to that vertex. In case G is a directed graph, in-
degree of vertex V is defined to be the number of edges for which V is the head. The outdegree is defined
to be the number of edges for which V is the tail. Directed graphs are also known as digraphs.

Student Activity 1
1. Define a graph.
2. Define its various terminologies.

38
Representation of Graphs
Two most commonly used representations are :
i) Adjacency matrix
ii) Path matrix

Adjacency Matrix
Let G = (V, E) be a graph with n vertices, n  1. The adjacency matrix of (G) is a 2-dimensional nxn array,
say A, with the property that A(ij) = 1 if the edge (Vi, Vj) for a directed graph G is in E(G). The adjacency
matrix for an undirected graph is symmetric as the edge (Vi, Vj) is in E(G) if the edge (Vj, Vi) is also in
E(G).
The adjacency matrix for a directed graph need not be symmetric.

Example 1

Let, V(G) ={v1, v2, v3, v4}


E(G) ={(v1 v2), (v2 v3), (v4 v1),(v4 v2), (v4 v3)}

Write the adjacency matrix of the above graph.

Solution:

The adjacency matrix of the above graph is

A=

Path Matrix
Let G be a simple directed graph with m nodes (vertices), V1, V2, V3, ……,Vm. The path matrix of G is
the m-square matrix Pij is 1 if there is a path from Vi to Vj otherwise Pij is 0.

39
Example 2

Find the path matrix of the following graph:

Solution:

The path matrix of the above graph is

P=

Student Activity 2
1. Define adjacency and path matrix representation of a graph with example.
2. Find the adjacency and path matrix of the following Graph:

3. Draw the Directed Graph using the following adjacency matrix:

40
Summary
 A graph consists of two Non empty subsets E(G) and V(G), where V(G) is a set of vertices and
E(G) is a set of edges connecting those vertices.
 Graph is a superset of Tree. Every tree is a Graph but every graph is not necessarily a tree.
 A graph in which every edge is directed is called Directed Graph or Digraph A graph in which
every edge is undirected is called an undirected Graph.

41

You might also like