0% found this document useful (0 votes)
75 views66 pages

Untitled

This document introduces key concepts in algorithms and data structures. It defines an algorithm as a computational method for solving a problem through a sequence of steps. An algorithm must be correct, finite, general, and efficient. Pseudocode is presented as an English-like representation for defining algorithms. The document then discusses data structures as a way to organize data and relationships. Common data structures include sets, arrays, lists, queues, and stacks. Abstract data types provide mathematical models for data types through defined values and operations. The chapter concludes with an introduction to algorithm efficiency and computational complexity analysis.

Uploaded by

Andrea Catalan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views66 pages

Untitled

This document introduces key concepts in algorithms and data structures. It defines an algorithm as a computational method for solving a problem through a sequence of steps. An algorithm must be correct, finite, general, and efficient. Pseudocode is presented as an English-like representation for defining algorithms. The document then discusses data structures as a way to organize data and relationships. Common data structures include sets, arrays, lists, queues, and stacks. Abstract data types provide mathematical models for data types through defined values and operations. The chapter concludes with an introduction to algorithm efficiency and computational complexity analysis.

Uploaded by

Andrea Catalan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 66

Chapter 1

Introduction
Algorithm
 An algorithm is a computational method for solving
a problem.
 It is a sequence of steps that take us from the input
to the output.
 An algorithm must be
 Correct
 It should provide a correct solution according to the specifications.
 Finite
 It should terminate.
 General
 It should work for every instance of a problem
 Efficient
 It should use few resources (such as time or memory).
Pseudocode

 An English-like representation of the code required


for an algorithm.
 Most common tool to define algorithms.
History of Programming Concepts
1. Nonstructured, linear programs (known as spaghetti code)
• the logic flow wound through the program
2. Modular programming (structured/ procedural)
• programs were organized in functions, each of which still used
a linear coding technique.
3. Object-oriented programming
• the functions are developed around an object.
• one part of OO concept is encapsulation, in which all
processing for an object is bundled together in a library and
hidden from the user.
Statement Constructs

1. Sequence
• a series of statements that do not alter the
execution path within an algorithm.
2. Selection
• Evaluates one or more alternative
3. Loop
• Iterates a block of code.
Data Structure
 a way of organizing the data considering not only the
items stored but also their releationship from each
other
 an aggregation of atomic and composite data types
into a set with defined relationships.
 a combination of elements each of which is either a
data type or another data structure.
 a group of data elements grouped together under
one name
Data Structure (cont…)

 A set of associations or relationships (structure)


involving the combined elements
 Atomic data are data that we choose to consider as a single,
nondecomposible entity. (e.g. int, floating points, char)
 Composite data can be broken out into subfields that have meaning.
(e.g. tel. #. 0 63 2 2475000)
 Data type is a set of data and the operations that can be performed on
the data.
 Structure means a set of rules that hold the data together.
Some Types of Data Structure
 Set
 Array
 List
 Queues
 Stacks
Figure 1-1: Some data structures that supports a list.
Abstract Data Type

 is a mathematical model for data types


 class of objects whose logical behavior is defined by a set of
values and a set of operations
 a data declaration packaged together with the operations
that are meaningful for the data type.
 with an ADT, users are not concerned with how the task is
done but rather with what it can do.
 the ADT consists of a set of definitions that allow
programmers to use the functions while hiding the
implementation (known as abstraction).
Figure 1-2: An ADT model.
Figure 1-3 A pointer to hidden structure
Abstract Data Type
List ADT
A list contains elements of same type arranged in sequential order and
following operations can be performed on the list.
get() – Return an element from the list at any given position.
insert() – Insert an element at any position of the list.
remove() – Remove the first occurrence of any element from a non-
empty list.
removeAt() – Remove the element at a specified location from a non-
empty list.
replace() – Replace an element at any position by another element.
size() – Return the number of elements in the list.
isEmpty() – Return true if the list is empty, otherwise return false.
isFull() – Return true if the list is full, otherwise return false.
Abstract Data Type

Stack ADT
A Stack contains elements of same type arranged in sequential
order. All operations takes place at a single end that is top of
the stack and following operations can be performed:
push() – Insert an element at one end of the stack called top.
pop() – Remove and return the element at the top of the stack,
if it is not empty.
peek() – Return the element at the top of the stack without
removing it, if the stack is not empty.
size() – Return the number of elements in the stack.
isEmpty() – Return true if the stack is empty, otherwise return
false.
isFull() – Return true if the stack is full, otherwise return false.
Introduction

 dynamic data structures - grow and shrink during execution

 Linked lists - insertions and removals made anywhere

 Stacks - insertions and removals made only at top of stack

 Queues - insertions made at the back and removals made from the front

 Binary trees - high-speed searching and sorting of data and efficient


elimination of duplicate data items
Algorithm
Efficiency
 A loop that does a fixed number of operations n times
is O(n)
 worst-case: maximum number of operations for
inputs of given size
 average-case: average number of operations for
inputs of given size
 best-case: fewest number of operations for inputs of
given size
Computational Complexity

 The same problem can be solved using many different


algorithms;
 Factorials can be calculated iteratively or recursively
 Sorting can be done using shellsort, heapsort, quicksort
etc.
 So how do we know what the best algorithm is? And
why do we need to know?
Why do we need to know?

 If searching for a value amongst 10 values, as with


many of the exercises we have encountered while
learning computer programming, the efficiency of the
program is maybe not as significant as getting the job
done.
 However, if we are looking for a value amongst
several trillion values, as only one step in a longer
algorithm establishing the most efficient searching
algorithm is very significant.
How do we find the most efficient
algorithm?

 To compare the efficiency of algorithms,


computational complexity can be used.
 Computational Complexity is a measure of how much
effort is needed to apply an algorithm, or how much it
costs.
 An algorithm’s cost can be considered in different
ways, but for our means Time and Space are critical.
Time being the most significant.
Computational Complexity
Considerations I

 Computational Complexity is both platform / system


and language dependent;
 An algorithm will run faster on my PC at home than the
PC’s in the lab.
 A precompiled program written in C++ is likely to be
much faster than the same program written in Basic.
 Therefore to compare algorithms all should be run on
the same machine.
Computational Complexity
Considerations II

 When comparing algorithm efficiencies, real-time


units such as microseconds or nanoseconds need not
be used.
 Instead logical units representing the relationship
between ‘n’ the size of a file, and ‘t’ the time taken to
process the data should be used.
Time / Size relationships

 Linear Relationship
 If t=cn, then an increase in the size of data increases the
execution time by the same factor
 Logarithmic Relationship
logarithm n. Mathematics. The power to which a base, such as
10, must be raised to produce a given number
 If t=log2n then doubling the size ‘n’ increases ‘t’ by one
time unit.
Asymptotic Complexity

 Functions representing ‘n’ and ‘t’ are normally much


more complex, but calculating such a function is only
important when considering large bodies of data,
large ‘n’.
 Ergo, any terms which don’t significantly affect the
outcome of the function can be eliminated, producing
a function which approximates the functions
efficiency. This is called Asymptotic Complexity.
Asymptotic complexity

• Complexity theory is part of the theory of


computation dealing with the resources required
during computation to solve a given problem.
• The most common resources are time (how many
steps does it take to solve a problem) and space
(how much memory does it take to solve a problem).
Other resources can also be considered, such as how
many parallel processors are needed to solve a
problem in parallel.
Example I

 Mowing grass has linear complexity because it takes


double the time to mow double the area. However,
looking up something in a dictionary has only
logarithmic complexity because for a double sized
dictionary you have to open it only one time more
(e.g. exactly in the middle - then the problem is
reduced to the half)
Example II

 Consider this example;


 f(n) = n2 + 100n + log10n + 1000
 For small values of n, the final term is the most
significant.
 However as n grows, the first term becomes most
significant. Hence for large ‘n’ it isn’t worth
considering the final term – how about the
penultimate term?
Some Big-O Function that appear in
algorithm analysis
Big-O

 Big-O is used to give an asymptotic upper bound for a


function, i.e. an approximation of the upper bound of
a function which is difficult to formulate.
 Just as there is an upper bound, there is a lower
bound (Big-Ω), we’ll come on to that shortly…
 But first, some useful properties of Big-O.
Upper Bound

• The number of steps an algorithm requires to solve a specific


problem is denoted as the running time of the algorithm. The
notion of a step refers to an underlying machine model. The
machine must be able to execute a single step in constant time.
• In general, the running time depends on the size of the
problem and on the respective input.
• Example:  The running time of a sorting algorithm grows if the
length of the input sequence grows. If the input sequence is
presorted, compared to an unsorted sequence possibly less
steps are sufficient.
Big-O and Logarithms

 First lets state that if the complexity of an


algorithm is on the order of a logarithmic
function, it is very good!
 Second lets state that despite that, there are an
infinite number of better functions, however very
few are useful; O(log log n) or O(1).
 Therefore, it is important to understand big-O
when it comes to Logarithms.
Lower Bound

• Once an algorithm for solving a specific problem is found the question


arises whether it is possible to design a faster algorithm or not. How
can we know unless we have found such an algorithm? In most cases a
lower bound for the problem can be given, i.e. a certain number of
steps that every algorithm has to execute at least in order to solve the
problem. As with the upper bounds the notion of a step refers to an
underlying machine model.
• Example:  In order to sort n numbers, every algorithm at least has to
take a look at every number. A sequential machine that can take a look
at a number in one step therefore needs at least n steps. Thus f(n) = n is
a lower bound for sorting, with respect to the sequential machine
model.
O, Ω & Θ

 For the function;


 f(n) = 2n2 + 3n + 1
 Options for big-O include;
 g(n) = n2, g(n) = n3, g(n) = n4 etc.
 Options for big-Ω include;
 g(n) = n2, g(n) = n, g(n) = n½
 Options for big-Θ include;
 g(n) = n2, g(n) = 2n2, g(n) = 3n2
 Therefore, while there are still an infinite number of
equations to choose from, it is obvious which equation
should be chosen.
O(n) represents upper bound. Θ(n) means tight bound. Ω(n)
represents lower bound.
Why Complexity Analysis?

 Today’s computers can perform millions of operations


per second at relatively low cost, so why complexity
analysis?
 With a PC that can perform 1 million operations per
second and 1 million items to be processed.
 A quadratic equation O(n2) would take 11.6 days.
 A cubic equation O(n3) would take 31,709 years.
 An exponential equation O(2n) is not worth thinking about.
Why Complexity Analysis

 Even a 1,000 times improvement in processing power


(consider Moore’s Law).
 The cubic equation would take over 31 years.
 The quadratic would still be over 16 minutes.
 To make scalable programs algorithm complexity
does need to be analysed.
Typical Complexities
Complexity Notation Description

Constant number of operations, not


constant O(1) depending on the input data size, e.g.
n = 1 000 000  1-2 operations
Number of operations propor-tional of
log2(n) where n is the size of the input
logarithmic O(log n)
data, e.g. n = 1 000 000 000  30
operations
Number of operations proportional to the
linear O(n) input data size, e.g. n = 10 000  5 000
operations

41
Typical Complexities (2)
Complexity Notation Description

Number of operations proportional to the


quadratic O(n )
2 square of the size of the input data, e.g. n
= 500  250 000 operations

Number of operations propor-tional to


the cube of the size of the input data, e.g.
cubic O(n3) n=
200  8 000 000 operations

O(2n), Exponential number of operations, fast


exponential O(kn), growing, e.g. n = 20  1 048 576
O(n!) operations
42
Complexity Classes

1 operation per μsec (microsecond), 10 operations to


be completed.
 Constant = O(1) = 1 μsec
 Logarithmic = O(lg n) = 3 μsec
 Linear = O(n) = 10 μsec
 O(n lg n) = 33.2 μsec
 Quadratic = O(n2) = 100 μsec
 Cubic = O(n3) = 1msec
 Exponential = O(2n) = 10msec
Complexity Classes

1 operation per μsec (microsecond), 102 operations to


be completed.
 Constant = O(1) = 1 μsec
 Logarithmic = O(lg n) = 7 μsec
 Linear = O(n) = 100 μsec
 O(n lg n) = 664 μsec
 Quadratic = O(n2) = 10 msec
 Cubic = O(n3) = 1 sec
 Exponential = O(2n) = 3.17*1017 yrs
Complexity Classes

1 operation per μsec (microsecond), 103 operations to


be completed.
 Constant = O(1) = 1 μsec
 Logarithmic = O(lg n) = 10 μsec
 Linear = O(n) = 1 msec
 O(n lg n) = 10 msec
 Quadratic = O(n2) = 1 sec
 Cubic = O(n3) = 16.7min
 Exponential = O(2n) = ……
Complexity Classes

1 operation per μsec (microsecond), 104 operations to


be completed.
 Constant = O(1) = 1 μsec
 Logarithmic = O(lg n) = 13 μsec
 Linear = O(n) = 10 msec
 O(n lg n) = 133 msec
 Quadratic = O(n2) = 1.7 min
 Cubic = O(n3) = 11.6 days
Complexity Classes

1 operation per μsec (microsecond), 105 operations to


be completed.
 Constant = O(1) = 1 μsec
 Logarithmic = O(lg n) = 17 μsec
 Linear = O(n) = 0.1 sec
 O(n lg n) = 1.6 sec
 Quadratic = O(n2) = 16.7 min
 Cubic = O(n3) = 31.7 years
Complexity Classes

1 operation per μsec (microsecond), 106 operations to


be completed.
 Constant = O(1) = 1 μsec
 Logarithmic = O(lg n) = 20 μsec
 Linear = O(n) = 1 sec
 O(n lg n) = 20 sec
 Quadratic = O(n2) = 11.6 days
 Cubic = O(n3) = 31,709 years
Asymptotic Complexity Example

 Consider this simple code;


for (i = 0,sum=0; i < n; i++)
sum += a[i];
 First 2 variables are initialised.
 The loop executes n times, with 2 assignments each
time (one updates sum and one updates i)
 Thus there are 2+2n assignments for this code; and so an
Asymptotic Complexity of O(n).
Asymptotic Complexity Example 2

 Consider this code;


for (i = 0; i < n; i++) {
for (j = 1, sum = a[0]; j <= i; j++)
sum += a[j];
cout<<“sum for subarray 0 through “ << i << “ is “<<sum<<endl;
}
 Before loop starts there is 1 initialisation (i)
 The outer loop executes n times, each time calling the inner loop
and making 3 assignments (sum, i and j)
 The inner loop executes i times for each iЄ{1,…,n-1} (the
elements in i) with 2 assignments in each case
(sum and j)
Asymptotic Complexity Example 2
(cont.)

 Therefore there are:


1+3n+n(n-1) or O(n2)
 assignments before the program completes.
Space Complexity

• Running time is usually the thing we care most about. But it can be as well
important to analyze the amount of memory used by a program. If a
program takes a lot of time, you can still run it, and just wait longer for the
result.
However if a program takes a lot of memory, you may not be able to run it
at all, so this is an important parameter to understand.
• We analyze things differently for recursive and iterative programs.

For an iterative program, it is usually just a matter of looking at the variable
declarations and storage allocation calls, e.g., array of n numbers.

Analysis of recursive program space is more complicated: the space used at
any time is the total space used by all recursive calls active at that time.
• Each recursive call takes a constant amount of space: some space for local
variables and function arguments, and also some space for remembering
Algorithm Efficiency, f(n)

 generally defined as a function of the number of elements


being processed and the type of loop being used.
Linear Loops
How many times will the body of the loop be repeated in a
linear loop?
Example:
1. i = 1
2. Loop (i <= 5)
1. application code
2. i = i + 1
3. end loop

Answer: 5, thus f(n) = n


Logarithmic Loops
How many times will the body of the loop be repeated in the
following program segments?

Multiply Loops Divide Loops


1. i=1 1. I = 1000
2. loop (i < 1000) 2. loop (i >=1)
1. application code 1. application code
2. i=ix2 2. i=i/2
3. end loop 3. end loop

f(n) = log2 n ; where 2 is the multiplier/divisor


Nested Loops
Iterations = outerloop iterations x inner loop iterations

A. Linear Logarithmic
1. i=1
2. loop (i <= 10)
1. j=1
2. loop (j <=10)
1. application code
2. j=jx2
3. end loop
4. i=i+1
3. end loop
.:. f(n) = n log2 n
Nested Loops …
B. Quadratic
1. i=1
2. loop (i <= 10)
1. j=1 .:. f(n) = n2
2. loop (j <= 10)
1. application code
2. j=j+1
3. end loop
4. i=i+1
3. end loop
Big-O Notation (“on the order of”)

 the simplification of algorithm efficiency.


*example:
the f(n) = n2, we would say its efficiency is
O(n2)
or order of n squared.
Deriving big-O notation from f(n)
1. In each term, set the coefficient of the term to 1.
2. Keep the largest term in the function and discard
the others.
Note:
Terms are ranked from lowest to highest as shown:
log2 n , n , nlog2 n , n2 , n3 … nk , 2k , n!
Hence, the seven standard measures of efficiencies are:
O(log2 n) , O(n) , O(nlog2 n) , O(n2) , O(nk) , O(ck) , O(n!)
Figure 1-4: Big-O ranges.
Big-O Analysis Example
The Number Game

 There are several approaches you could take;


 Guess 1, if wrong guess 2, if wrong guess 3, etc.
 Alternatively, guess the midpoint 5. If lower guess
halfway between 1 and 5, maybe 3 etc.
 Which is more better?
 It depends on what the number was! But, in each
option there is a best, worst and average case.
Average Case Complexity
 Best Case;
 Number of steps is smallest
 Worst Case;
 Number of steps is maximum
 Average Case;
 Somewhere in between.
 Could calculate as the sum of the number of steps for each
input divided by the number of inputs. But this assumes each
input has equal probability.
 So we weight calculation with the probability of each input.
Method 1

 Choose 1, if wrong choose 2 , if wrong choose 3…


 Probability of success for 1st try = 1/n
 Probability of success for 2nd try = 1/n
 Probability of success for nth try = 1/n
 Average;
1+2+…+n / n = (n+1)/2
Method 2

 Picking midpoints;
 Method 2 is actually like searching a binary tree, so we
will leave a full calculation until week 6, as right now the
maths could get complicated.
 But for n=10, you should be able to calculate the
average case – try it! (When n=10 I make it 1.9 times as
efficient)
Average Case Complexity

 Calculating Average Case Complexity can be difficult,


even if the probabilities are equal, so calculating
approximations in the form of big-O, big-Ω and big-Θ
can simplify the task.
Exercises

9. Calculate the run-time efficiency of the following


program segment:
1i=1
2 loop (i <= n)
1 print (i)
2 i=i+1
3 end loop Ans. O(n)
10. Calculate the run-time efficiency of the following
program segment:

1i=1
2 loop (i <= n)
1j=1
2 loop (j <= n)
1k=1
2 loop (k <= n)
1 print (i, j, k)
2k=k+1
3 end loop
4j=j+1
3 end loop
4 i = i+1
3 end loop

You might also like