0% found this document useful (0 votes)
28 views111 pages

Ilovepdf Merged

Uploaded by

Ahmed Osrf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views111 pages

Ilovepdf Merged

Uploaded by

Ahmed Osrf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 111

DATA STRUCTURE

DATA STRUCTURE
Introduction
‫مقدمة‬
‫لماذا عليك الخوارزميات وهياكل البيانات‬
‫تفكيك‪.‬‬ ‫رن‬
‫تحسي طريقة ر‬ ‫‪-1‬‬
‫‪ -‬سوف تصبح ر‬
‫اكي إدراكا للكود الذي تكتبه‬
‫مقدمة‬
‫‪ – 2‬وضوح األفكار وتنظيمها‬
‫‪ -‬مثال ‪ :‬مطلوب منك ان تجد اعىل معدل ر ن‬
‫بي مجموعة طالب‬

‫شهد‬ ‫اسماء‬ ‫كرم‬ ‫ادهم‬ ‫مريم‬ ‫اسماعيل‬


‫‪95‬‬ ‫‪99‬‬ ‫‪87‬‬ ‫‪85‬‬ ‫‪75‬‬ ‫‪80‬‬

‫◦ تمر عىل جميع الطالب‬


‫◦ تحتفظ بالطالب الذي لديه اعىل معدل‬
‫‪ -3‬حل المشاكل بفاعلية وبأقل المصادر‬
‫◦ اهم ما تبحث عنه ر‬
‫الشكات‬
‫◦ مثال ‪ :‬البحث يف ‪List‬‬

‫‪50‬‬ ‫‪35‬‬ ‫‪65‬‬ ‫‪77‬‬ ‫‪10‬‬ ‫‪25‬‬ ‫‪39‬‬


‫ن‬
‫‪ – 4‬النجاح يف مقابالت العمل‬
‫◦ حيث أن ر‬
‫الشكات العالمية تعتمد عىل مسائل حول مفاهيم الخوارزميات وهياكل البيانات‬
‫يف اختباراتها وطرق تقييمها‪.‬‬

‫◦ ‪ – 5‬المتعة يف حل المسائل ر‬
‫البمجية‪.‬‬
Computer is an electronic machine which is used for data processing and
manipulation.
When programmer collects such type of data for processing, he would require
to store all of them in computers main memory.
In order to make how computer work we need to know
Representation of data in computer.
Accessing of data.
How to solve problem step by step.
For doing all of this task we used Data Structure
What is Data
Structure

 A data structure is a
specialized format for
organizing, processing,
retrieving and storing
data.
 In computer
programming, a data
structure may be selected
or designed to store data
for the purpose of working
on it with various
algorithms.
Data Structures types example
Computer Architecture Representation
How-CPU-
Executes-
Program-
Instructions
Data Structure can be defined as the group of data elements which provides an
efficient way of storing and organizing data in the computer so that it can be used
efficiently.
examples of Data Structures are arrays, Linked List, Stack, Queue, etc.
Data Structures are widely used in almost every aspect of Computer Science i.e.
Operating System, Compiler Design, Artificial intelligence, Graphics and many more.
Data Structures are the main part of many computer science algorithms as they
enable the programmers to handle the data in an efficient way.
It plays a vital role in enhancing the performance of a software or a program as the
main function of the software is to store and retrieve the user’s data as fast as
possible
Data Structure
◦ A data structure is a particular way of organizing data in a computer so that it can be used effectively.

◦ For example, we can store a list of items having the same data-type using the array data structure.
The representation of particular data structure in the main memory of a computer is called as
storage structure.
The storage structure representation in auxiliary memory is called as file structure.
It is define as the way of storing and manipulating data in organized form so that it can be used
efficiently
Data Structure mainly specifies the following four things:
1)organization of data 2)accessing method 3)degree of associativity 4) processing alternative
for information
Algorithm + Data Structure = Program
Data Structure study Covers the following points
1) Amount of memory require to store
2) Amount of time require to process
3) Representation of data in memory
4) Operations performs on data
‫أنواع هياكل البيانات‬
Types Of DS

The DS are divided into


two types:
1) Primitive
2) Non primitive
Non primitive divided into
two type
1) Linear DS
2) Non linear DS
DATA TYPES
A particular kind of data item, as defined by the values it can take, the
Programming language used, or the operations that can be performed on it.
◦ Primitive Data Structure
◦ Primitive Data Structure are basic structure and directly operated upon by machine instructions.
◦ Primitive data structures have different representations on different computers.
◦ Integers, floats, character and pointers are example of primitive data structures.
◦ These data types are available in most programming languages as built in type.
Integer: It is a data type which allows all values without fraction part. We can used it for whole
numbers.
Float: It is a data type which is use for storing fraction numbers.
Character: It is a data type which is used for character values.
Pointer: A variable that hold memory address of another variable are called pointer.
Non Primitive Data Type
◦ These are more sophisticated data structures.
◦ These are derived from primitive data structure.
◦ The non – primitive data structures emphasize structuring of a group of homogeneous or
heterogeneous data items.
◦ Example of non – primitive data types are Array, List, and File etc.
◦ A non – primitive data type is further divided into Linear and non – Linear data structure.
Array: An array is a fixed size sequenced collection of elements of the same data type.
List: An ordered set containing variable number of elements is called as List.
File: A file is a collection of logically related information. It can be viewed as a large list of
records consisting of various fields.
Linear Data Structures
 A linear data structure simply means that it is a storage format of
the data in the memory in which the data are arranged in
contiguous blocks of memory.
 Example is the array of characters it represented by one
character after another.
 In the linear data structure, member elements form a sequence
in the storage.
 There are two ways to represent a linear data structure in
memory.
static memory allocation
dynamic memory allocation
The possible operations on the linear data structure are:
1) Traversing 2) Insertion 3) Deletion 4) searching 5) sorting
6) merging
◦ Example of Linear data structure are Stack and
Queue
Stack
◦ Stack is a data structure in which insertion and
deletion operations are performed at one end
only.
◦ The insertion operation is referred to as ‘PUSH’ and
deletion is referred as ‘POP’ operation
◦ Stack is also called as Last In First Out (LIFO) data
structure.
Queue
◦ The data structure which permits the insertion at
one and deletion at another end, known as
Queue.
◦ End at which deletion is occurs is known as FRONT
end and another end at which insertion occurs is
known as REAR end.
◦ Queue is also called as First In First Out (FIFO)
◦ Non linear DS are those data structure in which data items are not
arranged in a sequence.
◦ Example on Non Linear DS are Tree and Graph.
TREE
◦ A Tree can be define as finite data items (nodes) in which data
items are arranged in branches and sub branches
◦ Tree represent the hierarchical relationship between various
elements

Components of Graph ◦ Tree consist of nodes connected by edge, the represented by


circle and edge lives connecting to circle.
Graph
◦ Graph is collection of nodes (information) and connecting edges
(Logical relation) between nodes.
◦ A tree can be viewed as restricted graph
◦ Graph have many types: 1) Simple graph 2) Mixed graph 3) Multi
graph 4) Directed graph 5) Un-directed graph
Difference Between Linear and Non Linear Data Structure

Linear Data Structure Non – Linear Data Structure

◦ Every item is related to its previous and ◦ Every item is attached with many other
next item. items.
◦ Data is arranged in linear sequence. ◦ Data is not arranged in sequence.
◦ Data items can be traversed in a single run ◦ Data cannot be traversed in a single run.
◦ E.g. Array, Stacks, Linked list, Queue ◦ E.g. Tree, Graph
◦ Implementation is easy. ◦ Implementation is difficult.
Operation on Data Structures
Design of efficient data structure must take operations to be performed on the DS into account.
The most commonly used operations on DS are broadly categorized into following types

1. Create: This operation results in reserving memory for program elements. This can be done by
declaration statement Creation of DS may take place either during compile-time or run-time.
2. Destroy: This operation destroy memory space allocated for specified data structure .
3. Selection: This operation deals with accessing a particular data within a data structure.
4. Updation: It updates or modifies the data in the data structure.
5. Searching: It finds the presence of desired data item in the list of data items, it may also find
locations of all elements that satisfy certain conditions.
6. Sorting: This is a process of arranging all data items in a DS in particular order, for example either
ascending order or in descending order.
7. Splitting: It is a process of partitioning single list to multiple list.
8. Merging: It is a process of combining data items of two different sorted list into single sorted list.
9. Traversing: It is a process of visiting each and every node of a list in systematic manner.
What are Arrays?
Array is a container which can
hold a fix number of items and
these items should be of the
same type.
Most of the data structures make
use of arrays to implement their
algorithms.
•Following are the important
terms to understand the concept
of Array.
Element − Each item stored
in an array is called an element.
Index − Each location of an
1. An array is a container of elements. element in an array has a
2. Elements have a specific value and data type, like "ABC", TRUE or FALSE, etc. numerical index, which is used to
3. Each element also has its own index, which is used to access the element. identify the element.
• Elements are stored at
contiguous memory locations.
• An index is always less than the
total number of array items.
• In terms of syntax, any variable
that is declared as an array can
store multiple values.
• Almost all languages have the
same comprehension of arrays
but have different ways of
declaring and initializing them.
• However, three parts will always
remain common in all the
initializations, i.e., array name,
elements, and the data type of
elements.
•Array name: necessary for easy reference to the collection of elements
•Data Type: necessary for type checking and data integrity
•Elements: these are the data values present in an array
How to access a
specific array
value?
You can access any array item by
using its index

Syntax
arrayName[indexNum]

Example
balance[1]

Here, we have accessed the second value of the array using its index, which is 1.
The output of this will be 200, which is basically the second value of the balance
array.
◦ Array Representation
◦ Arrays can be declared in various ways in different languages. For illustration, let's take C array declaration.
Analysis of Algorithms

Input Algorithm Output

1
Algorithm Review

❑ An algorithm is a definite procedure for solving a problem in


finite number of steps

❑ Algorithm is a well defined computational procedure that takes


some value (s) as input, and produces some value (s) as
output.

❑ Algorithm is finite number of computational statements that


transform input into the output
2
2
Running Time
❑ Most algorithms transform input objects into best case
output objects. average case
worst case
120

❑ The running time of an algorithm typically grows 100


with the input size.

Running Time
80

60
❑ Average case time is often difficult to determine.
40

20
❑ We focus on the worst case running time.
0
◼ Easier to analyze 1000 2000 3000 4000

◼ Crucial to applications such as games, finance and Input Size


robotics

3
Good Algorithms?

❑ Run in less time


❑ Consume less memory

But computational resources (time complexity) is


usually more important

4
4
Complexity
❑ In examining algorithm efficiency we must understand
the idea of complexity

❑ Complexity is the consumptions of resources.

❑ Most important aspect of complexity are


◼ Space complexity
◼ Time Complexity

5 5
Space Complexity
❑ When memory was expensive we focused on making programs
as space efficient as possible and developed schemes to make
memory appear larger than it really was (virtual memory and
memory paging schemes)

❑ Space complexity is still important in the field of embedded


computing (hand held computer based equipment like cell
phones, palm devices, etc)

6 6
Experimental Studies
❑ Write a program implementing the algorithm 9000
8000

❑ Run the program with inputs of varying size 7000


and composition, noting the time needed: 6000

Time (ms)
5000

4000
3000
2000

1000
0
0 50 100
❑ Plot the results Input Size

7
Time Complexity
❑ Is the algorithm “fast enough” for my needs

❑ How much longer will the algorithm take if I increase


the amount of data it must process

❑ Given a set of algorithms that accomplish the same


thing, which is the right one to choose

8 8
Limitations of Experiments
❑ It is necessary to implement the algorithm, which may
be difficult

❑ Results may not be indicative of the running time on


other inputs not included in the experiment.

❑ In order to compare two algorithms, the same


hardware and software environments must be used
9
Theoretical Analysis
❑ Uses a high-level description of the algorithm instead
of an implementation

❑ Characterizes running time as a function of the input


size, n.
❑ Takes into account all possible inputs

❑ Allows us to evaluate the speed of an algorithm


independent of the hardware/software environment
10
Pseudocode
❑ High-level description of an algorithm
❑ More structured than English prose
❑ Less detailed than a program
❑ Preferred notation for describing algorithms
❑ Hides program design issues

11
Pseudocode Details
❑ Control flow ❑ Method call
◼ if … then … [else …] method (arg [, arg…])
◼ while … do … ❑ Return value
◼ repeat … until … return expression
◼ for … do … ❑ Expressions:
◼ Indentation replaces braces  Assignment
❑ Method declaration
= Equality testing
Algorithm method (arg [, arg…])
Input … n2 Superscripts and other
Output … mathematical formatting
allowed

12
The Random Access Machine (RAM) Model
❑ A CPU

❑ An potentially unbounded bank of 2


memory cells, each of which can 0
1
hold an arbitrary number or
character

Memory cells are numbered and accessing


any cell in memory takes unit time.
13
Seven Important Functions
❑ Seven functions that often appear
1E+30
in algorithm analysis: 1E+28 Cubic
1E+26
◼ Constant  1 1E+24 Quadratic
◼ Logarithmic  log n 1E+22
Linear
1E+20
◼ Linear  n 1E+18

T (n )
1E+16
◼ N-Log-N  n log n 1E+14
◼ Quadratic  n2 1E+12
1E+10
◼ Cubic  n3 1E+8
1E+6
◼ Exponential  2n 1E+4
1E+2
1E+0
❑ In a log-log chart, the slope of the 1E+0 1E+2 1E+4 1E+6 1E+8 1E+10
line corresponds to the growth rate n
14
Primitive Operations
❑ Basic computations performed
❑ Examples:
by an algorithm
◼ Evaluating an expression
❑ Identifiable in pseudocode
❑ Largely independent from the ◼ Assigning a value to a
variable
programming language
❑ Exact definition not important ◼ Indexing into an array
(we will see why later)
❑ Assumed to take a constant ◼ Calling a method
amount of time in the RAM
model ◼ Returning from a method

15
Algorithm Efficiency
❑ A measure of the amount of resources consumed in solving a
problem of size n
◼ time
◼ space
❑ Benchmarking: implement algorithm,
◼ run with some specific input and measure time taken
◼ better for comparing performance of processors than for comparing
performance of algorithms
❑ Big Oh (asymptotic analysis)
◼ associates n, the problem size,
◼ with t, the processing time required to solve the problem
16 16
Counting Primitive Operations
❑ By inspecting the pseudocode, we can determine the maximum number
of primitive operations executed by an algorithm, as a function of the
input size

❑ Step 1: 2 ops, 3: 2 ops, 4: 2n ops, 5: 2n ops, 6: 0 to


n ops, 7: 1 op
17
Estimating Running Time
❑ Algorithm find_max executes 5n + 5 primitive operations in the
worst case, 4n + 5 in the best case. Define:
a = Time taken by the fastest primitive operation
b = Time taken by the slowest primitive operation

❑ Let T(n) be worst-case time of find_max. Then


a (4n + 5)  T(n)  b(5n + 5)

❑ Hence, the running time T(n) is bounded by two linear


functions.
18
Growth Rate of Running Time
❑ Changing the hardware/ software environment
◼ Affects T(n) by a constant factor, but
◼ Does not alter the growth rate of T(n)

❑ The linear growth rate of the running time T(n) is


an intrinsic property of algorithm find_max

19
Performance Classification
f(n) Classification
1 Constant: run time is fixed, and does not depend upon n. Most instructions are
executed once, or only a few times, regardless of the amount of information being
processed
log n Logarithmic: when n increases, so does run time, but much slower. Common in
programs which solve large problems by transforming them into smaller problems.

n Linear: run time varies directly with n. Typically, a small amount of processing is
done on each element.
n log n When n doubles, run time slightly more than doubles. Common in programs which
break a problem down into smaller sub-problems, solves them independently, then
combines solutions
n2 Quadratic: when n doubles, runtime increases fourfold. Practical only for small
problems; typically the program processes all pairs of input (e.g. in a double nested
loop).
n3 Cubic: when n doubles, runtime increases eightfold

2n Exponential: when n doubles, run time squares. This is often the result of a natural,
“brute force” solution.
Size does matter[1]

What happens if we double the input size N?

N log2N 5N N log2N N2 2N
8 3 40 24 64 256
16 4 80 64 256 65536
32 5 160 160 1024 ~109
64 6 320 384 4096 ~1019
128 7 640 896 16384 ~1038
256 8 1280 2048 65536 ~1076
COMPLEXITY CLASSES
Time (steps)

Input size(n) 22
COMPLEXITY CLASSES
Time (steps)

23
Input size(n)
Array – Linear Search

24
Array – Linear Search

25
Array – Linear Search

26
27
Array – Binary Search

28
Array – Binary Search

29
COMPLEXITY CLASSES
Time (steps)

Input size(n) 30
COMPLEXITY CLASSES
Time (steps)

Input size(n) 31
COMPLEXITY CLASSES
Time (steps)

Input size(n) 32
COMPLEXITY CLASSES

33
Size does matter[2]
❑ Suppose a program has run time O(n!) and the run time for
n = 10 is 1 second

For n = 12, the run time is 2 minutes


For n = 14, the run time is 6 hours
For n = 16, the run time is 2 months
For n = 18, the run time is 50 years
For n = 20, the run time is 200 centuries
Comparison of Two Algorithms
insertion sort is
n2 / 4
merge sort is
2 n lg n
sort a million items?
insertion sort takes
roughly 70 hours
while
merge sort takes
roughly 40 seconds

This is a slow machine, but if


100 x as fast then it’s 40 minutes
versus less than 0.5 seconds
35
Constant Factors
1E+26
❑ The growth rate is 1E+24 Quadratic
Quadratic
not affected by 1E+22
1E+20 Linear
◼ constant factors or 1E+18 Linear
◼ lower-order terms 1E+16

T (n )
1E+14
1E+12
❑ Examples 1E+10
1E+8
◼ 102n + 105 is a linear 1E+6
function 1E+4
1E+2
◼ 105n2 + 108n is a 1E+0
quadratic function 1E+0 1E+2 1E+4 1E+6 1E+8 1E+10
n

36
Standard Analysis Techniques
❑ Constant time statements
❑ Analyzing Loops
❑ Analyzing Nested Loops
❑ Analyzing Sequence of Statements
❑ Analyzing Conditional Statements
Constant time statements
❑ Simplest case: O(1) time statements
❑ Assignment statements of simple data types
int x = y;
❑ Arithmetic operations:
x = 5 * y + 4 - z;
❑ Array referencing:
A[j] = 5;
❑ Array assignment:
 j, A[j] = 5;
❑ Most conditional tests:
if (x < 12) ...
Best Case
❑ Best case is defined as which input of size n is
cheapest among all inputs of size n.
❑ “The best case for my algorithm is n=1 because that
is the fastest.” WRONG!
Analyzing Loops[1]
❑ Any loop has two parts:
◼ How many iterations are performed?
◼ How many steps per iteration?
int sum = 0,j;
for (j=0; j < N; j++)
sum = sum +j;
◼ Loop executes N times (0..N-1)
◼ 4 = O(1) steps per iteration
❑ Total time is N * O(1) = O(N*1) = O(N)
ANALYZING LOOPS – LINEAR LOOPS
❑ Example (have a look at this code segment):

❑ Efficiency is proportional to the number of iterations.


❑ Efficiency time function is :
f(n) = 1 + (n-1) + c*(n-1) +( n-1)
= (c+2)*(n-1) + 1
= (c+2)n – (c+2) +1 41

❑ Asymptotically,
41
efficiency is : O(n)
Analyzing Loops[2]
❑ What about this for loop?
int sum =0, j;
for (j=0; j < 100; j++)
sum = sum +j;
❑ Loop executes 100 times
❑ 4 = O(1) steps per iteration
❑ Total time is 100 * O(1) = O(100 * 1) =
O(100) = O(1)
Analyzing Nested Loops[1]
❑ Treat just like a single loop and evaluate each level of nesting
as needed:
int j,k;
for (j=0; j<N; j++)
for (k=N; k>0; k--)
sum += k+j;
❑ Start with outer loop:
◼ How many iterations? N
◼ How much time per iteration? Need to evaluate inner loop
❑ Inner loop uses O(N) time
❑ Total time is N * O(N) = O(N*N) = O(N2)
Analyzing Nested Loops[2]
❑ What if the number of iterations of one loop depends on the
counter of the other?
int j,k;
for (j=0; j < N; j++)
for (k=0; k < j; k++)
sum += k+j;
❑ Analyze inner and outer loop together:
❑ Number of iterations of the inner loop is:
❑ 0 + 1 + 2 + ... + (N-1) = O(N2)
HOW DID WE GET THIS ANSWER?

❑ When doing Big-O analysis, we sometimes have


to compute a series like: 1 + 2 + 3 + ... + (n-1) + n

❑ i.e. Sum of first n numbers. What is the


complexity of this?

❑ Gauss figured out that the sum of the first n


numbers is always:

45

45
CONDITIONAL STATEMENTS
❑ What about conditional statements such as
if (condition)
statement1;
else
statement2;

❑ where statement1 runs in O(n) time and statement2 runs in


O(n2) time?
❑ We use "worst case" complexity: among all inputs of size n,
what is the maximum running time?
46

❑ The analysis for the example above is O(n2)


46
DERIVING A RECURRENCE EQUATION
❑ So far, all algorithms that we have been analyzing have been non
recursive
❑ Example : Recursive power method

❑ If N = 1, then running time T(N) is 2

❑ However if N ≥ 2, then running time T(N) is the cost of each step taken plus
time required to compute power(x,n-1). (i.e. T(N) = 2+T(N-1) for N ≥ 2)
47

❑ How do we solve this? One way is to use the iteration method.


47
ITERATION METHOD
❑ This is sometimes known as “Back Substituting”.
❑ Involves expanding the recurrence in order to see a pattern.
❑ Solving formula from previous example using the iteration method :

❑ Solution : Expand and apply to itself :


Let T(1) = n0 = 2, so T(N) = nk
T(N) = 2 + T(N-1)
= 2 + 2 + T(N-2)
= 2 + 2 + 2 + T(N-3)
= 2 + 2 + 2 + ……+ 2 + T(1)
= 2N + 2 remember that T(1) = n0 = 2 for N = 1

❑ So T(N) = 2N+2 is O(N) for last example. 48

48
Big-Oh Notation
❑ Given functions f(n) and g(n), we say 10,000
that f(n) is O(g(n)) if there are positive 3n

constants 1,000 2n+10


c and n0 such that
n
f(n)  cg(n) for n  n0 100
❑ Example: 2n + 10 is O(n)
◼ 2n + 10  cn 10
◼ (c − 2) n  10
◼ n  10/(c − 2)
1
◼ Pick c = 3 and n0 = 10 1 10 100 1,000
n

49
Analyzing Conditional Statements
What about conditional statements such as

if (condition)
statement1;
else
statement2;

where statement1 runs in O(N) time and statement2 runs in O(N2) time?

We use "worst case" complexity: among all inputs of size N, that is the maximum running
time?
The analysis for the example above is O(N2)
Big-Oh Example
1,000,000
❑ Example: the function n2 is n^2

not O(n) 100,000 100n


10n
◼ n2  cn
10,000 n
◼ nc
◼ The above inequality cannot 1,000
be satisfied since c must be a
constant 100

10

1
1 10 100 1,000
n
51
More Big-Oh Examples
7n-2
7n-2 is O(n)
need c > 0 and n0  1 such that 7n-2  c•n for n  n0
this is true for c = 7 and n0 = 1

◼ 3n3 + 20n2 + 5
3n3 + 20n2 + 5 is O(n3)
need c > 0 and n0  1 such that 3n3 + 20n2 + 5  c•n3 for n  n0
this is true for c = 4 and n0 = 21

◼ 3 log n + 5
3 log n + 5 is O(log n)
need c > 0 and n0  1 such that 3 log n + 5  c•log n for n  n0
this is true for c = 8 and n0 = 2 52
Big-Oh and Growth Rate
❑ The big-Oh notation gives an upper bound on the
growth rate of a function
❑ The statement “f(n) is O(g(n))” means that the growth
rate of f(n) is no more than the growth rate of g(n)
❑ We can use the big-Oh notation to rank functions
according to their growth rate

f(n) is O(g(n)) g(n) is O(f(n))


g(n) grows more Yes No
f(n) grows more No Yes
Same growth Yes Yes
53
Big-Oh Rules
❑ If is f(n) a polynomial of degree d, then f(n) is O(nd),
i.e.,
1. Drop lower-order terms
2. Drop constant factors

❑ Use the smallest possible class of functions


◼ Say “2n is O(n)” instead of “2n is O(n2)”

❑ Use the simplest expression of the class


◼ Say “3n + 5 is O(n)” instead of “3n + 5 is O(3n)”
54
Asymptotic Algorithm Analysis
❑ The asymptotic analysis of an algorithm determines the running
time in big-Oh notation

❑ To perform the asymptotic analysis


◼ We find the worst-case number of primitive operations executed as a
function of the input size
◼ We express this function with big-Oh notation

❑ Example:
◼ We say that algorithm find_max “runs in O(n) time”

❑ Since constant factors and lower-order terms are eventually


dropped anyhow, we can disregard them when counting
primitive operations
55
Computing Prefix Averages
❑ We further illustrate asymptotic analysis with 35
two algorithms for prefix averages X
30 A
❑ The i-th prefix average of an array X is average 25
of the first (i + 1) elements of X: 20
A[i] = (X[0] + X[1] + … + X[i])/(i+1)
15
10
❑ Computing the array A of prefix averages of
5
another array X has applications to financial
analysis 0
1 2 3 4 5 6 7

56
Prefix Averages (Quadratic)
The following algorithm computes prefix averages in
quadratic time by applying the definition

57
Arithmetic Progression
❑ The running time of 7
prefixAverage1 is 6
O(1 + 2 + …+ n)
5

❑ The sum of the first n integers is 4


n(n + 1) / 2 3
◼ There is a simple visual proof of 2
this fact
1
❑ Thus, algorithm prefixAverage1 0
runs in O(n2) time 1 2 3 4 5 6

58
Prefix Averages 2 (Looks Better)
The following algorithm uses an internal Python
function to simplify the code

Algorithm prefixAverage2 still runs in O(n2) time!


59
Prefix Averages 3 (Linear Time)
The following algorithm computes prefix averages in
linear time by keeping a running sum

Algorithm prefixAverage3 runs in O(n) time


60
Array-Based Sequences

Array-Based Sequences 1
Python Sequence Classes
❑ Python has built-in types, list, tuple, and str.
❑ Each of these sequence types supports indexing to access an individual
element of a sequence, using a syntax such as A[i]
❑ Each of these types uses an array to represent the sequence.
◼ An array is a set of memory locations that can be addressed using consecutive
indices, which, in Python, start with index 0.

A
0 1 2 i n
Array-Based Sequences 2
Arrays of Characters or Object References
❑ An array can store primitive elements, such as characters, giving
us a compact array.

❑ An array can also store references to objects.

Array-Based Sequences 3
Compact Arrays
❑ Primary support for compact arrays is in a module named
array.
◼ That module defines a class, also named array, providing compact
storage for arrays of primitive data types.
❑ The constructor for the array class requires a type code as a
first parameter, which is a character that designates the type of
data that will be stored in the array.

Array-Based Sequences 4
Type Codes in the array Class
❑ Python’s array class has the following type codes:

Array-Based Sequences 5
Insertion
❑ In an operation add(i, o), we need to make room for the new element by
shifting forward the n - i elements A[i], …, A[n - 1]
❑ In the worst case (i = 0), this takes O(n) time

A
0 1 2 i n
A
0 1 2 i n
A o
0 1 2 i n
Array-Based Sequences 6
Element Removal
❑ In an operation remove(i), we need to fill the hole left by the removed
element by shifting backward the n - i - 1 elements A[i + 1], …, A[n - 1]
❑ In the worst case (i = 0), this takes O(n) time

A o
0 1 2 i n
A
0 1 2 i n
A
0 1 2 i n
Array-Based Sequences 7
Performance
❑ In an array based implementation of a dynamic list:
◼ The space used by the data structure is O(n)
◼ Indexing the element at I takes O(1) time
◼ add and remove run in O(n) time in worst case
❑ In an add operation, when the array is full, instead of throwing
an exception, we can replace the array with a larger one…

Array-Based Sequences 8
Growable Array-based Array List
❑ In an add(o) operation (without an
index), we could always add at the end Algorithm add(o)
if t = S.length - 1 then
❑ When the array is full, we replace the A  new array of
array with a larger one size …
❑ How large should the new array be? for i  0 to n-1 do
◼ Incremental strategy: increase the size by A[i]  S[i]
a constant c SA
nn+1
Doubling strategy: double the size
S[n-1]  o

Array-Based Sequences 9
Comparison of the Strategies
❑ We compare the incremental strategy and the doubling
strategy by analyzing the total time T(n) needed to perform
a series of n add(o) operations
❑ We assume that we start with an empty stack represented
by an array of size 1
❑ We call amortized time of an add operation the average
time taken by an add over the series of operations, i.e.,
T(n)/n

Array-Based Sequences 10
Incremental Strategy Analysis
❑ We replace the array k = n/c times
❑ The total time T(n) of a series of n add operations is
proportional to
n + c + 2c + 3c + 4c + … + kc =
n + c(1 + 2 + 3 + … + k) =
n + ck(k + 1)/2
❑ Since c is a constant, T(n) is O(n + k2), i.e., O(n2)
❑ The amortized time of an add operation is O(n)

Array-Based Sequences 11
Doubling Strategy Analysis
❑ We replace the array k = log2 n times
❑ The total time T(n) of a series of n add
operations is proportional to geometric series
n + 1 + 2 + 4 + 8 + …+ 2k = 2
n + 2k + 1 - 1 = 4
1 1
3n - 1
❑ T(n) is O(n)
8
❑ The amortized time of an add operation is O(1)

Array-Based Sequences 12
Python Implementation

Array-Based Sequences 13
Linked Lists
Singly Linked List
A singly linked list is a
concrete data structure next
consisting of a sequence
of nodes, starting from a
head pointer
Each node stores elem node
◼ element
◼ link to the next node
head

A B C D
2
The Node Class for List Nodes
public class Node {
// Instance variables:
private Object element;
private Node next;
/** Creates a node with null references to its element and next node. */
public Node() {
this(null, null);
}
/** Creates a node with the given element and next node. */
public Node(Object e, Node n) {
element = e;
next = n;
}
// Accessor methods:
public Object getElement() {
return element;
}
public Node getNext() {
return next;
}
// Modifier methods:
public void setElement(Object newElem) {
element = newElem;
}
public void setNext(Node newNext) {
next = newNext;
}
}
3
Inserting at the Head
1. Allocate a new
node
2. Insert new element
3. Have new node
point to old head
4. Update head to
point to new node

4
Removing at the Head
1. Update head to
point to next node
in the list
2. Allow garbage
collector to reclaim
the former first
node

5
Inserting at the Tail
1. Allocate a new
node
2. Insert new element
3. Have new node
point to null
4. Have old last node
point to new node
5. Update tail to point
to new node

6
Removing at the Tail
Removing at the tail
of a singly linked list
is not efficient!
There is no
constant-time way
to update the tail to
point to the previous
node

7
Stack as a Linked List
We can implement a stack with a singly linked list
The top element is stored at the first node of the list
The space used is O(n) and each operation of the
Stack ADT takes O(1) time

nodes

t 

elements
8
Linked-List Stack in Python

9
Queue as a Linked List
We can implement a queue with a singly linked list
◼ The front element is stored at the first node
◼ The rear element is stored at the last node
The space used is O(n) and each operation of the
Queue ADT takes O(1) time r
nodes

f 

elements
10
Linked-List Queue in Python

11

You might also like