2 Intro
2 Intro
Introduction to
Data Structures
Last modified: 11/6/2023
Basic Concepts
• System life cycle
• Algorithm specification
• Data abstraction
• Performance analysis & measurement
Overview: System Life Cycle
Requirements
Analysis
Bottom-up
Top-down
Design
Data objects: abstract data types
Operations: specification & design of algorithms
Overview: System Life Cycle
(Cont.)
Coding & Refinement
Choose representations for data objects
Write algorithms for each operation on data
objects
Verification
Correctness proofs: selecting proved algorithms
Testing: correctness & efficiency
Error removal: well-document
Evaluative judgments about
programs
Meet the original specification?
Work correctly?
Document?
Use functions to create logical units?
Code readable? Use comments?
Use storage efficiently?
Running time acceptable?
Data Abstraction
Predefined & user defined data type
struct Student{ typedef struct{
int id; int id;
int age; int age;
}; }Student;
pseudo code
Example 1: Selection Sort (Cont.)
C code
Example of Selection Sort
A[0] A[1] A[2] A[3] A[4] A[5]
Original 34 8 64 51 32 21
after pass 0 8 34 64 51 32 21
after pass 1 8 21 64 51 32 34
after pass 2 8 21 32 51 64 34
after pass 3 8 21 32 34 64 51
after pass 4 8 21 32 34 51 64
Example of Selection Sort (Cont.)
Detailed (for example, doing pass 3 after pass 2)
A[0] A[1] A[2] A[3] A[4] A[5]
Original 34 8 64 51 32 21
after pass 0 8 34 64 51 32 21
after pass 1 8 21 64 51 32 34
after pass 2 8 21 32 51 64 34
doing pass 3 8 21 32 51 64 34 minimum
exchange
after pass 3 8 21 32 34 64 51
after pass 4 8 21 32 34 51 64
# of executions: n * (n-1)
Example of Binary Search
Example 2: Binary Search
Example 2: Binary Search (Cont.)
Example 3: Selection Problem
Selection problem: select the k-th largest among N
numbers
Solutions
Approach 1
• read N numbers into an array
• sort the array in decreasing order
• return the element in position k
Example 3: Selection Problem
(cont.) k elements
Solutions 109 99 87 75 63 61 54 49 32 25
Approach 2
• read k elements into an array
• sort them in decreasing order
• for each remaining elements, read one by one
• ignored if it is smaller than the k-th element
• otherwise, place in correct place and bumping one out of array
Which is better?
More efficient algorithm?
Recursive Algorithms
Direct recursion: functions that call themselves
Indirect recursion: Functions that call other functions
that invoke calling function again
C(n,m) = n!/(m!(n-m)!)
⇒ C(n,m)=C(n-1,m)+C(n-1,m-1)
Boundary condition for recursion
Recursive Factorial
n! = n × (n-1)! ⇒fact(n) = n × fact(n-1)
0! = 1
fact(n) = n × fact(n-1)
4 * fact(3)
4 * 3 * fact(2)
4* 3 * 2 * fact(1)
4 * 3 * 2 * 1 * fact(0)
Recursive Multiplication
a × b = a × (b-1) + a
a×1=a
Recursive Summation
sum(1, n) = sum(1, n-1) + n
sum(1, 1) = 1
Recursive binary search
Recursive Permutations
Permutation of {a, b, c}
Recursion?
a+Perm({b,c})=> {a, b, c} and {a, c, b}
b+Perm({a,c})=> {b, a, c} and {b, c, a}
c+Perm({a,b})=> {c, a, b} and {c, b, a}
Recursive Permutations (cont.)
Performance Evaluation
Performance analysis: machine independent
Performance measurement: machine dependent
Performance Analysis
Complexity theory
Space complexity: amount of memory
Time complexity: amount of computer time
Space Complexity
S(P) = c + Sp(I)
c: fixed space(instruction, simple variables,
constant
Sp(I): depends on characteristics of instance I
Characteristics: number (n), values of input and
output associated with I
* if n is the only characteristic, Sp(I) ⇒ Sp(n)
Time Complexity
T(P) = c + Tp(I)
c: compile time (or constant time)
Tp(I): program execution time depends on
characteristics of instance I
Characteristic: number, values of input and
output associated with I
* predict the growth in run time as the instance
characteristics change
Time Complexity (cont’d)
Compile time (C) independent of instance
characteristics
Run (execution) time TP
Definition
A program step is a syntactically or semantically
meaningful program segment whose execution
time is independent of the instance
characteristics.
Tabular Method
Recursive summing of a list of
numbers
Matrix addition
Time Complexity (cont’d)
Worst case
Best case
Average case
Time Complexity (cont’d)
Difficult to determine the exact step counts
what a step stands for is inexact
e.g. x := y v.s. x := y + z + (x/y) + …
exact step count is not useful for comparison
Step count doesn’t tell how much time a step
takes
break-even point
Asymptotic Notation -Big “oh”
Worst Case
f(n)=O(g(n)) iff
∃ positive const. c and n0, ∋ f(n) ≤ cg(n) ∀ n,
n ≥ n0
c n0
e.g.
Rule 2:
Rule 3:
Running Time Calculations
for loops
➔ O (n)
Running Time Calculations
(cont’d)
nested for loops
➔ O (𝑛2 )
Running Time Calculations
(cont’d)
consecutive statements
➔ O (𝑛2 )
Running Time Calculations
(cont’d)
If/else
➔ O (n)
Running Time Calculations-
Recursive
Running Time Calculations-
Recursive
Typical Growth Rate
Performance Measurement
Timing event
in C's standard library time.h
clock function: system clock
time function