Ilovepdf Merged
Ilovepdf Merged
DATA STRUCTURE
Introduction
مقدمة
لماذا عليك الخوارزميات وهياكل البيانات
تفكيك. رن
تحسي طريقة ر -1
-سوف تصبح ر
اكي إدراكا للكود الذي تكتبه
مقدمة
– 2وضوح األفكار وتنظيمها
-مثال :مطلوب منك ان تجد اعىل معدل ر ن
بي مجموعة طالب
◦ – 5المتعة يف حل المسائل ر
البمجية.
Computer is an electronic machine which is used for data processing and
manipulation.
When programmer collects such type of data for processing, he would require
to store all of them in computers main memory.
In order to make how computer work we need to know
Representation of data in computer.
Accessing of data.
How to solve problem step by step.
For doing all of this task we used Data Structure
What is Data
Structure
A data structure is a
specialized format for
organizing, processing,
retrieving and storing
data.
In computer
programming, a data
structure may be selected
or designed to store data
for the purpose of working
on it with various
algorithms.
Data Structures types example
Computer Architecture Representation
How-CPU-
Executes-
Program-
Instructions
Data Structure can be defined as the group of data elements which provides an
efficient way of storing and organizing data in the computer so that it can be used
efficiently.
examples of Data Structures are arrays, Linked List, Stack, Queue, etc.
Data Structures are widely used in almost every aspect of Computer Science i.e.
Operating System, Compiler Design, Artificial intelligence, Graphics and many more.
Data Structures are the main part of many computer science algorithms as they
enable the programmers to handle the data in an efficient way.
It plays a vital role in enhancing the performance of a software or a program as the
main function of the software is to store and retrieve the user’s data as fast as
possible
Data Structure
◦ A data structure is a particular way of organizing data in a computer so that it can be used effectively.
◦ For example, we can store a list of items having the same data-type using the array data structure.
The representation of particular data structure in the main memory of a computer is called as
storage structure.
The storage structure representation in auxiliary memory is called as file structure.
It is define as the way of storing and manipulating data in organized form so that it can be used
efficiently
Data Structure mainly specifies the following four things:
1)organization of data 2)accessing method 3)degree of associativity 4) processing alternative
for information
Algorithm + Data Structure = Program
Data Structure study Covers the following points
1) Amount of memory require to store
2) Amount of time require to process
3) Representation of data in memory
4) Operations performs on data
أنواع هياكل البيانات
Types Of DS
◦ Every item is related to its previous and ◦ Every item is attached with many other
next item. items.
◦ Data is arranged in linear sequence. ◦ Data is not arranged in sequence.
◦ Data items can be traversed in a single run ◦ Data cannot be traversed in a single run.
◦ E.g. Array, Stacks, Linked list, Queue ◦ E.g. Tree, Graph
◦ Implementation is easy. ◦ Implementation is difficult.
Operation on Data Structures
Design of efficient data structure must take operations to be performed on the DS into account.
The most commonly used operations on DS are broadly categorized into following types
1. Create: This operation results in reserving memory for program elements. This can be done by
declaration statement Creation of DS may take place either during compile-time or run-time.
2. Destroy: This operation destroy memory space allocated for specified data structure .
3. Selection: This operation deals with accessing a particular data within a data structure.
4. Updation: It updates or modifies the data in the data structure.
5. Searching: It finds the presence of desired data item in the list of data items, it may also find
locations of all elements that satisfy certain conditions.
6. Sorting: This is a process of arranging all data items in a DS in particular order, for example either
ascending order or in descending order.
7. Splitting: It is a process of partitioning single list to multiple list.
8. Merging: It is a process of combining data items of two different sorted list into single sorted list.
9. Traversing: It is a process of visiting each and every node of a list in systematic manner.
What are Arrays?
Array is a container which can
hold a fix number of items and
these items should be of the
same type.
Most of the data structures make
use of arrays to implement their
algorithms.
•Following are the important
terms to understand the concept
of Array.
Element − Each item stored
in an array is called an element.
Index − Each location of an
1. An array is a container of elements. element in an array has a
2. Elements have a specific value and data type, like "ABC", TRUE or FALSE, etc. numerical index, which is used to
3. Each element also has its own index, which is used to access the element. identify the element.
• Elements are stored at
contiguous memory locations.
• An index is always less than the
total number of array items.
• In terms of syntax, any variable
that is declared as an array can
store multiple values.
• Almost all languages have the
same comprehension of arrays
but have different ways of
declaring and initializing them.
• However, three parts will always
remain common in all the
initializations, i.e., array name,
elements, and the data type of
elements.
•Array name: necessary for easy reference to the collection of elements
•Data Type: necessary for type checking and data integrity
•Elements: these are the data values present in an array
How to access a
specific array
value?
You can access any array item by
using its index
Syntax
arrayName[indexNum]
Example
balance[1]
Here, we have accessed the second value of the array using its index, which is 1.
The output of this will be 200, which is basically the second value of the balance
array.
◦ Array Representation
◦ Arrays can be declared in various ways in different languages. For illustration, let's take C array declaration.
Analysis of Algorithms
1
Algorithm Review
Running Time
80
60
❑ Average case time is often difficult to determine.
40
20
❑ We focus on the worst case running time.
0
◼ Easier to analyze 1000 2000 3000 4000
3
Good Algorithms?
4
4
Complexity
❑ In examining algorithm efficiency we must understand
the idea of complexity
5 5
Space Complexity
❑ When memory was expensive we focused on making programs
as space efficient as possible and developed schemes to make
memory appear larger than it really was (virtual memory and
memory paging schemes)
6 6
Experimental Studies
❑ Write a program implementing the algorithm 9000
8000
Time (ms)
5000
4000
3000
2000
1000
0
0 50 100
❑ Plot the results Input Size
7
Time Complexity
❑ Is the algorithm “fast enough” for my needs
8 8
Limitations of Experiments
❑ It is necessary to implement the algorithm, which may
be difficult
11
Pseudocode Details
❑ Control flow ❑ Method call
◼ if … then … [else …] method (arg [, arg…])
◼ while … do … ❑ Return value
◼ repeat … until … return expression
◼ for … do … ❑ Expressions:
◼ Indentation replaces braces Assignment
❑ Method declaration
= Equality testing
Algorithm method (arg [, arg…])
Input … n2 Superscripts and other
Output … mathematical formatting
allowed
12
The Random Access Machine (RAM) Model
❑ A CPU
T (n )
1E+16
◼ N-Log-N n log n 1E+14
◼ Quadratic n2 1E+12
1E+10
◼ Cubic n3 1E+8
1E+6
◼ Exponential 2n 1E+4
1E+2
1E+0
❑ In a log-log chart, the slope of the 1E+0 1E+2 1E+4 1E+6 1E+8 1E+10
line corresponds to the growth rate n
14
Primitive Operations
❑ Basic computations performed
❑ Examples:
by an algorithm
◼ Evaluating an expression
❑ Identifiable in pseudocode
❑ Largely independent from the ◼ Assigning a value to a
variable
programming language
❑ Exact definition not important ◼ Indexing into an array
(we will see why later)
❑ Assumed to take a constant ◼ Calling a method
amount of time in the RAM
model ◼ Returning from a method
15
Algorithm Efficiency
❑ A measure of the amount of resources consumed in solving a
problem of size n
◼ time
◼ space
❑ Benchmarking: implement algorithm,
◼ run with some specific input and measure time taken
◼ better for comparing performance of processors than for comparing
performance of algorithms
❑ Big Oh (asymptotic analysis)
◼ associates n, the problem size,
◼ with t, the processing time required to solve the problem
16 16
Counting Primitive Operations
❑ By inspecting the pseudocode, we can determine the maximum number
of primitive operations executed by an algorithm, as a function of the
input size
19
Performance Classification
f(n) Classification
1 Constant: run time is fixed, and does not depend upon n. Most instructions are
executed once, or only a few times, regardless of the amount of information being
processed
log n Logarithmic: when n increases, so does run time, but much slower. Common in
programs which solve large problems by transforming them into smaller problems.
n Linear: run time varies directly with n. Typically, a small amount of processing is
done on each element.
n log n When n doubles, run time slightly more than doubles. Common in programs which
break a problem down into smaller sub-problems, solves them independently, then
combines solutions
n2 Quadratic: when n doubles, runtime increases fourfold. Practical only for small
problems; typically the program processes all pairs of input (e.g. in a double nested
loop).
n3 Cubic: when n doubles, runtime increases eightfold
2n Exponential: when n doubles, run time squares. This is often the result of a natural,
“brute force” solution.
Size does matter[1]
N log2N 5N N log2N N2 2N
8 3 40 24 64 256
16 4 80 64 256 65536
32 5 160 160 1024 ~109
64 6 320 384 4096 ~1019
128 7 640 896 16384 ~1038
256 8 1280 2048 65536 ~1076
COMPLEXITY CLASSES
Time (steps)
Input size(n) 22
COMPLEXITY CLASSES
Time (steps)
23
Input size(n)
Array – Linear Search
24
Array – Linear Search
25
Array – Linear Search
26
27
Array – Binary Search
28
Array – Binary Search
29
COMPLEXITY CLASSES
Time (steps)
Input size(n) 30
COMPLEXITY CLASSES
Time (steps)
Input size(n) 31
COMPLEXITY CLASSES
Time (steps)
Input size(n) 32
COMPLEXITY CLASSES
33
Size does matter[2]
❑ Suppose a program has run time O(n!) and the run time for
n = 10 is 1 second
T (n )
1E+14
1E+12
❑ Examples 1E+10
1E+8
◼ 102n + 105 is a linear 1E+6
function 1E+4
1E+2
◼ 105n2 + 108n is a 1E+0
quadratic function 1E+0 1E+2 1E+4 1E+6 1E+8 1E+10
n
36
Standard Analysis Techniques
❑ Constant time statements
❑ Analyzing Loops
❑ Analyzing Nested Loops
❑ Analyzing Sequence of Statements
❑ Analyzing Conditional Statements
Constant time statements
❑ Simplest case: O(1) time statements
❑ Assignment statements of simple data types
int x = y;
❑ Arithmetic operations:
x = 5 * y + 4 - z;
❑ Array referencing:
A[j] = 5;
❑ Array assignment:
j, A[j] = 5;
❑ Most conditional tests:
if (x < 12) ...
Best Case
❑ Best case is defined as which input of size n is
cheapest among all inputs of size n.
❑ “The best case for my algorithm is n=1 because that
is the fastest.” WRONG!
Analyzing Loops[1]
❑ Any loop has two parts:
◼ How many iterations are performed?
◼ How many steps per iteration?
int sum = 0,j;
for (j=0; j < N; j++)
sum = sum +j;
◼ Loop executes N times (0..N-1)
◼ 4 = O(1) steps per iteration
❑ Total time is N * O(1) = O(N*1) = O(N)
ANALYZING LOOPS – LINEAR LOOPS
❑ Example (have a look at this code segment):
❑ Asymptotically,
41
efficiency is : O(n)
Analyzing Loops[2]
❑ What about this for loop?
int sum =0, j;
for (j=0; j < 100; j++)
sum = sum +j;
❑ Loop executes 100 times
❑ 4 = O(1) steps per iteration
❑ Total time is 100 * O(1) = O(100 * 1) =
O(100) = O(1)
Analyzing Nested Loops[1]
❑ Treat just like a single loop and evaluate each level of nesting
as needed:
int j,k;
for (j=0; j<N; j++)
for (k=N; k>0; k--)
sum += k+j;
❑ Start with outer loop:
◼ How many iterations? N
◼ How much time per iteration? Need to evaluate inner loop
❑ Inner loop uses O(N) time
❑ Total time is N * O(N) = O(N*N) = O(N2)
Analyzing Nested Loops[2]
❑ What if the number of iterations of one loop depends on the
counter of the other?
int j,k;
for (j=0; j < N; j++)
for (k=0; k < j; k++)
sum += k+j;
❑ Analyze inner and outer loop together:
❑ Number of iterations of the inner loop is:
❑ 0 + 1 + 2 + ... + (N-1) = O(N2)
HOW DID WE GET THIS ANSWER?
45
45
CONDITIONAL STATEMENTS
❑ What about conditional statements such as
if (condition)
statement1;
else
statement2;
❑ However if N ≥ 2, then running time T(N) is the cost of each step taken plus
time required to compute power(x,n-1). (i.e. T(N) = 2+T(N-1) for N ≥ 2)
47
48
Big-Oh Notation
❑ Given functions f(n) and g(n), we say 10,000
that f(n) is O(g(n)) if there are positive 3n
49
Analyzing Conditional Statements
What about conditional statements such as
if (condition)
statement1;
else
statement2;
where statement1 runs in O(N) time and statement2 runs in O(N2) time?
We use "worst case" complexity: among all inputs of size N, that is the maximum running
time?
The analysis for the example above is O(N2)
Big-Oh Example
1,000,000
❑ Example: the function n2 is n^2
10
1
1 10 100 1,000
n
51
More Big-Oh Examples
7n-2
7n-2 is O(n)
need c > 0 and n0 1 such that 7n-2 c•n for n n0
this is true for c = 7 and n0 = 1
◼ 3n3 + 20n2 + 5
3n3 + 20n2 + 5 is O(n3)
need c > 0 and n0 1 such that 3n3 + 20n2 + 5 c•n3 for n n0
this is true for c = 4 and n0 = 21
◼ 3 log n + 5
3 log n + 5 is O(log n)
need c > 0 and n0 1 such that 3 log n + 5 c•log n for n n0
this is true for c = 8 and n0 = 2 52
Big-Oh and Growth Rate
❑ The big-Oh notation gives an upper bound on the
growth rate of a function
❑ The statement “f(n) is O(g(n))” means that the growth
rate of f(n) is no more than the growth rate of g(n)
❑ We can use the big-Oh notation to rank functions
according to their growth rate
❑ Example:
◼ We say that algorithm find_max “runs in O(n) time”
56
Prefix Averages (Quadratic)
The following algorithm computes prefix averages in
quadratic time by applying the definition
57
Arithmetic Progression
❑ The running time of 7
prefixAverage1 is 6
O(1 + 2 + …+ n)
5
58
Prefix Averages 2 (Looks Better)
The following algorithm uses an internal Python
function to simplify the code
Array-Based Sequences 1
Python Sequence Classes
❑ Python has built-in types, list, tuple, and str.
❑ Each of these sequence types supports indexing to access an individual
element of a sequence, using a syntax such as A[i]
❑ Each of these types uses an array to represent the sequence.
◼ An array is a set of memory locations that can be addressed using consecutive
indices, which, in Python, start with index 0.
A
0 1 2 i n
Array-Based Sequences 2
Arrays of Characters or Object References
❑ An array can store primitive elements, such as characters, giving
us a compact array.
Array-Based Sequences 3
Compact Arrays
❑ Primary support for compact arrays is in a module named
array.
◼ That module defines a class, also named array, providing compact
storage for arrays of primitive data types.
❑ The constructor for the array class requires a type code as a
first parameter, which is a character that designates the type of
data that will be stored in the array.
Array-Based Sequences 4
Type Codes in the array Class
❑ Python’s array class has the following type codes:
Array-Based Sequences 5
Insertion
❑ In an operation add(i, o), we need to make room for the new element by
shifting forward the n - i elements A[i], …, A[n - 1]
❑ In the worst case (i = 0), this takes O(n) time
A
0 1 2 i n
A
0 1 2 i n
A o
0 1 2 i n
Array-Based Sequences 6
Element Removal
❑ In an operation remove(i), we need to fill the hole left by the removed
element by shifting backward the n - i - 1 elements A[i + 1], …, A[n - 1]
❑ In the worst case (i = 0), this takes O(n) time
A o
0 1 2 i n
A
0 1 2 i n
A
0 1 2 i n
Array-Based Sequences 7
Performance
❑ In an array based implementation of a dynamic list:
◼ The space used by the data structure is O(n)
◼ Indexing the element at I takes O(1) time
◼ add and remove run in O(n) time in worst case
❑ In an add operation, when the array is full, instead of throwing
an exception, we can replace the array with a larger one…
Array-Based Sequences 8
Growable Array-based Array List
❑ In an add(o) operation (without an
index), we could always add at the end Algorithm add(o)
if t = S.length - 1 then
❑ When the array is full, we replace the A new array of
array with a larger one size …
❑ How large should the new array be? for i 0 to n-1 do
◼ Incremental strategy: increase the size by A[i] S[i]
a constant c SA
nn+1
Doubling strategy: double the size
S[n-1] o
◼
Array-Based Sequences 9
Comparison of the Strategies
❑ We compare the incremental strategy and the doubling
strategy by analyzing the total time T(n) needed to perform
a series of n add(o) operations
❑ We assume that we start with an empty stack represented
by an array of size 1
❑ We call amortized time of an add operation the average
time taken by an add over the series of operations, i.e.,
T(n)/n
Array-Based Sequences 10
Incremental Strategy Analysis
❑ We replace the array k = n/c times
❑ The total time T(n) of a series of n add operations is
proportional to
n + c + 2c + 3c + 4c + … + kc =
n + c(1 + 2 + 3 + … + k) =
n + ck(k + 1)/2
❑ Since c is a constant, T(n) is O(n + k2), i.e., O(n2)
❑ The amortized time of an add operation is O(n)
Array-Based Sequences 11
Doubling Strategy Analysis
❑ We replace the array k = log2 n times
❑ The total time T(n) of a series of n add
operations is proportional to geometric series
n + 1 + 2 + 4 + 8 + …+ 2k = 2
n + 2k + 1 - 1 = 4
1 1
3n - 1
❑ T(n) is O(n)
8
❑ The amortized time of an add operation is O(1)
Array-Based Sequences 12
Python Implementation
Array-Based Sequences 13
Linked Lists
Singly Linked List
A singly linked list is a
concrete data structure next
consisting of a sequence
of nodes, starting from a
head pointer
Each node stores elem node
◼ element
◼ link to the next node
head
A B C D
2
The Node Class for List Nodes
public class Node {
// Instance variables:
private Object element;
private Node next;
/** Creates a node with null references to its element and next node. */
public Node() {
this(null, null);
}
/** Creates a node with the given element and next node. */
public Node(Object e, Node n) {
element = e;
next = n;
}
// Accessor methods:
public Object getElement() {
return element;
}
public Node getNext() {
return next;
}
// Modifier methods:
public void setElement(Object newElem) {
element = newElem;
}
public void setNext(Node newNext) {
next = newNext;
}
}
3
Inserting at the Head
1. Allocate a new
node
2. Insert new element
3. Have new node
point to old head
4. Update head to
point to new node
4
Removing at the Head
1. Update head to
point to next node
in the list
2. Allow garbage
collector to reclaim
the former first
node
5
Inserting at the Tail
1. Allocate a new
node
2. Insert new element
3. Have new node
point to null
4. Have old last node
point to new node
5. Update tail to point
to new node
6
Removing at the Tail
Removing at the tail
of a singly linked list
is not efficient!
There is no
constant-time way
to update the tail to
point to the previous
node
7
Stack as a Linked List
We can implement a stack with a singly linked list
The top element is stored at the first node of the list
The space used is O(n) and each operation of the
Stack ADT takes O(1) time
nodes
t
elements
8
Linked-List Stack in Python
9
Queue as a Linked List
We can implement a queue with a singly linked list
◼ The front element is stored at the first node
◼ The rear element is stored at the last node
The space used is O(n) and each operation of the
Queue ADT takes O(1) time r
nodes
f
elements
10
Linked-List Queue in Python
11