Csc211 Lecture 03
Csc211 Lecture 03
Data Structures
Lecture 3
Dr. Iftikhar Azim Niaz
[email protected]
1
Last Lecture Summary I
System Development Life Cycle Phases
Ongoing Activities
Project Management, Feasibility, Documentation
Planning
Review, approve and prioritize project requests
Analysis
Preliminary Investigation, Detailed analysis
Design
Acquire Hardware and software, Develop details
Implementation
Develop programs, install and test new system
Operation, Support and Security
Maintenance Activities, System performance and security
2
Last Lecture Summary II
Program Development Life Cycle
Analyze requirements
Review requirements, develop IPO charts
Design solution
Design solution algorithm, Structured and OOP
Flowchart and Pseudo code
Validate design
Inspection and Desk check
Implement design
Program development tool, writing code
Test solution
Testing and Debugging
Document solution
3
What is a Programming
Language?
A programming language:
is a language
has strict grammar rules, symbols, and
special words
is used to construct a computer
program.
4
Programming Languages
First-Generation Languages: Machine
Languages (1s and 0s)
Second-Generation Languages: Assembly
Languages – are a little easier for humans to
program in
Third-Generation Languages: Programming
Comes of Age
Fourth-Generation Languages: Getting Away
from Procedure
Object-Oriented Programming: A Revolution in
the Making?
5
First Generation Computer
Programming Languages
Machine languages
1’s and 0’s represent instructions and procedures
(the Instruction Set)
Machine-dependent code (machine code)
Programmers have to know the structure of the
machine (architecture), addresses of memory
registers, etc.
Different for each hardware manufacturer (e.g., Intel differs
from Motorola)
6
Early Programming
Languages…
Memory locations were referenced by their address
This made the process of giving a computer instructions
to do things very cumbersome
E.g., storing the values 45 and 12 in memory locations
1652 and 2548 required instructions such as:
Get storage for two integers
Put 45 in location 1652
Put 12 in location 2548
Adding two numbers was even more cumbersome!
Add the contents of location 1652 to the
contents of location 2548 and store the
result into location 3000
7
Programming Language
Evolution
Early computers were programmed in machine
language
Machine language is also called binary code
To calculate wages = rate * hours in machine
language, the following sequence of instructions
might be needed:
address content meaning
000001 100100 load address content meaning
000010 010001 address 010001 000010 2
000011 100110 multiply to 010010 000110 6
000100 010010 address 010011 001100 12
000101 100010 store to
000110 010011 address
8
Second Generation
Assembly language
Still “low-level” (i.e., machine architecture
dependent)
But uses mnemonic command names
ADD(3,2), COMPARE(A,B), etc.
An “assembler” translates program source code into
the appropriate machine language (binary code)
adds overhead
9
First Generation vs. Second
Generation
Machine language Assembly
01001100 mov bx, offset value
11101001 mov ax, [bx]
10101010 add ax, 5
10001100 sub ax, 2
00001100 inc ax
10
Assembly Language
An instruction in assembly language is an easy-
to-remember form called a mnemonic
An assembler is a program that translates a
program in assembly language into machine
language
E.g., compare instructions in Assembly and
Machine languages:
11
Assembly Program Example
Using assembly language instructions, the equation
12
Third Generation
Largely Procedural languages (Imperative)
Not CPU dependent
High level
Command names are not about the machine any more
A=5
B=A+6
PRINT B
Command vocabulary and syntax rules
Compilers translate program source code into object
code (machine language binary code)
Interpreters execute one line at a time more overhead
13
Cf. 2nd & 3rd Generation
Assembly High level
mov bx, offset value x=2;
mov ax, [bx] if (x <= y)
add ax, 5 x = x + 1;
sub ax, 2 else
inc ax x = x – 1;
14
Third-Generation Languages
(3GL): Programming Comes of
Age
Still procedural but high level languages
Compilers and Interpreters
Spaghetti Code and the Great Software Crisis led to
Software Engineering
Structured Programming Languages (late 1960s)
Modular Programming Languages (1970s)
15
Higher Level Languages…
In high-level languages, symbolic names replace actual
memory addresses
I.e., we give mnemonic names to memory addresses in addition to
instructions!
The symbolic names for the memory locations where values are
stored are called variables
A variable is a name given by the programmer to refer to a
computer memory storage location
The word variable (just like variables in mathematics) is used
because the value stored in the variable (or memory location)
can change or vary
For each variable name that the programmer uses, the computer
keeps track of the actual memory address corresponding to that
variable
If memory addresses are analogous to the street addresses of
17
Programming Languages
Structural Programming Languages
Java, C, C++, Visual Basic, Pascal, Turbo
Pascal, Modula-2
Non-Structural Languages
Fortran
Scripting Languages
Bourne Shell, C Shell, Perl
Functional Programming Languages
Prolog, LISP, Hope
Query Languages
MSQL Lite, SQL, DBQ/AG
18
Fourth-Generation
Languages (4GL):
Getting Away with Procedures
Terminology
ral
Report Generators
edu
Query Languages
proc
no n
Natural Languages
19
Three Fundamental C Program
Stages
myprog.c myprog.obj myprog.exe
written
written in
in
written
written in
in written
written in
in machine
machine
CC machine
machine language
language
language
language
other
other code
code from
from
libraries,
libraries, etc.
etc.
(also
(also inin machine
machine
language)
language)
20
Processing a
program
To execute a program written in a high-
level language such C
Use an editor to create a source program (code) in
C
A source program (code) is a program written in a
high-level language
Use the compiler to check that the program obeys
the rules and translates the program in to an
equivalent machine language object program
An object program is the machine language (binary code)
version of the high-level language program that the
CPU can understand and execute 21
Processing a program
(graphic) CPU schedules (and
executes) programs in
main memory
other object
programs Loads executable
program into main
memory
22
Problem Solving Process (Programming
Life Cycle)
Step 1 - Analyze the problem
Outline the problem and its solution requirements
Design steps (algorithm) to solve the problem
Step 2 - Implement the algorithm
Implement the algorithm in a programming
language
Verify that the algorithm works
Step 3 - Maintenance
Maintenance requires using and modifying the
program if the problem domain changes
~75 % of
costs!
23
Analyze the Problem
In order to minimize maintenance costs, you
have to thoroughly understand what the
problem is about
Understand the problem requirements
Does the program require interaction with the user?
Does the program manipulate data?
Is there any output of the program?
If the problem is complex, divide the problem
into sub-problems.
Analyze each sub-problem as above
24
Problem Analysis: Coding-
Execution Cycle
25
Software Development
Activities
Editing
Compiling
Linking with precompiled files
Object files & Library modules
Loading and executing
Viewing the behavior of the program
26
Software Development
Cycle
Source Program
Compile
Library routines
Edit Link
Other object files
Think Load
Execute
27
Coding-Execution Cycle
Remove compilation and semantic (logic) errors
(debugging)
Link object code with system resources
(predefined subroutines like finding A-1)
Compiler only guarantees that the program
follows the rules of the language, but it does not
guarantee that it will run correctly !
So, test the results with sample data to see if the
program is calculating what you intend to
calculate before testing it on real data
If results are inconsistent, go back to debugging
28
IDEs
Integrated Development Environments or IDEs
Supports the entire software development cycle
E.g., MS Visual C++, Dev-C++, Code Warrior
Provides all the capabilities for developing software
Editor
Compiler
Linker
Loader
Debugger
Viewer
29
Maintenance Phase
Maintenance includes
Ongoing correction of newly discovered bugs
Revisions to meet changing user needs
Addition of new features
Maintenance is usually the longest phase, and
may be the primary source of revenue
Good documentation is vital for effective
maintenance
Also ~75 % of
costs!
30
Maintenance Phase
USE and MODIFY the program to
meet changing requirements
Correct errors that show up in using
it
Maintenance begins when your
program is put into use and
accounts for the majority of effort
on most programs
Accounts for more than 50% of
computer budgets and programmer
costs
31
Implementation methods
Compilation
Source language is translated into machine-executable
program changes
Disadvantage: Orders of magnitude slower than separate
Loader puts
Disk executable in memory.
..
..
Interpreted ..
Primary
Memory
Phase 1 Load code Load the source
..
..
..
Primary
Interpreter Memory Interpreter reads
Phase 2 source code line by line
and
translates it into
machine language on the
.. fly
..
.. 33
Procedural Programming…
Structured design – dividing a problem into smaller subproblems
The process of implementing a structured design is called structured
programming
Structured programming:
Each sub-problem is addressed by using three main control
overall problem
Procedural programming is combining structured programming with
modular programming
34
Modular, Structured
Programming
For example, solving Ax= b
1st step: entering A, b
2nd step: finding B=A-1
3rd step: multiple Bb
Each of the steps can be programmed as a
subroutine
The finding of the inverse in Step 2 may have already
been written and provided in the library
35
Sample Problem
A programmer needs an algorithm to
determine an employee’s weekly wages. How
would the calculations be done by hand???
In one week, an employee works 52 hours at
the hourly pay rate of $24.75. Assume a 40.0
hour normal work week and an overtime pay
rate factor of 1.5. What are the employee’s
wages?
Please come up with a formulaic AND a
procedural expression for your algorithm
36
One Employee’s Wages
In one week, an employee works 52
hours at the hourly pay rate of
$24.75. Assume a 40.0 hour normal
work week and an overtime pay
rate factor of 1.5
What are the employee’s wages?
40 x $ 24.75 = $ 990.00
12 x 1.5 x $ 24.75 = $ ___________
445.50
$ 1435.50
37
Weekly Wages, in General
If hours are more than 40.0, then
wages = (40.0 * payRate) + (hours - 40.0) *
1.5 *payRate
RECALL EXAMPLE
( 40 x $ 24.75 ) + ( 12 x 1.5 x $ 24.75 ) =
otherwise,
$1435.50
wages = hours * payRate
38
Algorithm to Determine an Employee’s
Weekly Wages
39
Implementation Phase:
Program
Translating your algorithm into a
programming language is called
CODING
Different languages use different
tools & methodologies to do the
translation
With C++, you use:
Documentation -- your written comments
Compiler -- translates your program into
machine language
Main Procedure -- may be called sub-algorithms
40
Implementation Phase:
Test
TESTING your program means
running (executing) your program
on the computer, to see if it
produces correct results
If it does not, then you must find
out what is wrong with your
program or algorithm and fix it
This is called debugging
41
What Can a Program Do?
A program can only instruct a computer to:
Read Input
Calculate
Store data
Write Output
Work in a sequential progression (Sequence)
Compare and branch (Selection)
Iterate or Loop (Repetition)
42
Programming in C
A C program is a collection of one or more
functions (or procedures)
Functions (in Computer Science) are very much like
functions in mathematics
A function is like a black box
There must be a function called main( ) in every
executable C program
Execution always begins with the first statement in the
function main( )
Any other functions in your program are sub-programs
and are not executed until they are called (either from
main() or from functions called by main())
43
Every C function has 2
parts
Header
int main ( ) {
Body
return 0;
}
44
Block (Compound
Statement)
a block is a sequence of zero or more statements
enclosed by a pair of curly braces { }
SYNTAX
{
Statement (optional)
.
.
.
}
45
A Multiplying
Function
46
Function Concept in
Math
Function definition
f ( x ) = 5 x - 3
Parameter of function
Name of function
47
The Function Header
In mathematics, the function header for our
function, f(x) = 5x – 3 is: x y = f(x)
y = f(x);
// Semi-colon added for C++ style 1
If you’re given JUST the function header,
how can you find out what the function 2
does?
Start plugging in values for the 3
argument/parameter x and see what
values for y result: 4
Make a table
Black Box Testing is when you don’t have 5
access to the function definition!
48
What is a header? Function
Signature
int main ( );
49
The Function Definition
The mathematical definition of the function y
= f(x) is f(x) = 5x – 3
The C programmatic definition would be:
int f(x)
{
int y = 5*x – 3;
return y;
}
Function definition include function
declaration and function body
50
The Structure of a main()
function
Structure of main()
51
The main() Function
Overall structure of a C program contains one
function named main()
Often called the driver function
All other functions are invoked from main()
Function header line: the first line of a function,
which contains
The type of data returned by the function (if any)
The name of the function
The type of data that must be passed into the function
when it is invoked (if any)
Arguments: the data passed into a function
Function body: the statements inside a function
(enclosed in braces) 52
Back to main()
Each statement inside the function must be
terminated with a semicolon
return: a keyword causing the appropriate
value to be returned from the function
return 0 in the main() function causes the
program to end
53
The main() function directs
all other functions
54
OUTLINE of a Program with Several
Functions:
When you are assigned a research paper, what’s the first thing you
do?
You brainstorm and come up with an outline of the paper
Exactly the same way, this is an outline of a C program
The modular approach, utilizing subroutines or functions, allows you
to outline your program in advance, before you sit there and code
all the detailed implementation
main() function
square() function
cube() function
55
A Program with Three
Functions
#include <iostream>
int main( )
{
cout << “The square of 3 is “
<< Square(3) << endl; // function call
cout << “The cube of 3 is “
<< Cube(3) << endl; // function call
return 0;
}
56
The Rest of the Program
int Square( int n ) {
return n * n;
}
int Cube( int n ) {
return n * n * n;
}
57
What is a Good Solution?
A solution is good if:
The total cost it incurs over all phases of its life
cycle is minimal
The cost of a solution includes:
Computer resources that the program consumes
Difficulties encountered by users
Consequences of a program that does not behave
correctly
Programs must be well structured and
documented
Efficiency is one aspect of a solution’s cost
58
Achieving a Modular Design
Abstraction
Separates the purpose of a module from its implementation
Specifications for each module are written before implementation
Functional abstraction
Separates the purpose of a function from its implementation
Data abstraction
Focuses of the operations of data, not on the implementation of the
operations
Abstract data type (ADT)
A collection of data and operations on the data
An ADT’s operations can be used without knowing how the
operations are implemented, if the operations’ specifications are
known
Data structure
A construct that can be defined within a programming language to
store a collection of data 59
Data
Variable
the name given to a collection of memory cells, designed
to store a particular data item
the value stored in that variable may change or vary as
the program executes
Constant
a data item with a name and a value that remain the
same during the execution of the program (e.g. 8, 10)
Literal
a constant whose name is the written representation of
its value (e.g. “8”, “10”)
60
Data type
Integers
whole numbers
Real
Numbers with a decimal point
Two choices in VB: single and double
Character
Boolean
True or False
61
Data Structures (also called data
aggregates)
Record
One set of related data
A time card; an invoice
File
A set of related records
Array
A collection of data that have same data type
String
An array of characters
Hello Daniel Chen
62
Problems
Problem: a task to be performed.
Best thought of as inputs and matching outputs.
Problem definition should include constraints on the
resources that may be consumed by any
acceptable solution.
But NO constraints on HOW the problem is solved
63
Resource (Space)
ALGORITHM
Resource (Time) 64
Examples – Factors to
Consider
Airline Reservations Trans-Fryslan Airlines
Attempt 1:
Declare as many variables as are the available seat
and assign a value to each variable.
Attempt 2:
Declare an array of length equal to the Maximum
seats available and then specify whether the
seat is occupied or unoccupied.
66
The Need for Data Structures
Goal: to organize data
Criteria: to facilitate efficient
storage of data
retrieval of data
manipulation of data
Complex computing tasks are unlike our
everyday experience
More complex applications demand more
calculations, searching and sorting
More powerful computers?
more complex applications
Data structures organize data
more efficient programs. 67
Organizing Data
The choice of data structure and algorithm can
make the difference between a program running
in a few seconds or many days.
The selection of the correct data structure is the
difference between success and failure.
Any organization and collection of records should
be
Searchable
Processable, and
modifiable in any order
If you are willing to pay enough in time delay
Example: Simple unordered array of records.
68
Efficiency
The cost of a solution is the amount of resources that
the solution consumes.
A solution is said to be efficient if it solves the problem
within its resource constraints.
The resources are:
Space
Time
Alternate definition: Better than known alternatives
(“relatively efficient”).
Space and time are typical constraints for programs.
This does not mean always strive for the most efficient
program. If the program operates well within resource
constraints, there is no benefit to making it faster or
smaller 69
Selecting a Data Structure
Select a data structure as follows:
70
Some Questions to Ask
Are all data inserted into the data structure at
the beginning, or are insertions interspersed
with other operations?
Can data be deleted?
If data can be deleted, a more complex
representation is typically required
Are all data processed in some well-defined
order, or is random access allowed/required?
These questions often help to narrow the
possibilities
71
Data Structure Philosophy
Each data structure has costs and benefits.
Rarely is one data structure better than another
in all situations.
A data structure requires:
space for each data item it stores,
time to perform each basic operation,
programming effort,
debugging effort,
maintenance effort.
72
Data Structure Philosophy (cont)
Each problem has constraints on available
space and time.
Only after a careful analysis of problem
characteristics can we know the best data
structure for the task.
Bank example:
Start account: a few minutes
Transactions: a few seconds
Close account: overnight
73
Abstract Data Types
Abstract Data Type (ADT): a definition for a
data type solely in terms of a set of values and
a set of operations on that data type.
Each ADT operation is defined by its inputs
and outputs.
Encapsulation: Hide implementation details.
74
Abstract Data Types (Cont.)
Def. a collection of related data
items
together with
an associated
e.g. whole set ofand
numbers (integers) operations
arithmetic operators
for addition, subtraction, multiplication and division.
75
Abstract Data Types (Cont.)
Def.Consists of
storage structures (aka data structures)
to store the data items
and
algorithms for the basic operations.
77
ADT and Data Structure
Data Type
ADT:
Data Items:
Type
Logical Form
Operations
78
Simple Data Types
Memory:
2-state devices « bits 0 and 1
Therefore:
simple
the mostdata types
basic (values
form aresequences
of data: atomic — can't be
of bits
subdivided) are ADTs.
Implementations have:
» Storage structures: memory locations
» Algorithms: system hardware/software to do basic
operations.
79
Boolean data
Data values: {false, true}
80
Character Data
Store numeric codes (ASCII, EBCDIC, Unicode)
1 byte for ASCII and EBCDIC,
2 bytes for Unicode
ASCII/EBCDIC
Unicode
,
Basic operation: comparison to determine if Equal, Less than Greater
than, etc. use their numeric codes (i.e. use ordinal value)
81
Integer Data
Nonegative (unsigned) integer:
Store its base-two representation in a fixed number w
of bits
(e.g., w = 16 or w = 32)
00000000010110002
88 =
Signed integer:
Store in a fixed number w of bits using one of the following
representations:
82
Sign-magnitude
representation
Save one bit (usually most significant) for
sign
(0 = +, 1 = – )
83
Two's complement
representation
Same as
For nonnegative n: sign mag.
Use ordinary base-two representation with leading (sign) bit 0
84
Summary
Generation of Programming Languages
Machine Language
Assembly Language
High Level Languages
Processing a Computer Program
Stages of compilation, Interpreter
Procedural and modular programming
Structure of a C Program
Data and Data structure
Abstract Data Type (ADT)
85