M116C 1 M116C 1 Lect02-Performance
M116C 1 M116C 1 Lect02-Performance
Performance
Time vs Throughput
Vehicle
Time to
San Diego*
Speed
Passengers
Throughput
(pmph)
Ferrari
0.75 hours
160 mph
320
Greyhound
2 hours
65 mph
60
3900
Time vs Throughput
% time program
... programs results ...
90.7u 12.9s 2:39 65%
%
user + kernel
wallclock
Performance
Cycles
# cycles depends on
architecture (i.e. how many cycles a given instruction type
will take)
the instruction makeup of the program being evaluated
Definitions
One of P&Hs
big pictures
seconds
CPU
Execution
Time
instructions
Instruction
Clock Cycle
CPI X
X
Count
Time
cycles/instruction
seconds/cycle
CPU
Execution
Time
Instruction
Clock Cycle
CPI X
X
Count
Time
Programmer
Compiler Writer
ISA Architect
Machine Architect
Hardware Designer
Materials Scientist
Physicist
Silicon Engineer
CPU Execution
=
Time
Same machine,
different programs
Same program,
different machines,
but same ISA
Same program,
different ISAs
Instruction
Clock Cycle
CPI X
X
Count
Time
Comparing Performance
MFLOPS?
Computer B
Program 1
10
Program 2
1000
100
Total Time
1001
110
Which is faster?
PerformanceB
PerformanceA =
Execution TimeA
1001
Execution TimeB = 110 = 9.1
Benchmarks
Full application:
SPEC (int and float)
Suppose:
Amdahls Law
then:
improved time = time on part A/p + time on part B.
Improving Latency
Improving Bandwidth
Key Points