0% found this document useful (0 votes)

39 views

Advanced Computer Architecture Fundamentals of Computer Design

This document provides an outline for a lecture on advanced computer architecture. It discusses computer science being at a crossroads, the differences between computer architecture and instruction set architecture, trends in technologies like processors, memory, disks, and networks over the past 20 years. It shows how bandwidth has increased much more than latency. The document also covers trends in power for integrated circuits, dependability, measuring and reporting performance, and quantitative principles of computer design.

Uploaded by

Anonymous iUso2IqPG5

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views

Advanced Computer Architecture Fundamentals of Computer Design

Uploaded by

Anonymous iUso2IqPG5

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 48

Advanced Computer Architecture

Fundamentals of Computer Design

Myung Hoon Sunwoo
School of Electrical and Computer Engineering
Ajou University

Outline

Computer Science at a Crossroads

Computer Architecture v. Instruction Set Arch.
Trends in Technology
gy
Trends in Power in Integrated Circuits
Dependability
Measuring Reporting,
Measuring,
Reporting and Summarizing Performance
Quantitative Principles of Computer Design

Ajou Univ.

Multimedia
Communications

SOC Lab.

Crossroads: Uniprocessor Performance

10000

Perfformance (vs. VA
AX-11/780)

From Hennessy and Patterson, Computer

Architecture: A Quantitative Approach, 4th
edition,
diti October,
O t b 2006

??%/year

1000
52%/year
100

25%/year

1
1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006

VAX
: 25%/year 1978 to 1986
RISC + x86:
86 52%/
52%/year 1986 tto 2002
RISC + x86: ??%/year 2002 to present
Ajou Univ.

Multimedia
Communications

SOC Lab.

Outline

Computer Science at a Crossroads

Ajou Univ.

Multimedia
Communications

SOC Lab.

Instruction Set Architecture: Critical Interface

software

instruction set

hardware

Properties of a good abstraction

Ajou Univ.

Lasts through many generations (portability)

Used in many different ways (generality)
Provides convenient functionality to higher levels
Permits an efficient implementation at lower levels
5

Multimedia
Communications

SOC Lab.

Example: MIPS
r0
r1

r31
PC
lo
hi

Programmable storage

Data types ?

2^32 x bytes

Format ?

31 x 32
32-bit
bit GPRs (R0=0)

Addressing Modes?

32 x 32-bit FP regs (paired DP)

Operations?

HI, LO, PC
Arithmetic logical
Add, AddU, Sub, SubU, And, Or, Xor, Nor, SLT, SLTU,
AddI, AddIU, SLTI, SLTIU, AndI, OrI, XorI, LUI
SLL, SRL, SRA, SLLV, SRLV, SRAV

Memory Access
LB, LBU, LH, LHU, LW, LWL,LWR
SB, SH, SW, SWL, SWR

Control

32-bit instructions on word boundary

J, JAL, JR, JALR

BEq, BNE, BLEZ,BGTZ,BLTZ,BGEZ,BLTZAL,BGEZAL
Ajou Univ.

Multimedia
Communications

SOC Lab.

Instruction Set Architecture

... the attributes of a [computing] system as seen by the programmer,
i.e. the conceptual structure and functional behavior, as distinct from
the organization of the data flows and controls, the logic design, and
the physical implementation
implementation.

Amdahl, Blaauw, and Brooks, 1964

SOFTWARE
-- Organization of Programmable
Storage
g
-- Data Types & Data Structures:
Encodings & Representations
-- Instruction Formats
-- Instruction (or Operation Code) Set
-- Modes of Addressing and Accessing Data Items and Instructions
-- Exceptional Conditions
Ajou Univ.

Multimedia
Communications

SOC Lab.

ISA vs. Computer Architecture

Old definition of computer architecture
= instruction set design
Other aspects of computer design called implementation
Insinuates implementation is uninteresting or less challenging

Our view is computer architecture >> ISA

Architects jjob much more than instruction set design;
g ; technical
hurdles today more challenging than those in instruction set design
Since instruction set design not where action is, some conclude
computer architecture (using old definition) is not where action is
We
W disagree
di
on conclusion
l i
Agree that ISA not where action is (ISA in CA:AQA 4/e appendix)

Ajou Univ.

Multimedia
Communications

SOC Lab.

Comp. Arch. is an Integrated Approach

What really matters is the functioning of the complete system
hardware, runtime system, compiler, operating system, and application
In networking, this is called the End to End argument

Computer architecture is not just about transistors, individual

instructions or particular implementations
instructions,
E.g., Original RISC projects replaced complex instructions with a compiler +
simple instructions

Ajou Univ.

Multimedia
Communications

SOC Lab.

Outline

Computer Science at a Crossroads

Ajou Univ.

Multimedia
Communications

SOC Lab.

Moores Law: 2X transistors / year

Cramming More Components onto Integrated Circuits

Gordon Moore, Electronics, 1965

# on transistors / cost-effective integrated circuit double every N months (12 N 24)

Ajou Univ.

Multimedia
Communications

SOC Lab.

Tracking Technology Performance Trends

Drill down into 4 technologies:

Disks,
Memory,
Network,
Processors

Compare ~1980 Archaic (Nostalgic) vs

vs. ~2000 Modern (Newfangled)
Performance Milestones in each technology

Compare for Bandwidth vs. Latency improvements in performance over

time
ti
Bandwidth: number of events per unit time
E.g.,
E g M bits / second over network,
network M bytes / second from disk

Latency: elapsed time for a single event

E.g., one-way network delay in microseconds,
average disk access time in milliseconds

Ajou Univ.

Multimedia
Communications

SOC Lab.

Disks: Archaic(Nostalgic) v. Modern(Newfangled)

CDC Wren I, 1983

3600 RPM
0 03 GB
0.03
GBytes capacity
i
Tracks/Inch: 800
Bits/Inch: 9550
Three 5.25 platters

Bandwidth:
0.6 MBytes/sec
Latency: 48.3 ms
Cache: none

Ajou Univ.

Seagate 373453, 2003

15000 RPM
73 4 GB
73.4
GBytes
Tracks/Inch: 64000
Bits/Inch: 533,000
Four 2.5 platters
(in 3.5 form factor)
Bandwidth:
86 MBytes/sec
Latency: 5.7 ms
Cache: 8 MBytes
y

(4X)
(2500X)
(80X)
(60X)

(140X)
(8X)

Multimedia
Communications

SOC Lab.

Latency Lags Bandwidth (for last ~20 years)

Performance Milestones

10000

1000

Relative
BW
100
Improve
ment

Disk

Disk: 3600, 5400, 7200, 10000, 15000

RPM (8x, 143x)

(Latency improvement
= Bandwidth improvement)

1
1

100

Relative Latency Improvement

Ajou Univ.

(latency = simple operation w/o contention

BW = best-case)
14

Multimedia
Communications

SOC Lab.

Memory: Archaic (Nostalgic) v. Modern (Newfangled)

1980 DRAM
(asynchronous)
0.06 Mbits/chip
64,000 xtors, 35 mm2
16-bit data bus per module
module,
16 pins/chip
13 Mbytes/sec
Latency:
L t
225 ns
(no block transfer)

Ajou Univ.

2000 Double Data Rate Synchr.

(clocked) DRAM
256.00 Mbits/chip
(4000X)
256,000,000 xtors, 204 mm2
64-bit data bus per
DIMM, 66 pins/chip
(4X)
1600 Mbytes/sec
(120X)
Latency: 52 ns
(4X)
Block transfers (page mode)

Multimedia
Communications

SOC Lab.

Latency Lags Bandwidth (for last ~20 years)

Performance Milestones

10000

1000

Relative
Memory
BW
100
Improve
ment

Memory Module: 16bit plain DRAM,

Page Mode DRAM, 32b, 64b, SDRAM,
DDR SDRAM (4x,120x)
Disk: 3600, 5400, 7200, 10000, 15000
RPM (8x, 143x)

Disk

(Latency improvement
= Bandwidth improvement)

1
1

100

(
(latency
y = simple
p operation
p
w/o contention
BW = best-case)

Relative Latency Improvement

Ajou Univ.

Multimedia
Communications

SOC Lab.

LANs: Archaic (Nostalgic)v. Modern (Newfangled)

Ethernet 802.3
Year of Standard: 1978
10 Mbit
Mbits/s
/
link speed
Latency: 3000 sec
Shared
S
media
Coaxial cable
Coaxial Cable:

"Cat 5" is 4 twisted pairs in bundle

Plastic Covering
Braided outer conductor
Insulator

Copper core

Ajou Univ.

Ethernet 802.3ae
Year of Standard: 2003
10 000 Mbit
10,000
Mbits/s
/
(1000X)
link speed
Latency: 190 sec
(15X)
S
Switched
media
Category 5 copper wire

Twisted Pair:

Copper, 1mm thick,

twisted to avoid antenna effect

Multimedia
Communications

SOC Lab.

Latency Lags Bandwidth (for last ~20 years)

Performance Milestones

10000

1000

Ethernet: 10Mb, 100Mb, 1000Mb,

10000 Mb/s (16x,1000x)
Memory Module: 16bit plain DRAM,
Page Mode DRAM, 32b, 64b, SDRAM,
DDR SDRAM ((4x,120x))
Disk: 3600, 5400, 7200, 10000, 15000
RPM (8x, 143x)

Network
Relative
Memory
BW
100
Improve
ment

Disk

(Latency improvement
= Bandwidth improvement)

1
1

100

Relative Latency Improvement

Ajou Univ.

(latency = simple operation w/o contention

BW = best-case)
18

Multimedia
Communications

SOC Lab.

CPUs: Archaic (Nostalgic) v. Modern (Newfangled)

1982 Intel 80286
12.5 MHz
2 MIPS (peak)
Latency 320 ns
134,000 xtors, 47 mm2
16-bit data bus, 68 pins
Microcode interpreter,
separate FPU chip
(no caches)

Ajou Univ.

2001 Intel Pentium 4

1500 MHz
(120X)
4500 MIPS (peak)
(2250X)
Latency 15 ns
(20X)
42,000,000 xtors, 217 mm2
64-bit data bus, 423 pins
3-way superscalar,
Dynamic translate to RISC,
S
Superpipelined
i li d (22 stage),
t
)
Out-of-Order execution
On-chip 8KB Data caches,
96KB Instr.
Instr Trace cache
cache,
256KB L2 cache

Multimedia
Communications

SOC Lab.

Latency Lags Bandwidth (for last ~20 years)

Performance Milestones
Processor: 286, 386, 486, Pentium,
Pentium Pro,
Pro Pentium 4 (21x
(21x,2250x)
2250x)
Ethernet: 10Mb, 100Mb, 1000Mb,
10000 Mb/s (16x,1000x)
Memory Module: 16bit plain DRAM,
DRAM
Page Mode DRAM, 32b, 64b, SDRAM,
DDR SDRAM (4x,120x)
Disk : 3600, 5400, 7200, 10000, 15000
RPM (8x, 143x)

10000

CPU high,
Memory low
(Memory
Wall) 1000

Processor

Network
Relative
Memory
BW
100
Improve
ment

Disk

(Latency improvement
= Bandwidth improvement)

1
1

100

Relative Latency Improvement

Ajou Univ.

Multimedia
Communications

SOC Lab.

Outline

Computer Science at a Crossroads

Ajou Univ.

Multimedia
Communications

SOC Lab.

Define and quantity power ( 1 / 2)

For CMOS chips, traditional dominant energy consumption has been
in switching transistors, called dynamic power
2

P
Power
dynamic = 1 / 2 Capacitive
C
i i Load
L d Voltage
F
FrequencySSwitched
i h d
V l
For mobile devices, energy better metric
2

Energydynamic
d
i = CapacitiveLoad Voltage

For a fixed task, slowing clock rate (frequency switched) reduces

power, but not energy
Capacitive load a function of number of transistors connected to
output and technology, which determines capacitance of wires and
transistors
Dropping voltage helps both, so went from 5V to 1V
To save energy & dynamic power, most CPUs now turn off clock of
inactive modules (e.g. Fl. Pt. Unit)
Ajou Univ.

Multimedia
Communications

SOC Lab.

Example of quantifying power

Suppose 15% reduction in voltage results in a 15% reduction in
frequency. What is impact on dynamic power?

Powerdynamic = 1 / 2 CapacitiveLoad Voltage FrequencySwitched

C
L d (.
Load
FrequencySSwitched
h d
= 1 / 2 .85 Capacitive
( 85
8 Voltage
l
) F
2

= (.85)3 OldPowerdynamic
0.6 OldPowerdynamic

Ajou Univ.

Multimedia
Communications

SOC Lab.

Define and quantity power (2 / 2)

Because leakage current flows even when a transistor is off, now
static power important too

Powerstatic = Currentstatic Voltage

V lt
Leakage current increases in processors with smaller transistor sizes
Increasing the number of transistors increases power even if they are
turned off
IIn 2006,
2006 goall ffor lleakage
k
iis 25% off ttotal
t l power consumption;
ti
hi
high
h
performance designs at 40%
Very low power systems even gate voltage to inactive modules to
control loss due to leakage

Ajou Univ.

Multimedia
Communications

SOC Lab.

Define and quantity dependability (1/3)

How decide when a system is operating properly?

IInfrastructure
f
t
t
providers
id
now offer
ff Service
S
i Level
L
l Agreements
A
t (SLA)
to guarantee that their networking or power service would be
dependable

Systems alternate between 2 states of service with respect to an

SLA:
Service accomplishment, where the service is delivered as
specified
ifi d in
i SLA
Service interruption, where the delivered service is different from
the SLA

1.
2.

Ajou Univ.

Failure = transition from state 1 to state 2

Restoration = transition from state 2 to state 1

Multimedia
Communications

SOC Lab.

Define and quantity dependability (2/3)

Module reliability = measure of continuous service accomplishment

(or time to failure).
2 metrics
1.Mean Time To Failure (MTTF) measures Reliability
2.Failures In Time (FIT) = 1/MTTF, the rate of failures

Traditionally reported as failures per billion hours of operation

Mean Time To Repair (MTTR) measures Service Interruption

Mean Time Between Failures (MTBF) = MTTF+MTTR

Module availability measures service as alternate between the 2

states of accomplishment and interruption (number between 0 and 1,
e.g. 0.9)

Module availability = MTTF / ( MTTF + MTTR)

Ajou Univ.

Multimedia
Communications

SOC Lab.

Example calculating reliability

If modules have exponentially distributed lifetimes (age of

module does not affect probability of failure), overall failure
rate is the sum of failure rates of the modules
Calculate FIT and MTTF for 10 disks (1M hour MTTF per disk), 1
disk controller (0.5M hour MTTF), and 1 power supply (0.2M
hour MTTF):

FailureRate = 10 (1 / 1,000,000) + 1 / 500,000 + 1 / 200,000

= 10 + 2 + 5 / 1,000,000
= 17 / 1,000,000
= 17,000 FIT
MTTF = 1,000,000,000 / 17,000
59,000hours
Ajou Univ.

Multimedia
Communications

SOC Lab.

Outline

Computer Science at a Crossroads

Ajou Univ.

Multimedia
Communications

SOC Lab.

Definition: Performance
Performance is in units of things per sec
bigger is better
If we are primarily concerned with response time

performance(x) =

1
execution_time(x)

" X is n times faster than Y" means

Performance(X)
n

=
Performance(Y)

Ajou Univ.

Execution_time(Y)

Execution_time(X)

Multimedia
Communications

SOC Lab.

Performance: What to measure

Usually
U
ll rely
l on b
benchmarks
h
k vs. reall workloads
kl d
To increase predictability, collections of benchmark applications,
called benchmark suites
suites, are popular
SPECCPU: popular desktop benchmark suite

CPU only
only, split between integer and floating point programs
SPECint2000 has 12 integer, SPECfp2000 has 14 integer pgms
SPECCPU2006 to be announced Spring 2006
SPECSFS (NFS file server) and SPECWeb (WebServer) added as server
benchmarks

Transaction Processing Council measures server performance and

cost-performance for databases

Ajou Univ.

TPC-C Complex query for Online Transaction Processing

TPC-H models ad hoc decision support
TPC-W a transactional web benchmark
TPC-App application server and web services benchmark
30

Multimedia
Communications

SOC Lab.

How Summarize Suite Performance (1/5)

Arithmetic average of execution time of all pgms?
But they vary by 4X in speed, so some would be more important than
g
others in arithmetic average

Could add a weights per program, but how pick weight?

Different companies want different weights for their products

SPECRatio: Normalize execution times to reference computer,

yielding a ratio proportional to performance =
ti
time
on reference
f
computer
t
time on computer being rated

Ajou Univ.

Multimedia
Communications

SOC Lab.

How Summarize Suite Performance (2/5)

If program SPECRatio on Computer A is 1.25 times bigger than
Computer B, then

ExecutionT
E
Timereference
SPECRatio A
ExecutionTime A
1.25 =
=
SPECRatioB ExecutionTimereference
ExecutionTimeB
ExecutionTimeB Performance A
=
=
ExecutionTime A Performanc
f
eB
Note that when comparing 2 computers as a ratio, execution times
on the reference computer drop out, so choice of reference
computer is irrelevant
Ajou Univ.

Multimedia
Communications

SOC Lab.

How Summarize Suite Performance (3/5)

Since ratios, proper mean is geometric mean
(SPECRatio unitless, so arithmetic mean meaningless)

GeometricMean = n

SPECRatio

i =1

1. Geometric mean of the ratios is the same as the ratio of the

geometric means
2. Ratio of geometric means
= Geometric mean of performance ratios
choice of reference computer is irrelevant!

Ajou Univ.

These two points make geometric mean of ratios attractive to

summarize performance

Multimedia
Communications

SOC Lab.

How Summarize Suite Performance (4/5)

Does a single mean well summarize performance of programs in
benchmark suite?
Can
C d
decide
id if mean a good
d predictor
di t by
b characterizing
h
t i i variability
i bilit off
distribution using standard deviation
Like geometric mean, geometric standard deviation is multiplicative
rather than arithmetic
Can simply take the logarithm of SPECRatios, compute the standard
mean and standard deviation, and then take the exponent to convert
back:

1 n

GeometricMean = exp ln (SPECRatioi )

n i =1

GeometricStDev = exp(StDev(ln (SPECRatioi )))

Ajou Univ.

Multimedia
Communications

SOC Lab.

How Summarize Suite Performance (5/5)

Standard deviation is more informative if know distribution has a
standard form
bell-shaped
bell shaped normal distribution, whose data are symmetric around
mean
lognormal distribution, where logarithms of data--not data itself--are
normally distributed (symmetric) on a logarithmic scale

For a lognormal distribution, we expect that

68% of samples fall in range [mean / gstdev, mean gstdev]
95% of samples fall in range mean / gstdev 2 , mean gstdev 2

Note: Excel provides functions EXP(), LN(), and STDEV() that

make calculating geometric mean and multiplicative standard
deviation easy

Ajou Univ.

Multimedia
Communications

SOC Lab.

Example Standard Deviation (1/2)

GM and multiplicative StDev of SPECfp2000 for Itanium 2
14000

SPE
ECfpRatio
o

12000
10000

GM = 2712
GSTEV = 1.98

8000
6000

5362

4000
2712
2000

1372

Ajou Univ.

apsi

sixt rack

fm
ma3d

lu
ucas

am
mmp

faccerec

equ
uake

art

ga
algel

mesa
m

applu
a

mgrid
m

sswim

wupw
wise

Multimedia
Communications

SOC Lab.

Example Standard Deviation (2/2)

GM and multiplicative StDev of SPECfp2000 for AMD Athlon
14000

SPE
ECfpRatio
o

12000
10000

GM = 2086
GSTEV = 1.40

8000
6000
4000

2911
2086
1494

2000

Ajou Univ.

apsi

sixttrack

fm
ma3d

lu
ucas

am
mmp

faccerec

equ
uake

art

ga
algel

mesa
m

applu
a

mgrid
m

sswim

wupwise

Multimedia
Communications

SOC Lab.

Outline

Computer Science at a Crossroads

Ajou Univ.

Multimedia
Communications

SOC Lab.

1) Taking Advantage of Parallelism

Increasing throughput of server computer via multiple processors or
multiple disks
Detailed HW design
Carry lookahead adders uses parallelism to speed up computing sums
from linear to logarithmic in number of bits per operand
Multiple memory banks searched in parallel in set-associative caches

Pipelining: overlap instruction execution to reduce the total time to

complete an instruction sequence.
sequence
Not every instruction depends on immediate predecessor executing
instructions completely/partially in parallel possible
Classic 5-stage pipeline:
1)) Instruction Fetch (Ifetch),
(f
)
2) Register Read (Reg),
3) Execute (ALU),
4) Data Memory Access (Dmem),
5) Register Write (Reg)

Ajou Univ.

Multimedia
Communications

SOC Lab.

Pipelined Instruction Execution

Time (clock cycles)

Ajou Univ.

Ifetch

DMem

Reg

DMem

Reg

DMem

Reg

ALU

Reg

ALU

O
r
d
e
r

Ifetch

ALU

I
n
s
t
r.

ALU

C l 1 Cycle
Cycle
C l 2 Cycle
C l 3 Cycle
C l 4 Cycle
C l 5 Cycle
C l 6 Cycle
C l 7

Ifetch

Reg

DMem

Reg

Multimedia
Communications

SOC Lab.

Limits to pipelining
Hazards prevent next instruction from executing during its
designated clock cycle
Structural hazards: attempt to use the same hardware to do two
different things at once
Data hazards: Instruction depends on result of prior instruction still in
the pipeline
Control hazards: Caused by delay between the fetching of instructions
and decisions about changes in control flow (branches and jumps).

Reg

DMem

Ifetch

Reg

DMem

Ifetch

Reg

ALU

DMem

Ifetch

Reg

A
ALU

Ifetch

ALU
U

I
n
s
t
r.

ALU

Ti
Time
((clock
l k cycles)
l )

O
r
d
e
r

Ajou Univ.

Reg
Reg
Reg
DMem

Reg

Multimedia
Communications

SOC Lab.

2) The Principle of Locality

The Principle of Locality:
Program access a relatively small portion of the address space at any
instant of time
time.

Two Different Types of Locality:

Temporal Locality (Locality in Time): If an item is referenced, it will tend to
be referenced again soon (e.g., loops, reuse)
Spatial Locality (Locality in Space): If an item is referenced, items whose
addresses are close by tend to be referenced soon
(e.g., straight-line code, array access)

Last 30 years, HW relied on locality for memory perf.

Ajou Univ.

MEM

Multimedia
Communications

SOC Lab.

Levels of the Memory Hierarchy

Capacity
p
y
Access Time
Cost

CPU Registers

100s Bytes
300 500 ps (0.3-0.5
(0 3-0 5 ns)

L1 and L2 Cache

10s-100s K Bytes
~1 ns - ~10 ns
$1000s/ GByte

Staging
Xfer Unit

I t O
Instr.
Operands
d
L1 Cache
Blocks

Disk
D
s

10s T Bytes, 10 ms
(10,000,000 ns)
~ $1 / GByte

Tape

iinfinite
fi i
sec-min
~$1 / GByte

Ajou Univ.

prog./compiler
1-8 bytes

f t
faster

cache cntl
y
32-64 bytes

L2 Cache
Blocks

Main Memory

G Bytes
80ns- 200ns
~ $100/ GByte

Upper Level

Registers

cache cntl
64-128 bytes

Memory
Pages

OS
4K-8K bytes

Files

user/operator
Mbytes

Disk

Tape

Larger

Lower Level
43

Multimedia
Communications

SOC Lab.

3) Focus on the Common Case

Common sense guides computer design
Since its engineering, common sense is valuable

In making a design trade-off,

trade off, favor the frequent case over the
infrequent case
E.g., Instruction fetch and decode unit used more frequently than
multiplier, so optimize it 1st
E.g.,
g , If database server has 50 disks / processor,
p
, storage
g dependability
p
y
dominates system dependability, so optimize it 1st

Frequent case is often simpler and can be done faster than the
infrequent case
E.g., overflow is rare when adding 2 numbers, so improve performance
by optimizing more common case of no overflow
May slow down overflow, but overall performance improved by
optimizing for the normal case

What is frequent
q
case and how much performance
p
improved
p
by
y
making case faster => Amdahls Law

Ajou Univ.

Multimedia
Communications

SOC Lab.

4) Amdahls Law
ExTimenew

Fraction enhanced
(
)
= ExTimeold 1 Fractionenhanced +

Speedup
p
p

enhanced

Speedupoverall =

ExTimeold
ld
=
ExTimenew

(1 Fractionenhanced ) +

Fraction enhanced

Speedupenhanced

Best you could ever hope to do:

Sp d pmaximum =
Speedup

Ajou Univ.

1
(1 - Fractionenhanced )

Multimedia
Communications

SOC Lab.

Amdahls Law example

New CPU 10X faster
I/O bound server, so 60% time waiting for I/O

Speedup overall =

1
Fraction enhanced
(1 Fraction enhanced ) +
Speedup enhanced
1

1
=
= 1.56
=
0.4 0.64
(1 0.4) +
10
Apparently, its human nature to be attracted by 10X
faster, vs. keeping in perspective its just 1.6X faster

Ajou Univ.

Multimedia
Communications

SOC Lab.

5) Processor performance equation

inst count
CPU time

= Seconds

= Instructions x

Program

CPI

Program
Compiler

(X)

Inst. Set.

X
X

Technolog
Technology
Ajou Univ.

Cycle time

x Seconds

Instruction

Inst Count
X

Organization

Cycles

CPI

Cycle

Clock Rate

X
X

Multimedia
Communications

SOC Lab.

Whats a Clock Cycle?

Latch
L
t h
or
register

combinational
logic

Old days: 10 levels of gates

Today: determined by numerous time-of-flight issues + gate
delays
clock propagation, wire lengths, drivers

Ajou Univ.

Multimedia
Communications

SOC Lab.

Webutil Manual
100% (1)
Webutil Manual
49 pages
Huawei Visco Stencils
No ratings yet
Huawei Visco Stencils
98 pages
CSE820 Week 2 - Introduction: Rich Enbody (Based Loosely On Slides by David Patterson)
No ratings yet
CSE820 Week 2 - Introduction: Rich Enbody (Based Loosely On Slides by David Patterson)
21 pages
Trends in Computer Architecture
No ratings yet
Trends in Computer Architecture
30 pages
CSE 820 Graduate Computer Architecture: Dr. Enbody
No ratings yet
CSE 820 Graduate Computer Architecture: Dr. Enbody
25 pages
Defining Computer Architecture
No ratings yet
Defining Computer Architecture
6 pages
Advanced Computer Architecture ECE 6373: Pauline Markenscoff N320 Engineering Building 1 E-Mail: Markenscoff@uh - Edu
No ratings yet
Advanced Computer Architecture ECE 6373: Pauline Markenscoff N320 Engineering Building 1 E-Mail: Markenscoff@uh - Edu
151 pages
CSE 675.02: Introduction To Computer Architecture: Instructor: Roger Crawfis
No ratings yet
CSE 675.02: Introduction To Computer Architecture: Instructor: Roger Crawfis
37 pages
Introduction to ACA 2021
No ratings yet
Introduction to ACA 2021
73 pages
CS 355 Computer Architecture: Text: Computer Organization & Design, D A Patterson, J L Hennessy
No ratings yet
CS 355 Computer Architecture: Text: Computer Organization & Design, D A Patterson, J L Hennessy
12 pages
Computer Architecture: Fundamentals Prof. Jerry Breecher CSCI 240 Fall 2003
No ratings yet
Computer Architecture: Fundamentals Prof. Jerry Breecher CSCI 240 Fall 2003
36 pages
Chapter 1
No ratings yet
Chapter 1
50 pages
Module 1a A Brief History of Computer Architecture
No ratings yet
Module 1a A Brief History of Computer Architecture
53 pages
Unit I Fundamentals of Computer Design and Ilp-1-14
No ratings yet
Unit I Fundamentals of Computer Design and Ilp-1-14
14 pages
Chapter 1
No ratings yet
Chapter 1
45 pages
Computer Architecture: Fundamentals
No ratings yet
Computer Architecture: Fundamentals
36 pages
Modern Computer Architecture: Lecture1 Fundamentals of Quantitative Design and Analysis (I)
No ratings yet
Modern Computer Architecture: Lecture1 Fundamentals of Quantitative Design and Analysis (I)
41 pages
CCS 1202 Lecture 2_Computer Evolution and Performance
No ratings yet
CCS 1202 Lecture 2_Computer Evolution and Performance
32 pages
Chapter 1
No ratings yet
Chapter 1
59 pages
Unit-1 ACA
No ratings yet
Unit-1 ACA
86 pages
Instructor: L. N. Bhuyan
No ratings yet
Instructor: L. N. Bhuyan
32 pages
Chapter 1
100% (1)
Chapter 1
62 pages
RTSEC Documentation
No ratings yet
RTSEC Documentation
4 pages
UNIT1-ACA
No ratings yet
UNIT1-ACA
115 pages
Smd150 Computer Architecture: Per Lindgren Eislab, Lectures Andrey Kruglyak, Syncsim Johan Eriksson, VHDL
No ratings yet
Smd150 Computer Architecture: Per Lindgren Eislab, Lectures Andrey Kruglyak, Syncsim Johan Eriksson, VHDL
43 pages
CH 1 - Introduction To Computer Architecture and Performance Measurement
No ratings yet
CH 1 - Introduction To Computer Architecture and Performance Measurement
42 pages
Chapter 1 Introduction
No ratings yet
Chapter 1 Introduction
68 pages
Computer Abstractions and Technology
No ratings yet
Computer Abstractions and Technology
47 pages
Lecture 1
No ratings yet
Lecture 1
34 pages
CS M151B / EE M116C: Computer Systems Architecture
No ratings yet
CS M151B / EE M116C: Computer Systems Architecture
29 pages
CH02-COA10e Spring 2025
No ratings yet
CH02-COA10e Spring 2025
24 pages
Study Notes COAL Mids
No ratings yet
Study Notes COAL Mids
14 pages
Recap 1
No ratings yet
Recap 1
15 pages
Recap 1
No ratings yet
Recap 1
15 pages
CA Lecture 1
No ratings yet
CA Lecture 1
28 pages
Son-CA - Lec1 - 1 - Computer Abstraction and Technology
No ratings yet
Son-CA - Lec1 - 1 - Computer Abstraction and Technology
31 pages
Computer Architecture Note 2024
No ratings yet
Computer Architecture Note 2024
45 pages
Introduction To Computer Architecture
No ratings yet
Introduction To Computer Architecture
17 pages
Advance Operating System-Computer Organization: Chap 1a: Overview
No ratings yet
Advance Operating System-Computer Organization: Chap 1a: Overview
71 pages
Lesson 5: Processor Design: Topic 1 - Methods and Concepts
No ratings yet
Lesson 5: Processor Design: Topic 1 - Methods and Concepts
57 pages
Introduction To Computer Architecture and Performance Measurement
No ratings yet
Introduction To Computer Architecture and Performance Measurement
41 pages
Wk05 - CPU Architecture (Part 1)
No ratings yet
Wk05 - CPU Architecture (Part 1)
72 pages
PPT#01
No ratings yet
PPT#01
30 pages
Cse431 02
No ratings yet
Cse431 02
50 pages
Objectives: Naydin@yildiz - Edu.tr
No ratings yet
Objectives: Naydin@yildiz - Edu.tr
11 pages
Lecture 12
No ratings yet
Lecture 12
73 pages
1
No ratings yet
1
52 pages
William Stallings Computer Organization and Architecture 7 Edition Computer Evolution and Performance
No ratings yet
William Stallings Computer Organization and Architecture 7 Edition Computer Evolution and Performance
44 pages
Chapter 2
No ratings yet
Chapter 2
59 pages
Week 1
No ratings yet
Week 1
68 pages
L-1 (History of Computer)
No ratings yet
L-1 (History of Computer)
75 pages
Ico22 - 1 - Computer Abstraction and Technology
No ratings yet
Ico22 - 1 - Computer Abstraction and Technology
42 pages
Chapter 1 Measuring Understanding Performance
No ratings yet
Chapter 1 Measuring Understanding Performance
63 pages
Lecture1 Cda3101
No ratings yet
Lecture1 Cda3101
44 pages
Lecture1 2
No ratings yet
Lecture1 2
30 pages
Instruction Set Architecture and Trends
No ratings yet
Instruction Set Architecture and Trends
4 pages
Computer Architecture Introduction
No ratings yet
Computer Architecture Introduction
61 pages
CIS775: Computer Architecture: Chapter 1: Fundamentals of Computer Design
No ratings yet
CIS775: Computer Architecture: Chapter 1: Fundamentals of Computer Design
43 pages
L-1 (History of Computer)
No ratings yet
L-1 (History of Computer)
75 pages
COA Chapter 1
No ratings yet
COA Chapter 1
32 pages
Circuit Board Revolution
From Everand
Circuit Board Revolution
Leo Musk
No ratings yet
Automated Optical Inspection: Advancements in Computer Vision Technology
From Everand
Automated Optical Inspection: Advancements in Computer Vision Technology
Fouad Sabry
No ratings yet
Network+ Guide To Networks 5 Edition: Wireless Networking
No ratings yet
Network+ Guide To Networks 5 Edition: Wireless Networking
89 pages
F120 Configure Details Good Startup Book
No ratings yet
F120 Configure Details Good Startup Book
208 pages
Unit-5 Part-2
No ratings yet
Unit-5 Part-2
22 pages
Chap01a DigitalImages
No ratings yet
Chap01a DigitalImages
52 pages
Switching
No ratings yet
Switching
30 pages
Ccna Test
No ratings yet
Ccna Test
30 pages
CV Nijkamp
No ratings yet
CV Nijkamp
2 pages
DS-7216HUHI-K2/P: Video Audio Input
No ratings yet
DS-7216HUHI-K2/P: Video Audio Input
5 pages
Exception Note
No ratings yet
Exception Note
38 pages
Documents - Pub - Web Dynpro Abap scn3 PDF
No ratings yet
Documents - Pub - Web Dynpro Abap scn3 PDF
32 pages
Hardware Certification Report - 1152921504626450757 PDF
No ratings yet
Hardware Certification Report - 1152921504626450757 PDF
1 page
DIRECTV Channel Lineup
100% (2)
DIRECTV Channel Lineup
2 pages
Muestra de Ensayo Económico
No ratings yet
Muestra de Ensayo Económico
8 pages
Se Lab Mannual Ex 1n F
No ratings yet
Se Lab Mannual Ex 1n F
6 pages
Telepresence Mx200 G2-Montaje de Pared
No ratings yet
Telepresence Mx200 G2-Montaje de Pared
8 pages
Kav60 La-5141p
No ratings yet
Kav60 La-5141p
40 pages
Unit 2 - Wireless Network
75% (4)
Unit 2 - Wireless Network
15 pages
Teltonika FM1120 User Manual v4.2
No ratings yet
Teltonika FM1120 User Manual v4.2
87 pages
Nova-227 - OD - TDD - eNB - Data - Sheet (SRv1.5 - 31-Jan-2019)
No ratings yet
Nova-227 - OD - TDD - eNB - Data - Sheet (SRv1.5 - 31-Jan-2019)
3 pages
Car Tracking Anti Theft System
No ratings yet
Car Tracking Anti Theft System
76 pages
Document Templates For Project
No ratings yet
Document Templates For Project
4 pages
Factory-Pattern in ABAP OO
No ratings yet
Factory-Pattern in ABAP OO
6 pages
MICRO CONTROLLER LAB Manual
No ratings yet
MICRO CONTROLLER LAB Manual
33 pages
Complex Programmable Logic Device
No ratings yet
Complex Programmable Logic Device
7 pages
Yamaha RX V667 Manual
No ratings yet
Yamaha RX V667 Manual
144 pages
Docker Container Security in Cloud Computing: Kelly Brady, Seung Moon, Tuan Nguyen, Joel Coffman
No ratings yet
Docker Container Security in Cloud Computing: Kelly Brady, Seung Moon, Tuan Nguyen, Joel Coffman
6 pages
Harris Surveillance AmberJack, StingRay, StingRay II, KingFish Wireless Surveillance Products Price List
100% (1)
Harris Surveillance AmberJack, StingRay, StingRay II, KingFish Wireless Surveillance Products Price List
8 pages
Master Data Cleanup Wizard Hangs
No ratings yet
Master Data Cleanup Wizard Hangs
2 pages