0% found this document useful (0 votes)

41 views

Advanced Computer Architecture: 563 L02.1 Fall 2011

Computer Architecture skill sets are different from other fields. Amdahl's law defines relative performance, relative cost, dependability, power. Cnn's daniel schmidt outlines the five principles of computer architecture.

Uploaded by

bashar_eng

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views

Advanced Computer Architecture: 563 L02.1 Fall 2011

Uploaded by

bashar_eng

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 57

ECE 563

Advanced Computer Architecture

Fall 2012 Lecture 2: Review of Metrics and Pipelining

563 L02.1

Fall 2011

Review from Last Time

Computer Architecture >> instruction sets Computer Architecture skill sets are different

5 Quantitative principles of design Quantitative approach to design Solid interfaces that really work Technology tracking and anticipation

563 L02.2

Fall 2011

Review (continued)

Other fields often borrow ideas from architecture Quantitative Principles of Design
1. 2. 3. 4. 5.

Take Advantage of Parallelism Principle of Locality Focus on the Common Case Amdahls Law The Processor Performance Equation Define, quantity, and summarize relative performance Define and quantify relative cost Define and quantify dependability Define and quantify power

Careful, quantitative comparisons

Culture of anticipating and exploiting advances in technology Culture of well-defined interfaces that are carefully implemented and thoroughly checked
Fall 2011

563 L02.3

Amdahls law

A programs execution time on a uniprocessor is t, and x% of the execution has to be sequential. If we decide to deploy n processors (same as the original uniprocessor), what is the best execution time on this multiprocessor system?

563 L02.4

Fall 2011

Metrics used to Compare Designs

Cost

Die cost and system cost

Execution Time

average and worst-case Latency vs. Throughput

Energy and Power

Also peak power and peak switching current

Reliability

Resiliency to electrical noise, part failure Robustness to bad software, operator error

Maintainability

System administration costs

563 L02.5

Compatibility

Software costs dominate

Fall 2011

Cost of Processor

Design cost (Non-recurring Engineering Costs, NRE) dominated by engineer-years (~$200K per engineer year) Cost of die die area die yield (maturity of manufacturing process, redundancy features) cost/size of wafers die cost ~= f(die area^4) with no redundancy Cost of packaging number of pins (signal + power/ground pins) power dissipation Cost of testing built-in test features? logical complexity of design choice of circuits (minimum clock rates, leakage currents, I/O drivers)

Architect affects all of these

563 L02.6 Fall 2011

System-Level Cost Impacts

Power supply and cooling Support chipset Off-chip SRAM/DRAM/ROM Off-chip peripherals

563 L02.7

Fall 2011

What is Performance?
Latency

(or response time or execution time)

time to complete one task

Bandwidth

(or throughput)

tasks completed per unit time

What

does pipelining improve?

563 L02.8

Fall 2011

Definition: Performance

Performance is in units of things per sec

bigger Can

is better

you think of a lower-is-better metric?

If we are primarily concerned with response time

performance(x) =

1 execution_time(x)

" X is n times faster than Y" means

Performance(X) n = Performance(Y)
563 L02.9

Execution_time(Y) = Execution_time(X)
Fall 2011

Performance Guarantees

Execution Rate C Inputs B A

Average Rate: A > B > C Worst-case Rate: A < B < C

563 L02.10 Fall 2011

Types of Benchmark

Synthetic Benchmarks

Fake programs invented to try to match the profile and behavior of real applications, e.g., Dhrystone, Whetstone

Toy Programs

100-line programs from beginning programming assignments, e.g., Nqueens, quicksort, Towers of Hanoi

Kernels

small, key pieces of real applications, e.g., matrix multiply, FFT, sorting, Livermore Loops, Linpack

Simplified Applications

Extract main computational skeleton of real application to simplify porting, e.g., NAS parallel benchmarks, TPC

Real Applications

Things people actually use their computers for, e.g., car crash simulations, relational databases, Photoshop, Quake

563 L02.11

Fall 2011

Performance: What to measure

Usually rely on benchmarks vs. real workloads To increase predictability, collections of benchmark applications-- benchmark suites -- are popular SPECCPU: popular desktop benchmark suite

CPU only, split between integer and floating point programs SPECint2000 has 12 integer, SPECfp2000 has 14 integer pgms SPECCPU2006 to be announced Spring 2006 SPECSFS (NFS file server) and SPECWeb (WebServer) added as server benchmarks

Transaction Processing Council measures server performance and cost-performance for databases

TPC-C Complex query for Online Transaction Processing TPC-H models ad hoc decision support TPC-W a transactional web benchmark TPC-App application server and web services benchmark
Fall 2011

563 L02.12

Summarizing Performance

System A B

Rate (Task 1) 10 20

Rate (Task 2) 20 10

Which system is faster?

563 L02.13

Fall 2011

depends whos selling

System A B Rate (Task 1) 10 20 Rate (Task 2) 20 10 Average 15 15

Average throughput

System A B

Rate (Task 1) 0.50 1.00

Rate (Task 2) 2.00 1.00

Average 1.25 1.00

Throughput relative to B

System A B

Rate (Task 1) 1.00 2.00

Rate (Task 2) 1.00 0.50

Average 1.00 1.25

Throughput relative to A

563 L02.14

Fall 2011

Summarizing Performance over Set of Benchmark Programs Arithmetic mean of execution times ti (in seconds)

1/n

i ti

Harmonic mean of execution rates ri (MIPS/MFLOPS)

n/ [i (1/ri)]

Both equivalent to workload where each program is run the same number of times Can add weighting factors to model other workload distributions

563 L02.15

Fall 2011

Normalized Execution Time

Measure speedup relative to reference machine

ratio = tRef / tA

563 L02.16

Fall 2011

How to Mislead with Performance Reports

Select

pieces of workload that work well on your design, ignore others Use unrealistic data set sizes for application (too big or too small) Report throughput numbers for a latency benchmark Report latency numbers for a throughput benchmark Report performance on a kernel and claim it represents an entire application Use 16-bit fixed-point arithmetic (because its fastest on your system) even though application requires 64-bit floating-point arithmetic Use a less efficient algorithm on the competing machine Report speedup for an inefficient algorithm (bubblesort) Compare hand-optimized assembly code with unoptimized C code Compare your design using next years technology against competitors year old design (1% performance improvement per week) Ignore the relative cost of the systems being compared Report averages and not individual results Report speedup over unspecified base system, not absolute times Report efficiency not absolute times Report MFLOPS not absolute times (use inefficient algorithm) [ David Bailey Twelve ways to fool the masses when giving performance results for parallel supercomputers ]
563 L02.17 Fall 2011

Benchmarking for Future Machines

Variance in performance for parallel architectures is going to be much worse than for serial processors

SPECcpu means only really work across very similar machine configurations

What is a good benchmarking methodology?

563 L02.18

Fall 2011

Why Power Matters

Packaging costs

Power has to be brought in and distributed around the chip Hundreds of pins and multiple interconnect layers for power

Power supply rail design Chip and system cooling costs Noise immunity and system reliability Battery life (in portable systems) Environmental concerns

Office equipment accounted for 5% of total US commercial energy usage in 1993 Energy Star compliant systems
Fall 2011

563 L02.19

Power and Energy Figures of Merit

Power consumption in Watts

determines battery life in hours Rate at which energy is delivered

Peak power

determines power ground wiring designs sets packaging limits impacts signal noise margin and reliability analysis

Energy efficiency in Joules

rate at which power is consumed over time

Energy = power * delay

Joules = Watts * seconds lower energy number means less power to perform a computation at the same frequency
Fall 2011

563 L02.20

Power versus Energy

Power is height of curve Watts Lower power design could simply be slower Approach 1 Approach 2 time Energy is area under curve Watts Two approaches require the same energy Approach 1 Approach 2 time

563 L02.21

Fall 2011

Peak Power versus Lower Energy

Peak A Peak B
Integrate power curve to get energy

Power

Time

System A has higher peak power, but lower total energy System B has lower peak power, but higher total energy
Fall 2011

563 L02.22

A "Typical" RISC ISA

32-bit fixed format instruction (3 formats) 32 32-bit GPR (R0 contains zero) 3-address, reg-reg arithmetic instruction Single address mode for load/store: base + displacement

no indirection

Simple branch conditions

see: SPARC, MIPS, HP PA-Risc, DEC Alpha, IBM PowerPC, CDC 6600, CDC 7600, Cray-1, Cray-2, Cray-3

563 L02.23

Fall 2011

Example: MIPS
Register-Register
31 26 25 21 20 16 15 11 10 6 5 0

Op
31

Rs1
26 25 21 20

Rs2

Opx
0

Op Branch
31

Rs1
26 25 21 20

Rd
16 15

immediate

Op Jump / Call
31

Rs1

Rs2

immediate

26 25

target

563 L02.24

Fall 2011

5 Steps of MIPS Datapath (RR)

Figure A.3, Page A-9

Instruction Fetch
Next PC

Instr. Decode Reg. Fetch

Next SEQ PC

Execute Addr. Calc

Next SEQ PC

Memory Access
MUX

Write Back

Adder
RS1

4
Address
563 L02.25

Zero?

MUX MUX

MEM/WB

PC <= PC + 4

Imm

Sign Extend

A <= Reg[IRrs]; rslt <= A opIRop B B <= Reg[IRrt]

Reg[IRrd] <= WB
Fall 2011

WB <= rslt

WB Data

IR <= mem[PC];

Memory

RS2

EX/MEM

Reg File

ID/EX

IF/ID

ALU

Data Memory

MUX

Inst. Set Processor Controller

IR <= mem[PC]; PC <= PC + 4

Ifetch

A <= Reg[IRrs]; B <= Reg[IRrt]

opFetch-DCD

br
if bop(A,b) PC <= __

jmp RR
PC <= __ r <= A opIRop B

RI
r <= A opIRop ___

LD
r <= A __ __

WB <= r

WB <= Mem[r]

Reg[IRrd] <= WB

563 L02.26

Fall 2011

Inst. Set Processor Controller

IR <= mem[PC]; PC <= PC + 4

Ifetch

A <= Reg[IRrs]; B <= Reg[IRrt]

opFetch-DCD

br
if bop(A,b) PC <= PC+IRim

jmp RR
PC <= IRjaddr r <= A opIRop B

RI
r <= A opIRop IRim

LD
r <= A + IRim

WB <= r

WB <= Mem[r]

Reg[IRrd] <= WB

563 L02.27

Fall 2011

5 Steps of MIPS Datapath

Figure A.3, Page A-9

Instruction Fetch
Next PC

Instr. Decode Reg. Fetch

Next SEQ PC

Execute Addr. Calc

Next SEQ PC

Memory Access
MUX

Write Back

Adder
RS1

4
Address
563 L02.28

Zero?

MUX MUX

MEM/WB

Imm

Sign Extend

Data stationary control

local decode for each instruction phase / pipeline stage
Fall 2011

WB Data

Memory

RS2

EX/MEM

Reg File

ID/EX

IF/ID

ALU

Data Memory

MUX

Visualizing Pipelining
Figure A.2, Page A-8

Time (clock cycles)

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7

ALU

I n s t r. O r d e r

ALU

Ifetch

Reg

DMem

Reg

Ifetch

Reg

DMem

Reg

ALU

Ifetch

Reg

DMem

Reg

ALU

Ifetch

Reg

DMem

Reg

563 L02.29

Fall 2011

Pipelining is not quite that easy!

Limits to pipelining: Hazards prevent next instruction from executing during its designated clock cycle

Structural hazards: HW cannot support this combination of instructions (single person to fold and put clothes away) Data hazards: Instruction depends on result of prior instruction still in the pipeline (missing sock) Control hazards: Caused by delay between the fetching of instructions and decisions about changes in control flow (branches and jumps).

563 L02.30

Fall 2011

One Memory Port/Structural Hazards

Figure A.4, Page A-14

Time (clock cycles)

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7

ALU

I Load Ifetch n s Instr 1 t r. O r d e r

ALU

Reg

DMem

Reg

ALU

Ifetch

Reg

DMem

Reg

Instr 2 Instr 3 Instr 4

Ifetch

Reg

DMem

Reg

ALU

Ifetch

Reg

DMem

Reg

ALU

Ifetch

Reg

DMem

Reg

563 L02.31

Fall 2011

One Memory Port/Structural Hazards

(Similar to Figure A.5, Page A-15)

Time (clock cycles)

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7

ALU

I Load Ifetch n s Instr 1 t r. O r d e r

ALU

Reg

DMem

Reg

ALU

Ifetch

Reg

DMem

Reg

Instr 2 Stall Instr 3

Ifetch

Reg

DMem

Reg

Bubble

Bubble Bubble

Bubble
ALU

Bubble
Reg

Ifetch

Reg

DMem

How do you bubble the pipe?

563 L02.32

Fall 2011

Speed Up Equation for Pipelining

CPIpipelined Ideal CPI Average Stall cycles per Inst

563 L02.33

Fall 2011

Example: Dual-port vs. Single-port

Machine A: Dual ported memory (Harvard Architecture) Machine B: Single ported memory, but its pipelined implementation has a 1.05 times faster clock rate Ideal CPI = 1 for both Loads are 40% of instructions executed Machine A is 1.33 times faster

563 L02.34

Fall 2011

Data Hazard on R1
Figure A.6, Page A-17

Time (clock cycles)

IF ID/RF EX MEM
DMem

WB
Reg

I n s t r. O r d e r

add r1,r2,r3 sub r4,r1,r3 and r6,r1,r7 or

Ifetch

Reg

ALU

Ifetch

Reg

DMem

Reg

ALU

Ifetch

Reg

DMem

Reg

ALU

r8,r1,r9

Ifetch

Reg

DMem

Reg

ALU

xor r10,r1,r11
563 L02.35

Ifetch

Reg

DMem

Reg

Among these 4 arrows, which are hazards and which are not?
Fall 2011

Three Generic Data Hazards

Read After Write (RAW) InstrJ tries to read operand before InstrI writes it I: add r1,r2,r3 J: sub r4,r1,r3 Caused by a Dependence (in compiler nomenclature). This hazard results from an actual need for communication.

563 L02.36

Fall 2011

Three Generic Data Hazards

Write After Read (WAR) InstrJ writes operand before InstrI reads it

I: sub r4,r1,r3 J: add r1,r2,r3 K: mul r6,r1,r7

Called an anti-dependence by compiler writers. This results from reuse of the name r1. Cant happen in MIPS 5 stage pipeline because:

All instructions take 5 stages, and Reads are always in stage 2, and Writes are always in stage 5

563 L02.37

Fall 2011

Three Generic Data Hazards

Write After Write (WAW) InstrJ writes operand before InstrI writes it. I: sub r1,r4,r3 J: add r1,r2,r3 K: mul r6,r1,r7

Called an output dependence by compiler writers This also results from the reuse of name r1. Cant happen in MIPS 5 stage pipeline because:

All instructions take 5 stages, and Writes are always in stage 5

Will see WAR and WAW in more complicated pipes

Fall 2011

563 L02.38

Forwarding to Avoid Data Hazard

Figure A.7, Page A-19

Time (clock cycles) I n s t r. O r d e r

ALU

add r1,r2,r3 Ifetch sub r4,r1,r3 and r6,r1,r7 or r8,r1,r9

Reg

DMem

Reg

ALU

Ifetch

Reg

DMem

Reg

ALU

Ifetch

Reg

DMem

Reg

ALU

Ifetch

Reg

DMem

Reg

ALU

xor r10,r1,r11

Ifetch

Reg

DMem

Reg

563 L02.39

Fall 2011

HW Change for Forwarding

Figure A.23, Page A-37
memory output at the end of MEM ALU output at the end of MEM ALU output at the end of EX

NextPC

mux

Immediate

563 L02.40

Registers

MEM/WR

EX/MEM

ALU

What circuit detects and resolves this hazard?

Fall 2011

ID/EX

Data Memory

mux

Forwarding to Avoid LW-SW Data Hazard

Figure A.8, Page A-20

Time (clock cycles) I n s t r. O r d e r

ALU

add r1,r2,r3 Ifetch lw r4, 0(r1) sw r4,12(r1) or r8,r6,r9

Reg

DMem

Reg

ALU

Ifetch

Reg

DMem

Reg

ALU

Ifetch

Reg

DMem

Reg

ALU

Ifetch

Reg

DMem

Reg

ALU

xor r10,r9,r11

Ifetch

Reg

DMem

Reg

563 L02.41

Fall 2011

Data Hazard Even with Forwarding

Figure A.9, Page A-21

Time (clock cycles) I n s t r. O r d e r

ALU

lw r1, 0(r2) Ifetch sub r4,r1,r6 and r6,r1,r7 or r8,r1,r9

Reg

DMem

Reg

ALU

Ifetch

Reg

DMem

Reg

ALU

Ifetch

Reg

DMem

Reg

ALU

Ifetch

Reg

DMem

Reg

563 L02.42

Fall 2011

Data Hazard Even with Forwarding

(Similar to Figure A.10, Page A-21)

Time (clock cycles) I n s t r. O r d e r

ALU

lw r1, 0(r2) sub r4,r1,r6 and r6,r1,r7 or r8,r1,r9

Ifetch

Reg

DMem

Reg

Ifetch

Reg

Bubble

ALU

DMem

Reg

ALU

Ifetch

Bubble

Reg

DMem

Reg

Bubble

Ifetch

Reg

ALU

DMem

Is this the same bubble?

563 L02.43 Fall 2011

Software Scheduling to Avoid Load Hazards

Try producing fast code for a = b + c; d = e f; assuming a, b, c, d ,e, and f in memory.
Slow code: LW LW ADD SW LW LW SUB SW Rb,b Rc,c Ra,Rb,Rc a,Ra Re,e Rf,f Rd,Re,Rf d,Rd

Compiler optimizes for performance. Hardware checks for safety.

563 L02.44 Fall 2011

Software Scheduling to Avoid Load Hazards

Try producing fast code for a = b + c; d = e f; assuming a, b, c, d ,e, and f in memory.
Slow code: LW LW ADD SW LW LW SUB SW
Fast code:

Rb,b Rc,c Ra,Rb,Rc a,Ra Re,e Rf,f Rd,Re,Rf d,Rd

LW LW LW ADD LW SW SUB SW

Rb,b Rc,c Re,e Ra,Rb,Rc Rf,f a,Ra Rd,Re,Rf d,Rd

Fall 2011

Compiler optimizes for performance. Hardware checks for safety.

563 L02.45

Control Hazard on Branches Three Stage Stall

ALU

10: beq r1,r3,36 14: and r2,r3,r5 18: or r6,r1,r7

Ifetch

Reg

DMem

Reg

ALU

Ifetch

Reg

DMem

Reg

ALU

Ifetch

Reg

DMem

Reg

ALU

22: add r8,r1,r9 36: xor r10,r1,r11

Ifetch

Reg

DMem

Reg

ALU

Ifetch

Reg

DMem

Reg

How to deal with the 3 instructions in between?

563 L02.46 Fall 2011

Branch Stall Impact

If CPI = 1, 30% branch, Stall 3 cycles => new CPI = 1.9! Two part solution:

Determine branch taken or not sooner, AND Compute taken branch address earlier

MIPS branch tests if register = 0 or 0 MIPS Solution:

Move Zero test to ID/RF stage (use xor) Adder to calculate new PC in ID/RF stage 1 clock cycle penalty for branch versus 3

563 L02.47

Fall 2011

Pipelined MIPS Datapath

Figure A.24, page A-38

Instruction Fetch
Next PC

Instr. Decode Reg. Fetch

Next SEQ PC

Execute Addr. Calc

Memory Access

Write Back

MUX

Adder

4
Address
563 L02.48

Zero?

RS1

MEM/WB

Imm

Sign Extend

Interplay of instruction set design and cycle time.

Fall 2011

WB Data

Memory

RS2

EX/MEM

Reg File

ID/EX

ALU

IF/ID

Data Memory

MUX

Four Branch Hazard Alternatives

#1: Stall until branch direction is clear #2: Predict Branch Not Taken

Execute successor instructions in sequence Squash instructions in pipeline if branch actually taken Advantage of late pipeline state update 47% MIPS branches not taken on average PC+4 already calculated, so use it to get next instruction 53% MIPS branches taken on average But havent calculated branch target address in MIPS
- MIPS still incurs 1 cycle branch penalty - Other machines: branch target known before outcome

#3: Predict Branch Taken

563 L02.49

Fall 2011

Four Branch Hazard Alternatives

#4: Delayed Branch

Define branch to take place AFTER a following instruction branch instruction sequential successor1 sequential successor2 ........ sequential successorn branch target if taken

Branch delay of length n

1 slot delay allows proper decision and branch target address in 5 stage pipeline MIPS uses this

563 L02.50

Fall 2011

Scheduling Branch Delay Slots (Fig A.14)

A. From before branch add $1,$2,$3 if $2=0 then delay slot B. From branch target sub $4,$5,$6 add $1,$2,$3 if $1=0 then delay slot becomes C. From fall through add $1,$2,$3 if $1=0 then delay slot or $7 $8 $9 sub $4,$5,$6 becomes add $1,$2,$3 if $1=0 then add $1,$2,$3 if $1=0 then sub $4,$5,$6 or $7 $8 $9

becomes if $2=0 then add $1,$2,$3

A is the best choice, fills delay slot & reduces instruction count (IC) In B, the sub instruction may need to be copied, increasing IC

563 L02.51

Fall 2011

Delayed Branch

Compiler effectiveness for single branch delay slot:

Fills about 60% of branch delay slots About 80% of instructions executed in branch delay slots useful in computation About 50% (60% x 80%) of slots usefully filled

Delayed Branch downside: As processor go to deeper pipelines and multiple issue, the branch delay grows and need more than one delay slot

Delayed branching has lost popularity compared to more expensive but more flexible dynamic approaches Growth in available transistors has made dynamic approaches relatively cheaper

563 L02.52

Fall 2011

Evaluating Branch Alternatives

In MIPS R4000, it takes at least 3 pipeline stages to calculate the branch target address, and an additional cycle to evaluate the branch condition (but no stall on the register comparison)

Assume 4% unconditional branch, 6% conditional branch- untaken, 10% conditional branch-taken Scheduling scheme Stall pipeline Predict taken Predict not taken U 2 2 2 C-UT 3 3 0 C-T 3 2 3 CPI 1.56 1.46 1.38

563 L02.53

Fall 2011

Problems with Pipelining

Exception: An unusual event happens to an instruction during its execution

Examples: divide by zero, undefined opcode

Interrupt: Hardware signal to switch the processor to a new instruction stream

Example: a sound card interrupts when it needs more audio output samples (an audio click happens if it is left waiting)

Problem: the exception or interrupt must appear between 2 instructions (Ii and Ii+1)

The effect of all instructions up to and including Ii is totalling complete No effect of any instruction after Ii can take place

The interrupt (exception) handler either aborts program or restarts at instruction Ii+1

563 L02.54

Fall 2011

Precise Exceptions In-Order Pipelines

Key observation: architected state only change in memory and register write stages.
563 L02.55 Fall 2011

Summary: Metrics and Pipelining

Machines compared over many metrics

Cost, performance, power, reliability, compatibility,

Difficult to compare widely differing machines on benchmark suite Control VIA State Machines and Microprogramming Just overlap tasks; easy if tasks are independent Speed Up Pipeline Depth; if ideal CPI is 1, then:
Cycle Timeunpipelined Pipeline depth Speedup 1 Pipeline stall CPI Cycle Timepipelined

Hazards limit performance on computers:

Structural: need more HW resources Data (RAW,WAR,WAW): need forwarding, compiler scheduling Control: delayed branch, prediction

Exceptions, Interrupts add complexity Next time: Read Appendix C!

Fall 2011

563 L02.56

About ECE 563

Please check the sakai page Move class from 10/17 to 10/16?

563 L02.57

Fall 2011

Cursos Download Top OK
25% (4)
Cursos Download Top OK
1 page
Computer Organization & Design The Hardware/Software Interface, 2nd Edition Patterson & Hennessy
80% (5)
Computer Organization & Design The Hardware/Software Interface, 2nd Edition Patterson & Hennessy
118 pages
2012-Ansi - z502 Bissc
No ratings yet
2012-Ansi - z502 Bissc
45 pages
Gaumer Process - Heat Control System
No ratings yet
Gaumer Process - Heat Control System
8 pages
CH02-COA10e Spring 2025
No ratings yet
CH02-COA10e Spring 2025
24 pages
Performance Issues
No ratings yet
Performance Issues
19 pages
CS5204/EE5364 - Advanced Computer Architecture - Performance
No ratings yet
CS5204/EE5364 - Advanced Computer Architecture - Performance
56 pages
CA01_2024S2
No ratings yet
CA01_2024S2
30 pages
Lecture1 2
No ratings yet
Lecture1 2
30 pages
Chapter Two
No ratings yet
Chapter Two
33 pages
Aula Ch1
No ratings yet
Aula Ch1
40 pages
Chapter4 Performance
No ratings yet
Chapter4 Performance
36 pages
Performance Numericals
No ratings yet
Performance Numericals
24 pages
4 Performance
No ratings yet
4 Performance
67 pages
Instructor: L. N. Bhuyan
No ratings yet
Instructor: L. N. Bhuyan
32 pages
CMP2008 L1
No ratings yet
CMP2008 L1
47 pages
CIS775: Computer Architecture: Chapter 1: Fundamentals of Computer Design
No ratings yet
CIS775: Computer Architecture: Chapter 1: Fundamentals of Computer Design
43 pages
L-2 (Computer Performance)
No ratings yet
L-2 (Computer Performance)
52 pages
Lecture 2: Metrics To Evaluate Systems
No ratings yet
Lecture 2: Metrics To Evaluate Systems
33 pages
ACSA1-Introduction
No ratings yet
ACSA1-Introduction
33 pages
CH02-COA10e Spring 2025
No ratings yet
CH02-COA10e Spring 2025
24 pages
CCS 1202 Lecture 2_Computer Evolution and Performance
No ratings yet
CCS 1202 Lecture 2_Computer Evolution and Performance
32 pages
Hpca Notes
No ratings yet
Hpca Notes
216 pages
LEC 2
No ratings yet
LEC 2
31 pages
CIS775: Computer Architecture: Chapter 1: Fundamentals of Computer Design
No ratings yet
CIS775: Computer Architecture: Chapter 1: Fundamentals of Computer Design
43 pages
Unit 1
No ratings yet
Unit 1
68 pages
Fundamentals of Quantitative Design and Analysis: A Quantitative Approach, Fifth Edition
No ratings yet
Fundamentals of Quantitative Design and Analysis: A Quantitative Approach, Fifth Edition
37 pages
CAQA6e ch1
No ratings yet
CAQA6e ch1
31 pages
Performance: Latency
No ratings yet
Performance: Latency
7 pages
CIS775: Computer Architecture: Chapter 1: Fundamentals of Computer Design
No ratings yet
CIS775: Computer Architecture: Chapter 1: Fundamentals of Computer Design
43 pages
CHAPTER 1-orig
No ratings yet
CHAPTER 1-orig
50 pages
Lecture 1 8405 Computer Architecture
No ratings yet
Lecture 1 8405 Computer Architecture
15 pages
Chapter 2
No ratings yet
Chapter 2
15 pages
2 RISC V Performance ISA
No ratings yet
2 RISC V Performance ISA
72 pages
Cs23402- Computer Architecture - Unit - 1 (4)
No ratings yet
Cs23402- Computer Architecture - Unit - 1 (4)
161 pages
Mod6 2 PDF
No ratings yet
Mod6 2 PDF
15 pages
Lecture2 E5231
No ratings yet
Lecture2 E5231
38 pages
L5-L6-Performance Issues
No ratings yet
L5-L6-Performance Issues
47 pages
LEC 2
No ratings yet
LEC 2
31 pages
Lecture 3: Performance/Power, MIPS Instructions
No ratings yet
Lecture 3: Performance/Power, MIPS Instructions
18 pages
Fundamentals of Quantitative Design and Analysis: A Quantitative Approach, Fifth Edition
No ratings yet
Fundamentals of Quantitative Design and Analysis: A Quantitative Approach, Fifth Edition
24 pages
Computer Architecture: Fundamentals
No ratings yet
Computer Architecture: Fundamentals
36 pages
Advanced Computer Architecture ECE 6373: Pauline Markenscoff N320 Engineering Building 1 E-Mail: Markenscoff@uh - Edu
No ratings yet
Advanced Computer Architecture ECE 6373: Pauline Markenscoff N320 Engineering Building 1 E-Mail: Markenscoff@uh - Edu
151 pages
Lecture 02 CH01 Performance Power
No ratings yet
Lecture 02 CH01 Performance Power
76 pages
Lecture 2: Performance/Power, MIPS Instructions
No ratings yet
Lecture 2: Performance/Power, MIPS Instructions
28 pages
01) Fundamentals of Quantitative Design and Analysis
No ratings yet
01) Fundamentals of Quantitative Design and Analysis
71 pages
M116C 1 M116C 1 Lect02-Performance
No ratings yet
M116C 1 M116C 1 Lect02-Performance
23 pages
lec01
No ratings yet
lec01
10 pages
L-2 (Computer Performance)
No ratings yet
L-2 (Computer Performance)
47 pages
CHAPTER 1 and 2
No ratings yet
CHAPTER 1 and 2
25 pages
CSC232 - Chp1 (Compatibility Mode)
No ratings yet
CSC232 - Chp1 (Compatibility Mode)
50 pages
Intro
No ratings yet
Intro
14 pages
Defining Computer Architecture
No ratings yet
Defining Computer Architecture
6 pages
Advanced Computer Architecture
No ratings yet
Advanced Computer Architecture
18 pages
2.Week
No ratings yet
2.Week
35 pages
Chapter 1 PPT 2007 V 2
No ratings yet
Chapter 1 PPT 2007 V 2
36 pages
Chapter_1_Introduction
No ratings yet
Chapter_1_Introduction
49 pages
Cse431 04
No ratings yet
Cse431 04
17 pages
DA_CI
No ratings yet
DA_CI
13 pages
What's New in .NET 8? A Complete Guide to the Latest Features
From Everand
What's New in .NET 8? A Complete Guide to the Latest Features
Nitika
No ratings yet
Advanced Backend Code Optimization
From Everand
Advanced Backend Code Optimization
Sid Touati
No ratings yet
DeepSeek vs. ChatGPT – Why DeepSeek is the Superior AI.
From Everand
DeepSeek vs. ChatGPT – Why DeepSeek is the Superior AI.
Gary Thatcher
No ratings yet
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
From Everand
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
Franco Mario
No ratings yet
Accessories BOQ For Process Water+Compressed Air
No ratings yet
Accessories BOQ For Process Water+Compressed Air
11 pages
CAS Restaurant Brochure
No ratings yet
CAS Restaurant Brochure
10 pages
Ideal Mexico Slimline RS40 Manual
No ratings yet
Ideal Mexico Slimline RS40 Manual
40 pages
Fire Detection System Design PDF
No ratings yet
Fire Detection System Design PDF
15 pages
International Catalog
No ratings yet
International Catalog
128 pages
DR Kurosh Salehi and Arezou Nazar
No ratings yet
DR Kurosh Salehi and Arezou Nazar
11 pages
Control of Cracking in Mass Concrete Structures - EM34
100% (1)
Control of Cracking in Mass Concrete Structures - EM34
82 pages
Rup XP Scrum PC 030326 PPT
No ratings yet
Rup XP Scrum PC 030326 PPT
53 pages
Art Proposal For Community Art Project: ST Cecilia's Primary School
No ratings yet
Art Proposal For Community Art Project: ST Cecilia's Primary School
5 pages
Analysis of Open Web Steel Joists
100% (1)
Analysis of Open Web Steel Joists
13 pages
Android Platform
No ratings yet
Android Platform
44 pages
Comparison of Wind Loads Calculated by Fifteen Different Codes and Standards, For Low (Steel Portal Frame), Medium and High-Rise Buildings
No ratings yet
Comparison of Wind Loads Calculated by Fifteen Different Codes and Standards, For Low (Steel Portal Frame), Medium and High-Rise Buildings
16 pages
Types of Earthdams
No ratings yet
Types of Earthdams
21 pages
EPB-Center-Case Study EN 16798-1 Condit Use
No ratings yet
EPB-Center-Case Study EN 16798-1 Condit Use
40 pages
Unit 8 Part 1: Vocabulary and Grammar Choose The Correct Answer (A, B, C or D) To Complete Each Sentence
No ratings yet
Unit 8 Part 1: Vocabulary and Grammar Choose The Correct Answer (A, B, C or D) To Complete Each Sentence
9 pages
Acer Aspire 5538 Series Service Guide
No ratings yet
Acer Aspire 5538 Series Service Guide
227 pages
Comparison of CYPECAD Metal 3D and STAAD
No ratings yet
Comparison of CYPECAD Metal 3D and STAAD
12 pages
Mac Keyboard Shortcuts
No ratings yet
Mac Keyboard Shortcuts
16 pages
Unimore Trading
No ratings yet
Unimore Trading
3 pages
Technology at Work at Gehry Partners: A Case Study: Overview of Firm
No ratings yet
Technology at Work at Gehry Partners: A Case Study: Overview of Firm
9 pages
Architecturally Exposed Structural Steel PDF
No ratings yet
Architecturally Exposed Structural Steel PDF
8 pages
W800 - Door
No ratings yet
W800 - Door
8 pages
Cse-3rd & 4TH Sem23062018
No ratings yet
Cse-3rd & 4TH Sem23062018
53 pages
Detailed Lesson Plan Angelo
100% (1)
Detailed Lesson Plan Angelo
10 pages
Checkpoint NGX Smart View Monitor
No ratings yet
Checkpoint NGX Smart View Monitor
70 pages
Guide Clock Tower Raid Granado Espada
No ratings yet
Guide Clock Tower Raid Granado Espada
16 pages
Rate Analysis Rebar
No ratings yet
Rate Analysis Rebar
90 pages