0% found this document useful (0 votes)

76 views

Large and Fast: Exploiting Memory Hierarchy

This document describes the memory hierarchy and caching techniques used to improve memory performance. It discusses how memory is organized into a hierarchy with smaller, faster levels closer to the CPU caching data from larger, slower levels further away. The key principles of locality are exploited by caching recently and spatially accessed data. Caches use tags and valid bits to track what data is stored in each block. Write policies like write-through and write-back aim to balance consistency with performance when data is written. Block size, replacement policies, and handling of write misses also impact cache performance.

Uploaded by

Afs Asg

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

76 views

Large and Fast: Exploiting Memory Hierarchy

Uploaded by

Afs Asg

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

Chapter 5

Large and Fast:

Exploiting Memory
Hierarchy
Introduction
Programmers want unlimited amounts of memory with
low latency
Fast memory technology is more expensive per bit than
slower memory
Solution: organize memory system into a hierarchy
Entire addressable memory space available in largest, slowest
memory
Incrementally smaller and faster memories, each containing a
subset of the memory below it, proceed in steps up toward the
processor
Temporal and spatial locality insures that nearly all
references can be found in smaller memories
Gives the allusion of a large, fast memory being presented to the
processor

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 2

5.1 Introduction
Memory Technology
Static RAM (SRAM)
0.5ns 2.5ns, $2000 $5000 per GB
Dynamic RAM (DRAM)
50ns 70ns, $20 $75 per GB
Magnetic disk
5ms 20ms, $0.20 $2 per GB
Ideal memory
Access time of SRAM
Capacity and cost/GB of disk

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 3

Principle of Locality
Programs access a small proportion of
their address space at any time
Temporal locality
Items accessed recently are likely to be
accessed again soon
e.g., instructions in a loop, induction variables
Spatial locality
Items near those accessed recently are likely
to be accessed soon
E.g., sequential instruction access, array data
Chapter 5 Large and Fast: Exploiting Memory Hierarchy 4
Taking Advantage of Locality
Memory hierarchy
Store everything on disk
Copy recently accessed (and nearby)
items from disk to smaller DRAM memory
Main memory
Copy more recently accessed (and
nearby) items from DRAM to smaller
SRAM memory
Cache memory attached to CPU

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 5

Memory Hierarchy Levels
Block (aka line): unit of copying
May be multiple words
If accessed data is present in
upper level
Hit: access satisfied by upper level
Hit ratio: hits/accesses
If accessed data is absent
Miss: block copied from lower level
Time taken: miss penalty
Miss ratio: misses/accesses
= 1 hit ratio
Then accessed data supplied from
upper level

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 6

5.2 The Basics of Caches
Cache Memory
Cache memory
The level of the memory hierarchy closest to
the CPU
Given accesses X1, , Xn1, Xn

How do we know if
the data is present?
Where do we look?

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 7

Direct Mapped Cache
Location determined by address
Direct mapped: only one choice
(Block address) modulo (#Blocks in cache)

#Blocks is a
power of 2
Use low-order
address bits

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 8

Tags and Valid Bits
How do we know which particular block is
stored in a cache location?
Store block address as well as the data
Actually, only need the high-order bits
Called the tag
What if there is no data in a location?
Valid bit: 1 = present, 0 = not present
Initially 0

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 9

Cache Example
8-blocks, 1 word/block, direct mapped
Initial state

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 10

Cache Example
Word addr Binary addr Hit/miss Cache block
22 10 110 Miss 110

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 11

Cache Example
Word addr Binary addr Hit/miss Cache block
26 11 010 Miss 010

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 12

Cache Example
Word addr Binary addr Hit/miss Cache block
22 10 110 Hit 110
26 11 010 Hit 010

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 13

Cache Example
Word addr Binary addr Hit/miss Cache block
16 10 000 Miss 000
3 00 011 Miss 011
16 10 000 Hit 000

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 14

Cache Example
Word addr Binary addr Hit/miss Cache block
18 10 010 Miss 010

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 15

Address Subdivision

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 16

Example: Larger Block Size
Consider a cache with 64 blocks and a
block size of 16 bytes. What block number
does byte address 1200 map to?

Block address = 1200/16 = 75

Block number = 75 modulo 64 = 11
1200 = (40)16
31 10 9 4 3 0
Tag Index Offset
22 bits 6 bits 4 bits

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 17

Block Size Considerations
Larger blocks should reduce miss rate
Due to spatial locality
But in a fixed-sized cache
Larger blocks fewer of them
More competition increased miss rate
Larger blocks pollution
Larger miss penalty
Can override benefit of reduced miss rate
Early restart and critical-word-first can help

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 18

Cache Misses
On cache hit, CPU proceeds normally
On cache miss
Stall the CPU pipeline
Fetch block from next level of hierarchy
Instruction cache miss
Restart instruction fetch
Data cache miss
Complete data access

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 19

Write-Through
On data-write hit, could just update the block in
cache
But then cache and memory would be inconsistent
Write through: also update memory
But makes writes take longer
e.g., if base CPI = 1, 10% of instructions are stores,
write to memory takes 100 cycles
Effective CPI = 1 + 0.1100 = 11
Solution: write buffer
Holds data waiting to be written to memory
CPU continues immediately
Only stalls on write if write buffer is already full

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 20

Write-Back
Alternative: On data-write hit, just update
the block in cache
Keep track of whether each block is dirty
When a dirty block is replaced
Write it back to memory
Can use a write buffer to allow replacing block
to be read first

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 21

Write Allocation
What should happen on a write miss?
Alternatives for write-through
Allocate on miss: fetch the block
Write around: dont fetch the block
Since programs often write a whole block before
reading it (e.g., initialization)
For write-back
Usually fetch the block

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 22

Example: Intrinsity FastMATH
Embedded MIPS processor
12-stage pipeline
Instruction and data access on each cycle
Split cache: separate I-cache and D-cache
Each 16KB: 256 blocks 16 words/block
D-cache: write-through or write-back
SPEC2000 miss rates
I-cache: 0.4%
D-cache: 11.4%
Weighted average: 3.2%

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 23

Example: Intrinsity FastMATH

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 24

Chapter 05
No ratings yet
Chapter 05
113 pages
Chapter 05 Computer Organization and Design, Fifth Edition: The Hardware/Software Interface (The Morgan Kaufmann Series in Computer Architecture and Design) 5th Edition
75% (4)
Chapter 05 Computer Organization and Design, Fifth Edition: The Hardware/Software Interface (The Morgan Kaufmann Series in Computer Architecture and Design) 5th Edition
105 pages
Sony Str-k685 (ET)
No ratings yet
Sony Str-k685 (ET)
52 pages
HCD-GRX70 (9-928-843-11) PDF
No ratings yet
HCD-GRX70 (9-928-843-11) PDF
72 pages
Large and Fast: Exploiting Memory Hierarchy: Omputer Rganization and Esign
No ratings yet
Large and Fast: Exploiting Memory Hierarchy: Omputer Rganization and Esign
87 pages
Lecture 9 - The Memory Hierarchy
No ratings yet
Lecture 9 - The Memory Hierarchy
25 pages
Large and Fast: Exploiting Memory Hierarchy: The Hardware/Software Interface
No ratings yet
Large and Fast: Exploiting Memory Hierarchy: The Hardware/Software Interface
33 pages
Chapter_05
No ratings yet
Chapter_05
52 pages
Lec 2
No ratings yet
Lec 2
26 pages
Chapter 5 Large and Fast Exploiting Memory Hierarchy
No ratings yet
Chapter 5 Large and Fast Exploiting Memory Hierarchy
101 pages
Lecture-17 CH-05 1
No ratings yet
Lecture-17 CH-05 1
21 pages
Chapter 5: Large and Fast Exploiting Memory Hierarchy Notes
No ratings yet
Chapter 5: Large and Fast Exploiting Memory Hierarchy Notes
16 pages
Chapter5 PDF
No ratings yet
Chapter5 PDF
95 pages
help2
No ratings yet
help2
102 pages
04 - Large and Fast Exploiting Memory Hierarchy
No ratings yet
04 - Large and Fast Exploiting Memory Hierarchy
92 pages
Chapter 05
No ratings yet
Chapter 05
105 pages
Large and Fast: Exploiting Memory Hierarchy: Computer Organization and Design
No ratings yet
Large and Fast: Exploiting Memory Hierarchy: Computer Organization and Design
107 pages
3. Lecture 19 Basics of Cache
No ratings yet
3. Lecture 19 Basics of Cache
23 pages
Chapter_05 9wY
No ratings yet
Chapter_05 9wY
136 pages
Unit 5 1 Cache Performance V 2
No ratings yet
Unit 5 1 Cache Performance V 2
29 pages
Chapter 5 Large and Fast Exploiting Memory Hierarchy
No ratings yet
Chapter 5 Large and Fast Exploiting Memory Hierarchy
96 pages
week11
No ratings yet
week11
45 pages
Week6 Memory Part2
No ratings yet
Week6 Memory Part2
23 pages
5 Memory Hierarchy
No ratings yet
5 Memory Hierarchy
99 pages
CA_Lecture_08
No ratings yet
CA_Lecture_08
38 pages
Chapter 5
No ratings yet
Chapter 5
106 pages
Computer Architecture: Memory Organization
No ratings yet
Computer Architecture: Memory Organization
65 pages
Patterson6e_MIPS_Ch05_Modified_Part2 (3)
No ratings yet
Patterson6e_MIPS_Ch05_Modified_Part2 (3)
121 pages
Cache1 2
No ratings yet
Cache1 2
30 pages
Lec 4a
No ratings yet
Lec 4a
25 pages
L - 3-AssociativeMapping - Virtual Memory
No ratings yet
L - 3-AssociativeMapping - Virtual Memory
52 pages
Unit-4 (2)
No ratings yet
Unit-4 (2)
72 pages
Systems I: Locality and Caching
No ratings yet
Systems I: Locality and Caching
18 pages
CS140 Computer Organization: Chapter 6: Memory
No ratings yet
CS140 Computer Organization: Chapter 6: Memory
81 pages
Cache Memory: CS2100 - Computer Organization
No ratings yet
Cache Memory: CS2100 - Computer Organization
45 pages
363 Note Ch5
No ratings yet
363 Note Ch5
94 pages
CS 61C: Great Ideas in Computer Architecture: Lecture 12 - Memory Hierarchy/Direct-Mapped Caches
No ratings yet
CS 61C: Great Ideas in Computer Architecture: Lecture 12 - Memory Hierarchy/Direct-Mapped Caches
27 pages
Computer Architecture: Cache Memory
No ratings yet
Computer Architecture: Cache Memory
28 pages
10-cacheperf
No ratings yet
10-cacheperf
24 pages
Stanford Advanced Caches
No ratings yet
Stanford Advanced Caches
46 pages
Chapter 3 Large and Fast
No ratings yet
Chapter 3 Large and Fast
86 pages
ARM hw5
No ratings yet
ARM hw5
5 pages
CAO UNIT 5
No ratings yet
CAO UNIT 5
12 pages
Computer Organization: Large and Fast: Exploiting Memory Hierarchy
No ratings yet
Computer Organization: Large and Fast: Exploiting Memory Hierarchy
19 pages
CODch 7 Slides
No ratings yet
CODch 7 Slides
49 pages
Computer Architecture and Organization: Lecture12: Locality and Caching
No ratings yet
Computer Architecture and Organization: Lecture12: Locality and Caching
17 pages
05) Cache Memory Introduction
No ratings yet
05) Cache Memory Introduction
20 pages
L17
No ratings yet
L17
23 pages
CS 3853 Computer Architecture - Memory Hierarchy
No ratings yet
CS 3853 Computer Architecture - Memory Hierarchy
37 pages
15CS754 SAN Solution Manual
No ratings yet
15CS754 SAN Solution Manual
15 pages
Understand CPU Caching Concepts
No ratings yet
Understand CPU Caching Concepts
14 pages
5 4 Virtual Memory Cache Coherence
No ratings yet
5 4 Virtual Memory Cache Coherence
42 pages
Memory Hierarchy Design: A Quantitative Approach, Fifth Edition
No ratings yet
Memory Hierarchy Design: A Quantitative Approach, Fifth Edition
112 pages
ACA Unit-5
No ratings yet
ACA Unit-5
54 pages
CA Chap5 Memory
No ratings yet
CA Chap5 Memory
91 pages
Cache and The Memory Hierarchy Part One: Computer Organization II Spring 2017 Gedare Bloom
No ratings yet
Cache and The Memory Hierarchy Part One: Computer Organization II Spring 2017 Gedare Bloom
31 pages
Cache Memory A
No ratings yet
Cache Memory A
62 pages
Chapter 2z Ppt
No ratings yet
Chapter 2z Ppt
54 pages
Chapter 2 Neede For Guide Line Help From Smiw
No ratings yet
Chapter 2 Neede For Guide Line Help From Smiw
7 pages
revision1
No ratings yet
revision1
33 pages
Storage Area Networks For Dummies
From Everand
Storage Area Networks For Dummies
Christopher Poelker
3.5/5 (2)
Flash Memory Evolution
From Everand
Flash Memory Evolution
Sterling Blackwood
No ratings yet
Week8 SampleMidterm
No ratings yet
Week8 SampleMidterm
2 pages
Week4 Instructions
No ratings yet
Week4 Instructions
36 pages
Chapter 1 Part 2: Computer Abstractions and Technology
No ratings yet
Chapter 1 Part 2: Computer Abstractions and Technology
27 pages
1 Differential Eqn - Lecture Notes 6: 1.1 Substitution Methods, Homogeneous Equations, Bernoulli Equa-Tions
No ratings yet
1 Differential Eqn - Lecture Notes 6: 1.1 Substitution Methods, Homogeneous Equations, Bernoulli Equa-Tions
8 pages
1 Differential Eqn - Lecture Notes 7
No ratings yet
1 Differential Eqn - Lecture Notes 7
7 pages
1 Differential Eqn - Lecture Notes 6: 1.1 Substitution Methods, Homogeneous Equations, Bernoulli Equa-Tions
No ratings yet
1 Differential Eqn - Lecture Notes 6: 1.1 Substitution Methods, Homogeneous Equations, Bernoulli Equa-Tions
8 pages
1 Differential Eqn. Lecture Notes 1: 1.1 Introduction and Elementary Concepts
No ratings yet
1 Differential Eqn. Lecture Notes 1: 1.1 Introduction and Elementary Concepts
4 pages
03 Television - Wag
No ratings yet
03 Television - Wag
18 pages
1N/FDLL 914/A/B / 916/A/B / 4148 / 4448: Small Signal Diode
No ratings yet
1N/FDLL 914/A/B / 916/A/B / 4148 / 4448: Small Signal Diode
4 pages
T400
No ratings yet
T400
655 pages
Diego Hern An Peluffo-Ord O Nez Electronic Eng., M.Eng., PHD
No ratings yet
Diego Hern An Peluffo-Ord O Nez Electronic Eng., M.Eng., PHD
13 pages
73.real Time Environment Monitoring System and Status Updating in PC
No ratings yet
73.real Time Environment Monitoring System and Status Updating in PC
3 pages
Welch Allyn Atlas Patient Monitor - Service Manual 2007 PDF
No ratings yet
Welch Allyn Atlas Patient Monitor - Service Manual 2007 PDF
225 pages
Immobilizer (Diag)
100% (3)
Immobilizer (Diag)
24 pages
TIP131 ONSemiconductor
No ratings yet
TIP131 ONSemiconductor
4 pages
ATM Terminal Security Using Fingerprint Recognition: Vaibhav R. Pandit Kirti A. Joshi Narendra G. Bawane, PH.D
No ratings yet
ATM Terminal Security Using Fingerprint Recognition: Vaibhav R. Pandit Kirti A. Joshi Narendra G. Bawane, PH.D
5 pages
Datasheet PDF
No ratings yet
Datasheet PDF
10 pages
DC Drives Numericals 2 PDF
No ratings yet
DC Drives Numericals 2 PDF
20 pages
Assignment 2
No ratings yet
Assignment 2
6 pages
Mtto de Ups Preventivo Check List
No ratings yet
Mtto de Ups Preventivo Check List
1 page
Design of Common Source Amplifier: September 2016
No ratings yet
Design of Common Source Amplifier: September 2016
7 pages
341
100% (1)
341
71 pages
Reason RT430 Brochure EN 2018 05 32058A A4
No ratings yet
Reason RT430 Brochure EN 2018 05 32058A A4
7 pages
DP ASi Link 20E - en - 2008 08 - Manual - C79000 G8976 C235 01
No ratings yet
DP ASi Link 20E - en - 2008 08 - Manual - C79000 G8976 C235 01
144 pages
ANT ADU4516R6v06 2199 Datasheet PDF
No ratings yet
ANT ADU4516R6v06 2199 Datasheet PDF
2 pages
555 Circuits
100% (1)
555 Circuits
23 pages
Flash ADCs
No ratings yet
Flash ADCs
60 pages
BITX Build by G4UFS
No ratings yet
BITX Build by G4UFS
8 pages
EBU - TECH 3299 HDTV Standards PDF
No ratings yet
EBU - TECH 3299 HDTV Standards PDF
6 pages
STM32L443
No ratings yet
STM32L443
221 pages
EXOR Gate Using Concurrent Statements: RTL View
No ratings yet
EXOR Gate Using Concurrent Statements: RTL View
59 pages
Tes P 103.05 r0 Ups System
No ratings yet
Tes P 103.05 r0 Ups System
9 pages
Instaspin - FOC and Motion
100% (1)
Instaspin - FOC and Motion
574 pages
Fast-Recovery Rectifier Diodes: External Dimensions
No ratings yet
Fast-Recovery Rectifier Diodes: External Dimensions
2 pages
NetworkInitiated Document
No ratings yet
NetworkInitiated Document
61 pages

Large and Fast: Exploiting Memory Hierarchy

Uploaded by

Large and Fast: Exploiting Memory Hierarchy

Uploaded by

Chapter 5

Large and Fast:

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 2

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 3

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 5

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 6

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 7

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 8

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 9

Index V Tag Data

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 10

Index V Tag Data

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 11

Index V Tag Data

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 12

Index V Tag Data

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 13

Index V Tag Data

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 14

Index V Tag Data

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 15

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 16

Block address = 1200/16 = 75

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 17

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 18

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 19

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 20

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 21

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 22

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 23

Chapter 5 Large and Fast: Exploiting Memory Hierarchy 24

You might also like