Chapter 2z Ppt
Chapter 2z Ppt
Memory
Key Characteristics of Computer Memory
Systems
• Memory can either be volatile, nonvolatile,
or nonerasable
• Capacity
• Cost
Performance
Three performance parameters influence
performance:
Transfer rate
Memory Hierarchy
As we move from the top of the pyramid to
the bottom we need:
a) Decreasing cost per bit
b) Increasing capacity
c) Increasing access time
d) Decreasing frequency of access of the memory
by the processor
Semiconductor Main
Memory
Semiconductor Memory Types
Memory Cell Operation
Organization of bit cells in a memory chip
Organization of a 2M × 32 memory module using 512K × 8
static memory chips
Internal organization of a 32M × 8 dynamic memory chip
Typical 16 Mb DRAM (4M x 4)
Cache memories
Read hit:
- The data is obtained from the cache.
Write hit:
- Cache is a replica of the contents of the main memory.
- Contents of the cache and the main memory may be
updated simultaneously. This is the write-through
protocol.
- Update the contents of the cache, and mark it as updated
by setting a bit known as the dirty bit or modified bit.
- The contents of the main memory are updated when this
block is replaced. This is write-back or copy-back
Cache Memory-Read/Write
• If the data is not present in the cache, then a Read miss or
Write miss occurs.
Read miss:
- Block of words containing this requested word is transferred
from the memory.
- After the block is transferred, the desired word is forwarded
to the processor.
- The desired word may also be forwarded to the processor as
soon as it is transferred without waiting for the entire block to
be transferred. This is called load-through or early-restart.
Write-miss:
- Write-through protocol is used, then the contents of the main
memory are updated directly.
- If write-back protocol is used, the block containing the
addressed word is first brought into the cache. The desired
word is overwritten with new information.
Locality of reference
• Analysis of programs indicates that many instructions in
localized areas of a program are executed repeatedly
during some period of time, while the others are accessed
relatively less frequently.
—These instructions may be the ones in a loop,
nested loop or few procedures calling each
other repeatedly.
—This is called “locality of reference”.
• Temporal locality of reference:
— Recently executed instruction is likely to be executed
again very soon.
• Spatial locality of reference:
— Instructions with addresses close to a recently
instruction are likely to be executed soon.
Locality of reference (contd..)
• Cache memory is based on the concept of locality
of reference.
— If active segments of a program are placed in a fast
cache memory, then the execution time can be reduced.
• Temporal locality of reference:
— Whenever an instruction or data is needed for the first
time, it should be brought into a cache. It will hopefully
be used again repeatedly.
• Spatial locality of reference:
— Instead of fetching just one item from the main memory
to the cache at a time, several items that have addresses
adjacent to the item being fetched may be useful.
— The term “block” refers to a set of contiguous addresses
locations of some size.
Hit Rate & Miss Penalty
• A successful access to data in a cache is called a hit.
• The number of hits stated as a fraction of all attempted
accesses is called the hit rate
• The number of misses stated as a fraction of all attempted
accesses is called the miss rate.
• High hit rates well over 0.9 are essential for high-
performance computers.
• Performance is adversely affected by the actions that need
to be taken when a miss occurs.
• The extra time needed to bring the desired information into
the cache is called miss penalty.
• Average access time experienced by processor is given by
tavg = hC + (1 − h)M Where
h hit rate (1-h) miss rate
C time to access data from cache
M miss penalty
Example : Suppose that the processor has access to two levels of memory. Level 1
contains 1000 words and has an access time of 0.01 µs; level 2 contains 100,000
words and has an access time of 0.1 µs. Assume 95% of the memory accesses are
found in Level 1. Calculate the average time to access a word in memory.
i= j modulo m
Tag Line Word
Ex: 10 101 10
icache line no.
jMM block no.
mno. of lines in cache
Direct- Mapping Cache Organization
Direct mapping
Main
•Block j of the main memory maps to j modulo 128 of
Block 0
memory the cache. 0 maps to 0, 129 maps to 1.
Cache Block 1 •More than one memory block is mapped onto the
tag
Block 0
same
tag
position in the cache.
Block 1 •May lead to contention for cache blocks even if the
Block 127 cache is not full.
•Resolve the contention by allowing new block to
Block 128
replace the old block, leading to a trivial replacement
tag
Block 127 Block 129 algorithm.
•Memory address is divided into three fields:
- Low order 4 bits determine one of the 16
words in a block.
Block 255 - When a new block is brought into the cache,
Tag Block Word the the next 7 bits determine which cache
Block 256
5 7 4 block this new block is placed in.
Main memory address
Block 257 - High order 5 bits determine which of the possible
32 blocks is currently present in the cache. These
are tag bits.
•Simple to implement but not very flexible.
Block 4095
Mapping of MM blocks(Bj) to cache lines(Li) for
Associative mapping
Tag • The penalty with associative mapping is the cost
00000---11111
bits of comparing a tag with every line in the cache
L0 B0-------B31 — To do this efficiently the cache must be small
• Address length = (s + w) bits = 5 + 2 bits
L1 B0-------B31 • Number of addressable units = 2s+w words= 232
L2 B0-------B31 • Block size = line size = 2w words= 22 =4 words
• Size of tag = s bits= 5 bits
L3 B0-------B31
L4 B0-------B31
Tag bits Word bits
L5 B0-------B31
5 2
L6 B0-------B31
L7 B0-------B31
Tag bits Word bits
Ex: 11001 10
Fully Associative Cache
Associative mapping
Main Block 0
memory
•Main memory block can be placed into
Block 1
Cache
any cache position.
tag
Block 0 •Memory address is divided into two
tag
Block 1 fields:
Block 127
- Low order 4 bits identify the word
within a block.
Block 128
- High order 12 bits or tag bits identify
tag
Block 127 Block 129 a memory block when it is resident in
the cache.
•Flexible, and uses cache space
efficiently.
Block 255
•Replacement algorithms can be used to
Tag Word
Block 256 replace an existing block in the cache
12 4
Block 257 when the cache is full.
Main memory address
•Cost is higher than direct-mapped
cache because of
the need to search all 128 patterns to
Block 4095 determine whether a given block is in the
cache.
K- Way Set-Associative Cache Organization
2-Way Set Associative
S0
S1
S3
No. of Main Memory Locations= No. of blocks x No. of words per block
=212 x 27 = 219
There are four types of I/O commands that an I/O module may
receive when it is addressed by a processor:
1) Control
- used to activate a peripheral and tell it what to do
2) Test
- used to test various status conditions associated with an I/O
module and its peripherals
3) Read
- causes the I/O module to obtain an item of data from the
peripheral and place it in an internal buffer
4) Write
- causes the I/O module to take an item of data from the data bus
and subsequently transmit that data item to the peripheral
+ 3 Techniques for Input of a Block of data
I/O Instructions
+
Interrupt-Driven I/O
+
Simple
Interrupt
Processing
+
Direct Memory Access (DMA)