Unit-4 Memory Notes
Unit-4 Memory Notes
Memory Hierarchy
Memory Hierarchy is an enhancement to organize the memory such that it can minimize the
access time. The Memory Hierarchy was developed based on a program behavior known as
locality of references. The figure below clearly demonstrates the different levels of memory
hierarchy
1
The memory unit that communicates directly with the CPU is called the main memory.
The principal technology used for the main memory is based on semiconductor integrated
circuits. Integrated circuit RAM chips are available in two possible operating modes, static
and dynamic.
• Static RAM: The static RAM consists essentially of internal flip-flops that store the
binary information. The stored information remains valid as long as power is applied
to the unit.
• Dynamic RAM: The dynamic RAM stores the binary information in the form of
electric charges that are applied to capacitors. The capacitors are provided inside the
chip by MOS transistors. The stored charge on the capacitors tends to discharge with
time and the capacitors must be periodically recharged by refreshing the dynamic
memory.
Refreshing is done by cycling through the words every few milliseconds to restore the
decaying charge.
The ROM portion of main memory is needed for storing an initial program called a
bootstrap loader. The bootstrap loader is a program whose function is to start the computer
software operating when power is turned on.
Since RAM is volatile, its contents are destroyed when power is turned off. The contents of
ROM remain unchanged after power is turned off and on again.
2
RAM and ROM Chips:
The block diagram of a RAM chip is shown in Fig.
• The capacity of the memory is 128 words of eight bits (one byte) per word. This
requires a 7-bitaddress and an 8-bit bidirectional data bus.
• The read and write inputs specify the memory operation and the two chips select
(CS) control inputs are for enabling the chip only when it is selected by the
microprocessor.
• The availability of more than one control input to select the chip facilitates the
decoding of the address lines when multiple chips are used in the microcomputer.
• The read and write inputs are sometimes combined into one line labeled R/W. When
the chip is selected, the two binary states in this line specify the two operation of
read or write.
• The function table listed in Fig. specifies the operation of the RAM chip.
• The unit is in operation only when CS1 = 1and CS2 = 0. The bar on top of the second
select variable indicates that this input is enabled when it is equal to 0.
• If the chip select inputs are not enabled, or if they are enabled but the read or write
inputs are not enabled, the memory is inhibited and its data bus is in a high-
impedance state.
• When CS1 = 1 and CS2 = 0, the memory can be placed in a write or read mode.
• When the WR input is enabled, the memory stores a byte from the data bus into a
location specified by the address input lines.
• When the RD input is enabled, the content of the selected byte is placed into the
data bus. The RD and WR signals control the memory operation as well as the bus
buffers associated with the bidirectional data bus .
However, since a ROM can only read, the data bus can only be in an output mode. For the
same-size chip, it is possible to have more bits of ROM than of RAM, because the internal
binary cells in ROM occupy less space than in RAM. For this reason, the diagram specifies a
512-byte ROM, while the RAM has only 128 bytes.
The nine address lines in the ROM chip specify any one of the 512 bytes stored in it. The
two chip select inputs must be CS1 = 1 and CS2 = 0 for the unit to operate. Otherwise, the
data bus is in a high-impedance state. There is no need for a read or write control because
the unit can only read. Thus when the chip is enabled by the two select inputs, the byte
selected by the address lines appears on the data bus.
3
4
Memory Address Map: The interconnection between memory and processor is established
from knowledge of the size of memory needed and the type of RAM and ROM chips
available. The addressing of memory can be established by means of a table that specifies
the memory address assigned to each chip. The table, called a memory address map, is a
pictorial representation of assigned address space for each chip in the system.
To demonstrate with a particular example, assume that a computer system needs 512
bytes of RAM and 512 bytes of ROM.
5
Question:
a. How many 128 x 8 RAM chips are needed to provide a memory capacity or 2048 bytes?
b. How many lines of the address bus must be used to access 2048 byte of memory? How
many of these lines will be common 10 all chips?
c. How many lines must be decoded for chip select? Specify the size of the decoders.
Solution:
Cache Memory
The data or contents of the main memory that are used frequently by CPU are stored in the cache
memory so that the processor can easily access that data in a shorter time. Whenever the CPU needs
to access memory, it first checks the cache memory. If the data is not found in cache memory, then
the CPU moves into the main memory.
Cache memory is placed between the CPU and the main memory. The block diagram for a cache
memory can be represented as:
The cache is the fastest component in the memory hierarchy and approaches the speed of CPU
components.
6
The basic operation of a cache memory is as follows:
o When the CPU needs to access memory, the cache is examined. If the word is found in the
cache, it is read from the fast memory.
o If the word addressed by the CPU is not found in the cache, the main memory is accessed to
read the word.
o A block of words one just accessed is then transferred from main memory to cache memory.
The block size may vary from one word (the one just accessed) to about 16 words adjacent to
the one just accessed.
o The performance of the cache memory is frequently measured in terms of a quantity
called hit ratio.
o When the CPU refers to memory and finds the word in cache, it is said to produce a hit.
o If the word is not found in the cache, it is in main memory and it counts as a miss.
o The ratio of the number of hits divided by the total CPU references to memory (hits plus
misses) is the hit ratio.
The performance of cache memory is frequently measured in terms of a quantity called hit
ratio. When the CPU refers to memory and finds the word in cache, it is said to produce a
hit. If the word is not found in cache, it is in main memory and it counts as a miss. The ratio
of the number of hits divided by the total CPU references to memory (hits plus misses) is
the hit ratio.
Three types of mapping procedures are:
1. Associative mapping
2. Direct mapping
3. Set-associative mapping
The main memory can store 32K words of 12 bits each. The cache is capable of storing 512
of these words at any given time. The CPU communicates with both memories. It first sends
a 15-bit address to cache. If there is a hit, the CPU accepts the 12-bit data from cache. If
there is a miss, the CPU reads the word from main memory and the word is then
transferred to cache.
7
Associative Mapping:
The fastest and most flexible cache organization uses an associative memory. The
associative memory stores both the address and content (data) of the memory word. This
permits any location in cache to store any word from main memory. The address value of 15
bits is shown as a five-digit octal number and its corresponding 12 -bit word is shown as a
four-digit octal number.
A CPU address of 15 bits is placed in the argument register and the associative memory is
searched for a matching address. If the address is found, the corresponding 12-bit data is
read and sent to the CPU. If no match occurs, the main memory is accessed for the word.
The address-data pair is then transferred to the associative cache memory. If the cache is
full, an address-data pair must be displaced to make room for a pair that is needed and not
presently in the cache. The decision as to what pair is replaced is determined from the
replacement algorithm that the designer chooses for the cache.
8
Direct Mapping:
The CPU address of 15 bits is divided into two fields. The nine least significant bits constitute
the index field and the remaining six bits form the tag field.
In the general case, there are 2k words in cache memory and 2n words in main memory. The
n-bit memory address is divided into two fields: k bits for the index field and n - k bits for
the tag field. The direct mapping cache organization uses the n-bit address to access the
main memory and the k-bit index to access the cache.
9
Each word in cache consists of the data word and its associated tag. When a new word is
first brought into the cache, the tag bits are stored alongside the data bits. When the CPU
generates a memory request, the index field is used for the address to access the cache. The
tag field of the CPU address is compared with the tag in the word read from the cache. If
the two tags match, there is a hit and the desired data word is in cache. If there is no match,
there is a miss and the required word is read from main memory. It is then stored in the
cache together with the new tag, replacing the previous value. The disadvantage of direct
mapping is that the hit ratio can drop considerably if two or more words whose addresses
have the same index but different tags are accessed repeatedly.
consider the numerical example shown in Fig. 12-13. The word at address zero is presently
stored in the cache (index = 000, tag = 00, data = 1220). Suppose that the CPU now wants to
access the word at address 02000. The index address is 000, so it is used to access the
cache. The two tags are then compared. The cache tag is 00 but the address tag is 02, which
does not produce a match. Therefore, the main memory is accessed and the data word 5670
is transferred to the CPU. The cache word at index address 000 is then replaced with a tag of
02 and data of 5670.
The same organization but using a block size of B words is shown in Fig
The index field is now divided into two parts: the block field and the word field. In a 512-
word cache there are 64 blocks of 8 words each, since 64 x 8 = 512. The block number is
specified with a 6-bit field and the word within the block is specified with a 3- bit field. The
tag field stored within the cache is common to all eight words of the same block. Every time
a miss occurs, an entire block of eight words must be transferred from main memory to
cache memory. Although this takes extra time, the hit ratio will most likely improve with a
larger block size because of the sequential nature of computer programs.
10
Set-Associative Mapping:
The disadvantage of direct mapping is that two words with the same index in their address
but with different tag values cannot reside in cache memory at the same time. A third type
of cache organization, called set-associative mapping, is an improvement over the direct
mapping organization in that each word of cache can store two or more words of memory
under the same index address. Each data word is stored together with its tag and the
Each index address refers to two data words and their associated tags. Each tag requires six
bits and each data word has 12 bits, so the word length is 2(6 + 12) = 36 bits. An index
address of nine bits can accommodate 512 words. Thus the size of cache memory is 512 x
36. It can accommodate 1024 words of main memory since each word of cache contains
two data words. In general, a set-associative cache of set size k will accommodate k words
of main memory in each word of cache.
The words stored at addresses 01000 and 02000 of main memory are stored in cache
memory at index address 000. Similarly, the words at addresses 02777 and 00777 are
stored in cache at index address 777. When the CPU generates a memory request, the index
value of the address is used to access the cache. The tag field of the CPU address is
11
then compared with both tags in the cache to determine if a match occurs. The comparison
logic is done by an associative search of the tags in the set similar to an associative memory
search: thus the name "set-associative." The hit ratio will improve as the set size increases
because more words with the same index but different tags can reside in cache. However,
an increase in the set size increases the number of bits in words of cache and requires more
complex comparison logic.
When a miss occurs in a set-associative cache and the set is full, it is necessary to replace
one of the tag-data items with a new value. The most common replacement algorithms
used are: random replacement, first-in, first-out (FIFO), and least recently used (LRU).
When the CPU finds a word in cache during a read operation, the main memory is not
involved in the transfer. However, if the operation is a write, there are two ways that the
system can proceed.
Write-Through: The simplest and most commonly used procedure is to update main
memory with every memory write operation, with cache memory being updated in parallel
if it contains the word at the specified address. This is called the write-through method. This
method has the advantage that main memory always contains the same data as the cache.
Write-Back: The second procedure is called the write-back method. In this method only the
cache location is updated during a write operation. The location is then marked by a flag so
that later when the word is removed from the cache it is copied into main memory. The
reason for the write-back method is that during the time a word resides in the cache, it may
be updated several times; however, as long as the word remains in the cache, it does not
matter whether the copy in main memory is out of date, since requests from the word are
filled from the cache. It is only when the word is displaced from the cache that an accurate
copy need be rewritten into main memory.
12
Virtual Memory:
In a memory hierarchy system, programs and data are first stored in auxiliary memory.
Portions of a program or data are brought into main memory as they are needed by the
CPU.
Virtual memory is a concept used in some large computer systems that permit the user to
construct programs as though a large memory space were available, equal to the totality of
auxiliary memory.
Each address that is referenced by the CPU goes through an address mapping from the so-
called virtual address to a physical address in main memory. Virtual memory is used to give
programmers the illusion that they have a very large memory at their disposal, even though
the computer actually has a relatively small main memory. A virtual memory system
provides a mechanism for translating program- generated addresses into correct main
memory locations.
This is done dynamically, while programs are being executed in the CPU. The translation or
mapping is handled automatically by the hardware by means of a mapping table.
Address Space and Memory Space
An address used by a programmer will be called a virtual address, and the set of such
addresses the address space. An address in main memory is called a location or physical
address. The set of such locations is called the memory space. In most computers the
address and memory spaces are identical. The address space is allowed to be larger than
the memory space in computers with virtual memory.
Consider a computer with a main-memory capacity of 32K words (K = 1024). Fifteen bits are
needed to specify a physical address in memory since 32K = 2 15. Suppose that the computer
has available auxiliary memory for storing 220 = 1024K words. Thus auxiliary memory has a
capacity for storing information equivalent to the capacity of 32 main memories. Denoting
the address space by N and the memory space by M, we then have for this example N =
1024K and M = 32K.
In a multiprogram computer system, programs and data are transferred to and from auxiliary
memory and main memory based on demands imposed by the CPU. Suppose that program 1
is currently being executed in the CPU. Program 1 and a portion of its associated data are
13
moved from auxiliary memory into main memory as shown in Fig. 12-16. Portions of
programs and data need not be in contiguous locations in memory since information is being
moved in and out, and empty spaces may be available in scatteredlocation in memory. In our
example, the address field of an instruction code will consist of 20 bits but physical memory
addresses must be specified with only 15 bits. Thus CPU will reference instructions and data
with a 20-bit address, but the information at this address must be taken from physical
memory because access to auxiliary storage for individual words will be prohibitively long.
A table is then needed, as shown in Fig. , to map a virtual address of 20 bits to a physical
address of 15 bits. The mapping is a dynamic operation, which means that every address is
translated immediately as a word is referenced by CPU.
14
Address Mapping Using Pages:
The table implementation of the address mapping is simplified if the information in the
address space and the memory space are each divided into groups of fixed size. The physical
memory is broken down into groups of equal size called blocks, which may range from 64 to
4096 words each. The term page refers to groups of address space of the same size.
For example, if a page or block consists of 1K words, then, using the previous example,
address space is divided into 1024 pages and main memory is divided into 32 blocks.
Although both a page and a block are split into groups of 1K words, a page refers to the
organization of address space, while a block refers to the organization of memory space.
The programs are also considered to be split into pages. Portions of programs are moved
from auxiliary memory to main memory in records equal to the size of a page. The term
"page frame" is sometimes used to denote a block.
Consider a computer with an address space of 8K and a memory space of 4K. If we split each
into groups of 1K words we obtain eight pages and four blocks. At any given time, up to four
pages of address space may reside in main memory in any one of the four blocks.
The mapping from address space to memory space is facilitated if each virtual address is
considered to be represented by two numbers: a page number address and a line within the
page. In a computer with 2p words per page, p bits are used to specify a line address and the
remaining high-order bits of the virtual address specify the page number. In the example of
Fig. 12-18, a virtual address has 13 bits. Since each page consists of 210 = 1024 words, the
high order three bits of a virtual address will specify one of the eight pages and the low-
order 10 bits give the line address within the page.
15
Note that the line address in address space and memory space is the same; the only
mapping required is from a page number to a block number.
The memory-page table consists of eight words, one for each page. The address in the page
table denotes the page number and the content of the word gives the block number where
that page is stored in main memory. The table shows that pages 1, 2, 5, and 6 are now
available in main memory in blocks 3, 0, 1, and 2, respectively.
A presence bit in each location indicates whether the page has been transferred from
auxiliary memory into main memory. A 0 in the presence bit indicates that this page is not
available in main memory. The CPU references a word in memory with a virtual address of
13 bits. The three high-order bits of the virtual address specify a page number and also an
address for the memory-page table.
The content of the word in the memory page table at the page number address is read out
into the memory table buffer register. If the presence bit is a 1, the block number thus read
is transferred to the two high-order bits of the main memory address register. The line
number from the virtual address is transferred into the 10 low-order bits of the memory
address register.
A read signal to main memory transfers the content of the word to the main memory
buffer register ready to be used by the CPU. If the presence bit in the word read from the
page table is 0, it signifies that the content of the word referenced by the virtual address
does not reside in main memory. A call to the operating system is then generated to fetch
the required page from auxiliary memory and place it into main memory before resuming
computation.
16
Question: An address space is specified by 24 bits and the corresponding memory space by
16 bits.
a. How many words are there in the address space?
b. How many words are there in the memory space?
c. If a page consists of 2K words, how many pages and blocks are there in the
system?
4. Solution:
Efficiency (Ƞ) :
1
Ƞ=
1 + 𝛾(1 ― ℎ)
Where γ 𝑖𝑠 𝑡ℎ𝑒 𝑟𝑎𝑡𝑖𝑜 𝑜𝑓 𝑚𝑎𝑖𝑛 𝑚𝑒𝑚𝑜𝑟𝑦 𝑎𝑐𝑐𝑒𝑠𝑠 𝑡𝑖𝑚𝑒 𝑡𝑜 𝑐𝑎𝑐ℎ𝑒 𝑚𝑒𝑚𝑜𝑟𝑦 𝑎𝑐𝑐𝑒𝑠𝑠 𝑡𝑖𝑚𝑒.
𝑡𝑚
17
𝛾=
𝑡𝑐
tc = 160ns
tm = 960ns
h = 0.90
18
Locality of reference:
Temporal Locality: Recently reference instruction or data are likely to be referenced again
near future.
Spatial Locality: This refers to the tendency for a process to access items whose addresses
are near one another.
The disk is divided into tracks. Each track is further divided into sectors. The point to be
noted here is that outer tracks are bigger in size than the inner tracks but they contain the
same number of sectors and have equal storage capacity. This is because the storage
density is high in sectors of the inner tracks where as the bits are sparsely
arranged in sectors of the outer tracks. Some space of every sector is used for formatting.
So, the actual capacity of a sector is less than the given capacity.
Read-Write(R-W) head moves over the rotating hard disk. It is this Read-Write head that
performs all the read and write operations on the disk and hence, position of the R- W head
is a major concern. To perform a read or write operation on a memory location, we need to
place the R-W head over that position. Some important terms must be noted here:
1. Seek time – The time taken by the R-W head to reach the desired track from it’s
current position.
2. Rotational latency – Time taken by the sector to come under the R-W head.
3. Data transfer time – Time taken to transfer the required amount of data. It depends
upon the rotational speed.
4. Controller time – The processing time taken by the controller.
5. Average Access time – seek time + Average Rotational latency + data transfer time +
controller time.
19
𝟏
2D and 2 𝟐D Memory Organization
The memory organization shown for RAMs and ROMs above suffers from a problem of
scale: it works fine when the number of words in the memory is relatively small but quickly
mushrooms as the memory is scaled up or increased in size. This happens because the
number of word select wires is an exponential function of the size of the address. Suppose
that the MAR is 10 bits wide, which means there are 1024 words in the memory. The
decoder will need to output 1024 separate lines. While this is not necessarily terrible,
increasing the MAR to 15 bits means there will be 32,768 wires, and 20 bits would be over a
million.
Fig. 1 shows a 16-word memory of 5-bit words using the conventional organization:
Notice that the decoder gets quite complicated because the number of lines coming out of
it is an exponential function of the number of wires coming in. Imagine a 32-bit address!
There would be 4 billion wires coming out.
One way to tackle the exponential explosion of growth in the decoder and word select
wires is to organize memory cells into a two-dimension grid of words instead of a one-
dimensional arrangement. Then the MAR is broken into two halves, which are fed
separately into smaller decoders. One decoder addresses the rows of the grid while the
other decoder addresses the columns. Fig. 2 shows a 2.5D memory of 16 words, each word
having 5 bits:
20
Each memory cell has an AND gate that represents the intersection of a vertical wire from
one decoder and a horizontal wire from the other. The output of this AND gate is the line
select wire.
In the above example, the total number of word select lines goes down from 16 to 8. (There
are four wires coming from each of two decoders.) If the MAR had 10 bits, there would be
1024 word select wires in the traditional organization, but only 64 in the 2.5D organization
because each half the MAR contributes 5 address bits, and 25 = 32.
Just to continue the illustration, if the MAR had 32 bits, the traditional organization would
require about 4 billion word select wires. If 2.5D organization were used, the MAR would
split into two 16-bit sections, each creating 216, or 65536, wires. Since there are two of
them, this creates 131,072 wires. The ratio of these two sizes is 0.000030518 or about
0.003%. Three thousandths of one percent! What a savings!
The usual terminology for a 2.5D memory is 21/2 memory, but this is hard to write. Nobody
is sure why it is called a two and a half dimensional thing, unless it is perhaps because an
ordinary memory is obviously two dimensional and this one is not quite three dimensional.
In a real circuit, the wires are cleverly laid out so that they go around, not through, flip-
flops, unlike our schematic diagram.
2.5D memory organization is almost always used on real memory chips today because the
savings in wiring and gates is so dramatic. Real computers use a combination of banks of
memory units, and each memory unit uses 2.5D organization.
21
Auxiliary Memory
An Auxiliary memory is known as the lowest-cost, highest-capacity and slowest-access storage in a
computer system. It is where programs and data are kept for long-term storage or when not in
immediate use. The most common examples of auxiliary memories are magnetic tapes and magnetic
disks.
Magnetic Disks
A magnetic disk is a type of memory constructed using a circular plate of metal or plastic coated with
magnetized materials. Usually, both sides of the disks are used to carry out read/write operations.
However, several disks may be stacked on one spindle with read/write head available on each
surface.
The following image shows the structural representation for a magnetic disk.
o The memory bits are stored in the magnetized surface in spots along the concentric circles
called tracks.
o The concentric circles (tracks) are commonly divided into sections called sectors.
Magnetic Tape
Magnetic tape is a storage medium that allows data archiving, collection, and backup for different
kinds of data. The magnetic tape is constructed using a plastic strip coated with a magnetic recording
medium.
The bits are recorded as magnetic spots on the tape along several tracks. Usually, seven or nine bits
are recorded simultaneously to form a character together with a parity bit.
Magnetic tape units can be halted, started to move forward or in reverse, or can be rewound.
However, they cannot be started or stopped fast enough between individual characters. For this
reason, information is recorded in blocks referred to as records.
22
23
24
25