Ca 5
Ca 5
MEMORY ORGANIZATION
Contents
Memory organization:
• Memory hierarchy
• Main memory
• Auxiliary memory
• Associative memory
• Cache memory
• Virtual memory
Memory Hierarchy
• At the bottom of the hierarchy are the relatively slow magnetic tapes used to store
removable files. Next are the magnetic disks used as backup storage. The main memory
occupies a central position by being able to communicate directly with the CPU and with
auxiliary memory devices through an I/O processor.
• When programs not residing in main memory are needed by the CPU, they are brought
in from auxiliary memory. Programs not currently needed in main memory are
transferred into auxiliary memory to provide space for currently used programs and data.
• A special very-high speed memory called a cache sometimes used to increase the speed
of processing by making current programs and data available to the CPU at a rapid rate.
Memory Hierarchy
• The cache memory is employed in computer systems to compensate for the speed
differential between main memory access time and processor logic. CPU logic is usually
faster than main memory access time, with the result that processing speed is limited
primarily by the speed of main memory.
• The key register provides a mask for choosing a particular field or key in the argument
word. The entire argument is compared with each memory word if the key register
contains all l' s. Otherwise, only those bits in the argument that have l's in their
corresponding position of the key register are compared.
Associative Memory
• a numerical example, suppose that the argument register A and the key register K
have the bit configuration shown below. Only the three leftmost bits of A are
compared with memory words because K has 1's in these positions.
• Word 2 matches the unmasked argument field because the three leftmost bits of the
argument and the word are equal.
• The relation between the memory array and external registers in an associative
memory is shown in Fig.
• The cells in the array are marked by the letter C with two subscripts. The first
subscript gives the word number and the second specifies the bit position in the word.
Associative Memory
• Thus cell Cij is the cell for bit j in word i. A bit Ai in the argument register is compared
with all the bits in column j of the array provided that Ki = 1. This is done for all columns
j = 1, 2, . . . , n.
• If a match occurs between all the unmasked bits of the argument and the bits in word i,
the corresponding bit M1 in the match register is set to 1. If one or more unmasked bits of
the argument and the word do not match, M1 is cleared to 0.
Match Logic
• and constitutes the AND operation of all pairs of matched bits in a word.
Cache Memory
• The active portions of the program and data are placed in a fast small memory, the
average memory access time can be reduced, thus reducing the total execution time of the
program. Such a fast small memory is referred to as a cache memory. It is placed between
the CPU and main memory as illustrated in Fig.
• The cache memory access time is less than the access time of main memory by a factor of
5 to 10. The cache is the fastest component in the memory hierarchy and approaches the
speed of CPU components.
• The fundamental idea of cache organization is that by keeping the most frequently
accessed instructions and data in the fast cache memory, the average memory access time
will approach the access time of the cache.
Figure: Addressing
relationships between
main and cache
memories.
Cache Memory
• In the general case, there are 2k words in cache memory and 2n
words in main memory. The n-bit memory address is divided into
two fields: k bits for the index field and n - k bits for the tag field.
The direct mapping cache organization uses the n-bit address to
access the main memory and the k-bit index to access the cache.
• Each word in cache consists of the data word and its associated tag.
When a new word is first brought into the cache, the tag bits are
stored alongside the data bits. When the CPU generates a memory
request, the index field is used for the address to access the cache.
the advantage that main memory always contains the same data as
the cache.
write-back:
• The second procedure is called the write-back method. In this
method only the cache location is updated during a write operation.
The location is then marked by a flag so that later when the word is
removed from the cache it is copied into main memory.
Virtual Memory
• Virtual memory is a concept used in some large computer systems
that permit the user to construct programs as though a large memory
space were available, equal to the totality of auxiliary memory. Each
address that is referenced by the CPU goes through an address
mapping from the so-called virtual address to a physical address in
main memory.
• Virtual memory is used to give programmers the illusion that they
have a very large memory at their disposal, even though the
computer actually has a relatively small main memory. A virtual
memory system provides a mechanism for translating program-
generated addresses into correct main memory locations.
Virtual Memory
Address Space And Memory Space
• An address used by a programmer will be called a virtual address, and
the set of such addresses the address space. An address in main
memory is called a location or physical address. The set of such
locations is called the memory space.
• Thus the address space is the set of addresses generated by programs
as they reference instructions and data; the memory space consists of
the actual main memory locations directly addressable for processing.
• As an illustration, consider a computer with a main-memory capacity
of 32K words (K = 1024). Fifteen bits are needed to specify a physical
address in memory since 32K = 215.
• Suppose that the computer has available auxiliary memory for storing
220 = 1024K words. Denoting the address space by N and the
memory space by M, we then have for this example N = 1024K and M
= 32K.
Virtual Memory
Figure : Relation between address and memory space in a virtual memory system.
• In a virtual memory system, programmers are told that they have the
total address space at their disposal. Moreover, the address field of the
instruction code has a sufficient number of bits to specify all virtual
addresses.
• In our example, the address field of an instruction code will consist of
20 bits but physical memory addresses must be specified with only 15
bits. Thus CPU will reference instructions and data with a 20-bit
Virtual Memory
• To map a virtual address of 20 bits to a physical address of 15 bits.
The mapping is a dynamic operation, which means that every
address is translated immediately as a word is referenced by CPU.
The mapping table may be stored in a separate memory as shown in
Fig.in main memory.
Figure: Address space and memory space split into groups of 1K words.
• The organization of the memory mapping table in a paged system is shown in
Fig. The memory-page table consists of eight words, one for each page. The
address in the page table denotes the page number and the content of the word
gives the block number where that page is stored in main memory. The table
shows that pages 1, 2, 5 and 6 are now available in main memory in blocks 3, 0,
1, and 2, respectively.
• A presence bit in each location indicates whether the page has been transferred
from auxiliary memory into main memory. A 0 in the presence bit indicates that
this page is not available in main memory. The CPU references a word in
memory with a virtual address of 13 bits. The three high-order bits of the virtual
address specify a page number and also an address for the memory-page table.
Virtual Memory
• Characteristics
- Standard von Neumann machine
- Instructions and data are stored in memory
- One operation at a time
- Parallel processing is achieved by means of multiple functional units or by
pipeline.
• Limitations
- Von Neumann bottleneck
- Maximum speed of the system is limited by the Memory Bandwidth
- Limitation on Memory Bandwidth
- Memory is shared by CPU and I/O
Parallel Processing
MISD Computer System
Characteristics
Characteristics
- Only one copy of the program exists
- A single controller executes one instruction at a time
Parallel Processing
MIMD Computer System
Characteristics
Speedup
Sk: Speedup
Sk = n*tn / (k + n - 1)*tp
Parallel Processing
Pipeline Speedup
Example
- 4-stage pipeline
- sub opertion in each stage; tp = 20nS
- 100 tasks to be executed
- 1 task in non-pipelined system; 20*4 = 80nS
Pipelined System
(k + n - 1)*tp = (4 + 99) * 20 = 2060nS
Non-Pipelined System
n*k*tp = 100 * 80 = 8000nS
Speedup
Sk = 8000 / 2060 = 3.88
4-Stage Pipeline is basically identical to the system
with 4 identical function units
Parallel Processing
General Pipeline