0% found this document useful (0 votes)
6 views

UNIT IV.ppt

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

UNIT IV.ppt

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

UNIT IV

MEMORY AND I/O


ORGANIZATION
CONTENTS
• MEMORY HIERARCHY
• MEMORY CHIP ORGANIZATION
• CACHE MEMORY
• VIRTUAL MEMORY
• PARALLEL BUS ARCHITECTURES
• INTERNAL COMMUNICATION METHODOLOGIES
• SERIAL BUS ARCHITECTURES
• MASS STORAGE
• INPUT AND OUTPUT DEVICES
MEMORY HIERARCHY

C
O
S
T

CAPACITY
MEMORY HIERARCHY
• The memory unit is an essential component in
any digital computer since it is needed for storing
programs and data.
• Computer memory should be fast , large and
inexpensive.
• The memory hierarchy system consists of all
storage devices employed in a computer system
from the slow by high-capacity auxiliary memory
to a relatively faster main memory, to an even
smaller and faster cache memory.
MAIN MEMORY
• The memory units that directly communicate
with CPU is called the main memory.
• The main memory occupies a central position
by being able to communicate directly with the
CPU.
• Main memory refers to physical memory that
is internal to the computer
STORAGE DEVICES
SECONDARY MEMORY
• Devices that provide backup storage are called
secondary memory.
• Secondary memory devices through an I/O
processor.
• Secondary memory access time is usually 1000
times that of main memory.
SECONDARY MEMORY DEVICES
CACHE MEMORY
• A special very-high-speed memory called
cache.
CACHE MEMORY
• It is used to increase the speed of processing
by making current programs and data available
to the CPU at a rapid rate.
• The cache is used for storing segments of
programs currently being executed in the
CPU and temporary data frequently needed in
the present calculations.
• The typical access time ratio between cache
and main memory is about 1 to 7ns
Cache location
Two Levels
• Memory hierarchy with two, three and four
levels
THREE LEVELS
MEMORY TECHNOLOGY
• There are four primary technologies used
today in memory hierarchies. Main memory is
implemented from
1. DRAM (dynamic random access memory)
2. SRAM (static random access memory).
3. Flash Memory
4. Disk Memory
• The fourth technology, used to implement the
largest and slowest level in the hierarchy in
servers, is magnetic disk
DRAM Technology: Dynamic
Random Access Memory
• Dynamic: must be refreshed periodically
• Volatile: loses data when power is removed
• DRAM is less costly per bit than SRAM,
although it is substantially slower.
• It provides higher density levels.
• More data can be stored using DRAM.
Internal organization of DRAM
MEMORY CHIP ORGANIZATION
RAM CHIP
CACHE MEMORY
CACHE MAPPING FUNCTIONS
• Mapping functions are used to
determine/specify where memory blocks are
placed in cache.
• Determine how memory blocks are mapped
into cache lines.
1) Direct mapping
2) Associative Mapping
3) Set Associative Mapping
How address is used to find respective
block in cache?
• Three portions of address

TAG INDEX BLOCK

20 10 2

• Tag Field: Used to compare with the value of


tag tag field in the cache
• Cache Index: Used to select the block within
cache.
• The index field is used to select a particular
word/block in the cache.
• The tag field in the address field and the tag in
the selected block of cache are compared to
determine whether the entry in the cache
corresponds to the requested address.
• If both are equal and the valid bit is set, then the
request hits in the cache & the word is supplied
to the processor.
• Otherwise cache miss occurs.
1. Direct Mapped Cache.
• A cache structure in which each memory location is
mapped to exactly one location in the cache .
• (ie) A memory block can go in exactly one
location/place in the cache
• For eg If a cache has 128 blocks
MM Block 0,128,256, …. Cache block 0

MM Block 1,129,257, …. Cache block 1

MM Block 2,130,258, …. Cache block 2


Direct Mapping
• There is only one cache
Block where memory block
12 can be found.
• The Position of a memory
Block is given by
(Block Number) modulo (No.of blocks in the cache)
• Here the block is given by
12 Modulo 8 = 4
2. Fully Associative Mapping
• A block can be placed in any location in the
cache.
• A block in MM may be associated with any entry
in the cache.
• To find a given block all the entries in the cache
must be searched because a block can be placed
in any one.
• The tag bits of the address received from the
processor is compared to the tag bits of each
block of the cache to see if the required block is
present.
Fully associative mapping
• Here the memory
block or block address 12
Can appear in any of the
8 cache blocks.
3. Set Associative Mapping
• Combination of Direct mapped & Fully associative
map.
• A block is directly mapped into a set, and then
all blocks in the set are searched for a match.
• There are a fixed number of locations(at least 2)
where each block can be mapped.
• A set associative cache with n locations are called
“n – way Set associative cache”.
• It consists of number of sets with each set
consisting of n blocks.
• Each block in the main memory maps to a
unique set in the cache, given by the index
field and a block can be placed in any element
of that set.
• The set containing the memory block
is given by
(Block number) Modulo (No.of sets in the cache)

• Here , In a 2 Way set associative cache


There are 4 sets & so memory block ‘12’
Must be in 12 modulo 4 =0
CACHE HIT AND CACHE MISS
Calculating Average Memory Access
Time
• Find the AMAT for a processor with a 1 ns clock cycle time, a miss
penalty of 20 clock cycles, a miss rate of 0.05 misses per instruction, and
a cache access time (including hit detection) of 1 clock cycle. Assume that
the read and write miss penalties are the same and ignore other write
stalls.
• Soln:
The average memory access time per instruction is

AMAT= Time for a Hit + Miss Rate x Miss Penalty

= 1 + (0.05 x 20)
= 2 Clock cycles or 2 ns
Measuring Cache Performance
• Calculating cache misses in all three mapping
technologies.
Problem

Sequence of block addresses :


0,8,0,6,8
1.Direct map method
Step 1: Determining Cache Block:
(Block Number) modulo (No.of blocks in the cache)
• Step 2:Fill cache content after each reference

• Colored – new entry


• When referring first time –new entry(Miss)
• In 2nd row Block ‘0’ is replaced with new entry of block ‘8’

No.of Cache Miss = 5


2. Set Associative Map method
• Has 2 sets with indices 0 and 1. (2 Sets with 2 elements
per set)
Step 1 :Determining Cache set

*Note: Use LRU(Least Recently Used) block to replace on miss


Step 2:Fill cache content after each reference:

• Here when block 6 is referenced it replaces block 8 , Since


block 8 is least recently used then block 0.
• And when again block 8 is referenced it replaces block 0.

No.of Cache Miss = 4


3. Fully Associative Method
• Any mem block at any cache block. So fill the cache contents directly.

No.of Cache Miss = 3

3 misses is the best we can do , because three unique block addresses are
accessed. For first time referring always report a miss.
Summary of different Mapping and
their relative performance

CACHE TYPE HIT RATIO SEARCH SPEED


DIRECT MAPPED GOOD BEST

FULLY ASSOCIATIVE BEST MODERATE

SET ASSOCIATIVE Very Good, Better as N Good, Worse as N


increases increases
Example: Cache Performance
• Assume the miss rate of an instruction cache is 2% and the
miss rate of the data cache is 4%. If a processor has a CPI of 2
without any memory stalls and the miss penalty is 100 cycles
for all misses, determine how much faster a processor would
run with a perfect cache that never missed. Assume the
frequency of all loads and stores is 36%.
Soln:
The number of memory miss cycles for instructions in terms of
the Instruction count (I) is

Instruction miss cycles I X 2% X 100 = 2.00 X I


As the frequency of all loads and stores is 36%, we can find the
number of memory miss cycles for data references.

Data miss cycles I X 36% X 4% X 100 = 1.44 X I

The total number of memory-stall cycles is

2.00 x I + 1.44 x I = 3.44 I

• Accordingly, the total CPI including memory stalls is

2 + 3.44 = 5.44
• Since there is no change in instruction count
or clock rate, the ratio of the CPU execution
times is
VIRTUAL MEMORY
VIRTUAL MEMORY
• Physical memory is not large enough to fit some
larger programs completely. When programs are
larger that physical memory.
• Virtual memory is a technique that allows execution
of a process that are not completely in MM.
• A part of the program is stored in secondary storage
devices like magnetic discs. The MM has only the
part of the program which is currently executing &
thus thy MM can act as a CACHE for secondary
storage.
• The OS moves the programs and data
between the MM and the Secondary storage.
They are brought into MM from SS when
needed.
Logical The address generated by CPU at
address compile time- Called Virtual address

Physical
An address in the MM. Actual MM
address address. Address loaded in PC during
load time.
Memory Management Unit (MMU)
• Converts logical Virtual address into actual Physical
address.

The Binary address that the processor issues for a data/instruction are called
virtual or logical address. These are translated into physical address.
• If a virtual address refers to the data that is
currently in the Physical memory, then it is
accessed immediately from the MM.
• On the other hand if the reference address is
not in the Main Memory, its contents must be
brought into a suitable location in Main
memory before they can be used.
Address Translation / Address
Mapping (VA to PA)
• The processor produces a virtual address, which is
translated by a combination of hardware and soft
ware to a physical address, which in turn can be used
to access main memory.
Address Translation
• The process by which VA is mapped to an address used to
access memory.
• Both VM and PM are broken into PAGES., so that a virtual
page is mapped to a physical page.
• No.of bits in page offset 🡪 Page Size
• No.Of bits in Page number 🡪 No.of pages
• Above diagram no.of offset bits 🡪12
– Page size = 212 = 4 KB
– No.of Physical pages = 218
– No.Of Virtual pages = 220
Page Table
• The table containing the virtual to physical
address translations in a virtual memory
system.
• The table, which is stored in memory, is
typically indexed by the virtual page number.
• Each entry in the table contains the physical
page number for that virtual page if the page
is currently in memory.
• Each program has its own page table, which maps
the virtual address space of that program to main
memory.
• Page table register : To indicate the location of the
page table in memory, the hardware includes a
register that points to the start of the page table.

Contents of Virtual Page Corresponding


Pg.Table reg Number index entry in
Page Table
Page Fault(Page not in MM)
• When a program generates an address request to a
page that is not available in the Main Memory.
• If the valid bit for a virtual page is off , a page fault
occurs.
• The page must be brought from disc to MM.
• When a page fault occurs, if all the pages in main
memory are in use, the operating system must
choose a page to replace. (Page replacement) LRU
• Reference bit – set whenever a pg is accessed
Page Fault
Translation Look aside Buffer (TLB)
• A cache to Page table.
• TLB: A special cache that keeps track of
recently used address translations/mappings.
• TLB contains subset of translations in the page
table.
• A copy of small portion of page table is in a
cache called TLB.
TLB
BUS architectures
Address bus data bus Control bus
• Sequence of block addresses

0 , 6, 8 , 9, 0 , 6
, 6 , 7, 0, 9, 6
• Cache blocks 4
• Find no.of cache miss in each technique

You might also like