0% found this document useful (0 votes)
35 views

Hardware Support For Virtual Memory: CS2100 - Computer Organization

This document discusses hardware support for virtual memory through address translation. It explains that virtual memory allows programs to run in their own virtual address space through the use of page tables that map virtual addresses to physical addresses. Translation lookaside buffers (TLBs) cache these mappings to enable fast address translation and avoid costly page table lookups on every memory access. TLB misses may result in a page fault if the requested page is not in main memory, which requires servicing from the operating system.

Uploaded by

amanda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views

Hardware Support For Virtual Memory: CS2100 - Computer Organization

This document discusses hardware support for virtual memory through address translation. It explains that virtual memory allows programs to run in their own virtual address space through the use of page tables that map virtual addresses to physical addresses. Translation lookaside buffers (TLBs) cache these mappings to enable fast address translation and avoid costly page table lookups on every memory access. TLB misses may result in a page fault if the requested page is not in main memory, which requires servicing from the operating system.

Uploaded by

amanda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 25

CS2100 Computer Organization

Hardware support
for virtual memory

Review: The Memory Hierarchy

Take advantage of the principle of locality to present the


user with as much memory as is available in the cheapest
technology at the speed offered by the fastest technology
Processor
4-8 bytes (word)

Increasing
distance
from the
processor in
access time

L1$
8-32 bytes (block)

L2$
1 to 4 blocks

Main Memory

Inclusive what
is in L1$ is a
subset of what
is in L2$ is a
subset of what
is in MM that is
a subset of is
in SM

1,024+ bytes (disk sector = page)

Secondary Memory

(Relative) size of the memory at each level

Virtual Memory

Use main memory as a cache for secondary memory

Allows efficient and safe sharing of memory among multiple


programs

Provides the ability to easily run programs larger than the size of
physical memory

Simplifies loading a program for execution by providing for code


relocation (i.e., the code can be loaded anywhere in main
memory)

What makes it work? again the Principle of Locality

A program is likely to access a relatively small portion of its


address space during any period of time

Each program is compiled into its own address space a


virtual address space

During run-time each virtual address must be translated to a


physical address (an address in main memory)

A physically addressed machine

Using physical addressing

All programs share one address space:


The physical address space

Machine language programs must be aware of the


machine organization

No way to prevent a program from accessing any


machine resource

The solution: virtual addressing

User programs run in an standardized virtual address


space

Address Translation hardware managed by the operating


system (OS) maps virtual address to physical memory

Hardware supports modern OS features:

Protection,

Translation,

Sharing

Two Programs Sharing Physical Memory

A programs address space is divided into pages (all one


fixed size) or segments (variable sizes)

The starting location of each page (either in main memory or in


secondary memory) is contained in the programs page table
Program 1
virtual address space
main memory

Program 2
virtual address space

Address Translation

A virtual address is translated to a physical address by a


combination of hardware and software

Virtual Address (VA)


31 30

. . .

Virtual page number

12 11

. . .

Page offset

Translation

Physical page number


29

. . .

Page offset
12 11

Physical Address (PA)

So each memory request first requires an address


translation from the virtual space to the physical space

A virtual memory miss (i.e., when the page is not in physical


memory) is called a page fault

MIPS R4000: Address Space Model

Address translation on MIPS R4400

10

MIPS R4000: Whos Running on the CPU?

11

Address Translation Mechanisms


Virtual page #

Offset

Physical page #
Offset
Physical page
V
base addr
1
1
1
1
1
1
0
1
0
1
0

Main memory

Page Table
(in main memory)
Disk storage

12

Page tables may not fit in memory!

13

Virtual Addressing with a Cache

Thus it takes an extra memory access to translate a VA


to a PA
VA
CPU

miss

PA
Translation

Cache

Main
Memory

hit
data

This makes memory (cache) accesses very expensive (if


every access was really two accesses)

The hardware fix is to use a Translation Lookaside Buffer


(TLB) a small cache that keeps track of recently used
address mappings to avoid having to do a page table
lookup
14

Making Address Translation Fast


Virtual page #

Tag

Physical page
base addr

1
1
1
0
1

Physical page
V
base addr
1
1
1
1
1
1
0
1
0
1
0

TLB

Main memory

Page Table
(in physical memory)
Disk storage

15

Translation Lookaside Buffers (TLBs)

Just like any other cache, the TLB can be organized as


fully associative, set associative, or direct mapped
V

Virtual Page #

Physical Page #

Dirty

Ref

Access

Who is
permitted to do
what on this
page, i.e.
access rights.

TLB access time is typically smaller than cache access


time (because TLBs are much smaller than caches)

TLBs are typically not more than 128 to 256 entries even on high
end machines

16

A TLB in the Memory Hierarchy


VA
CPU

t
TLB
Lookup
miss

hit
PA

miss

Cache

Main
Memory

hit

Translation
data

A TLB miss is it a page fault or merely a TLB miss?

If the page is loaded into main memory, then the TLB miss can be
handled (in hardware or software) by loading the translation information
from the page table into the TLB
- Takes 10s of cycles to find and load the translation info into the TLB

If the page is not in main memory, then its a true page fault
- Takes 1,000,000s of cycles to service a page fault

TLB misses are much more frequent than true page faults

17

Some Virtual Memory Design Parameters


Paged VM

TLBs

Total size

16,000 to
16 to 512
250,000 words entries

Total size (KB)

250,000 to
1,000,000,000

Block size (B)

4000 to 64,000 4 to 32

Miss penalty (clocks)

10,000,000 to
100,000,000

10 to 1000

Miss rates

0.00001% to
0.0001%

0.01% to
2%

0.25 to 16

18

Two Machines Cache Parameters


Intel P4
TLB organization

AMD Opteron

1 TLB for instructions


and 1TLB for data

2 TLBs for instructions and


2 TLBs for data

Both 4-way set


associative

Both L1 TLBs fully


associative with ~LRU
replacement

Both use ~LRU


replacement

Both L2 TLBs are 4-way set


associative with round-robin
LRU
Both L1 TLBs have 40
entries

Both have 128 entries

TLB misses handled in


hardware

Both L2 TLBs have 512


entries
TBL misses handled in
hardware

19

TLB Event Combinations


TLB

Page
Table

Cache Possible? Under what circumstances?

Hit

Hit

Hit

Hit

Hit

Miss

Miss

Hit

Hit

Miss

Hit

Miss

Yes TLB miss, PA in page table, but data


not in cache

Miss

Miss

Miss

Hit

Miss

Miss/

Yes page fault


Impossible TLB translation not possible if
page is not present in memory

Hit
Miss

Miss

Hit

Yes what we want!


Yes although the page table is not
checked if the TLB hits
Yes TLB miss, PA in page table

Impossible data not allowed in cache if


page is not in memory

21

The real thing: TLB translation on MIPS R4000

22

Reducing Translation Time

Can overlap the cache access with the TLB access

Works when the high order bits of the VA are used to access the
TLB while the low order bits are used as index into cache

Block offset

2-way Associative Cache

Index
VA Tag

PA
Tag

Tag Data

Tag Data

PA Tag
TLB Hit
=

Cache Hit Desired word

23

Why Not a Virtually Addressed Cache?

A virtually addressed cache would only require address


translation on cache misses
VA
CPU

Translation

PA

Main
Memory

Cache
hit
data

but

Two different virtual addresses can map to the same physical


address (when processes are sharing data), i.e., two different
cache entries hold data for the same physical address
synonyms
- Must update all cache entries with the same physical address or the
memory becomes inconsistent

24

The Hardware/Software Boundary

What parts of the virtual to physical address translation


is done by or assisted by the hardware?

Translation Lookaside Buffer (TLB) that caches the recent


translations
- TLB access time is part of the cache hit time
- May allot an extra stage in the pipeline for TLB access

Page table storage, fault detection and updating


- Page faults result in interrupts (precise) that are then handled by
the OS
- Hardware must support (i.e., update appropriately) Dirty and
Reference bits (e.g., ~LRU) in the Page Tables

Disk placement
- Bootstrap (e.g., out of disk sector 0) so the system can service a
limited number of page faults before the OS is even loaded

25

Summary

The Principle of Locality:

Program likely to access a relatively small portion of the


address space at any instant of time.
-

Temporal Locality: Locality in Time

Spatial Locality: Locality in Space

Caches, TLBs, Virtual Memory all understood by


examining how they deal with the four questions
1.

Where can block be placed?

2.

How is block found?

3.

What block is replaced on miss?

4.

How are writes handled?

Page tables map virtual address to physical address

TLBs are important for fast translation

26

You might also like