0% found this document useful (0 votes)
39 views59 pages

Virtual Memory

Virtual memory is a technique that allows processes to execute even if they are not completely loaded into physical memory. It abstracts logical memory from physical memory, allowing programs to be larger than physical memory and freeing programmers from concerns over memory limitations. Virtual memory also allows easy sharing of files and memory between processes and provides an efficient mechanism for process creation.

Uploaded by

Jong Kook
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views59 pages

Virtual Memory

Virtual memory is a technique that allows processes to execute even if they are not completely loaded into physical memory. It abstracts logical memory from physical memory, allowing programs to be larger than physical memory and freeing programmers from concerns over memory limitations. Virtual memory also allows easy sharing of files and memory between processes and provides an efficient mechanism for process creation.

Uploaded by

Jong Kook
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 59

VIRTUAL MEMORY

 Virtual memory is a technique that allows the execution


of processes that are not completely in memory. One
major advantage of this scheme is that programs can be
larger than physical memory.
 Further, virtual memory abstracts main memory into an
extremely large, uniform array of storage, separating
logical memory as viewed by the user from physical
memory. This technique frees programmers from the
concerns of memory-storage limitations.
 Virtual memory also allows processes to share files easily
and to implement shared memory. In addition, it provides
an efficient mechanism for process creation.
2
 The memory-management algorithms talked about how to avoid
memory fragmentation by breaking process memory requirements
down into smaller bites (pages), and storing the pages non-
contiguously in memory. However the entire process still had to be
stored in memory somewhere.
 The requirement that instructions must be in physical memory to be
executed seems both necessary and reasonable; but it is also
unfortunate, since it limits the size of a program to the size of physical
memory.
 In practice, most real processes do not need all their pages, or at least
not all at once, for several reasons:
ꙮ Error handling code is not needed unless that specific error
occurs, some of which are quite rare.
ꙮ Arrays are often over-sized for worst-case scenarios, and only a
small fraction of the arrays are actually used in practice.
ꙮ Certain features of certain programs are rarely used, such as the
routine to balance the federal budget 3
The ability to execute a program that is only partially in memory
would confer many benefits:
Program would no longer be constrained by the amount of
physical memory that is available. Users would be able to
write programs for an extremely large virtual address space,
simplifying the programming task.
Because each user program could take less physical memory,
more programs could be run at the same time, with a
corresponding increase in CPU utilization and throughput but
with no increase in response time or turnaround time.
Less I/O would be needed to load or swap user programs into
memory, so each user program would run faster.
Thus, running a program that is not entirely in
memory would benefit both the system and the user.
4
Virtual memory involves the separation of logical memory as perceived by
users from physical memory. This separation allows an extremely large virtual
memory to be provided for programmers when only a smaller physical
memory is available.
Virtual memory makes the task of programming much easier, because the programmer
no longer needs to worry about the amount of physical memory available

5
The virtual address space of a process refers to the logical (or
virtual) view of how a process is stored in memory.
 The actual physical layout is controlled by the
process's page table.
 The large blank space (or hole) between the
heap and the stack is part of the virtual address
space but will require actual physical pages
only if the heap or stack grows.
 Virtual address spaces that include holes are
known as sparse address spaces.
 Using a sparse address space is beneficial
because the holes can be filled as the stack or
heap segments grow or if we wish to
dynamically link libraries (or possibly other
shared objects) during program execution.
6
Virtual memory also allows the sharing of files and memory by multiple
processes, with several benefits:
• System libraries can be shared by mapping them into the virtual address
space of more than one process.
• Processes can also share virtual memory by mapping the same block of
memory to more than one process.
• Process pages can be shared during a fork( ) system call, eliminating the
need to copy all of the pages of the original (parent) process.

7
Demand Paging
• The basic idea behind demand paging is that when a process is swapped
in, its pages are not swapped in all at once. Rather they are swapped in
only when the process needs them (on demand.) This is termed a lazy
swapper, although a pager is a more accurate term.
• Pages that are never accessed are thus never loaded into physical
memory. A demand-paging system is similar to a paging system with
swapping where processes reside in secondary memory.
 A lazy swapper never swaps
a page into memory unless
that page will be needed.
 A swapper manipulates
entire processes, whereas a
pager is concerned with the
individual pages of a
process.
8
 The basic idea behind paging is that when a process is
swapped in, the pager only loads into memory those
pages that it expects the process to need (right away.)
 Pages that are not loaded into memory are marked as
invalid in the page table, using the invalid bit.
 The rest of the page table entry may either be blank or
contain information about where to find the swapped-out
page on the hard drive.
 If the process only ever accesses pages that are loaded in
memory (memory resident pages), then the process runs
exactly as if all the pages were loaded in to memory.
9
Page table when
some pages are not
in main memory.
10
Access to a page marked invalid causes a page fault. The procedure for handling
this page fault is straightforward.
1. The memory address requested is first checked, to make sure it was
a valid memory request.
2. If the reference was invalid, the process is terminated. Otherwise,
the page must be paged in.
3. A free frame is located, possibly from a free-frame list.
4. A disk operation is scheduled to bring in the necessary page from
disk. (This will usually block the process on an I/O wait, allowing
some other process to use the CPU in the meantime.)
5. When the I/O operation is complete, the process's page table is
updated with the new frame number, and the invalid bit is changed
to indicate that this is now a valid page reference.
6. The instruction that caused the page fault must now be restarted
from the beginning, (as soon as this process gets another turn on
the CPU.)
11
12
• In the extreme case, we can start executing a process with no
pages in memory. NO pages are swapped in for a process until
they are requested by page faults. This is known as pure
demand paging.
• In theory each instruction could generate multiple page faults. In
practice this is very rare, due to locality of reference, which
results in reasonable performance from demand paging.
The hardware to support demand paging is the same as the
hardware for paging and swapping:
1. Page table: This table has the ability to mark an entry invalid
through a valid–invalid bit or a special value of protection bits.
2. Secondary memory: This memory holds those pages that are not
present in main memory. The secondary memory is usually a high-
speed disk. It is known as the swap device, and the section of disk
used for this purpose is known as swap space.
13
 A crucial requirement for demand paging is the ability to
restart any instruction from scratch once the desired page
has been made available in memory (after a pagefault).
 For most simple instructions this is not a major difficulty.
However there are some architectures that allow a single
instruction to modify a fairly large block of data, (which
may span a page boundary), and if some of the data gets
modified before the page fault occurs, this could cause
problems.
 Paging is added between the CPU and the memory in a
computer system and should be entirely transparent to the
user process.

14
 People often assume that paging can be added to any
system. Although this assumption is true for a non-demand-
paging environment, where a page fault represents a fatal
error, it is not true where a page fault means only that an
additional page must be brought into memory and the
process restarted.
 As long as we have no page faults, the effective access time
is equal to the memory access time.
 Let p be the probability of a page fault (0 ≤ p ≤ 1). We
would expect p to be close to zero—that is, we would
expect to have only a few page faults. The effective access
time is then
effective access time = (1 − p) × ma + p × page fault time.
15
A page fault causes the following sequence to occur:
1. Trap to the operating system.
2. Save the user registers and process state.
3. Determine that the interrupt was a page fault.
4. Check that the page reference was legal and determine the location of the page
on the disk.
5. Issue a read from the disk to a free frame:
a. Wait in a queue for this device until the read request is serviced.
b. Wait for the device seek and/or latency time.
c. Begin the transfer of the page to a free frame.
6. While waiting, allocate the CPU to some other user (CPU scheduling, optional).
7. Receive an interrupt from the disk I/O subsystem (I/O completed).
8. Save the registers and process state for the other user (if step 6 is executed).
9. Determine that the interrupt was from the disk.
10. Correct the page table and other tables to show that the desired page is now in
memory.
11. Wait for the CPU to be allocated to this process again.
12. Restore the user registers, process state, and new page table, and then resume
the interrupted instruction. 16
 There are many steps that occur when servicing a page fault, and
some of the steps are optional or variable. But, suppose that a
normal memory access requires 200 nanoseconds, and that
servicing a page fault takes 8 milliseconds. (8,000,000
nanoseconds, or 40,000 times a normal memory access.) With a
page fault rate of p, (on a scale from 0 to 1), the effective access
time is now:
( 1 - p ) * ( 200 ) + p * 8000000
= 200 + 7,999,800 * p
 which clearly depends heavily on p! Even if only one access in
1000 causes a page fault, the effective access time drops from
200 nanoseconds to 8.2 microseconds, a slowdown of a factor of
40 times. In order to keep the slowdown less than 10%, the page
fault rate must be less than 0.0000025, or one in 399,990
accesses.
17
• A subtlety is that swap space is faster to access than the
regular file system, because it does not have to go through
the whole directory structure. For this reason some
systems will transfer an entire process from the file system
to swap space before starting up the process, so that future
paging all occurs from the ( relatively ) faster swap space.
• Some systems use demand paging directly from the file
system for binary code (which never changes and hence
does not have to be stored on a page operation), and to
reserve the swap space for data segments that must be
stored. This approach is used by both Solaris and BSD
Unix.

18
Copy-on-Write
 Copy-on-write is a technique, which works by allowing the
parent and child processes initially to share the same pages.
These shared pages are marked as copy-on-write pages, meaning
that if either process writes to a shared page, a copy of the shared
page is created.
 The idea behind a copy-on-write fork is that the pages for a
parent process do not have to be actually copied for the child
until one or the other of the processes changes the page.
 They can be simply shared between the two processes in the
meantime, with a bit set that the page needs to be copied if it
ever gets written to.
 This is a reasonable approach, since the child process usually
issues an exec( ) system call immediately after the fork. 19
Before process 1 modifies page C.

After process 1 modifies page C 20


 Pages used to satisfy copy-on-write duplications are typically
allocated using zero-fill-on-demand, meaning that their previous
contents are zeroed out before the copy proceeds.
 Some systems provide an alternative to the fork( ) system call
called a virtual memory fork, vfork( ). In this case the parent is
suspended, and the child uses the parent's memory pages. This is
very fast for process creation, but requires that the child not
modify any of the shared memory pages before performing the
exec( ) system call.

21
Page Replacement
 In order to make the most use of virtual memory, we load
several processes into memory at the same time. Since we
only load the pages that are actually needed by each
process at any given time, there is room to load many
more processes than if we had to load in the entire
process.

 However memory is also needed for other purposes (such


as I/O buffering), and what happens if some process
suddenly decides it needs more pages and there aren't any
free frames available? 22
There are several possible solutions to consider:
1. Adjust the memory used by I/O buffering, etc., to free up some
frames for user processes. The decision of how to allocate
memory for I/O versus user processes is a complex one, yielding
different policies on different systems. (Some allocate a fixed
amount for I/O, and others let the I/O system contend for
memory along with everything else.)
2. Put the process requesting more pages into a wait queue until
some free frames become available.
3. Swap some process out of memory completely, freeing up its
page frames.
4. Find some page in memory that isn't being used right now, and
swap that page only out to disk, freeing up a frame that can be
allocated to the process requesting it. This is known as page
replacement, and is the most common solution. There are many
different algorithms for page replacement 23
Need for page replacement. 24
The previously discussed page-fault processing assumed that there
would be free frames available on the free-frame list. Now the page-
fault handling must be modified to free up a frame if necessary, as
follows:
1. Find the location of the desired page on the disk, either in
swap space or in the file system.
2. Find a free frame:
a) If there is a free frame, use it.
b) If there is no free frame, use a page-replacement algorithm to
select an existing frame to be replaced, known as the victim
frame.
c) Write the victim frame to disk. Change all related page tables
to indicate that this page is no longer in memory.
3. Read in the desired page and store it in the frame. Adjust all
related page and frame tables to indicate the change.
4. Restart the process that was waiting for this page. 25
Page Replacement. 26
• Notice that, in step 2C, if no frames are free, two page transfers (one out
and one in) are required. This situation effectively doubles the page-
fault service time and increases the effective access time accordingly.
• This can be alleviated somewhat by assigning a modify bit, or dirty bit
to each page.
• When this scheme is used, each page or frame has a modify bit
associated with it in the hardware. The modify bit for a page is set by
the hardware whenever any byte in the page is written into, indicating
that the page has been modified.
• When we select a page for replacement, we examine its modify bit. If
the bit is set, we know that the page has been modified since it was read
in from the disk. In this case, we must write the page to the disk.
• If the modify bit is not set, however, the page has not been modified
since it was read into memory. In this case, we need not write the
memory page to the disk: it is already there.
• This scheme can significantly reduce the time required to service a page
fault, since it reduces I/O time by one-half if the page has not been
modified. 27
• There are two major requirements to implement a successful
demand paging system: a frame-allocation algorithm and a page-
replacement algorithm. The former centers around how many
frames are allocated to each process ( and to other needs ), and the
latter deals with how to select a page for replacement when there
are no free frames available.
• The overall goal in selecting and tuning these algorithms is to
generate the fewest number of overall page faults. Because disk
access is so slow relative to memory access, even slight
improvements to these algorithms can yield large improvements in
overall system performance.
• These algorithms are evaluated by running it on a particular string
of memory references and computing the number of page faults.
The string of memory references is called a reference string.
28
A reference string can be generated in one of three common ways:
1. Randomly generated, either evenly distributed or with some distribution
curve based on observed system behavior. This is the fastest and easiest
approach, but may not reflect real performance well, as it ignores locality of
reference.
2. Specifically designed sequences. These are useful for illustrating the
properties of comparative algorithms in published papers and textbooks.
3. Recorded memory references from a live system. This may be the best
approach, but the amount of data collected can be enormous, on the order of a
million addresses per second. The volume of collected data can be reduced by
making two important observations:
a) Only the page number that was accessed is relevant. The offset within that page
does not affect paging operations.
b) Successive accesses within the same page can be treated as a single page request,
because all requests after the first are guaranteed to be page hits. (Since there are
no intervening requests for other pages that could remove this page from the page
table.)
c) So for example, if pages were of size 100 bytes, then the sequence of address
requests ( 0100, 0432, 0101, 0612, 0634, 0688, 0132, 0038, 0420 ) would
reduce to page requests ( 1, 4, 1, 6, 1, 0, 4 ) 29
As the number of available frames increases, the number of
page faults should decrease, as shown.

30
FIFO Page Replacement
 The simplest page-replacement algorithm is a first-in,
first-out (FIFO) algorithm.
 A FIFO replacement algorithm associates with each
page the time when that page was brought into memory.
When a page must be replaced, the oldest page is
chosen.
 A FIFO queue can be created to hold all pages in
memory. We replace the page at the head of the queue.
When a page is brought into memory, we insert it at the
tail of the queue.
31
For ex, lets use the reference string for a memory with three frames.
7, 0, 1, 2, 0, 3, 0, 4, 2, 3, 0, 3, 2, 1, 2, 0, 1, 7, 0, 1
Initially, our three frames are initially empty. The first three references (7, 0, 1) cause
page faults and are brought into these empty frames.

In the above example, 20 page requests result in 15 page faults.


• Although FIFO is simple and easy, it is not always optimal, or
even efficient.
32
• An interesting effect that can occur with FIFO is Belady's
anomaly, in which increasing the number of frames available can
actually increase the number of page faults that occur!
• consider the following reference string:
1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5

Notice that the number of faults for four frames (ten) is greater than the number
of faults for three frames (nine)! 33
Examples

34
Optimal Page Replacement
 The discovery of Belady's anomaly lead to the search for
an optimal page-replacement algorithm, which is simply
that which yields the lowest of all possible page-faults,
and which does not suffer from Belady's anomaly.
 Such an algorithm does exist, and is called OPT or MIN.
This algorithm is simply "Replace the page that will not
be used for the longest time in the future.“
 Use of this page-replacement algorithm guarantees the
lowest possible pagefault rate for a fixed number of
frames.
35
Lets use the same reference string for a memory with three frames.
7, 0, 1, 2, 0, 3, 0, 4, 2, 3, 0, 3, 2, 1, 2, 0, 1, 7, 0, 1

⁂ By applying OPT, the minimum number of possible page faults is 9.


Since 6 of the page-faults are unavoidable (the first reference to each
new page), FIFO can be shown to require 3 times as many (extra)
page faults as the optimal algorithm.
⁂ Unfortunately OPT cannot be implemented in practice, because it
requires foretelling the future, but it makes a nice benchmark for the
comparison and evaluation of real proposed new algorithms. 36
Optimal Examples

37
LRU Page Replacement
 The prediction behind LRU, the Least Recently Used,
algorithm is that the page that has not been used in the
longest time is the one that will not be used again in the
near future.
 The distinction between FIFO and LRU:- The former looks
at the oldest load time, and the latter looks at the oldest use
time.
 LRU as analogous to OPT, except looking backwards in
time instead of forwards. OPT has the interesting property
that for any reference string S and its reverse R, OPT will
generate the same number of page faults for S and for R. It
turns out that LRU has this same property
38
Lets use the same reference string for a memory with three frames.
7, 0, 1, 2, 0, 3, 0, 4, 2, 3, 0, 3, 2, 1, 2, 0, 1, 7, 0, 1

⁂ LRU for our sample string, yielding 12 page faults, as compared to 15 for
FIFO and 9 for OPT.
⁂ The LRU policy is often used as a page-replacement algorithm and is
considered to be good. The major problem is how to implement LRU
replacement.
⁂ An LRU page-replacement algorithm may require substantial hardware
assistance. The problem is to determine an order for the frames defined by
the time of last use. 39
There are two simple approaches commonly used:
1. Counters:- Every memory access increments a counter, and the
current value of this counter is stored in the page table entry for
that page. Then finding the LRU page involves simple searching
the table for the page with the smallest counter value. Note that
overflowing of the counter must be considered.
2. Stack:- Another approach is to use a stack, and whenever a page
is accessed, pull that page from the middle of the stack and place
it on the top. The LRU page will always be at the bottom of the
stack. Because this requires removing objects from the middle of
the stack, a doubly linked list is the recommended data structure.
• Note that both implementations of LRU require hardware support,
either for incrementing the counter or for managing the stack, as
these operations must be performed for every memory access.
40
 Neither LRU or OPT exhibit Belady's anomaly. Both belong to a
class of page-replacement algorithms called stack algorithms, which
can never exhibit Belady's anomaly.
 A stack algorithm is one in which the pages kept in memory for a
frame set of size N will always be a subset of the pages kept for a
frame size of N + 1.

 In the case of LRU, (and


particularly the stack
implementation thereof),
the top N pages of the
stack will be the same for
all frame set sizes of N or
anything larger.

41
LRU-Approximation Page Replacement
⁂ Unfortunately full implementation of LRU requires hardware support, and
few systems provide the full hardware support necessary.
⁂ In particular, many systems provide a reference bit for every entry in a page
table, which is set anytime that page is accessed. Initially all bits are set to
zero, and they can also all be cleared at any time. One bit of precision is
enough to distinguish pages that have been accessed since the last clear
from those that have not, but does not provide any finer grain of detail.
Additional-Reference-Bits Algorithm
 Finer grain is possible by storing the most recent 8 reference bits for each page
in an 8-bit byte in the page table entry, which is interpreted as an unsigned int.
 At periodic intervals (clock interrupts), the OS takes over, and right-shifts each
of the reference bytes by one bit.
 The high-order (leftmost) bit is then filled in with the current value of the
reference bit, and the reference bits are cleared.
 At any given time, the page with the smallest value for the reference byte is
the LRU page. 42
Obviously the specific number of bits used and the frequency with which
the reference byte is updated are adjustable, and are tuned to give the
fastest performance on a given hardware platform

E.g., the page with ref bits 11000100 is more recently used than the page
with ref bits 01110111
Second-Chance Algorithm
The second chance algorithm is essentially a FIFO, except the reference
bit is used to give pages a second chance at staying in the page table.
When a page must be replaced, the page table is scanned in a FIFO
(circular queue) manner.
If a page is found with its reference bit not set, then that page is selected
as the next victim.
If, however, the next page in the FIFO does have its reference bit set,
then it is given a second chance:
 The reference bit is cleared, and the FIFO search continues.
 If some other page is found that did not have its reference bit set, then
that page will be selected as the victim, and this page ( the one being
given the second chance ) will be allowed to stay in the page table.
 If , however, there are no other pages that do not have their reference
bit set, then this page will be selected as the victim when the FIFO
search circles back around to this page on the second pass.
44
 If all reference bits in the table are set, then second chance degrades to FIFO,
but also requires a complete search of the table for every page-replacement.
 As long as there are some pages whose reference bits are not set, then any
page referenced frequently enough gets to stay in the page table indefinitely.

 This algorithm is also


known as the clock
algorithm, from the hands
of the clock moving
around the circular queue.

45
Enhanced Second-Chance Algorithm
The enhanced second chance algorithm looks at the reference bit and the
modify bit (dirty bit) as an ordered page, and classifies pages into one of
four classes:

⁂ This algorithm searches the page table in a circular fashion ( in as many as


four passes), looking for the first page it can find in the lowest numbered
category. I.e. it first makes a pass looking for a ( 0, 0 ), and then if it can't
find one, it makes another pass looking for a ( 0, 1 ), etc.
⁂ The main difference between this algorithm and the previous one is the
preference for replacing clean pages if possible. 46
Counting-Based Page Replacement
There are several algorithms based on counting the number of
references that have been made to a given page, such as:
⁂ Least Frequently Used, LFU: Replace the page with the lowest
reference count. A problem can occur if a page is used frequently
initially and then not used any more, as the reference count remains
high. A solution to this problem is to right-shift the counters
periodically, yielding a time-decaying average reference count.
⁂ Most Frequently Used, MFU: Replace the page with the highest
reference count. The logic behind this idea is that pages that have
already been referenced a lot have been in the system a long time,
and we are probably done with them, whereas pages referenced only
a few times have only recently been loaded, and we still need them.
⁂ In general counting-based algorithms are not commonly used, as
their implementation is expensive and they do not approximate OPT
well. 47
Page-Buffering Algorithms
⁂ There are a number of page-buffering algorithms that can be used in
conjunction with the afore-mentioned algorithms, to improve overall
performance and sometimes make up for inherent weaknesses in the
hardware and/or the underlying page-replacement algorithms:
 Maintain a certain minimum number of free frames at all times. When a page-
fault occurs, go ahead and allocate one of the free frames from the free list
first, to get the requesting process up and running again as quickly as
possible, and then select a victim page to write to disk and free up a frame as a
second step.
 Keep a list of modified pages, and when the I/O system is otherwise idle, have
it write these pages out to disk, and then clear the modify bits, thereby
increasing the chance of finding a "clean" page for the next potential victim.
 Keep a pool of free frames, but remember what page was in it before it was
made free. Since the data in the page is not actually cleared out when the page
is freed, it can be made an active page again without having to load in any
new data from disk. This is useful when an algorithm mistakenly replaces a
page that in fact is needed again soon. 48
Allocation of Frames

 There are two important tasks in virtual memory


management: a page-replacement strategy and a
frame-allocation strategy.
 Minimum Number of Frames
 Allocation Algorithms
 Global versus Local Allocation
 Non-Uniform Memory Access

49
Minimum Number of Frames
⁂ The absolute minimum number of frames that a process must be
allocated is dependent on system architecture, and corresponds to the
worst-case scenario of the number of pages that could be touched by a
single (machine) instruction.
⁂ If an instruction (and its operands) spans a page boundary, then multiple
pages could be needed just for the instruction fetch.
⁂ Memory references in an instruction touch more pages, and if those
memory locations can span page boundaries, then multiple pages could
be needed for operand access also.
⁂ The worst case involves indirect addressing, particularly where multiple
levels of indirect addressing are allowed. For this reason architectures
place a limit on the number of levels of indirection allowed in an
instruction, which is enforced with a counter initialized to the limit and
decremented with every level of indirection in an instruction - If the
counter reaches zero, then an "excessive indirection" trap occurs.
50
Allocation Algorithms
⁂ Equal Allocation - If there are m frames available and n processes
to share them, each process gets m / n frames, and the leftovers are
kept in a free-frame buffer pool.
⁂ Proportional Allocation - Allocate the frames proportionally to the
size of the process, relative to the total size of all processes. So if the
size of process i is S_i, and S is the sum of all S_i, then the
allocation for process P_i is a_i = m * S_i / S.
⁂ Variations on proportional allocation could consider priority of
process rather than just their size.
⁂ Obviously all allocations fluctuate over time as the number of
available free frames, m, fluctuates, and all are also subject to the
constraints of minimum allocation. (If the minimum allocations
cannot be met, then processes must either be swapped out or not
allowed to start until more free frames become available.)
51
Global versus Local Allocation
⁂ One big question is whether frame allocation (page
replacement) occurs on a local or global level.
⁂ With local replacement, the number of pages allocated to a
process is fixed, and page replacement occurs only amongst the
pages allocated to this process.
⁂ With global replacement, any page may be a potential victim,
whether it currently belongs to the process seeking a free frame
or not.
⁂ Local page replacement allows processes to better control their
own page fault rates, and leads to more consistent performance
of a given process over different system load levels.
⁂ Global page replacement is overall more efficient, and is the
more commonly used approach.
52
Non-Uniform Memory Access
⁂ The above arguments all assume that all memory is equivalent, or at least
has equivalent access times.
⁂ This may not be the case in multiple-processor systems, especially where
each CPU is physically located on a separate circuit board which also holds
some portion of the overall system memory.
⁂ In these latter systems, CPUs can access memory that is physically located
on the same board much faster than the memory on the other boards.
⁂ The basic solution is akin to processor affinity - At the same time that we
try to schedule processes on the same CPU to minimize cache misses, we
also try to allocate memory for those processes on the same boards, to
minimize access times.
⁂ The presence of threads complicates the picture, especially when the
threads get loaded onto different processors.
⁂ Solaris uses an lgroup as a solution, in a hierarchical fashion based on
relative latency. For example, all processors and RAM on a single board
would probably be in the same lgroup. Memory assignments are made
within the same lgroup if possible, or to the next nearest lgroup otherwise.
53
Thrashing
 If a process cannot maintain its minimum required
number of frames, then it must be swapped out, freeing
up frames for other processes. This is an intermediate
level of CPU scheduling.
 But what about a process that can keep its minimum, but
cannot keep all of the frames that it is currently using on
a regular basis? In this case it is forced to page out pages
that it will need again in the very near future, leading to
large numbers of page faults.
 A process that is spending more time paging than
executing is said to be thrashing.
54
Cause of Thrashing
⁂ Early process scheduling schemes would control the level of
multiprogramming allowed based on CPU utilization, adding in more
processes when CPU utilization was low.
⁂ The problem is that when memory filled up and processes started spending
lots of time waiting for their pages to page in, then CPU utilization would
lower, causing the schedule to add in even more processes and exacerbating
the problem! Eventually the system would essentially grind to a halt.
⁂ Local page replacement policies can prevent one thrashing process from
taking pages away from other processes, but it still tends to clog up the I/O
queue, thereby slowing down any other process that needs to do even a
little bit of paging (or any other I/O for that matter.)

55
 To prevent thrashing we must
provide processes with as many
frames as they really need "right
now", but how do we know what
that is?
 The locality model notes that
processes typically access
memory references in a given
locality, making lots of references
to the same general area of
memory before moving
periodically to a new locality.
 If we could just keep as many
frames as are involved in the
current locality, then page faulting
would occur primarily on
switches from one locality to
another. 56
Working-Set Model
⁂ The working set model is based on the concept of locality, and defines a
working set window, of length delta Δ. Whatever pages are included in the
most recent delta page references are said to be in the processes working
set window, and comprise its current working set.

 The selection of delta is critical to the success of the working set model - If it is too
small then it does not encompass all of the pages of the current locality, and if it is
too large, then it encompasses pages that are no longer being frequently accessed.
 The total demand, D, is the sum of the sizes of the working sets for all processes. If
D exceeds the total number of available frames, then at least one process is
thrashing, because there are not enough frames available to satisfy its minimum
working set. If D is significantly less than the currently available frames, then
additional processes can be launched. 57
⁂ The hard part of the working-set model is keeping track of what
pages are in the current working set, since every reference adds one
to the set and removes one older page. An approximation can be
made using reference bits and a timer that goes off after a set interval
of memory references:
 For example, suppose that we set the timer to go off after every
5000 references (by any process), and we can store two
additional historical reference bits in addition to the current
reference bit.
 Every time the timer goes off, the current reference bit is copied
to one of the two historical bits, and then cleared.
 If any of the three bits is set, then that page was referenced
within the last 15,000 references, and is considered to be in that
processes reference set.
 Finer resolution can be achieved with more historical bits and a
more frequent timer, at the expense of greater overhead.
58
Page-Fault Frequency
⁂ A more direct approach is to recognize that what we really want to control
is the page-fault rate, and to allocate frames based on this directly
measurable value. If the page-fault rate exceeds a certain upper bound then
that process needs more frames, and if it is below a given lower bound,
then it can afford to give up some of its frames to other processes.

59

You might also like