Computer Architecture Unit 2
Computer Architecture Unit 2
Unit 2
❖ The design of the memory hierarchy is divided into two types such as
primary (Internal) memory and secondary (External) memory.
Memory Hierarchy Design
The memory hierarchy in computers mainly include the following:
1)Registers:
• Usually, the register is a static RAM or SRAM in the processor of the
computer which is used for holding the data word which is typically 64
or 128 bits.
• The program counter register is the most important as well as found in
all the processor.
• Most of the processors use a status word register as well as an
accumulator.
• A status word register is used for decision making, and the accumulator
is used to store data like mathematical operation.
2)Cache Memory:
• Cache memory can also be found in the processor, however rarely it
maybe another IC (Integrated Circuit) which is separated into levels.
• The cache holds the chunk of data which are frequently used from main
memory.
• When the processor has a single core then it will have two or more
cache levels rarely. Present multi-core processors will be having three,
2-levels for each one core, and one level is shared.
3)Main Memory:
• The main memory in the computer is nothing but the memory unit in
the CPU that communicates directly. It is the main storage unit of the
computer.
• This memory is fast as well as large memory used for storing the data
throughout the operations of the computer. This memory is made up of
RAM as well as ROM.
• Multi-processing environment is created by main memory (RAM).
4)Magnetic Disks:
3
3)Access Time:
▪ The access time in the memory hierarchy is the interval of the time
among the data availability as well as request to read or write.
▪ Because whenever we shift from top to bottom inside the memory
hierarchy, then the access time will increase.
4)Cost per bit:
▪ When we shift from top to bottom inside the memory hierarchy, then
the cost for each bit will increase, which means as internal memory is
expensive compared with external memory.
2)Coherence Property:
▪ Maintains consistency of data across multiple cache levels and
processors in a multi-core or distributed system.
▪ Ensures that changes made to a data item in one cache are propagated
to other caches holding the same data.
▪ Commonly managed through protocols like MESI (Modified, Exclusive,
Shared, Invalid) or MOESI (Modified, Owner, Exclusive, Shared, Invalid).
3)Locality Property:
Locality properties refer to the tendency of a processor to access data
repeatedly in the same memory locations over a short period of time. It is
actually a property of cache memory.
There are two types of locality properties:
6
i)Temporal Locality:
▪ This property is related to time.
▪ The tendency of a program to access same memory location multiple
times within a short period of time.
▪ If a word is referenced now, then the same word will be referenced
again in future.
▪ LRU is used in temporal locality.
Hit Latency: Hit latency is the time it takes to access a memory location in a
cache.
Cache Miss: When the required data is not found in the cache, forcing the
CPU to retrieve it from the slower main memory. On searching in the cache if
data is not found, a cache miss has occurred.
CACHE MAPPING
Primary Terminologies
Some primary terminologies related to cache mapping are listed below:
• Main Memory Blocks: The main memory is divided into equal-sized
partitions called the main memory blocks.
• Cache Line: The cache is divided into equal partitions called the cache
lines.
• Block Size: The number of bytes or words in one block is called the
block size.
• Tag Bits: Tag bits are the identification bits that are used to identify
which block of main memory is present in the cache line.
• Number of Cache Lines: The number of cache lines is determined by
the ratio of cache size divided by the block or line size.
• Number of Cache Set: The number of cache sets is determined by the
ratio of several cache lines divided by the associativity of the cache.
Definition
Cache mapping is a technique that is used to bring the main memory content
to the cache or to identify the cache block in which the required content is
present. Actually needed block (words) of main memory is brought to the
cache lines & how (through address) CPU identifies(calls) that data from
cache.
10
In direct mapping (in cache) physical address is divided into three parts i.e.,
Tag bits, Cache Line Number and Byte offset. The bits in the cache line
number represents the cache line in which the content is present whereas
the bits in tag are the identification bits that represents which block of main
memory is present in cache. The bits in the byte offset decides in which byte
of the identified block the required content is present.
Advantages
▪ Fast: Direct mapping is fast because it only requires checking the line
where the block is being searched for, rather than the entire cache.
▪ Simple: Direct mapping is simple to implement and check for cache hits
or misses.
Disadvantages
▪ Conflict misses: Direct mapping can lead to conflict misses, where a
cache block is replaced even if other blocks are empty.
12
In fully associative mapping (in cache) address is divided into two parts i.e.,
Tag bits and Byte offset. The tag bits identify that which memory block is
present and bits in the byte offset field decides in which byte of the block the
required content is present.
Advantages
▪ No Conflict Misses: Since a block can be placed anywhere in the cache,
there is no risk of a conflict miss occurring where multiple blocks map
to the same cache location.
13
▪ High hit rate: Due to the ability to place any block anywhere, the cache
is more likely to find the requested data, leading to a higher hit rate.
Disadvantages
▪ Increased comparison time: To find a specific block, the cache needs to
compare the tag of the searched block with every single cache line,
which can significantly increase search time.
▪ High implementation complexity: Implementing a fully associative
cache requires more complex hardware design compared to other
mapping techniques.
▪ Larger circuit area: Due to the need for more extensive comparison
logic, a fully associative cache may require a larger circuit area
compared to other options.
In set associative mapping the cache blocks are divided in sets. It divides
address into three parts i.e., Tag bits, set number and byte offset. The bits in
set number decides that in which set of the cache the required block is
present and tag bits identify which block of the main memory is present. The
bits in the byte offset field gives us the byte of the block in which the content
is present.
Advantages
▪ Higher hit rate: Set associative mapping can have a higher hit rate for
the same cache size.
▪ Fewer conflict misses: Set associative mapping can have fewer conflict
misses.
▪ Reduced comparison time: Set associative mapping can have a reduced
comparison time compared to fully associative mapping.
▪ More flexible than direct mapping: Set associative mapping is more
flexible than direct mapping, but less complex than fully associative
mapping.
Disadvantages
▪ Expensive: Set associative mapping can be expensive due to the cost of
associative-comparison hardware.
▪ Conflict misses: Set associative mapping can still have conflict misses.
CACHE MISSES
Cache miss occurs when data is not available in the Cache Memory. When
CPU detects a miss, it processes the miss by fetching requested data from
main memory.
16
➢ Use cache blocking: Divide data into smaller blocks to fit more
efficiently into the cache.
➢ Optimize data structures: Optimize data structures to improve
spatial and temporal locality.
➢ Use different cache levels: Use different cache levels or
partitions to reduce cache contention.
➢ Adjust replacement policy: Adjust the cache replacement
policy to suit your workload.
• Paging
• Segmentation
19
Paging
Paging divides memory into small fixed-size blocks called pages. When the
computer runs out of RAM, pages that aren’t currently in use are moved to
the hard drive, into an area called a swap file. The swap file acts as an
extension of RAM. When a page is needed again, it is swapped back into
RAM, a process known as page swapping. This ensures that the operating
system (OS) and applications have enough memory to run.
Demand Paging: The process of loading the page into memory on demand
(whenever a page fault occurs) is known as demand paging. The process
includes the following steps are as follows:
• If the CPU tries to refer to a page that is currently not available in the
main memory, it generates an interrupt indicating a memory access
fault.
• The OS puts the interrupted process in a blocking state. The OS
generates trap.
• The OS will search for the required page in the logical address space
(LAS).
• The required page will be brought from logical address space to
physical address space. The page replacement algorithms are used for
the decision-making of replacing the page in physical address space.
• The page table will be updated accordingly.
20
• The signal will be sent to the CPU to continue the program execution
and it will place the process back into the ready state.
Hence whenever a page fault occurs these steps are followed by the
operating system and the required page is brought into memory.
Page Fault Service Time: The time taken to service the page fault is called
page fault service time. The page fault service time includes the time taken
to perform all the above six steps.
Let, Main memory access time is: m (in nanoseconds)
Page fault service time is: s (in milliseconds)
Page fault rate is: p
Then, Effective memory access time = (p*s) + (1-p) *m
Segmentation
Segmentation divides virtual memory into segments of different sizes.
Segments that aren’t currently needed can be moved to the hard drive. The
system uses a segment table to keep track of each segment’s status,
including whether it’s in memory, if it’s been modified, and its physical
address. Segments are mapped into a process’s address space only when
needed.
Virtual Memory vs Physical Memory
An abstraction that
The actual hardware (RAM) that
extends the available
Definition stores data and instructions
memory by using disk
currently being used by the CPU
storage
Data Indirect (via paging and Direct (CPU can access data
Access swapping) directly)
What is Swapping?
Swapping is a process out means removing all of its pages from memory, or
marking them so that they will be removed by the normal page replacement
process. Suspending a process ensures that it is not runnable while it is
swapped out. At some later time, the system swaps back the process from
the secondary storage to the main memory. When a process is
busy swapping pages in and out then this situation is called thrashing.
22
What is Thrashing?
At any given time, only a few pages of any process are in the main memory,
and therefore more processes can be maintained in memory. Furthermore,
time is saved because unused pages are not swapped in and out of memory.
In the steady state practically, all of the main memory will be occupied with
process pages, so that the processor and OS have direct access to as many
processes as possible. Thus, when the OS brings one page in, it must throw
another out. If it throws out a page just before it is used, then it will just have
to get that page again almost immediately. Too much of this leads to a
condition called Thrashing. The system spends most of its time swapping
pages rather than executing instructions.
How it works:
1)A process can request multiple memory blocks from different locations
in the memory.
2)The available free memory space is scattered across the memory.
3)This technique reduces memory wastage caused by internal and
external fragmentation.
Types of Contiguous Memory Allocation
1)Paging: Divides memory into fixed-sized blocks called "pages" and
allocates memory to processes by assigning them different combinations
of these pages, allowing for non-contiguous allocation.
2)Segmentation: Splits memory into variable-sized logical units called
"segments" where each segment can represent a different part of a
program (like code, data, stack).
3)Multilevel paging: A hierarchical approach to paging where the page
table is organized into multiple levels, reducing the memory required to
store the table itself.
4)Inverted paging: Instead of each process having its own page table, a
single page table is maintained for the entire system, where entries point
to the processes that are using that page.
5)Segmented Paging: Segmented paging essentially refers to a
combination of segmentation and paging, where memory is divided into
segments which are further divided into pages.
FIFO
FIFO in memory replacement policy stands for "First In, First Out,"
meaning that when a page needs to be replaced in memory, the system will
always choose the page that has been in memory the longest (the oldest
page), essentially treating the memory like a queue where the first item
added is the first to be removed; this is considered a simple and
straightforward page replacement algorithm.
28
Working: When a new page needs to be loaded into memory and there is no
space, the page that has been in memory the longest (the one at the front of
the queue) is removed to make room for the new page.
Advantages of FIFO:
Easy to implement: Due to its simple queue-based structure, FIFO is
relatively easy to implement in software.
Disadvantages of FIFO:
i)Can be inefficient: In many scenarios, FIFO might not be the optimal choice
as it doesn't consider how recently a page was accessed, leading to
potentially replacing frequently used pages.
ii)Belady's Anomaly: In some cases, increasing the number of available
memory frames can paradoxically lead to more page faults with FIFO.
29
MRU
MRU stands for "Most Recently Used" and refers to a memory replacement
policy where the item that was accessed most recently is the one chosen to
be removed from the cache when space is needed for a new item; essentially,
the opposite of the "Least Recently Used" (LRU) policy.
Functionality: When a cache becomes full and needs to make space for a
new item, the MRU policy identifies the data that was accessed most recently
and removes it from the cache.
Assumption: The MRU policy assumes that items recently accessed are
more likely to be accessed again soon, making it a suitable strategy when
there is strong temporal locality in data access patterns.
When might MRU be useful?
Short-term access patterns: When data is accessed frequently within a short
time frame, MRU can be effective as the most recent access is likely to be the
next access as well.
Limitations of MRU: Not suitable for all scenarios; If access patterns exhibit
poor temporal locality, where recently accessed items are unlikely to be
accessed again soon, MRU may not perform well.
30
LRU
LRU" in memory management stands for "Least Recently Used," which is a
cache replacement policy in computer science where the block that has been
least recently accessed is replaced when the cache is full. This policy is based
on the principle of temporal locality, aiming to optimize cache performance
by keeping frequently used data in the cache.
Working:
1)LRU maintains a list of items in the cache, with the most recently used
items at the front.
2)When the cache is full, LRU removes the least recently used item to make
room for a new item.
3)The new item is added to the front of the list.
Advantages:
▪ Improved performance: LRU keeps frequently used data in fast
memory, which improves system performance
▪ Optimized memory usage: LRU keeps frequently used data in memory
and discards less frequently used data, which helps optimize memory
usage
▪ Simple implementation: LRU is easy to understand and implement, and
doesn't require complex data structures or algorithms
▪ Predictable eviction decisions: LRU focuses on recency of access,
making eviction decisions consistent
▪ Works well in many applications: LRU is used in many systems,
including web browsers and databases
Disadvantages
▪ It requires additional Data Structure to be implemented.
▪ Hardware assistance is high.
▪ In LRU error detection is difficult as compared to other algorithms.
▪ It has limited acceptability.
▪ LRU are very costly to operate.
31
Optimal
The "optimal memory replacement policy," also called the "Optimal Page
Replacement Algorithm" or "MIN algorithm," is a theoretical strategy in
memory management that aims to replace the page which will not be
accessed for the longest time in the future, essentially selecting the page
with the furthest next use based on perfect knowledge of future memory
access patterns; however, this policy is not practical because it requires
knowing the future access sequence, making it a benchmark for evaluating
other replacement algorithms.
Functionality:
When a page needs to be replaced, the optimal policy selects the page that
will be accessed the furthest into the future, minimizing page faults.
Theoretical nature:
This policy is considered theoretical because it requires knowledge of future
memory access which is not possible in real-time operating systems.
Advantages:
▪ Minimizes page faults: Due to its ability to identify the page that will be
accessed furthest in the future, the optimal policy generates the least
number of page faults, leading to optimal performance in theory.
32