0% found this document useful (0 votes)
14 views58 pages

module-3-chapter-1

The document discusses bus systems, particularly backplane bus specifications, data transfer buses, and arbitration processes in multiprocessor systems. It covers various addressing and timing protocols, as well as the advantages and disadvantages of synchronous and asynchronous timing. Additionally, it explores memory interleaving techniques, cache addressing models, and memory allocation schemes within operating systems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views58 pages

module-3-chapter-1

The document discusses bus systems, particularly backplane bus specifications, data transfer buses, and arbitration processes in multiprocessor systems. It covers various addressing and timing protocols, as well as the advantages and disadvantages of synchronous and asynchronous timing. Additionally, it explores memory interleaving techniques, cache addressing models, and memory allocation schemes within operating systems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 58

Bus, Cache and Shared

Memory
Backplane bus systems

The effective bandwidth available to each


processor is inversely proportional to the
number of processors contending for the
bus
So, most bus-based commercial
multiprocessors are small in size.
Backplane bus specification

A backplane bus interconnects


processors, data storage and peripheral
devices in tightly coupled hardware
configuration.
The system bus must be designed to allow
communication between devices on the
bus without disturbing the internal
activities of all devices.
Timing protocols must be established to
arbitrate among multiple requests.
Signal lines on the backplane are often
functionally grouped into four buses.
Data Transfer Bus (DTB)

Data, address and control lines from the


DTB in a VME(versa module europa bus)
bus
The address lines are used to broadcast the
data and device address.
The no. of address lines is proportional to
the logarithm of the size of the address
space.
Address modifier lines can be used to
define special addressing modes.
Bus arbitration and control

The process of assigning control of the DTB


to a requester is called arbitration.
The requester is called a master, and the
receiving end is called a slave.
Interrupt lines are used to handle
interrupts, which are prioritized.
The backplane is made of signal lines and
connectors.
Functional modules

A functional module is a collection of


electronic circuitry that resides on one
functional board and works to achieve
special bus control functions.
Arbiter is a functional module that
accepts bus requests from the requester
module and grants control of the DTB to
one requester at a time.
A bus timer measures the time each data
transfer takes on the DTB and terminates
the DTB cycle if a transfer takes too long.
Addressing and Timing
Protocols

Two types of printed circuit boards



Active
Processors can act as bus masters or as slaves at
different times.
Initiate a bus cycle
Only one master can control the bus at a time

Passive
Act as only slaves
Respond to requests by a master
One or more slaves can respond to the masters
request at a time.
Bus addressing

The backplane bus is driven by a digital clock


with a fixed cycle time called the bus cycle.
The backplane is designed to have a limited
physical size which will not skew information
w.r.t the strobe signals
To speedup the operations, cycles on parallel
lines in different buses may overlap in time.
To optimize performance, the bus should be
designed to minimize the time required for

request handling

Arbitration

Addressing and

interrupts
Broadcall and Broadcast

Most bus transactions involve only one


master and one slave.
Broadcall is a read operation involving
multiple slaves placing data on the bus
lines.

Used to detect multiple interrupt sources.
Broadcast is a write operation involving
multiple slaves.

It is essential in implementing multicache
coherence
Synchronous and
Asynchronous Timing
Synchronous bus is simple to control,
requires less control circuitry, and thus
costs less.
It is suitable for connecting relatively the
same speed.
Otherwise the slowest device will slow
down the entire bus operation.
Asynchronous Timing

The advantage of using an asynchronous


bus lies in the freedom of using variable
length clock signals for different-speed
devices.
This does not impose any response time
restrictions on the source and destination.
It allows fast and slow devices to be
connected on the same bus, and it is less
prone to noise.
Arbitration, Transaction and
Interrupt
The process of selecting the next bus
master is called arbitration.
The duration of a master’s control of the
bus is called bus tenure (condition).
This arbitration process is designed to
restrict tenure of the bus to one master at a
time.
Central arbitration

Potential masters are daisy-chained in a


cascade.
A special signal line is used to propagate a
bus-grant signal level from the first master
to the last.
Each potential master can send a bus
request.
A fixed priority is set in a daisy chain from
left to right.
Advantages

Simplicity

Additional devices can be added anywhere in the
daisy chain
Disadvantages

Fixed priority sequence violating the fairness
practice.

Its slowness in propagating the bus signal along
the daisy chain.
Independent requests and
grants

Multiple bus request and bus grant signal


lines can be independently provided.
No daisy chaining is used in this scheme.
The arbitration among potential masters is
still carried out by a central arbiter.
Priority based bus allocation can be
implemented.

Multiprocessor system usually uses above scheme.
Advantages

Flexible

Faster arbitration time
Disadvantages

Large number of arbitration lines used
Distributed arbitration

Each potential master is equipped with



Own arbiter and

Unique arbitration number.
The arbitration number is used to resolve the
arbitration completion.

When two or more devices compete for the bus, the
winner is one whose arbitration no. is the largest.
All potential masters can send their arbitration
numbers to the shared bus request/grant
(SBRG) lines.
Each arbiter compares the resulting number on
the SBRG lines with its own arbitration number.
If the SBRG number is greater, the requester is
dismissed.
Transaction modes

Address only transfer



Consists of an address transfer followed by no
data.
Compelled data transfer

Consists of an address transfer followed by a
block of one or more data transfers to one or
more contiguous addresses.
Packet data transfer

Consists of an address transfer followed by a
fixed length block of data transfers from a set
of continuous addresses.
Connected transaction

Used to carry out a masters request and slave’s
response in a single bus transaction.
Split transaction

Splits the request and response into separate bus
transactions.

Use the bus resources in more efficient way.

Complete split transactions may require two or
more connected bus transactions.
The IEEE future bus+ standards

Typical IEEE standards start with a company


building a device, and then submitting it to the
IEEE for the standardization effort. In the case of
Futurebus this was reversed, the whole system
was being designed as during the
standardization effort. (Tektronix )
Architecture, processor and technology
independent
Fully asynchronous timing protocol
Optional source synchronized protocol for high
speed
Fully distributed parallel arbitration protocols
High reliability
The Futurebus technology is currently used as an
internal backplane technology for systems such
as routers.
Use of multilevel mechanisms for locking
of modules
Circuit-switched and split transaction
protocols
Support of real time mission-critical
computations with multiple priority levels
Support of 32-bit or 64-bit addressing
Direct support of snoopy cache based
multiprocessors
Compatible message passing protocols
Cache Addressing Models
Physical Address Caches

When the amount of unwritten data in the cache reaches a certain level, the controller periodically writes cached data to a drive. This write process is
called "flushing."
Physical Address Caches

Advantages
No need to perform Cache Flushing
Fewer cache bugs in OS kernels

Disadvantages
Slow down in accessing the cache until
MMU/TLB finishes translating the address
Virtual Address Caches
Virtual Address Cache

Advantages
Cache search becomes easy

Disadvantages
Virtual Addressing gives rise to aliasing.
Removal of aliasing needs too much of
flushing
Too much of flushing increases cache misses
Direct Mapping Cache and
Associative
When the amount Cache
of unwritten data in the cache reaches a
certain level, the controller periodically writes cached data
to a drive. This write process is called "flushing."
Mapping

Placement policy
Bj -> Bi If i = j modulo(m) i=1,2,3……m and

j=1,2,3….n
K-way associativity

m Cache block frames are divided into v=m/k


sets
d - set number
Mapping is Bj Bi If j modulo(v)=i
Cache Performance Issues
Shared memory organizations

Memory Interleaving

Main memory is built with multiple modules

Memory modules are connected by a system bus

Parallel access of multiple words can be done
simultaneously or in a pipelined fashion.
These memory words are assigned linear
addresses.
Two address formats for memory
interleaving

Low order interleaving

High order interleaving
Low order interleaving

It spreads contiguous memory locations


across the m modules horizontally
Low order ‘a’ bits of the memory address
are used to identify the memory module.
High order ‘b’ bits are the word address
within each module.
Note that the same word address is
applied to all memory modules.
A module address decoder is used to
distribute module addresses.
The Address Space for Low– Order
Interleaving
 When a memory is N–way interleaved, we always find
that N = 2K.
This is due to the structure of the memory address.
 For K = 1, we have 2–way interleaving.
 For K = 2, we have 4–way interleaving.
 For K = 3, we have 8–way interleaving.
 For K = 4, we have 16–way interleaving.
 For each scheme, the K bits(lower bits) of the address
select the Module.
 structure is as follows. Each address is a
6–bit unsigned binary number.
• A Main memory formed with m= 2^a memory modules
• Each containing w=2^b words of memory cells

a nd b bits in Low order interleaving


a bits are used to identify the memory module. And the higher order b bits
are the word addresses(displacement) within each module.

a and b bits in High order interleaving


a bits are as the module address and the low order b bits are the word address
in each module.

High order interleaving cannot support block access of contiguous


location .
Low order interleaving support the block access in pipelined fashion .
High order interleaving

It uses the high-order ‘a’ bits as the


module address and low-order ‘b’ bits as
the word address with in each module.
Contiguous memory locations are thus
assigned to the same memory module.
In each memory cycle, only one word is
accessed from each module.
Thus the high-order interleaving cannot
support block access of contiguous
locations
Pipelined Memory Access

τ=θ/m
Memory Bandwidth

The memory bandwidth B of an m-way


interleaved memory is upper bounded by m
and lower bounded by 1.
B=m0.56 ~ sqrt(m)
Where m is number of interleaved memory
If 16 memory modules are used then the
effective memory bandwidth is
approximately four times that of a single
module.
Fault tolerance

High and low order interleaving can be


combined to yield many different
interleaved memory organizations
Sequential addresses are assigned in the
high order interleaved memory in each
memory module.
This makes it easier to isolate faulty
memory modules in a memory bank of m
memory modules.
When one module failure is detected, the
remaining modules can still be used by
opening a window in the address space.
Memory allocation schemes

The portion of the OS kernel which handles


the allocation and de-allocation of main
memory to executing process is called
memory manager.
Memory manager monitors the amount of
available main memory and decides the
actions.
Allocation Policies

Memory swapping is the process of moving


blocks of information between the levels of
a memory hierarchy.
Several key concepts in swapping process

Swapping can be made by
Nonpreemptive

The incoming block can be placed only in a free
region of the main memory.
Preemptive

Placement of incoming block in a region
presently occupied by another process.
Memory Allocation Schemes

You might also like