ddco mod 4
ddco mod 4
ORGANIZATION
M ODULE 4
Accessing I/O
Devices
Accessing I/O devices
Processor Memory
Bus
•Multiple I/O devices may be connected to the processor and the memory via a bus.
•Bus consists of three sets of lines to carry address, data and control signals.
•Each I/O device is assigned an unique address.
•To access an I/O device, the processor places the address on the address lines.
•The device recognizes the address, and responds to the control signals.
Accessing I/O devices (contd..)
Memory mapped I/O & I/O mapped I/O
I/O devices and the memory may share the same address
space:
Memory-mapped I/O.
Any machine instruction that can access memory can be used to
transfer data to or from an I/O device.
4
Accessing I/O devices (contd..)
Address lines
Bus Data lines
Control lines
Input device
•I/O device is connected to the bus using an I/O interface circuit which has:
- Address decoder, control circuit, and data and status registers.
•Address decoder decodes the address placed on the address lines thus enabling the
device to recognize its address.
•Data register holds the data being transferred to or from the processor.
•Status register holds information necessary for the operation of the I/O device.
•Data and status registers are connected to the data lines, and have unique addresses.
•I/O interface circuit coordinates I/O transfers.
Accessing I/O devices (contd..)
Rate of transfer to and from I/O devices is slower than the
speed of the processor. This creates the need for
mechanisms to synchronize data transfers between them.
Program-controlled I/O:
Processor repeatedly monitors a status flag to achieve the
necessary synchronization.
Processor polls the I/O device.
1
2
n lines are
ready j
here
j+1
4010 96500
(Starting address of
ISR3)
Interrupts Nesting
Before the processor started executing the interrupt service
routine for a device, it disabled the interrupts from the device.
In general, same arrangement is used when multiple devices can
send interrupt requests to the processor.
During the execution of an interrupt service routine of
device, the processor does not accept interrupt requests
from any other device.
Since the interrupt service routines are usually short, the
delay that this causes is generally acceptable.
However, for certain devices this delay may not be acceptable.
Ex: Real time clock
Which devices can be allowed to interrupt a processor
when it is executing an interrupt service routine of another
device?
Interrupts (contd..)
I/O devices are organized in a priority structure:
An interrupt request from a high-priority device is accepted
while the processor is executing the interrupt service routine
of a low priority device.
A priority level is assigned to a processor that can be changed
under program control.
Priority level of a processor is the priority of the program that
is currently being executed.
When the processor starts executing the interrupt service
routine of a device, its priority is raised to that of the device.
If the device sending an interrupt request has a higher priority
than the processor, the processor accepts the interrupt
request.
Interrupts (contd..)
Processor’s priority is encoded in a few bits of the processor
status register.
◦ Priority can be changed by instructions that write into the
processor status register.
◦ Usually, these are privileged instructions, or instructions that
can be executed only in the supervisor mode.
◦ Privileged instructions cannot be executed in the user mode.
◦ Prevents a user program from accidentally or intentionally
changing the priority of the processor.
If there is an attempt to execute a privileged instruction in the
user mode, it causes a special type of interrupt called as privilege
exception.
Interrupt- Priority
IN T R 1 INTR p
Processor
INTA1 INTA p
Priority arbitration
IN T R 1 I NT R p
Device 1 Device 2 Device p
INTA1 INTA p
Priority arbitration
•When I/O devices were organized in a daisy chain fashion, the devices shared
an interrupt-request line, and the interrupt-acknowledge propagated through
the devices.
INTR
Processor
INTR 1
Device Device
INTA1
Processor
IN T R p
Device Device
INTA p
Priority arbitration
circuit
Overhead:
(i) Instructions are needed for incrementing the memory
address and keeping track of word count
(ii) Saving and restoring PC and other state information
Direct Memory Access
A special control unit may be provided to transfer a
block of data directly between an I/O device and the
main memory, without continuous intervention by the
processor.
IRQ Done
IE R/W
Starting
address
Word
count
Direct Memory Access
Main
Processor
memory
System bus
B BS Y
BR
Processor
DMA DMA
controller controller
BG1 1 BG2 2
Centralized Bus Arbitration (cont.,)
• Bus arbiter may be the processor or a separate unit
connected to the bus.
• Normally, the processor is the bus master, unless it grants
bus membership to one of the DMA controllers.
• DMA controller requests the control of the bus by asserting
the Bus Request (BR) line.
• In response, the processor activates the Bus-Grant1 (BG1)
line, indicating that the controller may use the bus when it
is free.
• BG1 signal is connected to all DMA controllers in a daisy
chain fashion.
• BBSY signal is 0, it indicates that the bus is busy. When
BBSY becomes 1, the DMA controller which asserted BR can
acquire control of the bus.
Centralized arbitration (contd..)
DMA controller 2
asserts the BR signal.
Processor asserts
BR
the BG1 signal
B BS Y
Bus
master
Processor DMA controller 2 Processor
Processor
Processor
0101
0110
-------
0111
Processor
A - 0101
B - 0110
---------------------
Line - 0 1 1 1
Distributed arbitration
Arbitration process:
•Each device compares the pattern that appears on the arbitration lines to
its own ID, starting with MSB.
•If it detects a difference, it transmits 0s on the arbitration lines for that
and all lower bit positions.
• When the cache is full, and a block of words needs to be transferred from
the main memory, some block of words in the cache must be replaced. This
is determined by a “replacement algorithm”.
Cache hit
• Existence of a cache is transparent to the processor. The
processor issues Read and Write requests in the same
manner.
• Read hit:
▪ The data is obtained from the cache.
• Write hit:
▪ Cache has a replica of the contents of the main memory.
▪ Contents of the cache and the main memory may be updated
simultaneously. This is the write-through protocol.
▪ Update the contents of the cache, and mark it as updated by
setting a bit known as the dirty bit or modified bit. The
contents of the main memory are updated when this block is
replaced. This is write-back or copy-back protocol.
Cache miss
• If the data is not present in the cache, then a Read miss or Write
miss occurs.
• Read miss:
▪ Block of words containing this requested word is transferred from the
memory.
▪ After the block is transferred, the desired word is forwarded to the
processor.
▪ The desired word may also be forwarded to the processor as soon as it is
transferred without waiting for the entire block to be transferred. This is
called load-through or early-restart.
• Write-miss:
▪ Write-through protocol is used, then the contents of the main memory
are updated directly.
▪ If write-back protocol is used, the block containing the
addressed word is first brought into the cache. The desired word
is overwritten with new information.
Cache Coherence Problem
• Data transfers between main memory and disk occur
directly bypassing the cache.
• When the data on a disk changes, the main memory block is
also updated.
Block 4095
Direct mapping
Main
memory Block 0 •Block j of the main memory maps to j modulo 128 of
Block 1
the cache. 0 maps to 0, 129 maps to 1.
Cache
tag •More than one memory block is mapped onto the same
Block 0
position in the cache. (0, 128, 256 … maps to 0)
tag
Block 1
•May lead to contention for cache blocks even if the
Block 127 cache is not full.
Block 128 •Resolve the contention by allowing new block to
tag replace the old block, leading to a trivial replacement
Block 127 Block 129
algorithm.
•Memory address is divided into three fields:
- Low order 4 bits determine one of the 16
words in a block.
Block 255
Tag Block Word - When a new block is brought into the cache,
5 7 4 Block 256 the next 7 bits determine which cache block
Block 257 this new block is placed in.
Main memory address
- High order 5 bits determine which of the possible
32 blocks is currently present in the cache. These
are tag bits.
Block 4095
•Simple to implement but not very flexible.
Associative mapping
Main
memory Block 0 •Main memory block can be placed into any cache
Cache Block 1 position.
•Memory address is divided into two fields:
tag
Block 0
tag
Block 1 - Low order 4 bits identify the word within a block.
Block 127 - High order 12 bits or tag bits identify a memory
Block 128 block when it is resident in the cache.
tag
Block 127 Block 129 •Flexible, and uses cache space efficiently.
•Replacement algorithms can be used to replace an
existing block in the cache when the cache is full.
•Cost is higher than direct-mapped cache because of
Block 255
Tag Word the need to search all 128 patterns to determine
Block 256
whether a given block is in the cache. (Associative
12 4
Block 257
Main memory address search)
Block 4095
Set-Associative mapping
• Blocks of cache are grouped into sets.
Cache
Main
memory Block 0 • Mapping function allows a block of the main
tag
Block 0 memory to reside in any block of a specific set.
Block 1
Set 0 tag Block 1 • Divide the cache into 64 sets, with two blocks per
Block 2
set.
tag
Set 1 tag • Memory block 0, 64, 128 etc. map to cache set 0,
Block 3 Block 63 and they can occupy either of the two positions.
Block 64 • Memory address is divided into three fields:
tag
Block 126 Block 65 - 6 bit field determines the set number.
Set 63 - High order 6 bit fields are compared to the
tag Block 127 tag fields of the two blocks in a set.
• Set-associative mapping combination of direct and
Block 127 associative mapping.
Tag Set Word
Block 128 • Number of blocks per set is a design parameter.
6 6 4
- One extreme is to have all the blocks in one
Main memory address Block 129
set, requiring no set bits (fully associative
mapping).
- Other extreme is to have one block per set, is
the same as direct mapping.
Block 4095
• K-way Set-Associative Cache