Direct Memory Access
Direct Memory Access
Programmed I/O:
When the processor is executing a program and encounters an instruction relating to
I/O, it executes that instruction by issuing a command to the appropriate I/O module.
The I/O module performs the requested action and then sets the appropriate bits in
the I/O status register but takes no further action to alert the processor. In particular,
it does not interrupt the processor.
The processor periodically checks the status of the I/O module until it finds that the
operation is complete.
With programmed I/O, the processor has to wait a long time for the I/O module of
concern to be ready for either reception or transmission of more data. The processor,
while waiting, must repeatedly cross-examine the status of the I/O module. As a result,
the performance level of the entire system is severely degraded.
Interrupt-driven I/O ,
An alternative to Programmed I/O is for the processor to issue an I/O command to a
module and then go on to do some other useful work.
The I/O module will then interrupt the processor to request service when it is ready
to exchange data with the processor. The processor then executes the data transfer,
as before, and then resumes its former processing.
Interrupt-driven I/O, though more efficient than simple programmed I/O, still requires
the active intervention of the processor to transfer data between memory and an I/O
module, and any data transfer must traverse a path through the processor. Thus, both
of these forms of I/O suffer from two inherent drawbacks:
1. The I/O transfer rate is limited by the speed with which the processor can test and service
a device.
CS8493-OPERATING SYSTEMS
ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY
Definition
When large volumes of data are to be moved, a more efficient technique is required:
direct memory access (DMA).
Direct Memory Access (DMA) transfers the block of data between the memory and peripheral
devices of the system, without the participation of the processor.
The DMA function can be performed by a separate module on the system bus or it can
be incorporated into an I/O module.
Working
When the processor wishes to read or write a block of data, it issues a command to the DMA
module, by sending to the
DMA module the following information:
Whether a read or write is requested
The address of the I/O device involved
The starting location in memory to read data from or write data to
The number of words to be read or written
The processor then continues with other work. It has delegated this I/O operation to
the DMA module, and that module will take care of it.
The DMA module transfers the entire block of data, one word at a time, directly to or
from memory without going through the processor.
When the transfer is complete, the DMA module sends an interrupt signal to the
processor. Thus, the processor is involved only at the beginning and end of the
transfer.
The DMA module needs to take control of the bus to transfer data to and from
memory. Because of this competition for bus usage, there may be times when the
processor needs the bus and must wait for the DMA module.
Note that this is not an interrupt; the processor does not save a context and do
something else.
CS8493-OPERATING SYSTEMS
ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY
Rather, the processor pauses for one bus cycle (the time it takes to transfer one word
across the bus).
The overall effect is to cause the processor to execute more slowly during a DMA
transfer when processor access to the bus is required.
Nevertheless, for a multiple-word I/O transfer, DMA is far more efficient than
interrupt-driven or programmed I/O.
CS8493-OPERATING SYSTEMS
ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY
Multiprocessor Systems
Types
Asymmetric multiprocessing, in which each processor is assigned a specific task. A boss
processor, controls the system; the other processors either look to the boss for instruction or
have predefined tasks.
• This scheme defines a boss–worker relationship. The boss processor schedules and
allocates work to the worker processors.
Symmetric multiprocessing (SMP), in which each processor performs all tasks within the
operating system.
SMP means that all processors are peers; no boss–worker relationship exists between
processors.
Symmetric Multiprocessors
DEFINITION An SMP can be defined as a stand-alone computer system with the following
characteristics:
1. There are two or more similar processors of comparable capability.
2. These processors share the same main memory and I/O facilities and are interconnected
by a bus or other internal connection scheme, such that memory access time is approximately
the same for each processor.
3. All processors share access to I/O devices, either through the same channels or through
different channels that provide paths to the same device.
4. All processors can perform the same functions (hence the term symmetric ).
5. The system is controlled by an integrated operating system that provides interaction
between processors and their programs at the job, task, file, and data element levels.
In an SMP, individual data elements can constitute the level of interaction, and there can be
a high degree of cooperation between processes.
CS8493-OPERATING SYSTEMS
ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY
Advantages
• Performance: If the work to be done by a computer can be organized so that some portions
of the work can be done in parallel, then a system with multiple processors will yield greater
performance than one with a single processor of the same type.
• Availability: In a symmetric multiprocessor, because all processors can perform the same
functions, the failure of a single processor does not halt the machine. Instead, the system can
continue to function at reduced performance.
• Incremental growth: A user can enhance the performance of a system by adding an
additional processor.
• Scaling: Vendors can offer a range of products with different price and performance
characteristics based on the number of processors configured in the system.
ORGANIZATION Figure 1.19 illustrates the general organization of an SMP. There are multiple
processors, each of which contains its own control unit, arithmetic logic unit, and registers.
Each processor has access to a shared main memory and the I/O devices through some
form of interconnection mechanism; a shared bus is a common facility.
CS8493-OPERATING SYSTEMS
ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY
The processors can communicate with each other through memory (messages and
status information left in shared address spaces).
It may also be possible for processors to exchange signals directly.
The memory is often organized so that multiple simultaneous accesses to separate
blocks of memory are possible.
In modern computers, processors generally have at least one level of cache memory
that is private to the processor.
This use of cache introduces some new design considerations. Because each local
cache contains an image of a portion of main memory, if a word is altered in one cache,
it could conceivably invalidate a word in another cache.
To prevent this, the other processors must be alerted that an update has taken place.
This problem is known as the cache coherence problem and is typically addressed in
hardware rather than by the OS.
Multicore Computers
A multicore computer, also known as a chip multiprocessor , combines two or more
processors (called cores) on a single piece of silicon (called a die).
Typically, each core consists of all of the components of an independent processor,
such as registers, ALU, pipeline hardware, and control unit, plus L1 instruction and data
caches.
In addition to the multiple cores, contemporary multicore chips also include L2 cache
and, in some cases, L3 cache.
Designers have found that the best way to improve performance to take advantage of
advances in hardware is to put multiple processors and a substantial amount of cache
memory on a single chip.
An example of a multicore system is the Intel Core i7, which includes four x86
processors, each with a dedicated L2 cache, and with a shared L3 cache
One mechanism Intel uses to make its caches more effective is prefetching, in which
the hardware examines memory access patterns and attempts to fill the caches with
data that’s likely to be requested soon.
CS8493-OPERATING SYSTEMS
ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY
The Core i7 chip supports two forms of external communications to other chips. DDR3
memory controller, QPI
The DDR3 memory controller brings the memory controller for the DDR (double data
rate) main memory onto the chip. The interface supports three channels that are 8
bytes wide for a total bus width of 192 bits, for an aggregate data rate of up to 32
GB/s.
The QuickPath Interconnect (QPI) is a point-to-point link electrical interconnect
specification. It enables high-speed communications among connected processor
chips. The QPI link operates at 6.4 GT/s (transfers per second).
CS8493-OPERATING SYSTEMS