0% found this document useful (0 votes)
4 views6 pages

Direct 5th chapter

Direct Memory Access (DMA) optimizes data transfer between I/O devices and memory without burdening the CPU, allowing for efficient operation in systems with shared buses. Interrupts are mechanisms that notify the CPU of I/O device needs, with a hierarchical structure for handling them, while modern CPUs face challenges with precise and imprecise interrupts due to pipelining and superscalar execution. I/O software principles focus on device independence, uniform naming, error handling, and efficient data management, with methods like Programmed I/O, Interrupt-Driven I/O, and DMA differing in efficiency and CPU involvement.

Uploaded by

yeshaswini.56
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views6 pages

Direct 5th chapter

Direct Memory Access (DMA) optimizes data transfer between I/O devices and memory without burdening the CPU, allowing for efficient operation in systems with shared buses. Interrupts are mechanisms that notify the CPU of I/O device needs, with a hierarchical structure for handling them, while modern CPUs face challenges with precise and imprecise interrupts due to pipelining and superscalar execution. I/O software principles focus on device independence, uniform naming, error handling, and efficient data management, with methods like Programmed I/O, Interrupt-Driven I/O, and DMA differing in efficiency and CPU involvement.

Uploaded by

yeshaswini.56
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 6

### Summary: Direct Memory Access (DMA)

**Direct Memory Access (DMA)** is a technique to optimize data transfer between I/O
devices and memory without burdening the CPU. This method is crucial for efficient
operation in systems where the CPU communicates with device controllers over a
shared system bus.
#### **How DMA Works:**
1. **Initialization**:
- The CPU programs the DMA controller by setting registers for:
- **Memory address**: Where data will be transferred.
- **Byte count**: Number of bytes to transfer.
- **Control information**: Transfer direction (read/write), unit size
(byte/word), and I/O port to use.

2. **Data Transfer**:
- The DMA controller requests data from the device controller over the bus.
- The device controller reads from/writes to memory as directed by the DMA
controller.
- The DMA controller:
- Tracks the transfer using acknowledgments.
- Increments the memory address.
- Decrements the byte count until the transfer is complete.
- Once the transfer is finished, the DMA controller interrupts the CPU to signal
completion.

3. **Modes of Operation**:
- **Cycle Stealing**: The DMA controller transfers one word at a time,
occasionally using the bus, causing slight delays for the CPU.
- **Burst Mode**: The DMA controller acquires the bus, transfers a block of
data, then releases it. This is more efficient but can block the CPU and other
devices.

4. **Fly-by Mode vs. Buffered Mode**:


- **Fly-by Mode**: Data is transferred directly from the device to memory.
- **Buffered Mode**: Data is first buffered in the device controller,
simplifying bus usage and allowing checksum verification before transfer.
#### **Advantages of DMA**:
- **CPU Efficiency**: Frees the CPU from handling byte-by-byte data transfers.
- **Faster Data Transfer**: Exploits bus bandwidth efficiently, especially in burst
mode.
- **Checksum Verification**: Device controllers can verify data integrity before
transferring to memory.
- **Simplified Design**: Internal buffering in device controllers reduces timing
issues during high bus contention.
#### **Challenges and Complexities**:
- **Multiple Devices**: More sophisticated DMA controllers manage concurrent
transfers using multiple register sets and prioritize requests using round-robin or
priority schemes.
- **Virtual vs. Physical Memory**:
- Most DMA controllers use physical addresses, requiring the OS to translate
virtual addresses.
- Some advanced controllers support virtual addresses, relying on the MMU for
translation.
#### **When DMA is Not Used**:
- In some low-end systems, the CPU handles data transfers directly, eliminating the
DMA controller to save costs.
- This approach is viable when the CPU is faster than the I/O device and has no
other tasks.
#### **Design Considerations**:
- **Internal Buffering**: Ensures smooth data handling during bus contention and
simplifies controller design.
- **Limitations**: When the CPU is underutilized, waiting for a slower DMA
controller can reduce efficiency. DMA remains a widely used mechanism for efficient
data transfer in systems with high-performance requirements and multiple I/O
devices.

### Summary: Interrupts Revisited

Interrupts are mechanisms used by I/O devices to notify the CPU that they require
attention. The interrupt system in typical personal computers involves a
**hierarchical structure** with device controllers, an interrupt controller, and
the CPU.
#### **How Interrupts Work**:
1. **Triggering an Interrupt**:
- When an I/O device completes its task, it asserts a signal on its assigned bus
line.
- The **interrupt controller** detects the signal and decides whether to handle
the interrupt based on priority or whether another interrupt is in progress.

2. **Handling Interrupts**:
- If the interrupt is processed immediately, the controller places an
identifying number (device ID) on the bus’s address lines and signals the CPU.
- The CPU halts its current execution and:
- Uses the device ID to index the **interrupt vector**, a table storing the
addresses of interrupt service routines (ISRs).
- Fetches the ISR's address and starts executing it.

3. **Interrupt Acknowledgement**:
- The ISR acknowledges the interrupt by writing a specific value to the
interrupt controller.
- The controller can then process new interrupt requests, ensuring **race
conditions** are avoided.
#### **Saving CPU State During Interrupts**:
- The CPU saves its state (at minimum, the **program counter**) before executing
the ISR to ensure the interrupted process can resume correctly.
- Where this information is saved depends on the system:
- **Internal registers**: Can result in **race conditions** if a second interrupt
overwrites data before it is read.
- **Stack**: Most systems use the stack for saving state, but this introduces
challenges:
- **User stack issues**:
- The stack pointer may be invalid (e.g., out of memory bounds).
- Writing to an illegal or boundary address could cause a **page fault**,
complicating the interrupt process.
- **Kernel stack advantage**:
- More likely to have valid memory and avoid page faults.
- However, switching to the kernel stack may require **MMU context changes**,
invalidating the **cache** and **TLB**, increasing overhead.
#### **Interrupt Prioritization**:
- When multiple interrupts occur:
- The controller determines priority, typically allowing higher-priority
interrupts to preempt lower-priority ones.
- Devices with lower priority may continue asserting their signals until
serviced.
#### **Challenges with Interrupts**:
- **Lost Data**: Delayed or unacknowledged interrupts can cause data loss if
subsequent interrupts overwrite state.
- **Overhead**: Switching modes, saving/restoring state, and handling MMU/cache
invalidate operations increase interrupt-handling latency.
- **Older Systems**: In older architectures lacking a centralized interrupt
controller, each device directly signals the CPU, leading to complexity. Interrupts
are critical for efficient I/O handling but require careful design to balance
responsiveness, system stability, and performance.

### Summary: Precise and Imprecise Interrupts

Modern CPUs face challenges with interrupts due to features like **pipelining** and
**superscalar execution**, where instructions are executed out of order or
partially completed at any given time. Interrupt handling in these architectures
differs from older systems, where instructions were strictly sequential.
#### **Interrupt Handling in Modern CPUs**:
1. **Pipelined Architectures**:
- In pipelined CPUs, when an interrupt occurs, multiple instructions are in
various stages of execution.
- The program counter (PC) typically points to the next instruction to be
fetched, not the one being executed, complicating interrupt processing.

2. **Superscalar Architectures**:
- Instructions are decomposed into **micro-operations** and may execute out of
order.
- At the time of an interrupt, some earlier instructions may not have started,
while newer ones may be almost finished.
- This irregular state makes it difficult to determine the exact program state
during an interrupt.
#### **Precise vs. Imprecise Interrupts**:
- **Precise Interrupts**:
- Leave the machine in a well-defined state, satisfying four properties:
1. The **program counter (PC)** is saved in a known location.
2. All instructions before the one pointed to by the PC are completed.
3. No instructions beyond the one pointed to by the PC have finished.
4. The state of the instruction at the PC is clearly known (either executed or
not).
- Example: When handling **I/O interrupts**, instructions beyond the PC have not
started, or their effects are rolled back.

- **Imprecise Interrupts**:
- Occur when instructions near the PC are in varying stages of completion, and
the execution state is inconsistent.
- To handle these, the CPU dumps a large amount of internal state onto the stack,
making interrupt handling and recovery slow and complex.
- Imprecise interrupts make real-time tasks harder, as they slow down interrupt
response and complicate restarting processes.
#### **Trade-offs and Design Approaches**:
1. **Performance vs. Precision**:
- Precise interrupts require additional logic to maintain a clean machine state,
consuming **chip area** and increasing **design complexity**.
- Imprecise interrupts simplify CPU design but burden the operating system,
making recovery slower and more error-prone.

2. **Selective Precision**:
- Some systems make specific interrupts (e.g., I/O) precise while allowing
others (e.g., fatal traps) to be imprecise, as recovery is unnecessary for
programming errors like division by zero.

3. **Backward Compatibility**:
- CPUs like the **x86 family** implement precise interrupts to support older
software, using complex interrupt logic to preserve a defined machine state.
4. **Performance Impact**:
- Precise interrupts require logging and shadow copies of registers, impacting
CPU speed.
- Conversely, imprecise interrupts complicate the OS, potentially leading to
slower overall system performance.
### Conclusion:
The choice between precise and imprecise interrupts involves trade-offs between CPU
design complexity, performance, and operating system simplicity. Precise interrupts
favor compatibility and OS simplicity at the cost of chip area and complexity,
while imprecise interrupts prioritize CPU performance but complicate interrupt
handling and recovery.

### Summary: Principles of I/O Software

The principles of I/O software focus on goals and techniques for designing
efficient, user-friendly, and robust systems that abstract the complexities of
hardware.
#### **Goals of I/O Software**:
1. **Device Independence**:
- Programs should work with any I/O device without being modified for specific
devices.
- For example, a program reading input or writing output should function
seamlessly with hard disks, USB sticks, or keyboards.
- The operating system handles device-specific differences.

2. **Uniform Naming**:
- Devices and files should be named in a consistent manner, such as using path
names or integers, without reflecting the underlying hardware.
- Example: In UNIX, all devices can be accessed through the file system
hierarchy, allowing uniform interaction regardless of the device.

3. **Error Handling**:
- Errors should be managed at the lowest possible level (e.g., the controller or
device driver) to shield upper layers from transient issues like read errors.
- If lower layers cannot resolve an error, only then should it escalate to
higher levels.

4. **Synchronous (Blocking) vs. Asynchronous I/O**:


- Physical I/O is typically asynchronous, but blocking I/O is easier for user
programs to handle.
- Operating systems often provide a blocking interface while internally managing
interrupt-driven operations.
- High-performance applications may require direct access to asynchronous I/O
for precise control.

5. **Buffering**:
- Data often needs temporary storage before reaching its final destination.
- Example: Network packets must be buffered for examination before being placed
in their final location.
- Buffering also handles real-time constraints, such as preparing audio data in
advance to avoid underruns.
- While necessary, buffering involves extra copying, which impacts performance.

6. **Shared vs. Dedicated Devices**:


- Some devices (e.g., disks) can be shared by multiple users simultaneously
without issues.
- Others (e.g., printers) require exclusive access to avoid conflicts (e.g.,
mixed output from multiple users).
- Dedicated devices introduce complexities like potential deadlocks, which the
operating system must manage.
### Key Takeaways:
I/O software abstracts the complexities of hardware, ensuring device independence,
uniformity, error resilience, and efficient data management. The operating system
plays a crucial role in balancing these goals while addressing performance
constraints and user needs.

### Summary: Methods of Performing I/O


I/O operations can be executed through three main techniques: **Programmed I/O**,
**Interrupt-Driven I/O**, and **Direct Memory Access (DMA)**. These methods differ
in their implementation, efficiency, and CPU involvement.
#### **5.2.2 Programmed I/O**
1. **Overview**:
- The CPU performs all I/O tasks, actively controlling the data transfer.
- Example: Printing a string via a serial interface where characters are
processed one at a time.

2. **Steps**:
- The user process assembles data in a user-space buffer and makes a system
call.
- The operating system transfers the buffer to kernel space.
- Data is output character-by-character to the device's data register.
- The CPU continuously **polls** (busy-waits) the device's status register to
check readiness for the next character.

3. **Advantages**:
- Simple to implement.
- Effective when devices operate very quickly or in systems with no competing
tasks (e.g., embedded systems).

4. **Disadvantages**:
- Inefficient in multitasking systems as the CPU is fully occupied, unable to
perform other tasks.
#### **5.2.3 Interrupt-Driven I/O**
1. **Overview**:
- The CPU outputs data and performs other tasks while the device operates.
- Interrupts notify the CPU when the device is ready for the next action.

2. **Steps**:
- The user process makes a system call, and data is copied to kernel space.
- The first character is written to the device, and the **scheduler** allows
other processes to execute.
- When the device is ready, it generates an **interrupt** to signal the CPU.
- The interrupt handler either outputs the next character or unblocks the user
process if all data is processed.

3. **Advantages**:
- Reduces CPU idle time compared to programmed I/O.
- Efficient in multitasking systems, allowing the CPU to perform other tasks.

4. **Disadvantages**:
- Interrupts occur for every character, adding overhead and wasting CPU time
when processing many characters.
#### **5.2.4 I/O Using DMA (Direct Memory Access)**
1. **Overview**:
- A **DMA controller** handles data transfer directly between memory and the
device, minimizing CPU involvement.
2. **Steps**:
- The user process makes a system call, and data is copied to kernel space.
- The DMA controller is configured to manage data transfer.
- Once the entire buffer is transferred, the device generates a single interrupt
to notify completion.

3. **Advantages**:
- Reduces the number of interrupts from one per character to one per buffer,
significantly saving CPU time.
- Frees the CPU to handle other tasks during the transfer.

4. **Disadvantages**:
- Requires specialized hardware (DMA controller).
- Slower than the CPU for some tasks, and may not drive devices at full speed.
### **Comparison**
- **Programmed I/O**: Simple but inefficient in multitasking systems due to CPU
busy-waiting.
- **Interrupt-Driven I/O**: Efficient for multitasking but incurs interrupt
overhead.
- **DMA**: Most efficient for large data transfers, reducing CPU involvement but
requiring additional hardware.

DMA is often preferred in systems prioritizing performance, while simpler methods


may suit specific scenarios like embedded systems or low-speed devices.

You might also like