Computer Architecture Important Question
Computer Architecture Important Question
Computer Organization:
Computer Organization deals with the implementation details — how everything in the
architecture is actually built and made to work.
It includes:
Control signals
Data paths
Memory technology
Processor logic
Circuit-level implementation
So this is more about the physical realization and the how a computer performs its operations.
🔍 Key Differences:
Feature Computer Architecture Computer Organization
High-level design, functionality, and Implementation details and physical
Focus
performance aspects
Perspective Programmer’s view Hardware engineer’s view
Types of instructions a CPU can How those instructions are implemented
Example
execute in hardware
Instruction sets, data formats, Control units, ALUs, buses, memory
Deals with
addressing modes hardware
Abstraction
Abstract / Logical Concrete / Physical
Level
How It Works:
The CPU fetches an instruction from memory.
It decodes and executes it.
It then moves to the next instruction in a sequential manner.
iii. Registers
Function: Small, high-speed storage locations within the CPU.
Common Registers:
o Program Counter (PC): Holds the address of the next instruction.
o Instruction Register (IR): Holds the current instruction being executed.
o Accumulator: Temporarily holds data during operations.
o General Purpose Registers: Used for temporary data storage during execution.
v. Clock
Function: Generates a consistent timing signal to synchronize all CPU operations.
Clock Speed: Measured in GHz (gigahertz) — higher means more instructions per
second.
Memory Hierarchy:
The memory hierarchy is a structure that organizes computer memory/storage systems based on
speed, cost, and capacity. It ranges from the fastest and most expensive (used frequently) to the
slowest and cheapest (used infrequently).
Levels of the Memory Hierarchy (Top to Bottom):
Level Type Speed Cost/Bit Size Volatility
1. Registers Inside the CPU Fastest Very high Very small Volatile
2. Cache L1, L2, L3 Very fast High Small Volatile
3. Main Memory RAM (DRAM) Moderate Medium Medium Volatile
4. Secondary Storage SSD, HDD Slower Low Large Non-volatile
5. Tertiary Storage Tape, Cloud Storage Slowest Very low Very large Non-volatile
Q. Why Is Memory Hierarchy Important?
1. Performance Optimization:
o Faster memory (like cache) is close to the CPU, allowing quicker data access and
reducing wait time.
2. Cost Efficiency:
o Faster memory is expensive. A hierarchy allows a blend of speed and
affordability by using fast memory for critical tasks and slower, cheaper memory
for bulk storage.
3. Efficient Data Management:
o Frequently accessed data is stored in higher (faster) levels, while rarely accessed
data stays in lower levels — improving overall speed.
4. Scalability:
o As programs grow in size and complexity, the hierarchy allows systems to scale
without a massive cost increase.
5. What is cache memory? What are different cache mapping techniques (direct, associative, and
set-associative)?
Answer:
Cache Memory:
Cache memory is a small, high-speed memory located close to the CPU (often inside it) that
stores frequently accessed data and instructions to speed up processing.
Since accessing data from main memory (RAM) is relatively slow, the cache reduces the
average time to access memory by keeping copies of frequently used data closer to the CPU.
Q. Why Is It Fast?
Cache uses SRAM (Static RAM) which is faster (but more expensive) than the DRAM
used in main memory.
It operates at CPU speeds and drastically improves performance by reducing memory
access latency.
🔍 Cache Levels:
L1: Smallest, fastest, located on the processor core.
L2: Larger, slightly slower, also often on the chip.
L3: Even larger, shared between cores, slower than L1/L2 but faster than RAM.
Cache Mapping Techniques:
These are the methods used to place and find data in the cache from main memory.
I. Direct Mapping
Q. How It Works:
Each block of main memory maps to exactly one cache line.
Simple and fast but can cause conflicts if multiple blocks map to the same line.
Formula:
Cache Line Index = (Block Address) MOD (Number of Cache Lines)
Pros:
Simple and cheap to implement.
Fast access time.
Cons:
High chance of cache misses if multiple data blocks map to the same line (conflict
misses).
Pros:
Very low conflict misses.
More flexible placement.
Cons:
Expensive and slower to implement (needs hardware to compare tags for all cache lines).
Example (4-way):
A block maps to 1 of N sets, but has 4 possible lines in that set it can be placed in.
Pros:
Balance between speed and flexibility.
Lower conflict rate than direct mapping.
Less expensive than fully associative.
Cons:
Slightly more complex than direct mapping.
6. Difference between SRAM and DRAM.
Feature SRAM (Static RAM) DRAM (Dynamic RAM)
Full Form Static Random Access Memory Dynamic Random Access Memory
Speed Faster Slower
Cost More expensive Cheaper
Density Lower (takes more space per bit) Higher (more compact)
Power
Lower (no need for frequent refresh) Higher (requires periodic refresh)
Consumption
Storage Cell Uses 6 transistors per bit Uses 1 transistor + 1 capacitor per bit
Not required (holds data as long as Required (data must be refreshed
Data Refreshing
powered) periodically)
Cache memory (L1, L2, L3) inside
Used In Main memory (RAM modules)
CPU
Complexity More complex design Simpler design
Volatile (loses data when power is
Volatility Volatile
off)
Virtual memory is managed by the Operating System (OS) using a combo of hardware (like the
Memory Management Unit, or MMU) and software.
🔍 Process (Simplified):
1. Program requests data at a virtual address.
2. The MMU checks the page table to find the corresponding physical frame.
3. If the page is in RAM (a page hit), it retrieves it.
4. If the page is not in RAM (a page fault):
o The OS loads the page from the disk (swap space) into RAM.
o If RAM is full, it may evict another page to make room (based on a replacement
algorithm like LRU).
🔍 Page Table:
A data structure that maps virtual page numbers to physical frame numbers. Can include:
Present/absent bit (is it in RAM?)
Access permissions
Dirty bit (has it been modified?)
Examples:
ARM processors (used in smartphones, tablets)
RISC-V (open-source ISA, growing in popularity)
MIPS, SPARC
Advantages:
More powerful individual instructions
Compact programs (smaller binaries)
Less burden on compilers
Examples:
x86 architecture (used in most desktops and laptops)
Intel 8086, Pentium, AMD Ryzen
9. What is pipelining? What are the types of hazards (data, control, structural)?
Answer:
Pipelining:
Pipelining is a technique where multiple instruction stages are overlapped during execution. It’s
similar to an assembly line in a factory.
Instead of waiting for one instruction to finish entirely before starting the next, pipelining breaks
instruction execution into stages:
1. Fetch
2. Decode
3. Execute
4. Memory Access
5. Write Back
Each stage works simultaneously on different instructions, so multiple instructions are in flight
at once.
Goal:
Increase instruction throughput (instructions per unit time) — not the execution time of a
single instruction.
Hazards in Pipelining:
While pipelining improves speed, it also introduces hazards — situations that prevent the next
instruction from executing in the next cycle.
a) Data Hazards:
Occurs when instructions depend on the results of previous instructions that haven’t
completed yet.
Types:
RAW (Read After Write) – Most common:
o ADD R1, R2, R3 ; R1 = R2 + R3
o SUB R4, R1, R5 ; uses R1 before it's written
WAR (Write After Read) – Rare in pipelines:
WAW (Write After Write) – Can happen in out-of-order execution.
Solutions:
Forwarding/Bypassing
Stalling (pipeline bubble)
Example:
BEQ R1, R2, LABEL
The CPU doesn’t know which instruction to fetch next until the branch decision is
made.
Solutions:
Branch prediction
Delayed branching
Stalling until decision is known
c) Structural Hazards
Occurs when hardware resources are insufficient to handle overlapping instructions.
Example:
If the CPU has only one memory unit and both an instruction fetch and data access
occur simultaneously — they conflict.
Solutions:
Duplicate resources (e.g., separate instruction/data caches)
Pipeline scheduling
10. What is an instruction cycle? Explain its stages.
Answer:
Instruction Cycle:
An instruction cycle is the complete process by which a computer fetches, decodes, and
executes an instruction.
Each instruction a program runs must go through this cycle to be processed by the CPU.
b. Decode:
Goal: Understand what the instruction means.
The Control Unit (CU) interprets the binary instruction.
It identifies the operation (opcode) and the operands (data or addresses involved).
c. Execute:
Goal: Perform the operation.
The Arithmetic Logic Unit (ALU) or other CPU components perform the task.
This could be arithmetic, logical operations, memory access, etc.
Components Involved:
1. I/O Device (e.g., hard disk, NIC, etc.)
2. Main Memory (RAM)
3. DMA Controller
4. CPU (only initiates and gets notified when done)
Cycle Stealing DMA transfers one byte/word at a time, "stealing" CPU cycles occasionally.
Transparent DMA only transfers when the CPU is idle — causes no interference but
Mode slower.
Benefits of DMA:
Increases system throughput
Reduces CPU load
Enables efficient handling of large data transfers (like file copying, streaming, etc.)
Characteristics:
Separate address space for memory and I/O.
CPU knows whether it’s accessing memory or I/O based on the instruction used.
Typically fewer address lines are needed for I/O.
Pros:
Keeps memory and I/O spaces separate — no overlap.
May use simpler, faster hardware for basic I/O operations.
Cons:
Requires special CPU instructions.
Less flexible (can’t use regular load/store instructions on I/O).
B. Memory-Mapped I/O:
How It Works:
I/O devices are assigned addresses within the same address space as RAM.
The CPU accesses I/O just like it accesses memory using standard instructions (like
MOV, LOAD, STORE).
Characteristics:
No separate I/O instructions needed — regular memory instructions work.
Requires careful address allocation to avoid overlapping memory and I/O.
Pros:
Simple and flexible (e.g., use of full set of memory instructions on I/O).
Easier to design for systems with unified memory and I/O space.
Can use powerful addressing modes and optimizations.
Cons:
Reduces available address space for RAM.
May need more complex address decoding.
Comparison Table:
Feature I/O-Mapped I/O Memory-Mapped I/O
Address Space Separate I/O address space Shared with memory
Instructions Special I/O instructions (e.g., IN, OUT) Regular memory instructions
Address Size Typically smaller Same as memory address size
Hardware Simplicity Simpler for small systems More integrated/flexible
Flexibility Less flexible More flexible