Computer Organization Complete Notes_45112494_2024_12_27_12_57
Computer Organization Complete Notes_45112494_2024_12_27_12_57
Computer Architecture
Flynn’s Taxonomy
Single Instruction, Single Data (SISD):
This is a uniprocessor machine which is capable of executing a single
instruction, operating on a single data stream.
The machine instructions are processed in a sequential manner and
computers adopting this model are popularly called sequential computers.
Most conventional computers have SISD architecture.
All the instructions and data to be processed have to be stored in primary
memory.
The speed of the processing element in the SISD model is limited by the rate
at which the computer can transfer information internally.
1. The CSIC architecture processes complex The RISC architecture executes simple yet optimized
instructions that require several clock cycles for instructions in a single clock cycle. It processes
execution. On average, it takes two to five clock instructions at an average speed of 1.5 clock cycles
cycles per instruction (CPI). per instruction (CPI).
2. Implementation of complex instructions is enabled RISC lacks special memory and thus utilizes
through memory units. specialized hardware to execute instructions.
3. CISC devices are installed with a RISC devices are embedded with a hardwired
microprogramming unit. programming unit.
4. CISC uses a variety of instructions to accomplish RISC is provided with a reduced instruction set, which
complex tasks. is typically primitive in nature.
5. A CISC architecture uses one cache to store data The RISC architecture relies on split caches, one for
as well as instructions. However, recent CISC data and the other for instructions.
designs employ split caches to divide data and
instructions.
6. CISC processors use a memory-to-memory RISC processors rely on a register-to-register
framework to execute instructions such as ADD, mechanism to execute ADD, STORE, and independent
LOAD, and even STORE. LOAD instructions.
7. The CISC architecture uses only one register set. The RISC architecture utilizes multiple registers sets.
8. CISC processors are capable of processing high- Since RISC processors support a limited set of
level programming language statements more addressing modes, complex instructions are
efficiently, thanks to the support of complex synthesized through software codes.
Subscribe Infeepedia youtube channel for computer science competitive exams
Download Infeepedia app and call or wapp on 8004391758
Computer Organization Infeepedia By: Infee Tripathi
addressing modes.
9. CISC does not support parallelism and pipelining. RISC processors support instruction pipelining.
As such, CISC instructions are less pipelined.
10. CISC instructions require high execution time. RISC instructions require less time for execution.
11. In the CISC architecture, the task of decoding In RISC processors, instruction decoding is simpler
instructions is quite complex. than in CISC.
12. Examples: Intel x86 CPUs, System/360, VAX, PDP- Examples: Alpha, ARC, ARM, AVR, MIPS, PA-RISC, PIC,
11, Motorola 68000 family, and AMD. Power Architecture, and SPARC.
13. CISC processors are used for low-end applications RISC processors are suitable for high-end
such as home automation devices, security applications, including image and video processing,
systems, etc. telecommunications, etc.
Machine Instructions
Machine instructions are machine code programs or commands which instruct the processor to perform a
specific task.
Machine instruction is a sequence of binary bits (0/1) which is executed by processor.
An instruction can be divided in 3 parts:
Opcode
Operand
Addressing Mode
Types of Instructions
Memory-reference
Data Transfer Instructions
Instructions
Data Manipulation Instructions Register-reference
Instructions
Program Control Instructions Input-Output Instructions
PUSH: Pushes data from a register towards the top of the stack.
STORE: Transfer of data from the register to the memory.
POP: Fetches top data from register or stack memory.
XCHG: Transfers information between registers and memory.
Memory-Reference Instruction
15 14 11 0
Mode(I) opcode Address
1 or 0 000-110
Hexadecimal
SNo. Instruction I=0 I=1 Description
direct Indirect
1. AND 0XXX 8XXX Performs AND operation between memory and accumulator.
2. ADD 1XXX 9XXX Performs add operation between memory and accumulator.
3. LDA 2XXX AXXX Load accumulator in memory
4. STA 3XXX BXXX Store accumulator in memory
5. BUN 4XXX CXXX Branch unconditionally
6. BSA 5XXX DXXX Branch and save return address
7. INC 6XXX EXXX Increment
Register-reference instructions
15 14 11 0
Mode(I) opcode Address
0 111
Input-Output Insruction
15 14 11 0
Mode(I) opcode Address
1 111
Instruction cycle
The instruction cycle is a process in which a computer system
fetches an instruction from memory, decodes it, and then executes
it.
It is also called Fetch-Decode-Execute Cycle.
Each phase of Instruction Cycle can be decomposed into a
sequence of elementary micro-operations called microinstructions.
Each instruction cycle consists of the following phases:
1. Fetch instruction from memory.
2. Decode the instruction.
3. Read the effective address from memory.
4. Execute the instruction.
Upon the completion of step 4, the control goes back to step 1 to fetch, decode, and execute the next
instruction.
This cycle repeats indefinitely unless a HALT instruction is encountered
1. Fetch cycle:-
Step1:- The address of the next instruction is fetched from Program Counter(PC) and transferred to the Memory
Address Register(MAR). It is done at T0 clock pulse.
Step 2:- The address in MAR is put on the address bus and the control unit request a memory read and fetch the
data of that address and send it back using data bus and data is copied into the memory buffer register.
Now Program counter is incremented by one, to get ready for the next instruction.
It is done at T1 clock pulse.
Step 3: The content of the MBR is moved to the instruction register(IR) to decode. It is done at T2 clock pulse.
2. Decode cycle:- When the instruction is in IR then it will be decoded by decoder at T2 clock pulse.
Here we get either register reference or I/O reference or memory (direct/ indirect) reference instruction
depends upon the value of D7 and I.
Indirect bit is transferred to flipflop I & address part is transferred to AR.
3. Indirect Cycle:-
The indirect cycle is sometimes required when instructions involve accessing memory locations that contain
addresses or pointers to the actual data or another address.
During this cycle, the CPU may use an address obtained from the previous instruction to access another
memory location, which holds the data or another address to be used in the next cycle.
The indirect cycle enables the CPU to follow memory references and retrieve the actual data required for
execution.
4. Execute cycle:-
Case1:- In case of a memory instruction (direct or indirect) the execution phase will be in the next clock pulse T3.
Case1.1- If the instruction is direct, nothing is done at T3 clock pulse.
Case1.2-If the instruction has an indirect address, the effective address is read from main memory, and any
required data is fetched from main memory to be processed and then placed into data registers(Clock Pulse: T3).
Case2:-If the instruction is an I/O reference or a Register reference, the operation is performed (executed) at
clock Pulse: T3.
5. Interrupt cycle:-
When an interrupt occur the current contents of the PC must be saved so that the processor can resume
normal activity after the interrupt.
Thus, the contents of the PC are transferred to the MBR to be written into memory.
The special memory location reserved for this purpose is loaded into the MAR from the control unit.
It might, for example, be a stack pointer.
The PC is loaded with the address of the interrupt routine. As a result, the next instruction cycle will begin by
fetching the appropriate instruction.
Advantages
Standardization: Provides a consistent way for CPUs to work, making it easier for software and hardware to
work together.
Efficiency: Breaks down tasks into smaller steps, helping CPUs work faster.
Pipelining: Lets CPUs work on multiple instructions at once, improving speed.
Disadvantages
Overhead: Adds extra steps to every instruction, slowing things down a bit.
Complexity: Can be hard to design and understand, especially for complex CPUs.
Limited parallelism: Sometimes, instructions depend on each other, so they can't all be done at once, slowing
things down.
Instruction Format
Instruction formats in computer architecture refer to the way instructions are encoded and represented in
Machine Language.
The instruction formats are a sequence of bits (0 and 1) which contains some fields (addressing mode, opcode
and operands).
Each field of the instruction provides specific information to the CPU related to the mode, operation and
location of the operands.
In other words we can say the instruction format also defines the layout of an instruction in bits.
The instruction format depends upon the CPU organization and CPU depends upon the Instructions Set
Architecture implemented by the processor.
As per CPU organization instructions may have several different lengths containing varying number of
addresses ( 3, 2, 1 or 0).
The number of address field in the instruction format depends on the internal organization of its registers.
CPU organizations
CPU organization is the general structure and behaviour of a computer. The CPU is made up of three main parts:
Register set
Control unit (CU)
Arithmetic and logic unit (ALU)
Advantages:
Using lots of registers makes the CPU work faster.
Programs take up less memory because they're written more efficiently.
Disadvantages:
Compilers have to be smart to avoid using too many registers.
It costs more because of all the extra registers.
3. Stack Organization
As we know stack maintains a stack pointer which points the top of the stack to access the data.
The operands are implicitly specified on top on the stack.
It does not use address field for data manipulation instructions (add, mul, sub etc).
Subscribe Infeepedia youtube channel for computer science competitive exams
Download Infeepedia app and call or wapp on 8004391758
Computer Organization Infeepedia By: Infee Tripathi
Opcode Destination/Source
Example:
ADD A // AC <- AC + M[A]
Important Points
In instruction format instruction length is always a challenge. The longer the instruction, the longer will be the
time taken to fetch the instruction.
The number of bits of address is directly proportional to the memory range. i.e., if the larger range is required,
larger number of address bits is required.
If a system supports the virtual memory, then the memory range that needs to be addressed by the instruction
will be larger than the physical memory.
Instruction length should be equal to or the multiple of data bus length.
Here operands are always implied to be present on the top of the stack.
2. Immediate Mode
The operand is specified in the instruction explicitly.
Instead of address of location, a data value is present in the instruction address field.
It is represented as ‘#’.
Example:- Immediate Mode Opcode operands
ADD #15 // AC <- AC+15
It will increment the value stored in the accumulator by 15.
MOV R #20 //R <- 20
It initializes register R to a constant value 20.
Advantages:
It is fast mode
Limitation:-
The range of constants is restricted by the size of the address field.
Advantages:-
Largest address space is available
Disadvantages:-
Slower than direct due to multiple memory reference to find data.
Advantages:-
Insruction length is short due to limited number of registers.
Fast execution due to no memory reference.
Disadvantages:-
Limited address space
If we use multiple register it complex the instructions and also not cost effective.
Advantages:-
Faster than indirect mode because one less memory reference.
Large address space 2n
Disadvantages:-
Slower than register direct mode.
Example:
ADD R1, 10(R2) // R1 <- R1 + M[10 + R2]
ADD R1, 201(XR) // R1 <- R1 + M[201 + XR]
Memory
Memory Organization
Associative Memory
Associative memories are also commonly known as content-addressable memories (CAMs).
In associative memory, data is stored together with additional tags or metadata that describe its content. When a search
is performed, the associative memory compares the search query with the tags of all stored data, and retrieves the data
that matches the query.
It reduces the search time by using content
It is accessed simultaneously or parallel.
When a word is written in an associative memory, no address is given.
This memory is capable to find an empty location to store the word.
To read a word from associative memory the content of the word or part of the word is specified.
The memory locates all the words which matched the specified content and marks them for reading.
Subscribe Infeepedia youtube channel for computer science competitive exams
Download Infeepedia app and call or wapp on 8004391758
Computer Organization Infeepedia By: Infee Tripathi
Cache memory
Cache is the fastest memory.
Currently running or ready to run process is placed in cache for fast execution.
CPU first access cache memory if words needed.
If the word is found in the cache, it is read from the cache.
If the word required by CPU is not found in the cache, the main memory is accessed to read the blocks.
Data transfer between CPU and cache is called words and between cache and main memory is called blocks.
Data transfer in Words Blocks
CPU Cache Main Memory
Locality of Reference
Locality of reference is the tendency of the computer program where a program repeatedly accesses a specific set of
memory locations over a short period of time.
The term locality of reference refers to the fact that there is usually some predictability in the sequence of memory
addresses accessed by a program during its execution.
There are two types of locality of reference:-
1. Temporal:- It means that a recently executed instruction is likely to be executed again very soon.
2. Spatial:- It means that instructions in close proximity to a recently executed instruction are also likely to be executed
soon.
Cache Performance
When the CPU refers to memory and finds the word in cache, it is said cache hit, Otherwise, it is a cache miss
The performance of cache memory is measured in terms of hit ratio.
Hit ratio = hits(h) / (hit+miss) = No. of hits/total no. of CPU references
Miss ratio= miss(1-h) /(hit+miss)=No. of miss/ total no. of CPU references
Cache miss penalty= cache access time + main memory access time
Direct mapping
It is the simplest mapping and uses RAM to map with cache memory.
A particular block of the main memory is mapped only in a certain line of the cache.
Suppose cache memory is divided in k number of lines and main memory has n blocks then jth number block can be
mapped in the cache as follows:
Cache line number = (Address of the Main Memory Block ) Modulo (Total number of lines in Cache)
L = j mod k
Example:-
Suppose block j of the main memory is mapped to cache with k number of lines ( J mod k).
Consider a main memory of 256 words and divided into 64 blocks.
Number of words in each block =256/64 = 4 words/block
Consider a cache memory of 32 words.
As we know the size of block and line is equal. Hence the size of line is 4 words.
Total number of lines = 32/4 = 8
L0 B0, B8,
B0 W0 W1 W2 W3
L1 B1,
B1 W4 W5 W6 W7
L2 B2
L3 B3 . . . . .
L4 B4 . . . . .
L5 B5 . . . . .
L6 B6 . . . . .
L7 B7 B62 W248 W249 W250 W251
B63 W252 W253 W254 W255
To access the data from cache CPU generate a logical address of 8 bits.
8 bit
Tag Index
3 bit 5 bit
L0 B0,B1.........., B63 B0 W0 W1 W2 W3
L1 B0,B1.........., B63 B1 W4 W5 W6 W7
B2 W8 W9 W10 W11
L2 B0,B1.........., B63 B3 W12 W13 W14 W15
L3 B0,B1.........., B63 . . . . .
L4 B0,B1.........., B63 . . . . .
. . . . .
L5 B0,B1.........., B63
. . . . .
L6 B0,B1.........., B63 B61 W244 W245 W246 W247
L7 B0,B1.........., B63 B62 W248 W249 W250 W251
B63 W252 W253 W254 W255
Cache memory Main Memory
8 bit
Tag Index
6 bit 2 bit
Disadvantage:-
Expensive (parallel algorithm has been developed to speed up the process)
Page replacement algorithm is needed.
To access the data from cache CPU generate a logical address of 8 bits.
8 bit
Tag Index
6 bit 2 bit
Note:
If K=1, it becomes direct mapping
If k= no. of blocks, it become associative mapping.
Virtual Memory
Virtual memory is a memory management technique that allows the illusion of a larger memory space than physically
available.
It separates logical memory from physical memory, assigning each
program its own virtual address space.
Page tables are used to map virtual addresses to physical addresses,
with memory divided into fixed-size blocks called pages.
When a program accesses a page not in physical memory, a page
fault occurs, triggering the retrieval of the required page from
secondary storage.
Demand paging brings pages into memory only when needed,
optimizing memory utilization.
Swapping is a memory management technique where entire
processes are moved between RAM and disk.
Processes are swapped out from RAM to free up memory and
stored in the swap space on disk.
When needed, swapped-out processes are swapped back into RAM from the swap space, called swap in.
Swap space acts as a backing store, providing additional storage when RAM is insufficient.
Page replacement algorithms like FIFO and LRU decide which pages to evict from memory when it's full.
Drawbacks include overhead from page faults and complexity.
Example:- The virtual address space is 4 GB and page size is 128 KB. Given that the memory address space is of 512 MB.
What is the size of frame and bits required to represent virtual addresses and physical addresses.
Solution:
1. Virtual Address Space (VAS):
VAS = 4 GB = 4×230 bytes = 232 bytes.
Page size = 128 KB = 128×210 bytes = 217 bytes.
Number of pages = VAS/ Page Size=232 /217= 215
Virtual address bits = 32 bits (since 232 bytes).
2. Physical Address Space (PAS):
PAS = 512 MB = 512×220 bytes = 229 bytes.
Subscribe Infeepedia youtube channel for computer science competitive exams
Download Infeepedia app and call or wapp on 8004391758
Computer Organization Infeepedia By: Infee Tripathi
3 3 3 2 2 2 2 2 2 5
2 2 2 1 1 1 1 1 1 4 4
1 1 1 4 4 4 5 5 5 3 3 3
F F F F F F F H H F F F
Example: Suppose we have a system with 3 frames (physical memory) and a sequence of page accesses:
1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5.
If the system has 3 frames, the given reference string the using LRU page replacement algorithm yields a total of 10 page
faults.
If the system has 3 frames, the given reference string the using LFU page replacement algorithm yields a total of 10 page
faults.
3 4 4 4 5 5 5 5 5 5
2 2 2 2 2 2 2 2 2 4 4
1 1 1 1 1 1 1 1 1 3 3 3
F F F F H H F H H F F H
If the system has 3 frames, the given reference string the using Optimal page replacement algorithm yields a total of 7 page
faults.
IO Organization
Input/Output Subsystem:-
The I/O subsystem of a computer provides an efficient mode of communication between the central system and the
outside environment.
It handles all the input output operations of the computer system.
Peripheral Devices:
Input or output devices that are connected to computer are called peripheral devices.
These devices are designed to read information into or out of the memory unit upon command from the CPU and are
considered to be the part of computer system.
These devices are also called peripherals.
For example: Keyboards, Mouse,Scanner and printers are common peripheral devices.
Interfaces
Interface is a shared boundary btween two separate components of the computer system which can be used to attach two
or more components to the system for communication purposes.
There are two types of interface:
1. CPU Inteface
2. I/O Interface
Input-Output Interface
1. Peripherals connected to a computer need special communication links for interfacing with CPU.
2. In computer system, there are special hardware components between the CPU and peripherals to control or manage the
input-output transfers.
3. These components are called input-output interface units because they provide communication links between processor
bus and peripherals.
4. They provide a method for transferring information between internal system and input-output devices.
5. The Input/output Interface is required because there are exists many differences between the central computer and each
peripheral while transferring information.
Some major differences are:
a) Peripherals are electromechanical and electromagnetic devices and their manner of operation is different from the
operation of CPU and memory, which are electronic device.Therefore, a conversion of signal values may be required.
b) The data transfer rate of peripherals is usually slower than the transfer rate of CPU, and consequently a synchronisation
mechanism is needed.
c) Data codes and formats in peripherals differ from the word format in the CPU and Memory.
d) The operating modes of peripherals are differ from each other and each must be controlled so as not to disturb the
operation of other peripherals connected to CPU.
There are three ways that computer buses can be used to communicate with memory and I/O:
1. Use two separate buses, one for memory and the other for I/O.
2. Use one common bus for both memory and I/O but have separate control lines for each.
3. Use one common bus for memory and I/O with common control lines.
1. Programmed I/O
1. In this mode, the CPU is responsible for all data transfers between the I/O device and memory.
2. The CPU continuously checks the status of the I/O device to see if it is ready for data transfer.
3. This process is called polling.
4. The CPU waits in a loop until the I/O device signals that it is ready.
5. Once ready, the CPU executes a data transfer instruction to read
from or write to the I/O device.
Advantages:
Simple to implement.
Direct control over I/O devices.
Disadvantages:
CPU time is wasted in constantly checking the device status
(polling).
Inefficient as the CPU cannot perform other tasks during this
waiting period.
2. Interrupt-Driven I/O
1. In this mode, the CPU is interrupted by the I/O device when the device is ready for data transfer.
2. The CPU issues a command to the I/O device and continues with
other tasks.
3. When the I/O device is ready (e.g., it has finished reading or
writing), it sends an interrupt signal to the CPU.
4. The CPU stops its current task, saves its state, and executes an
interrupt service routine (ISR) to handle the I/O transfer.
5. After the I/O operation is complete, the CPU resumes its previous
task.
Advantages:
More efficient than programmed I/O as the CPU can perform
other tasks while waiting for the I/O operation.
Reduces CPU idle time.
Disadvantages:
More complex to implement compared to programmed I/O.
May introduce interrupt latency if multiple interrupts occur.
Memory-Mapped I/O
1. In this mode, I/O devices are treated as if they are part of the system memory, and specific memory addresses are
assigned to each I/O device.
2. The CPU uses regular memory instructions (like LOAD and STORE) to read or write data to the I/O device addresses.
3. No special I/O instructions are needed.
4. This allows I/O devices to be controlled and accessed just like memory.
5. A single set of read/write control lines (no distinction between memory and I/O transfer)
6. Memory and I/O addresses share the common address space which reduces memory address range available
7. No specific input or output instruction so the same memory reference instructions can be used for I/O transfers o
Considerable flexibility in handling I/O operations
Advantages:
Simplifies the I/O access mechanism as the same instructions can be used for both memory and I/O.
Easier to program and optimize.
Disadvantages:
Consumes a part of the memory address space for I/O operations.
May lead to address conflicts.
Applications:
Common in microcontrollers and simpler computer architectures.
Advantages:
No conflict between memory and I/O addresses.
Efficient use of memory address space.
Disadvantages:
Requires additional instructions for I/O operations.
May require complex programming.
Modes of Transfer
Mode CPU Involvement Efficiency Use Case
Programmed I/O High (Polling-based) Low (CPU is idle) Simple, low-speed devices
Interrupt-Driven I/O Moderate (Interrupt handling) Moderate to High Moderate-speed devices
Direct Memory Access (DMA) Low (Only initialization) High High-speed devices