coa MODULE 1
coa MODULE 1
Input Unit
Computers accept coded information through input units. The most common input device is the
keyboard. Whenever a key is pressed, the corresponding letter or digit is automatically translated
into its corresponding binary code and transmitted to the processor.Many other kinds of input
devices for human-computer interaction are available, including the touchpad,mouse, joystick, and
trackball. These are often used as graphic input devices in conjunction with displays.Microphones
can be used to capture audio input which is then sampled and converted into digital codesfor
storage and processing.Similarly, cameras can be used to capture video input.Digital communication
facilities, such as the Internet, can also provide input to a computer from othercomputers and
database servers.
Central Processing Unit (CPU) : Once the information is entered into the computer by the input
device, the processor processes it. The CPU is called the brain of the computer because it is the
control center of the computer. It first fetches instructions from memory and then interprets them
so as to know what is to be done. If required, data is fetched from memory or input device.
Thereafter CPU executes or performs the required computation and then either stores the output or
displays on the output device. The CPU has three main components which are responsible for
different functions – Arithmetic Logic Unit (ALU), Control Unit (CU) and Memory registers
Arithmetic and Logic Unit (ALU) : The ALU, as its name suggests performs mathematical calculations
and takes logical decisions. Arithmetic calculations include addition, subtraction, multiplication and
division. Logical decisions involve comparison of two data items to see which one is larger or smaller
or equal.
Memory Unit--The function of the memory unit is to store programs and data. There are two
classes of storage, called primary and secondary.
Primary Memory ---- also called main memory, is a fast memory that operates at electronic speeds.
Programs must be stored in this memory while they are being executed. The memory consists of a
large number of semiconductor storage cells, each capable of storing one bit of information. These
cells are rarely read or written individually.Instead, they are handled in groups of fixed size called
words. The memory is organized so that one word can be stored or retrieved in one basic operation.
The number of bits in each word is referred to as the word length of the computer, typically 16, 32,
or 64 bits.To provide easy access to any word in the memory, a distinct address is associated with
each wordlocation. Addresses are consecutive numbers, starting from 0, that identify successive
locations.Instructions and data can be written into or read from the memory under the control of
the processor. A memory in which any location can be accessed in a short and fixed amount of time
after specifying its address is called a random-access memory (RAM). The time required to access
one word is called the memory access time. This time is independent of the location of the word
being accessed. It typically ranges from a few nanoseconds (ns) to about 100 ns for current RAM
units
Cache Memory--As an adjunct to the main memory, a smaller, faster RAM unit, called a cache, is
used to hold sections of a program that are currently being executed, along with any associated data.
The cache is tightly coupled with the processor and is usually contained on the same integrated-
circuit chip. The purpose of the cache is to facilitate high instruction execution rates.At the start of
program execution, the cache is empty. As execution proceeds, instructions are fetched into the
processor chip, and a copy of each is placed in the cache. When the execution of an instruction
requires data, located in the main memory, the data are fetched and copies are also placed in the
cache. If these instructions are available in the cache, they can be fetched quickly during the period
of repeated use.
Secondary Storage--Although primary memory is essential, it tends to be expensive and does not
retain information when power is turned off. Thus additional, less expensive, permanent
secondarystorage is used when large amounts of data and many programs have to be stored,
particularly for information that is accessed infrequently. Access times for secondary storage are
longer than for primary memory. The devices available are including magnetic disks, optical disks
(DVD and CD), and flash memory devices.
Output Unit--Output unit function is to send processed results to the outside world. A familiar
example of such adevice is a printer. Most printers employ either photocopying techniques, as in
laser printers, or ink jet streams. Such printers may generate output at speeds of 20 or more pages
per minute. However, printers are mechanical devices, and as such are quite slow compared to the
electronic speed of a processor.Some units, such as graphic displays, provide both an output
function, showing text and graphics, and an input function, through touchscreen capability. The dual
role of such units is the reason for using the single name input/output (I/O) unit in many cases.
Control Unit--The memory, arithmetic and logic, and I/O units store and process information and
perform input and
output operations. The operation of these units must be coordinated in some way. This is the
responsibility of the control unit. The control unit is effectively the nerve center that sends control
signals to other units and senses their states.I/O transfers, consisting of input and output operations,
are controlled by program instructions thatidentify the devices involved and the information to be
transferred.Control circuits are responsible for generating the timing signals that govern the
transfers. They determine when a given action is to take place. Data transfers between the
processor and the memory are also managed by the control unit through timing signals. A large set
of control lines (wires) carries the signals used for timing and synchronization of events in all units.
REGISTERS-- The processor provides 16 registers for use in general system and application
programming. These registers can be grouped as follows:
• General-purpose data registers. These eight registers are available for storing operands and
pointers.
• Segment registers. These registers hold up to six segment selectors.
• Status and control registers. These registers report and allow modification of the state of the
processor and of the program being executed.
. Registers
Registers are small, fast storage locations within the CPU, used to store temporary data and
instructions during processing. They are an essential part of a CPU’s ISA and are closer to the ALU
(Arithmetic Logic Unit) than main memory, which allows for rapid data access.
General-Purpose Registers (GPRs): Used to store temporary data for computation. Some
architectures have a small number of GPRs, while others have dozens.
Special-Purpose Registers: Include the program counter (PC), stack pointer (SP), and
instruction register (IR), among others.
o Program Counter (PC): Holds the memory address of the next instruction to be
executed.
o Stack Pointer (SP): Points to the top of the stack, which is used for managing
function calls and local variables.
o Status Register/Flags Register: Holds flags that provide information about the
outcome of operations, such as zero, carry, overflow, and sign flags.
Registers are crucial for performance since they allow the CPU to store and access data extremely
quickly compared to memory access.
2. Instruction Execution Cycle
The instruction execution cycle (or fetch-decode-execute cycle) is the fundamental process by which
a CPU executes instructions. It consists of a series of steps that are repeated for each instruction in a
program.
1. Fetch: The CPU retrieves the instruction from memory at the address pointed to by the
Program Counter (PC). It places the instruction in the Instruction Register (IR) and
increments the PC.
2. Decode: The control unit decodes the instruction in the IR to determine which operation to
perform and the operands needed.
3. Execute: The decoded instruction is executed by the CPU. This might involve arithmetic or
logical operations, memory access, or control instructions.
4. Writeback: Any results generated by the operation are written back to the appropriate
destination, such as a register or memory location.
Each of these steps requires different parts of the CPU, such as the ALU, registers, and control unit.
This cycle repeats for each instruction, allowing the CPU to carry out the instructions in a program
sequentially (or out-of-order in modern processors with optimized pipelines).
Register Transfer Language (RTL) is a symbolic notation used to describe the low-level operations
performed by the CPU as it executes instructions. RTL describes how data moves between registers,
memory, and the ALU, and is particularly useful for understanding the exact operations at each stage
of the instruction cycle.
For example:
An instruction like ADD R1, R2, R3 (add contents of R2 and R3, store in R1) might have the
following RTL interpretation:
o R1 ← R2 + R3
RTL statements describe the transfers and transformations on data at the register level, revealing
what each part of the CPU does for a given instruction. RTL helps computer architects understand
and design the internal data flow of the CPU, providing a foundation for hardware implementation
and optimization.
Digital systems are composed of modules that are constructed from digital components, such as
registers, decoders, arithmetic elements, and control logic
The modules are interconnected with common data and control paths to form a digital computer
system
The operations executed on data stored in registers are called microoperations
more registers
Some of the digital components from before are registers that implement microoperations
specifying
A programming language is a procedure for writing symbols to specify a given computational process
Define symbols for various types of microoperations and describe associated hardware that can
implement the microoperations Register Transfer Designate computer registers by capital letters to
denote its function.The register that holds an address for the memory unit is called MAR.The
program counter register is called PC.IR is the instruction register and R1 is a processor register .The
individual flip-flops in an n-bit register are numbered in sequence from 0 to n-1
4. Addressing Modes
Addressing modes specify how the CPU should interpret the operands of an instruction. They
determine how the CPU locates data in memory or registers, offering flexibility and efficiency in
programming. Common addressing modes include:
1. Immediate Addressing: The operand is specified directly within the instruction itself. Useful
for constants.
2. Direct Addressing: The operand is located at a specific memory address, which is provided
within the instruction.
o Example: MOV R1, 0x1000 (move the data at address 0x1000 into R1).
3. Indirect Addressing: The address of the operand is specified in a register, which indirectly
points to the actual data location.
o Example: MOV R1, [R2] (move data at the address stored in R2 into R1).
o Example: ADD R1, R2, R3 (add values in R2 and R3, store in R1).
5. Indexed Addressing: Combines a base address (often stored in a register) with an offset to
calculate the operand’s location. This is commonly used for array access.
o Example: MOV R1, 0x1000[R2] (move data from address 0x1000 + R2 into R1).
6. Relative Addressing: The address is specified relative to the Program Counter. Often used in
branch instructions.
Addressing modes enable efficient data access and manipulation, allowing flexible instruction
encoding and reducing memory access times in complex programs.
5. Instruction Set
The instruction set of a CPU refers to the set of all operations that the CPU can perform. It forms the
basis of programming for that CPU and includes the types of instructions it supports. A CPU’s
instruction set can typically be divided into several categories:
1. Data Movement Instructions: For moving data between registers, memory, and other
storage locations.
3. Logical Instructions: For performing logical operations like AND, OR, XOR, and NOT.
4. Control Instructions: For directing the flow of execution. This includes jumps, branches, and
function calls.
o Examples: JMP, CALL, RET, BZ (branch if zero), BNZ (branch if not zero)
5. Bit Manipulation Instructions: For bit-level operations such as shifting and rotating.
o Examples: SHL (shift left), SHR (shift right), ROL (rotate left)
7. System Instructions: For special-purpose tasks, such as changing CPU mode or managing
memory.
the x86 architecture is one of the oldest and most widely used instruction sets, developed by intel in
the late 1970s. it is based on complex instruction set computing (cisc), which provides a wide variety
of instructions that can perform multiple steps within a single instruction. the x86 isa has evolved
through numerous versions (e.g., x86, x86-32, x86-64), adding new features and extending the
instruction set for modern computing needs.
cisc design philosophy: the x86 isa includes many instructions that perform complex
operations, allowing for compact code but requiring more complex decoding. instructions
can vary in length from 1 to 15 bytes, enabling a rich set of operations but making pipelining
and decoding more challenging.
data movement instructions: x86 provides multiple ways to move data between registers,
memory, and the stack, including mov for general data transfer and specialized instructions
like push and pop for stack management.
arithmetic and logic instructions: instructions include basic operations like add, sub, and mul,
as well as advanced operations such as imul (integer multiplication) and div for division. x86
also supports bitwise operations (and, or, xor) and shifts (shl, shr).
control flow instructions: the x86 isa includes jump and branch instructions for altering
program flow. jmp for unconditional jumps, jz/jnz for conditional jumps based on zero/non-
zero flags, and call/ret for function calls. branch prediction is essential for high performance
due to the varied length and complexity of instructions.
simd extensions (mmx, sse, avx): over the years, x86 has added simd instructions to handle
multimedia processing more efficiently. these include mmx (for integer operations), sse (for
floating-point), and avx (for larger registers and parallelism). avx-512, a recent addition,
offers 512-bit simd operations.
due to its flexibility and backward compatibility, x86 is widely used in desktop, laptop, and server
processors. it provides a broad instruction set, which can make it more power-hungry but versatile.
intel and amd continually optimize the x86 design to enhance performance for complex tasks like
gaming, data processing, and software development.
2. arm architecture
arm architecture is based on the reduced instruction set computing (risc) philosophy. it was
developed by arm holdings and is widely used in mobile devices, embedded systems, and iot devices
due to its power efficiency and straightforward design. arm isa comes in multiple versions, including
armv7 (32-bit) and armv8 (64-bit), and recent iterations (like armv9) continue to enhance
performance and security features.
risc design philosophy: arm uses a simplified, fixed-length 32-bit instruction format (16-bit
for thumb instructions), leading to easier decoding and efficient pipelining. risc architecture
generally has fewer, simpler instructions, reducing power consumption and increasing
processing efficiency.
data movement instructions: arm instructions include ldr (load), str (store), and mov (move)
for data handling. arm employs load-store architecture, meaning data is moved to/from
registers before performing arithmetic, reducing direct memory manipulation.
arithmetic and logic instructions: arm supports basic arithmetic (add, sub) and logic (and, orr,
eor) instructions. multiplication and division are handled by mul and div. arm also supports
conditionally executed instructions, which allow certain instructions to execute based on the
status flags, reducing the need for branching.
branching and control flow instructions: arm includes both conditional (b, bl) and
unconditional (beq, bne) branch instructions. these instructions rely on the condition codes,
which are typically set by arithmetic and logic operations. arm’s use of conditional execution
minimizes the performance cost of branches.
simd and vector processing: arm's neon technology provides simd processing capabilities,
allowing for parallel processing of multimedia tasks, such as image processing and
cryptographic operations, making it suitable for high-performance mobile applications.
power efficiency and thumb instructions: arm introduced the thumb instruction set, which
uses 16-bit instructions to reduce code size, enhancing efficiency for embedded systems.
arm processors use power-efficient designs, making arm a popular choice for battery-
powered devices.
arm's power efficiency and scalable design make it a staple in smartphones, tablets, and embedded
systems. companies like apple (with its a-series and m-series chips), qualcomm, and samsung utilize
customized arm architectures to provide high-performance and low-power cpus for mobile and
consumer electronics markets. arm’s flexibility allows it to balance performance and efficiency for
both general-purpose and specialized computing.
3. mips architecture
risc design philosophy: like arm, mips follows the risc design with a small, simple, and
consistent set of instructions, aiming for one instruction per cycle. all instructions are 32 bits
in length, simplifying decoding and pipeline design.
data movement instructions: mips uses a load-store model, with lw (load word) and sw
(store word) as primary instructions for moving data between registers and memory. it also
supports direct register-to-register data transfer (move).
arithmetic and logical instructions: mips provides basic arithmetic operations (add, sub) and
logical operations (and, or, xor). mips instructions are register-based, meaning that
arithmetic operations typically use registers as both source and destination.
branching and control flow instructions: mips uses simple branching instructions like beq
(branch if equal) and bne (branch if not equal) for decision-making, with j for unconditional
jumps. mips also has jal (jump and link) for function calls, which links the current pc value to
the return address.
load-store architecture: mips adheres to the load-store model strictly, with arithmetic and
logical operations only operating on registers, not directly on memory.
the simplicity and efficiency of mips make it popular in embedded systems, routers, and academic
use. due to its straightforward design, mips is often used in networking equipment and iot devices.
the architecture’s predictability and ease of understanding make it a prime choice for education,
where it serves as a model to teach fundamental computer architecture concepts.
design
cisc risc risc
philosophy
simd support yes (mmx, sse, avx) yes (neon) limited (mips-3d)
feature x86 (intel/amd) arm mips
typical embedded,
desktops, servers mobile, embedded
applications education
Data representation: signed number representation, fixed and floating point representations,
character representation
Signed number representation allows computers to encode both positive and negative values. Since
binary format only naturally represents positive numbers, specific techniques are needed to denote
negative values.
Sign-and-Magnitude Representation:
o In this method, the leftmost bit (most significant bit) is used as the sign bit, with 0
representing positive and 1 representing negative. The remaining bits represent the
magnitude (absolute value) of the number.
o Example: In an 8-bit format, 00000101 represents +5, while 10000101 represents -5.
o Here, negative numbers are represented by inverting all bits of their positive
counterparts. The leftmost bit is still a sign bit (0 for positive, 1 for negative).
o In two’s complement, negative numbers are represented by inverting all bits of the
positive number and then adding one to the result. The leftmost bit acts as the sign
bit (0 for positive, 1 for negative).
o Example: To represent -5, start with the binary for 5 (00000101), invert it
(11111010), and add one to get 11111011.
o Advantage: Two’s complement has only one representation for zero and simplifies
arithmetic operations since addition and subtraction work directly with binary
values.
Two’s complement is the most widely used method for representing signed integers in modern
computers due to its efficient handling of addition, subtraction, and zero representation.
These two techniques represent real (fractional) numbers, allowing computers to work with non-
integer values.
Fixed-Point Representation:
Fixed-point representation stores numbers with a specific number of digits (bits) allocated
for the integer part and the fractional part.
In fixed-point binary notation, a designated binary point separates the integer bits from the
fractional bits. The location of the binary point is fixed, meaning the number of fractional
bits is constant.
Example: In an 8-bit fixed-point system with 4 bits for the integer part and 4 bits for the
fractional part, 0010.1100 would represent 2.75 in decimal (binary 0010 is 2, and .1100 is
0.75).
Fixed-point representation is often used in embedded systems where precision and range are
limited and predictable. However, it’s less flexible than floating-point representation when working
with numbers of varying magnitudes.
Floating-Point Representation:
The IEEE 754 single-precision (32-bit) and double-precision (64-bit) standards are most
widely adopted. In IEEE 754 single precision:
o Sign Bit: 1 bit for the sign (0 for positive, 1 for negative).
o Exponent: 8 bits for the exponent, stored in "biased" form, where 127 is added to
the exponent to avoid negative values.
o Mantissa: 23 bits for the mantissa (or significand), which represents the precision of
the number.
Example:
Character representation encodes text characters, symbols, and control codes, allowing computers
to store and manipulate human-readable text. The two most commonly used character encoding
standards are ASCII and Unicode.
ASCII is a 7-bit encoding scheme that represents 128 characters, including upper- and
lowercase English letters, digits, punctuation marks, and control characters.
Extended ASCII (8-bit) represents 256 characters, allowing additional symbols, characters,
and graphic symbols used in various languages.
Unicode:
Unicode uses several encoding formats, with UTF-8, UTF-16, and UTF-32 as the most
common. UTF-8 is widely used on the web and is backward-compatible with ASCII, as it uses
1-4 bytes to represent each character.
Unicode enables computers to represent a vast range of characters and symbols from
languages worldwide, as well as emoji and specialized symbols, ensuring broad compatibility
across platforms and applications.
Summary
Signed Number Binary representation of positive and Used for integer arithmetic in
Representation negative integers (e.g., two’s complement) computer programs
Character Encoding of text characters using ASCII or Text processing, web applications,
Representation Unicode data storage
Each data representation method is chosen based on the specific requirements for range, precision,
storage efficiency, and compatibility with applications, contributing to the efficient handling of
different types of data in computing systems.
Computer arithmetic – integer addition and subtraction, ripple carry adder, carry look ahead
adder. .
Computers perform integer arithmetic using binary numbers, specifically two’s complement
representation for signed integers. In two’s complement, positive and negative numbers are
represented in a way that simplifies addition and subtraction, enabling the same circuitry to handle
both operations without distinguishing between positive and negative signs explicitly.
Binary Addition:
Binary addition operates similarly to decimal addition, with bitwise addition using the following basic
rules:
0+0=0
1+0=1
0+1=1
1 + 1 = 0 with a carry of 1
In cases where two bits add up to 2 in binary (i.e., 10), the 0 is placed in the sum, and 1 is carried to
the next higher bit. This carry propagation can potentially slow down the operation, especially for
large binary numbers.
Binary Subtraction:
Binary subtraction is typically performed by adding the two’s complement of the subtrahend (the
number being subtracted) to the minuend (the number being subtracted from). To find the two’s
complement:
Add 0101 and 1101 to obtain 1 0010, where the leftmost 1 is discarded in a fixed-bit system,
yielding 0010 (binary 2), which is the correct result.
A ripple carry adder is a simple, basic circuit used to perform binary addition on two binary numbers.
It consists of a series of full adders connected in sequence, where each full adder adds a single bit
from each number, taking into account the carry bit from the previous stage.
This process continues until the final full adder produces the sum and final carry-out for the
most significant bit.
For example, adding two 4-bit binary numbers 1101 and 1011:
2. This carry is added to the next bit, and the process continues until all bits are processed.
Propagation Delay: The primary drawback is the carry propagation delay, where the carry
signal must travel through each full adder in sequence. For an n-bit adder, the delay is
proportional to n.
For large numbers, this delay can be significant, making ripple carry adders slower compared
to other types of adders.
Despite its simplicity, the ripple carry adder is not used in high-performance processors due to this
delay. However, it remains useful in applications where speed is less critical.
A carry look-ahead adder (CLA) addresses the speed limitations of the ripple carry adder by reducing
the carry propagation delay. Instead of waiting for the carry to ripple through each bit position, the
CLA adder uses logic to calculate the carry bits in advance.
The carry look-ahead adder uses two main functions for each bit:
1. Generate (G): A bit position generates a carry if both bits being added are 1.
2. Propagate (P): A bit position propagates a carry from a previous bit if at least one of the bits
is 1.
The carry-out for each bit can then be calculated based on the generate and propagate functions:
Carry-Out Formula: Ci+1=Gi+(Pi⋅Ci)C_{i+1} = G_i + (P_i \cdot C_i)Ci+1 =Gi +(Pi ⋅Ci )
Using these formulas, a carry look-ahead adder can compute the carry for each bit position
independently and simultaneously, drastically reducing the delay caused by carry propagation.
For a 4-bit adder, let’s compute the carry-out at each bit position:
The result is an addition that’s much faster, as the carry is determined in parallel.
Reduced Delay: By calculating carries in parallel, carry look-ahead adders are significantly
faster than ripple carry adders, especially as the number of bits grows.
Scalability: Though CLAs are faster for larger bit lengths, they become more complex with
higher bit counts due to the increasing number of gates required.
For high-speed applications, such as in ALUs (Arithmetic Logic Units) of modern CPUs, CLAs are
commonly used because they improve the speed of addition operations by minimizing carry
propagation time.
Ripple Carry Sequential adder with carry bits Simple design, Slow due to carry
Adder propagating sequentially fewer gates propagation delay
Carry Look- Parallel computation of carry bits Faster for large bit More complex, more
Ahead Adder for faster addition numbers gates required
multiplication – shift-and add, Booth multiplier, carry save multiplier, etc.Division restoring and
non-restoring techniques, floating point arithmetic
Multiplication Techniques
In computer arithmetic, multiplication is more complex than addition due to the need for both
shifting and adding partial products. Several techniques have been developed to optimize this
process.
1. Shift-and-Add Multiplication:
o A simple method similar to manual multiplication, where the multiplier’s bits are
evaluated from least significant to most significant.
o For each 1 bit in the multiplier, the multiplicand is shifted and added to a cumulative
sum.
2. Booth’s Multiplier:
o Useful for signed binary numbers and reduces computation by skipping certain bits.
3. Carry-Save Multiplier:
o After all partial products are calculated, a final addition resolves the carries. This
technique is particularly useful in high-speed multipliers and in ALUs for faster
performance.
Division Techniques
Division is more complex than multiplication and requires specific algorithms for efficient
computation. There are two main techniques used in computer systems:
1. Restoring Division:
o If the result is positive, a 1 is placed in the quotient, and the remainder is updated.
o If the result is negative, the previous result is "restored" by adding back the divisor,
and a 0 is placed in the quotient.
o It is simple to implement but can be slower because it requires restoring the value
when the subtraction is negative.
2. Non-Restoring Division:
o Similar to restoring division but avoids restoring the remainder when a subtraction
results in a negative value.
o Instead, it keeps track of the sign and adds the divisor in the next step if the previous
subtraction was negative.
o This method eliminates the restoring step, making it faster in some cases and is
often used in digital hardware implementations.
Floating-point arithmetic allows computers to handle real numbers with fractional components and
is standardized by the IEEE 754 format.
Representation:
o The 32-bit format includes 1 sign bit, 8 bits for the exponent (biased by 127), and 23
bits for the mantissa (or fraction), which gives it a wide dynamic range.
Operations: