0% found this document useful (0 votes)
13 views

Computer Architecture and Organization (1)

The document provides an overview of computer architecture and organization, detailing the evolution of computers from ENIAC to modern architectures such as von Neumann and Harvard. It explains the fundamental components of a computer, including the CPU, memory, and I/O systems, as well as the roles of the arithmetic and logic unit (ALU) and control unit. Additionally, it discusses the importance of instruction set architecture and the factors influencing computer design, including hardware advancements and software trends.

Uploaded by

Emma flex
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Computer Architecture and Organization (1)

The document provides an overview of computer architecture and organization, detailing the evolution of computers from ENIAC to modern architectures such as von Neumann and Harvard. It explains the fundamental components of a computer, including the CPU, memory, and I/O systems, as well as the roles of the arithmetic and logic unit (ALU) and control unit. Additionally, it discusses the importance of instruction set architecture and the factors influencing computer design, including hardware advancements and software trends.

Uploaded by

Emma flex
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 36

COMPUTER ARCHITECTURE AND ORGANIZATION

CHAPTER ONE
COMPUTER
Computer is an electronic device or machine that accept data, process it and give the desired
output.
As an electronic device, computer consist of electronic components like resistors, capacitor e.t.c
Although for most part human being can do whatever a computer can do but computer as a
machine, is expected to do it faster and easier with much accuracy.
Computer has found its way into innumerable areas of life. It affects some part of our lives. A
computer is faster and more accurate than people are, but must be given a complete set of
instructions that tell it exactly what to do at each step of its operation. This set of instructions,
called a program is prepared by one or more persons for each job the computer is to do.
Programs are placed in the computer memory unit in binary-coded form, with each instruction
having a unique code. The computer takes these instruction codes from memory one at a time
and performs the operation called for the code.
Fundamentals of Computer Design
ENIAC (Electronic Numerical Integrator And Computer) designed and constructed at the
University of Pennsylvania, was the world’s first general purpose electronic and digital
computer. The project was a response to U.S needs during World War II.
John Mauchy, a professor of electrical engineering at the university of Pennsylvania and John
Eckert, one of his graduate students, proposed to build a general-purpose computer using
vacuum tubes for the BRL’s application. In 1943, the Army accepted this proposal, and work
began on the ENIAC. The resulting machine was enormous, weighing 30 tons, occupying 1500
square feet of floor space, and containing more than 18,000 vacuum tubes, when operating, it
consumed 140 kilowatts of power. It was also substantially faster than any electromechanical
computer, capable of 5000 additions per second.
The ENIAC was completed in 1948, too late to be used in the war effort. The use of the ENIAC
for a purpose other than that for which it was built demonstrated its general-purpose nature. The
ENIAC continue to operate under BRL management until 1955 when it was disassembled.
What is the issue with the ENIAC and how can it be solved
The task of entering and altering programs for the ENIAC was extremely tedious. The
programming process can be easy if the program could be represented in a form suitable for
storing in memory alongside the data. Then, a computer could get its instructions by reading
from memory, and a program could be set or altered by setting the values of a portion of
memory. The idea is known as the stored-program concept. The first publication of the idea was
in 1945 proposal by Von Neumann for a new computer, the EDVAC (Electronic Discrete
Variable Computer).
VON NEUMANN’S ARCHITECTURE
In 1946, Von Neumann and his colleagues began the design of a new stored-program computer,
referred to as the IAS computer, at the Princeton Institute for Advanced Studies. The IAS
computer although not completed until 1952, is the prototype of all subsequent general-purpose
computers
EDVAC
The concept of stored program computers appeared in 1945 when John von Neumann drafted the
first version of EDVAC (Electronic Discrete Variable Computer). Those ideas have since been
the milestones of computers:
 An input device through which data and instructions can be entered
 Storage in which data can be read/written; instructions are like data, they reside in the
same memory
 A control unit which fetches instructions, decode and execute them
 Output devices for the user to access the results.
Improved Von Neumann Architecture

HARVARD ARCHITECTURE
The Harvard architecture is a computer architecture with a separate storage and signal pathways
for instructions and data. It is often contrasted with the von Neumann architecture, where
program instructions and data share the same memory and pathways.
The term is often stated as having originated from the Harvard Mark I relay-based computer,
which stored instructions on punched tape (24 bits wide) and data in electro-mechanical
counters. These early machines had data storage entirely contained within the central processing
unit, and provided no access to the instruction storage as data. Programs needed to be loaded by
an operator, the processor could not initialize itself.
Harvard and von Neumann architectures are often portrayed as a dichotomy, but the various
devices labeled as the former have far more in common with the latter than they do with each
other. Harvard architecture was coined in the context of microcontroller design and only
retrospectively applied to the Harvard machines and subsequently applied to RISC
microprocessors with separated caches.
Modern processors appear to the user to be systems with von Neumann architecture, with the
program code stored in the same main memory as the data. For performance reasons, internally
and largely invisible to the user, most designs have separate processor caches for the instructions
and data, with separate pathways into the processor for each. This is one form of what is known
as the modified Harvard architecture.
Memory details
In Harvard architecture, there is no need to make the two memories share characteristics. In
particular, the word width, timing, implementation technology, and memory address structure
can differ. In some systems, instructions for pre-programmed tasks can be stored in read-only
memory while data memory generally requires read-write memory. In some systems, there is
much more instruction memory than data memory so instruction addresses are wider than data
addresses.
Contrast with von Neumann architectures
In a system with a pure von Neumann architecture, instructions and data are stored in the same
memory, so instructions are fetched over the same data path used to fetch data. This means that
CPU cannot simultaneously read an instruction and read or write data from or to the memory. In
a computer using the Harvard architecture, the CPU can both read an instruction and perform a
data memory access at the same time, even without cache. A Harvard architecture computer can
thus be faster for a given circuit complexity because instruction fetches and data access do not
contend for a single memory pathway.
Also, a Havard architecture machine has distinct code and data address spaces: instruction
address zero is not the same as data address zero. Instruction address zero might identify a
twenty-four bit value, while data address zero might indicate an eight-bit byte that is not part of
that twenty-four-bit value.
MODIFIED HARVARD ARCHITECTURE
A modified Harvard architecture machine is very much like Harvard architecture machine, but it
relaxes the strict separation between instruction and data while still letting the CPU concurrently
access two (or more) memory buses. The most common modification includes separate
instruction and data caches backed by a common address spaces. While the CPU executes from
cache, it acts as a pure Harvard machine. When accessing backing memory, it acts like von
Neumann machine (where code can be removed around like data, which is a powerful
technique).
The improvements in computer technology have been tremendous since the first machine
appeared.
EVOLUTION OF COMPUTER
Four lines of evolution have emerged from the first computers
Mainframes computer: Large computers that can support very many users while delivering great
computing power. It is mainly in mainframes where most of the innovations (both in architecture
and organization) have been made
Minicomputers: have adopted many of the mainframe techniques, yet being designed to sell for
less, satisfying the computing needs for smaller groups of users. It is the minicomputer group
that improved at the fastest pace (since 1965 when DEC introduced the first minicomputer)
mainly due to the evolution of integrated circuits technology (the first IC appeared in 1958).
Supercomputer: Designed for scientific applications, they are the most expensive computers.
Processing is usually done in batch mode for reasons of performance.
Microcomputers: have appeared in the microprocessor era (the first microprocessor, intel 4004,
was introduced in 1971). The term micro refers only to physical dimensions, not to computing
performance. A typical microcomputer (either a PC or a workstation) nicely fits on a desk.
Microcomputers are a direct product of technological advances: faster CPUs, semiconductor
memories, etc.
Basic Organization of a computer
Most of the computers available today are so called Von Neumann computers, simply because
their building part; CPU or processor, memory, and I/O are interconnected the way von Neumann
suggested.
Figure below shows the building blocks of computers; even though there are many variations,
and the level at which these blocks can be found is different, sometimes at the system level, other
times at board level or even at chip level, their meaning is yet the same
(1) CPU: This is core of the computer; all computation is done here and the whole system is
controlled by the CPU.
(2) Memory: This is where the program and the data for the program are stored.
(3) I/O: This provide the means of entering the program and data into the system. It also
allows the user to get the results of the computation.
Organization of a CPU using a single internal data bus.

Organization of CPU, using single internal data bus, IR is the Instruction Register
Computation and control in CPU
The computation part of the CPU, called the datapath, consists of the following units:
(a) ALU (Arithmetic and Logic Unit) which performs arithmetic and logic operations
(b) Registers which hold variables or intermediary results of computation, as well as special
purpose registers
(c) Bus which is the interconnections between them.

Central Processing Unit (CPU)


In a computer central processing unit (CPU), the accumulator is the register in which
intermediate arithmetic and logic results are stored. Without a register like an accumulator, it
would be necessary to write the result of each calculation (addition, multiplication, shift, e.t.c.) to
main memory, perhaps only to be read right back again for use in the next operation.
Assess to main memory is slower than to a register like an accumulator because the technology
used for the large main memory is slower (but cheaper) than that used for a register. Early
electronic computer systems were often split into two groups, those with accumulator and those
without accumulator
Modern computer systems often have multiple general-purpose registers that can operate as
accumulators, and the term is no longer as common as it once was. However, to simplify their
design, a number of special-purpose processor still use a single accumulator.

Arithmetic and logic unit


The arithmetic and logic unit (ALU) performs the arithmetic and logical functions that are the
work of the computer. There are other general-purpose registers that hold the input data, and the
accumulator receives the result of the operation. The instruction register contains the instructions
that the ALU is to perform.
For example, when adding two numbers, one number is placed in one of the general-purpose
registers and the other in another general-purpose register. The ALU performs the addition and
puts the result in the accumulator. If the operation is a logical one, the data to be compared is
placed into one of the general-purpose registers. The result of the comparison, a 1 or 0, is put in
the accumulator. Whether this is a logical or arithmetic operation, the accumulator content is then
placed int the cache location reserved by the program for the result.

Major Parts of a Computer


The several types of computer systems can be broken down into the same functional units. Each
unit performs specific functions, and all units function together to carry out the instructions given
in the program. Figure shows the five major functional parts of a digital computer and their
interaction. The solid lines with arrows represent the flow of data and information. The dashed
lines with arrow represent the flow of timing and control signals.
Input unit: complete set of instructions and data is fed into the computer system and into the
memory unit, to be stored until needed through the input unit.
Memory unit: the memory stores the instructions and data received from the input unit. It stores
the results of arithmetic operations received from the arithmetic unit. It also supplies information
to the output unit.
Central processing unit (CPU)
This consist of control unit, arithmetic/logic unit, register, clock and bus. The CPU contains all
of the circuitry for fetching and interpreting instructions and for controlling and performing the
various operations called for by the instructions.
Control unit: this unit takes instructions from the memory unit one at a time and interprets them.
It sends appropriate signals to all the other units to cause the specific instruction to be executed.
Arithmetic/logic unit: all arithmetic calculations and logical decisions are performed in this unit
which can then send results to the memory unit to be stored.
One essential function of computer is the performance of arithmetic operation where logic gates
and flip-flops are combined so that they can add, subtract, multiply and divide binary numbers.
These circuits perform arithmetic operations at speeds that are not humanly possible.
Output unit: this unit takes data from the memory unit and present the information to the operator
or display.

Arithmetic and logic unit


All arithmetic operations take place in the arithmetic/logic unit (ALU) of a computer. Figure is a
block diagram showing the major elements included in a typical ALU. The main purpose of the
ALU is to accept binary data that are stored in the memory and to execute arithmetic and logic
operations on these data according to instructions from the control unit.
Arithmetic/logic unit contains at least two flip-flop registers; the B register and the accumulator
register. It also contains combinational logic, which performs the arithmetic and logic operations
on the binary numbers that are stored in the B register and the accumulator. A typical sequence of
operations may occur as follows:
a. The control unit receives an instruction (from the memory unit) specifying that a number
stored in a particular memory location (address) is to be added to the number presently
stored in the accumulator register.
b. The number to be added is transferred from memory to the B register.
c. The number in the B register and the number in the accumulator register are added
together in the logic circuits (upon command from the control unit). The resulting sum is
then sent to the accumulator to be stored.
d. The new number in the accumulator can remain there so that another number can be
added to it, if the particular arithmetic process is finished, it can be transferred to memory
for storage.
These steps should make it apparent how the accumulator register derives its name. this register
“accumulates” the sums that occur when performing successive additions between new numbers
acquired from memory and the previously accumulated sum. In fact, for any arithmetic problem
containing several steps as they are completed as well as the final result when the problem is
finished.

What Drives the work of a computer


Designing computer involves hardware at all levels which include functional organization, logic
design, implementation. Implementation itself deals with designing/specifying Ica, packaging,
power, cooling etc. It also involves software (at least at the level of designing the instruction set).

Architecture is the art and science of building. It involves the terms: commodity, firmness and
delight. Architecture embraces functional, technological and aesthetic aspects.
Computer architect has to specify the performance requirements of various part of a computer
system, to define the interconnections between them and to keep it harmoniously balanced. The
computer architect’s job is more than designing the Instruction Set.
Instruction set architecture refers to what the programmer sees as the machine’s instruction set.
The instruction set is the boundary between the hardware and the software, and most of the
decisions concerning the instruction set affect the hardware, and the converse is also true, many
hardware decisions may beneficially/adversely affect the instruction set.
The implementation of a machine refers to the logical and physical design techniques used to
implement an instance of the architecture. The implementation has two aspects
(1) The organization which refers to logical aspects of an implementation. The high-level
aspects of the design: CPU design, memory system, bus structure(s) etc.
(2) The Hardware which refers to the specifics of an implementation. Detail logic design and
packaging are included here.
Factor to be considered for designing
(a) Integrated circuit technology: the transistor count on a chip should be considered.
Transistor count on a chip increase by about 25% per year, thus doubling every three
years. Device speed increases at almost the same pace.
(b) Semiconductor RAM: density increases by some 60% per year, thus quadrupling every
three years; the cycle time has decreased very slow in the last decade, only by 33%
(c) Disk technology: density increases by about 25% per year, doubling every three years.
The access time has decreased only by one third in ten years.
A new design must support, not only the circuits that are available now, at the design time, which
will become obsolete eventually, but also the circuit that will be available when the product gets
into the market.
The designer must also be aware of the software trends:
(1) The amount of memory a average program requires has grown by a factor of 1.5 to 2 per
year. In this rhythm the 32 bit address space of the processors dominating the market
today, will soon be exhausted. As a matter of fact, the most recently appeared designs, as
DEC’s Alpha, have shifted to larger address spaces: the virtual address of the Alpha
architecture is 64 bit and various implementations must provide at least 43 bits of
address.
(2) Increased dependency on compiler technology: the compiler is now the major interface
between the programmer and the machine. While in the pioneering era of computers a
compiler had to deal with ill designed instruction sets, now the architectures move toward
supporting efficient compiling and ease of writing compilers.

Computer organization is also called microarchitecture abbreviated as μarch or uarch which is


the way a given instruction set architecture (ISA) is implemented in a particular processor. A
given ISA may be implemented with different microarchitectures. Implementations may vary due
to different goals of a given design or due to shifts in technology.
Microarchitecture includes the constituent parts of the processor and how these interconnect and
interoperate to implement the ISA.
The microarchitecture of a machine is usually represented as (more or less detailed) diagrams
that describe the interconnections of the various microarchitectural elements of the machine,
which may be anything from single gates and registers, to complete arithmetic logic units
(ALUs) and even larger elements. These diagrams generally separate the datapath (where data is
placed) and the control path (which can be said to steer the data).
The person designing a system usually draws the specific microarchitecture as a kind of data
flow diagram. Like a block diagram, the microarchitecture diagram shows microarchitectural
elements such as the arithmetic and logic unit and the register file as a single schematic symbol.
Typically, the diagram connects those elements with arrows, thick lines and thin lines to
distinguish between three state buses (which require a three-state buffer for each device that
drives the bus), unidirectional buses (always driven by a single source, such as the way the
address bus on simpler computer is always driven by the memory address register), and
individual control lines. Very simple computers have a single data bus organization – they have a
single three-state bus. The diagram of more complex computers usually shows multiple three-
state buses, which help the machine do more operations simultaneously.
Each microarchitectural element is in turn represented by a schematic describing the
interconnections of logic gates used to implement it. Each logic gate is in turn represented by a
circuit diagram describing the connections of the transistors used to implement it in some
particular logic family. Machine with different microarchitectures may have the same instruction
set architecture, and thus be capable of executing the same programs. New microarchitectures
and/or circuitry solutions, along with advances in semiconductor manufacturing, are what allows
newer generations of processors to achieve higher performance while using the same ISA.
In principle, a single microarchitecture could execute several different ISAs with only minor
changes to the microcode.
Interconnections inside the CPU
If space/low-price are a must then single internal bus may be considered. But it has a big
drawback such as (1) little flexibility in choosing the instruction set (2) most operations have as
an operand the content of the accumulator, and this is also the place where the result goes. Due to
its simplicity (simple also means cheap!), this was the solution adopted by the first CPUs.
When we say simple, we mean both hardware simplicity and software simplicity: because one
operand is always in the accumulator, and the accumulator is also the destination, the instruction
encoding is very simple: the instruction must only specify what it the operation to be performed
and which is its second operand.
As the technology allowed to move to wider data paths (16, 32, 64 and even larger in the future),
it has become also possible to specify more complex instruction formats: more explicit operands,
more registers, larger offsets, etc.
Newer CPU generations are faster due to
(a) Faster clock rate (lower Tck): while technology features decresed more transistors fit on
the surface and they may operate at higher speed;
(b) Lower IC: it takes fewer instructions to perform an integer instruction on 32-bit integers,
if the datapath is 32-bit wide as compared with an 8-bit datapath;
(c) Lower CPI: with more involved hardware it is possible to make large transfers (read/store
from/to memory in a single clock cycle, instead of several ones as it was the case with
narrower datapath CPUs.
The datapath contains most of the CPU’s state; this is the information the programmer has to
save when the program is suspended; restoring this information makes the computation look like
nothing had happened.

CHAPTER TWO

Instruction set design


Instructions
An instruction specifies an operation to be performed and the operands involved.
The tasks carried out by a computer program consist of a sequence of small steps, such as adding
two numbers, testing for a particular condition, reading a character from the keyboard, or
sending a character to be displayed on a displayed screen. Computer must have instructions
capable of performing four types of operations:
1) Data transfers between the memory and the processor registers
2) Arithmetic and logic operations on data
3) Program sequencing and control
4) I/O transfers
Machine language is built up from discrete statements or instructions. On the processing
architecture, a given instruction may specify:
 Opcode (the instruction to be performed) e.g. add, copy, test
 any explicit operands:
registers
literal/constant values
addressing mode used to access memory
More complex operations are built up by combining these simple instructions, which are
executed sequentially, or as otherwise directed by control flow instructions.
One instruction may have several fields, which identify the logical operation, and may also
include source and destination addresses and constant values.
On traditional architectures, an instruction includes an opcode that specifies the operation to
perform, such as add contents of memory to register—and zero or more operand specifiers,
which may specify registers, memory locations, or literal data.
Instruction set may have instructions with the following formats:
Operation destination, operand1, operand2
Or
Operation destination, operand
Where operation specifies what is to be performed with the operands operand1 and operand2, or
with operand, and destination is the place where the result is to be stored.
Figure is the MIPS "Add Immediate" instruction, which allows selection of source and
destination registers and inclusion of a small constant.

The operand specifiers may have addressing modes determining their meaning or may be in
fixed fields. In very long instruction word (VLIW) architectures, which include many
microcode architectures, multiple simultaneous opcodes and operands are specified in a single
instruction.
Some exotic instruction sets do not have an opcode field, such as transport triggered
architectures (TTA), only operand(s).
Most stack machines have "0-operand" instruction sets in which arithmetic and logical
operations lack any operand specifier fields; only instructions that push operands onto the
evaluation stack or that pop operands from the stack into variables have operand specifiers. The
instruction set carries out most ALU actions with postfix (reverse Polish notation) operations that
work only on the expression stack, not on data registers or arbitrary main memory cells. This can
be very convenient for compiling high-level languages, because most arithmetic expressions can
be easily translated into postfix notation.

Instruction types
Examples of operations common to many instruction sets include:
Data handling and memory operations
 Set a register to a fixed constant value.
 Copy data from a memory location or a register to a memory location or a register (a
machine instruction is often called move; however, the term is misleading). They are used
to store the contents of a register, the contents of another memory location or the result of
a computation, or to retrieve stored data to perform a computation on it later. They are
often called load or store operations.
 Read or write data from hardware devices.
Arithmetic and logic operations

 Add, subtract, multiply, or divide the values of two registers, placing the result in a
register, possibly setting one or more condition codes in a status register.
 increment, decrement in some ISAs, saving operand fetch in trivial cases.
 Perform bitwise operations, e.g., taking the conjunction and disjunction of corresponding
bits in a pair of registers, taking the negation of each bit in a register.
 Compare two values in registers (for example, to see if one is less, or if they are equal).
 Floating-point instructions for arithmetic on floating-point numbers.[7]
Control flow operations
 Branch to another location in the program and execute instructions there.
 Conditionally branch to another location if a certain condition holds.
 Indirectly branch to another location.
 Call another block of code, while saving the location of the next instruction as a point to
return to.
Coprocessor instructions
 Load/store data to and from a coprocessor or exchanging with CPU registers.
 Perform coprocessor operations.
Complex instructions
Processors may include "complex" instructions in their instruction set. A single "complex"
instruction does something that may take many instructions on other computers. Such
instructions are typified by instructions that take multiple steps, control multiple functional units,
or otherwise appear on a larger scale than the bulk of simple instructions implemented by the
given processor. Some examples of "complex" instructions include:
 transferring multiple registers to or from memory (especially the stack) at once
 moving large blocks of memory (e.g. string copy or DMA transfer)
 complicated integer and floating-point arithmetic (e.g. square root, or transcendental
functions such as logarithm, sine, cosine, etc.)
 SIMD instructions, a single instruction performing an operation on many homogeneous
values in parallel, possibly in dedicated SIMD registers
 performing an atomic test-and-set instruction or other read-modify-write atomic
instruction
 instructions that perform ALU operations with an operand from memory rather than a
register
Complex instructions are more common in CISC instruction sets than in RISC instruction sets,
but RISC instruction sets may include them as well. RISC instruction sets generally do not
include ALU operations with memory operands, or instructions to move large blocks of memory,
but most RISC instruction sets include SIMD or vector instructions that perform the same
arithmetic operation on multiple pieces of data at the same time. SIMD instructions have the
ability of manipulating large vectors and matrices in minimal time. SIMD instructions allow easy
parallelization of algorithms commonly involved in sound, image, and video processing.

The program instructions and data operands are stored in the memory. The sequence of
instructions is brought from the memory into the processor and executed to perform a given task.
The vast majority of programs are written in high-level languages such as C, C++, or Java. To
execute a high-level language program on a processor, the program must be translated into the
machine language for that processor, which is done by a compiler program. Assembly language
is a readable symbolic representation of machine language.

The design of a new machine is not a smooth process; the designer of the architecture must be
aware of the possible hardware limitations when setting up the instruction set, while the
hardware designers must be aware of the consequences their decisions have over the software.
It is not seldom that some architectural features cannot be implemented (at a reasonable price, in
a reasonable time, using a reasonable surface of silicon, or cannot be implemented at all!); in this
case the architecture has to be redefined. Very often small changes in the instruction set can
greatly simplify the hardware; converse is also true, the process of designing the hardware often
suggests improvements in the instruction set.
Design of instruction set consist of (1) what should be included in the instruction set (what is a
must for the machine), and what can be left as an option, how do instructions look like and what
is the relation between hardware and the instruction set.

Conditional instructions often have a predicate field—a few bits that encode the specific
condition to cause an operation to be performed rather than not performed. For example, a
conditional branch instruction will transfer control if the condition is true, so that execution
proceeds to a different part of the program, and not transfer control if the condition is false, so
that execution continues sequentially. Some instruction sets also have conditional moves, so that
the move will be executed, and the data stored in the target location, if the condition is true, and
not executed, and the target location not modified, if the condition is false. Similarly, IBM
z/Architecture has a conditional store instruction. A few instruction sets include a predicate field
in every instruction; this is called branch predication.

A complete instruction set, including operand addressing methods, is often referred to as the
instruction set architecture (ISA) of a processor.
Instruction set architecture
Why the ISA is important
Understanding what the instruction set can do and how the compiler makes use of those
instructions can help developers write more efficient code. It can also help them understand the
output of the compiler which can be useful for debugging.
An Instruction Set Architecture (ISA) is a part of the abstract model of a computer that defines
how the CPU is controlled by the software. The ISA acts as an interface between the hardware
and the software, specifying both what the processor is capable of doing as well as how it gets
done. A device or program that executes instructions described by that ISA, such as a central
processing unit (CPU), is called an implementation of that ISA.
The ISA provides the only way through which user is able to interact with the hardware. It can be
viewed as programmer’s manual because it’s the portion of the machine that’s visible to the
assembly language programmer, the compiler writer, and the application programmer.
The ISA defines the supported instruction, data types, the registers, how the hardware manages
main memory, key features, (such as virtual memory, memory consistency, addressing modes,
virtual memory), which instructions a microprocessor can execute, and the input/output model of
multiple ISA implementations. The ISA can be extended by adding instructions or other
capabilities or by adding support for larger addresses and data values.
The Instruction Set Architecture (ISA) serves as the boundary between software and hardware.
ISA is the part of the processor that is visible to the programmer or compiler writer.
The ISA of a processor can be described using five (5) categories
1. Operand Storage in the CPU
Where are the operands kept other than in memory?
2. Number of explicit named operands
How many operands are named in a typical instruction.
3. Operand location
Can any ALU instruction operand be located in memory? Or must all operands be kept
internally in the CPU
4. Operation
What operations are provided in the ISA
5. Type and size of operands
What is the type and size of each operand and how is it specified?
Of all the above the most distinguishing factor is the first
Number of operands
Instruction sets may be categorized by the maximum number of operands explicitly specified in
instructions.
Each instruction specifies some number of operands (registers, memory locations, or immediate
values) explicitly. Some instructions give one or both operands implicitly, such as by being
stored on top of the stack or in an implicit register. If some of the operands are given implicitly,
fewer operands need be specified in the instruction. When a "destination operand" explicitly
specifies the destination, an additional operand must be supplied. Consequently, the number of
operands encoded in an instruction may differ from the mathematically necessary number of
arguments for a logical or arithmetic operation (the arity). Operands are either encoded in the
"opcode" representation of the instruction, or else are given as values or addresses following the
opcode.
The three (3) most common types of ISAs are:
1. Stack- The operands are implicitly on top of the stack.
2. Accumulator – One operand is implicitly the accumulator
3. General Purpose Register (GPR) – All operands are explicitly mentioned, they are either
registers or memory locations.
Let’s look at the assembly code of and how it is implemented in the three architecture
C = A + B;
(In the equation, A, B, and C are (direct or calculated) addresses referring to memory cells,
while reg1 and so on refer to machine registers.)

 0-operand (zero-address machines), so called stack machines: All arithmetic operations


take place using the top one or two positions on the stack: push a, push b, add, pop c.
 C = A+B needs four instructions. For stack machines, the terms "0-operand" and
"zero-address" apply to arithmetic instructions, but not to all instructions, as 1-
operand push and pop instructions are used to access memory.
 1-operand (one-address machines), so called accumulator machines, include early
computers and many small microcontrollers: most instructions specify a single right
operand (that is, constant, a register, or a memory location), with the implicit
accumulator as the left operand (and the destination if there is one): load a, add b, store c.
 C = A+B needs three instructions.
 2-operand — many CISC and RISC machines fall under this category:
 CISC — move A to C; then add B to C.
 C = A+B needs two instructions. This effectively 'stores' the result without
an explicit store instruction.
 CISC — Often machines are limited to one memory operand per instruction: load
a,reg1; add b,reg1; store reg1,c; This requires a load/store pair for any memory
movement regardless of whether the add result is an augmentation stored to a
different place, as in C = A+B, or the same memory location: A = A+B.
 C = A+B needs three instructions.
 RISC — Requiring explicit memory loads, the instructions would be: load
a,reg1; load b,reg2; add reg1,reg2; store reg2,c.
 C = A+B needs four instructions.
 3-operand, allowing better reuse of data:
 CISC — It becomes either a single instruction: add a,b,c
 C = A+B needs one instruction.
 CISC — Or, on machines limited to two memory operands per instruction, move
a,reg1; add reg1,b,c;
 C = A+B needs two instructions.
 RISC — arithmetic instructions use registers only, so explicit 2-operand
load/store instructions are needed: load a,reg1; load b,reg2; add reg1+reg2-
>reg3; store reg3,c;
 C = A+B needs four instructions.
 Unlike 2-operand or 1-operand, this leaves all three values a, b, and c in
registers available for further reuse.
 more operands—some CISC machines permit a variety of addressing modes that allow
more than 3 operands (registers or memory accesses), such as the VAX "POLY"
polynomial evaluation instruction.
Due to the large number of bits needed to encode the three registers of a 3-operand instruction,
RISC architectures that have 16-bit instructions are invariably 2-operand designs, such as the
Atmel AVR, TI MSP430, and some versions of ARM Thumb. RISC architectures that have 32-
bit instructions are usually 3-operand designs, such as the ARM, AVR32, MIPS, Power ISA, and
SPARC architectures.
Not all processors can be neatly tagged into one of the above categories. The i8086 has many
instructions that use implicit operands although it has a general register set. The i8051 is another
example, it has 4 banks of GPRs but most instructions must have the A register as one of its
operands.
Advantages and disadvantages of each architecture
Stack
Advantage: Simple Model of expression evaluation (reverse polish). Short instructions.
Disadvantage: A stack can’t be randomly accessed. This makes it hard to generate efficient code.
The stack itself is accessed every operation and becomes a bottleneck.
Accumulator
Advantage: short instruction
Disadvantage: The accumulator is only temporary storage so memory traffic is the highest for
this approach
GPR
Advantage: Makes code generation easy. Data can be stored for long periods in registers.
Disadvantages: All operands must be named leading to longer instructions.
Earlier CPUs were of stack and accumulator but in the last 15 years all CPUs made are GPR
processors. The two (2) major reasons are that registers are faster than memory, the more data
that can be kept internally in the CPU the faster the program will run. The other reason is that
registers are easier for a compiler to use.
Classification of ISAs
An ISA may be classified in a number of different ways. A common classification is by
architectural complexity. A complex instruction set computer (CISC) has many specialized
instructions, some of which may only be rarely used in practical programs. A reduced instruction
set computer (RISC) simplifies the processor by efficiently implementing only the instructions
that are frequently used in programs, while the less common operations are implemented as
subroutines, having their resulting additional processor execution time offset by infrequent use.
Complex instructions are more common in CISC instruction sets than in RISC instruction sets,
but RISC instruction sets may include them as well. RISC instruction sets generally do not
include ALU operations with memory operands, or instructions to move large blocks of memory,
but most RISC instruction sets include SIMD or vector instructions that perform the same
arithmetic operation on multiple pieces of data at the same time. SIMD instructions have the
ability of manipulating large vectors and matrices in minimal time. SIMD instructions allow easy
parallelization of algorithms commonly involved in sound, image, and video processing.
Other types include very long instruction word (VLIW) architectures, and the closely
related long instruction word (LIW) and explicitly parallel instruction computing (EPIC)
architectures. These architectures seek to exploit instruction-level parallelism with less hardware
than RISC and CISC by making the compiler responsible for instruction issue and scheduling.
Architectures with even less complexity have been studied, such as the minimal instruction set
computer (MISC) and one-instruction set computer (OISC). These are theoretically important
types, but have not been commercialized.
Instruction length
The size or length of an instruction varies widely, from as little as four bits in some
microcontrollers to many hundreds of bits in some VLIW systems. Processors used in personal
computers, mainframes, and supercomputers have minimum instruction sizes between 8 and 64
bits. The longest possible instruction on x86 is 15 bytes (120 bits). Within an instruction set,
different instructions may have different lengths. In some architectures, notably most reduced
instruction set computers (RISC), instructions are a fixed length, typically corresponding with
that architecture's word size. In other architectures, instructions have variable length, typically
integral multiples of a byte or a halfword. Some, such as the ARM with Thumb-
extension have mixed variable encoding, that is two fixed, usually 32-bit and 16-bit encodings,
where instructions cannot be mixed freely but must be switched between on a branch (or
exception boundary in ARMv8).
Fixed-length instructions are less complicated to handle than variable-length instructions for
several reasons (not having to check whether an instruction straddles a cache line or virtual
memory page boundary, for instance), and are therefore somewhat easier to optimize for speed.

CHAPTER THREE
A processor register is a quickly accessible location available to a computer's
processor. Registers usually consist of a small amount of fast storage, although some registers
have specific hardware functions, and may be read-only or write-only. In computer architecture,
registers are typically addressed by mechanisms other than main memory, but may in some cases
be assigned a memory address e.g. DEC PDP-10, ICT 1900.[2]
Almost all computers, whether load/store architecture or not, load items of data from a larger
memory into registers where they are used for arithmetic operations, bitwise operations, and
other operations, and are manipulated or tested by machine instructions. Manipulated items are
then often stored back to main memory, either by the same instruction or by a subsequent one.
Modern processors use either static or dynamic random-access memory (RAM) as main memory,
with the latter usually accessed via one or more cache levels.
Processor registers are normally at the top of the memory hierarchy, and provide the fastest way
to access data. The term normally refers only to the group of registers that are directly encoded
as part of an instruction, as defined by the instruction set. However, modern high-performance
CPUs often have duplicates of these "architectural registers" in order to improve performance via
register renaming, allowing parallel and speculative execution.
When a computer program accesses the same data repeatedly, this is called locality of reference.
Holding frequently used values in registers can be critical to a program's performance. Register
allocation is performed either by a compiler in the code generation phase, or manually by an
assembly language programmer.

Size
Registers are normally measured by the number of bits they can hold, for example, an 8-
bit register, 32-bit register, 64-bit register, 128-bit register, or more. In some instruction sets, the
registers can operate in various modes, breaking down their storage memory into smaller parts
(32-bit into four 8-bit ones, for instance) to which multiple data (vector, or one-dimensional
array of data) can be loaded and operated upon at the same time. Typically, it is implemented by
adding extra registers that map their memory into a larger register. Processors that have the
ability to execute single instructions on multiple data are called vector processors.
Types
A processor often contains several kinds of registers, which can be classified according to the
types of values they can store or the instructions that operate on them:
 User-accessible registers can be read or written by machine instructions. The most
common division of user-accessible registers is a division into data registers and address
registers.
 Control register
 Data registers can hold numeric data values such as integers and, in some
architectures, floating-point numbers, as well as characters, small bit arrays and
other data. In some older architectures, such as the IBM 704, the IBM 709 and
successors, the PDP-1, the PDP-4/PDP-7/PDP-9/PDP-15, the PDP-5/PDP-8, and
the HP 2100, a special data register known as the accumulator is used implicitly
for many operations.
 Address registers hold addresses and are used by instructions that indirectly
access primary memory.
 Some processors contain registers that may only be used to hold
an address or only to hold numeric values (in some cases used as an index
register whose value is added as an offset from some address); others
allow registers to hold either kind of quantity. A wide variety of possible
addressing modes, used to specify the effective address of an operand,
exist.
 The stack pointer is used to manage the run-time stack. Rarely, other data
stacks are addressed by dedicated address registers (see stack machine).
 General-purpose registers (GPRs) can store both data and addresses, i.e., they are
combined data/address registers; in some architectures, the register
file is unified so that the GPRs can store floating-point numbers as well.
 Status registers hold truth values often used to determine whether some
instruction should or should not be executed.
 Floating-point registers (FPRs) store floating-point numbers in many
architectures.
 Constant registers hold read-only values such as zero, one, or pi.
 Vector registers hold data for vector processing done by SIMD instructions
(Single Instruction, Multiple Data).
 Special-purpose registers (SPRs) hold some elements of the program state; they
usually include the program counter, also called the instruction pointer, and the
status register; the program counter and status register might be combined in a
program status word (PSW) register. The aforementioned stack pointer is
sometimes also included in this group. Embedded microprocessors, such as
microcontrollers, can also have special function registers corresponding to
specialized hardware elements.
 Model-specific registers (also called machine-specific registers) store data and
settings related to the processor itself. Because their meanings are attached to the
design of a specific processor, they are not expected to remain standard between
processor generations.
 Memory type range registers (MTRRs)
 Internal registers are not accessible by instructions and are used internally for processor
operations.
 The instruction register holds the instruction currently being executed.
 Registers related to fetching information from RAM, a collection of storage
registers located on separate chips from the CPU:
 Memory buffer register (MBR), also known as memory data
register (MDR)
 Memory address register (MAR)
 Architectural registers are the registers visible to software and are defined by an
architecture. They may not correspond to the physical hardware if register renaming is
being performed by the underlying hardware.
Hardware registers are similar, but occur outside CPUs.
In some architectures (such as SPARC and MIPS), the first or last register in the integer register
file is a pseudo-register in that it is hardwired to always return zero when read (mostly to
simplify indexing modes), and it cannot be overwritten. In Alpha, this is also done for the
floating-point register file. As a result of this, register files are commonly quoted as having one
register more than how many of them are actually usable; for example, 32 registers are quoted
when only 31 of them fit within the above definition of a register.

Program Status Word (PSW)


The program status Word (PSW) contains status bits that reflect the current CPU state. It is 16-bit
register, seven bits remain unused while the rest nine are used. Out of nine bits, six are status
flags and three are control flags. The 8th bit represents the trap flag. The control flags are DF
(Direction Flag), IF (Interrupt Flag), and TF (Trap Flag). The status flags are CF (carry Flag), AF
(Auxiliary Carry Flag), SF (Sign Flag), ZF (Zero Flag), PF (parity Flag), and OF (overflow
Flag).
Flag register is a Special Purpose Register. Depend upon the value of the result after any
arithmetic and logical operation. The flag bits become set (1) or reset (0).
Generally, the flag register of 8085 microprocessor is an 8-bit register and it contains three don’t
care combinations and 5 active flags. The trap flag is NOT present in 8085The flags of 8085 are
given as

X is don’t care.

Sign Flag (S): after any operation, if the MSB (B7) of the result is 1, it indicates the number is
negative and the sign flag becomes set.
Zero Flag (Z): after any arithmetical or logical operation if the result is 00 H, the zero flags
become set.
Auxiliary Carry Flag (AC): if after any arithmetic or logical operation, the bit D3 generates any
carry and passes it on to D4, this flag becomes set.
Parity Flag: if after any arithmetic or logical operation the result has even parity, an even number
of 1 bit, the parity register becomes set.
Carry Flag (CY): carry is generated when performing n-bit operations and if the result is more
than n bits, then this flag becomes set.
Register Transfer Notation
Possible locations that may be involved in the transfer of information from one location in a
computer to another are memory locations, processor registers, or registers in the I/O subsystem.
Most of the time, such locations are identified symbolically with convenient names. For
examples, names that represent the addresses of memory locations may be LOC, PLACE, A, or
VAR2. Predefined names for the processor registers may be R0 or R6. Registers in the I/O
subsystem may be identified by names such as OUTSTATUS or DATAIN. To describe the
transfer of information, the contents of any location are denoted by placing square brackets
around its name. Thus, the expression
R3 [PLACE]
Means that the contents of memory location LOC are transferred into processor register R3.
Example2. Consider the operation that adds the contents of registers R1 and R2, and places their
sum into register R3. This action is indicated as
R4 [R1] + [R2]
This type of notation is known as Register Transfer Notation (RTN). The right-hand side of an
RTN expression always denoted a value, and the left-hand side is the name of a location where
the value is to be placed, overwriting the old contents of that location.
In computer jargon, the words “transfer” and “move” are commonly used to mean “copy”.
Transferring data from a source location A to a destination location B means that the contents of
location A are read and when then written into location B. In this operation, only the contents of
the destination will change. The contents of the source will stay the same.

register-transfer level (RTL)


In digital circuit design, register-transfer level (RTL) is a design abstraction which models a
synchronous digital circuit in terms of the flow of digital signals (data) between hardware
registers, and the logical operations performed on those signals.
Register-transfer-level abstraction is used in hardware description languages (HDLs) like
Verilog and VHDL to create high-level representations of a circuit, from which lower-level
representations and ultimately actual wiring can be derived. Design at the RTL level is typical
practice in modern digital design.
Unlike in software compiler design, where the register-transfer level is an intermediate
representation and at the lowest level, the RTL level is the usual input that circuit designers
operate on. In fact, in circuit synthesis, an intermediate language between the input register
transfer level representation and the target netlist is sometimes used. Unlike in netlist, constructs
such as cells, functions, and multi-bit registers are available. Examples include FIRRTL and
RTLIL.
Transaction-level modelling is a higher level of electronic system design.
RTL description
When designing digital integrated circuits with a hardware description language (HDL), the
designs are usually engineered at a higher level of abstraction than transistor level (logic
families) or logic gate level. In HDLs the designer declares the registers (which roughly
correspond to variables in computer programming languages), and describes the combinational
logic by using constructs that are familiar from programming languages such as if-then-else and
arithmetic operations. This level is called register-transfer level. The term refers to the fact that
RTL focuses on describing the flow of signals between registers.
When designing digital integrated circuits with a hardware description language (HDL), the
designs are usually engineered at a higher level of abstraction than transistor level ( logic
families) or logic gate level. In HDLs the designer declares the registers (which roughly
correspond to variables in computer programming languages), and describes the combinational
logic by using constructs that are familiar from programming languages such as if-then-else and
arithmetic operations. This level is called register-transfer level. The term refers to the fact that
RTL focuses on describing the flow of signals between registers.

A synchronous circuit consists of two kinds of elements: registers (sequential logic) and
combinational logic. Registers (usually implemented as D flip-flops) synchronize the circuit's
operation to the edges of the clock signal, and are the only elements in the circuit that have
memory properties. Combinational logic performs all the logical functions in the circuit and it
typically consists of logic gates.
Register pressure
Register pressure measures the availability of free registers at any point in time during the
program execution. Register pressure is high when a large number of the available registers are
in use; thus, the higher the register pressure, the more often the register contents must be
spilled into memory. Increasing the number of registers in an architecture decreases register
pressure but increases the cost.
While embedded instruction sets such as Thumb suffer from extremely high register pressure
because they have small register sets, general-purpose RISC ISAs like MIPS and Alpha enjoy
low register pressure. CISC ISAs like x86-64 offer low register pressure despite having smaller
register sets. This is due to the many addressing modes and optimizations (such as sub-register
addressing, memory operands in ALU instructions, absolute addressing, PC-relative addressing,
and register-to-register spills) that CISC ISAs offer.

CHAPTER FOUR
MEMORY OPERATIONS
Both program instructions and data operands are stored in the memory. To execute an instruction,
the processor control circuits must cause the word (or words) containing the instruction to be
transferred from the memory and the processor. Thus, two basic operations involving the
memory are needed, namely, Read and Write.
The Read operation transfers a copy of the contents of a specific memory location to the
processor. The memory contents remain unchanged. To start a Read operation, the processor
sends the address of the desired location to the memory and requests that its contents be read.
The memory reds the data stored at that address and sends them to the processor.
The Write operation transfers an item of information from the processor to a specific memory
location, overwriting the former contents of that location. To initiate a Write operation, the
processor sends the address of the desired location to the memory, together with the data to be
written into that location. The memory then uses the address and data to perform the write.

CHAPTER FIVE
ASSEMBLY LANGUAGE NOTATION
Assembly language notation is another type of notation to represent machine instructions and
programs. A generic instruction to transfer content of a memory location LOC to processor
register R1 can be specified by the statement
Load R1, LOC
The contents of LOC are unchanged by the execution of the instruction, but the old contents of
register R1 are overwritten. The name Load is appropriate for the instruction because the
contents read from a memory location are loaded into a processor register.
Example 2. Adding of two numbers contained in processor register R1 and R2 and placing their
sum in R3 can be specified by the assembly-language statement
Add R3, R2, R1
In this case, register R1 and R2 hold the source operands, while R3 is the destination.
The English words Load and Add are used to denote the required operations. In the assembly-
language instructions of actual (commercial) processors, such operations are defined by using
mnemonics, which are typically abbreviations of the words describing the operations. For
example, the operation Load may be written as LD, while the operation Store, which transfers a
word from a processor register to the memory, may be written as STR or ST. Assembly
languages for different processors often use different mnemonics for a given operation.

You might also like