CA Unit-1
CA Unit-1
CISC (Complex Instruction Set Computer) RISC (Reduced Instruction Set Computer)
A large number of instructions are present in the Very fewer instructions are present. The numbers
architecture. of instructions are generally less than 100.
CISC supports array. RISC does not support array.
Some instructions will take long execution times. No instruction with a long execution time due to
very simple instruction set.
Variable-length encodings of the instructions Fixed-length encodings of the instructions are
used.
12.What is an Opcode? How many bits are needed to specify 32 distinct operations?
PART-B
What is Computer Architecture?
Computer architecture is a set of rules and methods that describe the functionality,
organization, and implementation of computer systems.
It is defined as the functional operation of the individual hardware unit in a computer
system and the flow of information among the control of those units.
Computer hardware is the electronic circuit.
Computer architecture is a Specification of detailing how a set of software and
hardware technology standards interact to form a computer system or platform.
Computer architecture involves instruction set architecture design, micro architecture
design, logic design, and implementation.
Computer architecture = Set of Instructions + Computer organization
5. Datapath
The data path manipulates the data coming through the processor. It also provides a
small amount of temporary data storage.
The data path consists of the following components.
Programmable Registers - Small units of data storage that are directly visible to
assembly language programmers. They can be used like simple variables in a
high-level program.
Program Counter (PC) - holds the address for fetching instructions.
Multiplexers have control inputs coming from control. They are used for routing
data through the data path.
Processing Elements - compute new data values from old data values. In simple
processors the major processing elements are grouped into an Arithmetic-Logic
Unit (ALU).
Special-Purpose Registers - hold data that is needed for processor operation but
is not directly visible to assembly language programmers.
Processor Operation
The processor executes a sequence of instructions that are located in memory.
Execution of each instruction involves at least the first three of the following activities.
The last four activities are required for some, but not all, instructions.
The activities are approximately in time order.
However, some of the activities can be overlapped in time.
Instruction fetch
Program counter (PC) update
Instruction decode
Source operand fetch
Arithmetic-logic unit (ALU) operation
Memory access
Register write
In these activities
The program counter (PC) hold the address of the next instruction.
For a simple processor, the arithmetic-logic unit (ALU) performs all arithmetic and
logical operations.
The organization of the data path can be determined from these activities.
Where an activity requires selecting among different options depending on the
instruction, there will be a multiplexer that selects the appropriate option as directed by a
control signal.
2. Technology
As per Moore’s Law, feature size shrinks by 30% every 2-3 years, resulting in an
increase in Transistor, Capacity, Performance, and reduce manufacturing cost.
The following figure shows the technologies that have been used over time period, with
an estimate of the relative performance/unit cost for each technology.
A transistor is simply on/off switch controlled by electricity.
The Integrated circuit (IC) combined dozens to hundreds of transistors into a single chip.
Memory-DRAM capacity increases twice per 2years and 64x size improvement in last
decade.
Processor-Speed 2x per 1.5 years and 100x performance in last decade.
Disk-Capacity -2x per year and 250x sixe in last decade.
As an individual computer user, you are interested in reducing response time and
Datacenter managers are often interested in increasing throughput or bandwidth
Response time: the time between the start and completion of a task—also referred as
execution time.
Throughput—the total amount of work done in a given time.
Measuring Performance:
The computer that performs the same amount of work in the least time is the fastest.
Program execution time is measured in seconds per program.
CPU execution time or simply CPU time, which recognizes this distinction, is the
time the CPU spends computing for this task and does not include time spent waiting
for I/O or running other programs.
CPU time can be further divided into the CPU time spent in the program, called user
CPU time, and the CPU time spent in the operating system performing tasks on behalf of
the program, called system CPU time.
The term system performance to refer to elapsed time on an unloaded system and
CPU performance to refer to user CPU time.
Instruction Performance:
The performance equations above did not include any reference to the number of
instructions needed for the program.
The execution time must depend on the number of instructions in a program. Here
execution time is that it equals the number of instructions executed multiplied by the
average time per instruction.
Clock cycles required for a program can be written as
CPU clock cycles = Instructions for a program x Average clock cycles per instruction
The term clock cycles per instruction, which is the average number of clock cycles each
instruction takes to execute, is often abbreviated as CPI.
CPI provides one way of comparing two different implementations of the same instruction set
architecture, since the number of instructions executed for a program will be the same.
The Classic CPU Performance Equation:
The basic performance equation in terms of instruction count (the number of instructions
executed by the program), CPI, and clock cycle time:
CPU time = Instruction count X CPI X Clock cycle time or, since the clock rate is the inverse
of clock cycle time:
CPU time = Instruction count X CPI
Clock rate
These formulas are particularly useful because they separate the three key factors that affect
performance.
Components of performance Units of Measure
CPU execution time for a program Seconds for the program
Instruction count Instructions executed for the
program
Clock cycles per instruction (CPI) Average number of clock cycles
per
Clock cycle time Seconds per clock cycle
We can measure the CPU execution time by running the program, and the clock cycle time is
usually published as part of the documentation for a computer.
The instruction count and CPI can be more difficult to obtain. Of course, if we know the
clock rate and CPU execution time, we need only one of the instruction count or the CPI to
determine the other.
4. Power wall
The SPEC(Standard Performance Evaluation Corporation) states that performance of the
hottest chip grew by 52% per year from 1986 to 2002, and then grew only 20% in the
next three years (about 6% per year). This problem is now called “the Power Wall”.
More transistors mean more power, thus more heat generated .The design goal for the
late 1990’s and early 2000’s was to drive the clock rate up. This was done by adding
more transistors to a smaller chip.
Unfortunately, this increased the power dissipation (consumption) of the CPU chip
beyond the capacity of inexpensive cooling techniques.
Dense chips that were literally “hot”; they radiated considerable thermal power and were
difficult to cool.
There are two solutions to the problem of the Power Wall.
1. The technique taken by IBM servers was to include sophisticated and costly(water
cooling system) cooling technologies.
2. The technique taken by commodity processor providers, such as Intel and AMD. They
have moved to a multicore design, in which the chips are cooled by simple fans and all–metal
heat radiators.
CPU chips with multiple processors per chip are called “multicore”.
The dynamic energydependsonthecapacitiveloadingofeachtransistorandthevoltageapplied:
Frequency switched is a function of the clock rate. The capacitive load per transistor is a
function of both the number of transistors connected to an output (called the fanout) and
the technology, which determines the capacitance of both wires and transistors.
5. Uniprocessors to multiprocessors;
The power wall has forced a dramatic change in the design of microprocessor.
Increasing the clock speed of uniprocessor has reached saturation and cannot be
increased beyond a certain limit because of power consumption and heat dissipation
issues.
As the physical size of chip decreased, while the no of transistors/chip increased, clock
speed increased, which boosted the heat dissipation across the chip to a dangerous level.
Cooling and heat sink requirement issues were there.
There were limitations in the use of silicon surface area and in reducing the sixe of
individual gates further.
To gain performance within a single core, many techniques like pipelining, super
pipelined, super scalar architectures are used.
6. Instructions
The instruction format of an instruction is usually depicted in a rectangular box
symbolizing the bits of the instruction as they appear in memory words or in a control
register.
An instruction format defines the layout of the bits of an instruction, in terms of its
constituent parts.
The bits of an instruction are divided into groups called fields.
The most common fields found in instruction formats are:
An operation code field that specifies the operation to be performed.
An address field that designates a memory address or a processor register.
A mode field that specifies the way the operand or the effective address is determined.
Other the operation code field of an instruction is a group of bits that define various
processor operations, such as add, subtract, complement and shift.
Address fields contain either a memory address field or a register address. Mode fields
offer a variety of ways in which an operand is chosen.
There are mainly four types of instruction formats:
Three address instructions
Two address instructions
One address instructions
Zero address instructions
The MOV instruction moves or transfers the operands to and from memory and processor
registers.
The first symbol listed in an instruction is assumed to be both a source and the destination where
the result of the operation is transferred.
One address instructions
One address instructions use an implied accumulator (AC) register for all data manipulation.
For multiplication and division there is a need for a second register.
However, here we will neglect the second register and assume that the AC contains the result of
all operations.
The program to evaluate X= (A+B)*(C+D) is
All operations are done between the AC register and a memory operand.
T is the address of a temporary memory location required for storing the intermediate result.
Commercially available computers also use this type of instruction format.
Zero address instructions
A stack organized computer does not use an address field for the instructions ADD and MUL.
The PUSH and POP instructions, however, need an address field to specify the operand that
communicates with the stack.
The following program shows how X=(A+B)*(C+D) will be written for a stack organized
computer.(TOS stands for top of stack.)
In MIPS, this design compromise produces instructions that are 32 bits long. The governing
principle follows:
MIPS instructions have three different formats:
1. R format - Arithmetic instructions
2. I format - Branch, transfer, and immediate instructions
3. J format - Jump instructions
7. Representing instructions
8. Logical operations
9. Control operations
10. Addressing and addressing modes
Addressing modes is the ways that is used to identify the location of an operand in instruction.
Each instruction needs to specify data on which the operation is to be performed.
But the operand (data) may be in accumulator, general purpose register or at some specified
memory location.
So the way that is used to identify the location of an operand which is specified in an instruction
is called Addressing modes.
The addressing modes in computer architecture actually define how an operand is chosen to
execute an instruction. Assume operand size = 2 bytes.
11. Stacks and queues