Coa Notes Part 1 Basic Structure of Computers
Coa Notes Part 1 Basic Structure of Computers
ARCHITECTURE
PART 1
▪ Functional units
▪ Basic operational concepts
▪ Bus structures
▪ Performance and metrics
▪ Instructions and instruction sequencing
▪ Hardware
▪ Software interface
▪ Instruction set architecture
▪ Addressing modes
▪ RISC
▪ CISC
▪ ALU design
Computer Organization:
Computer hardware:
Computer Architecture:
Functional Units
RAM:
Memory in which any location can be reached in short and fixed
amount of time after specifying its address is called RAM.
Time required to access 1 word is called Memory Access Time.
Cache Memory:
The small,fast,RAM units are called Cache. They are tightly coupled
with processor to achieve high performance.
Main Memory:
The largest and the slowest unit is called the main memory.
1.3. ALU:
Most computer operations are executed in ALU.
Consider a example,
Suppose 2 numbers located in memory are to be added. They are
brought into the processor and the actual addition is carried out by the ALU.
The sum may then be stored in the memory or retained in the processor for
immediate use.
Access time to registers is faster than access time to the fastest
cache unit in memory. 1.4. Output Unit:
Its function is to send the processed results to the outside world.
eg.Printer
Printers are capable of printing 10000 lines per minute but its speed is
comparatively slower than the processor.
Instructions are fetched from memory and the operand at LOC A is fetched.
It is then added to the contents of R0, the resulting sum is stored in Register
R0.
Eg:2
Load LOC A, R1
Add the contents of Register R1 & R0 and places the sum into R0.
Fig:Connection between Processor and Main Memory
Instruction Register(IR)
Program Counter(PC)
It contains the data to written into or read out of the address location.
MAR and MDR facilitates the communication with memory.
Operation Steps:
Interrupt:
BUS STRUCTURES:
The Buffer Register when connected with the bus, carries the information
during transfer. The Buffer Register prevents the high speed processor from
being locked to a slow I/O device during a sequence of data transfer.
SOFTWARE:
Application program
System program
Application Program:
Functions of OS:
Steps:
PERFORMANCE:
Elapsed Timethe total time required to execute the program is called the elapsed
time.
It depends on all the units in computer system.
Processor Time The period in which the processor is active is called the
processor time. It depends on hardware involved in the
execution of the instruction.
Processor clock:
T = (N*S)/R
N,S<R
Pipelining and Superscalar operation:
Clock Rate:
RISCRISC CISCCISC
Program
Functions of Compiler:
Performance Measurement:
n 1/n
Where, nSPEC rating= (
Number of
programs in the suiteΠ SPECi ) (SPEC)i
rating i=1 for program I in the suite.
INSTRUCTION AND INSTRUCTION SEQUENCING
Memory Location
Processor register
Registers in I/O sub-system.
Decrement R1
Conditional Codes:
Immediate mode
Register mode
Absolute mode
Indirect mode
Index mode
Base with index
Base with index and offset
Relative mode
Auto-increment mode
Auto-decrement mode
Variables:
Register Mode
Absolute Mode
Register Mode:
Constants:
Immediate Mode.
It places the value 200 in the register R0.The immediate mode used to
specify the value of source operand.
Move #200,R0
Indirect Mode:
Add (R1),R0
… Address B of an operand(B) is
Operand stored into R1 register.If we
want this
operand,we can
get it
Operand
through register R1(indirection).
The register or new location that contains the address of an operand is called
the pointer.
Index Mode:
EA=X + [Ri]
The index register R1 contains the address of a new location and the value of
X defines an offset(also called a displacement).
To find operand,
Here the constant X refers to the new address and the contents of
index register define the offset to the operand.
The sum of two values is given explicitly in the instruction and the
other is stored in register.
Relative Addressing:
Relative Mode:
The Effective Address is determined by the Index mode using the PC
in place of the general purpose register (gpr).
This mode can be used to access the data operand. But its most
common use is to specify the target address in branch instruction.Eg.
Branch>0 Loop
It causes the program execution to goto the branch target location. It is
identified by the name loop if the branch condition is satisfied.
Additional Modes:
Auto-increment mode
Auto-decrement mode
Auto-increment mode:
RISC
Pronounced risk, and stands for Reduced Instruction Set Computer. RISC
chips evolved around the mid-1980 as a reaction at CISC chips. The
philosophy behind it is that almost no one uses complex assembly language
instructions as used by CISC, and people mostly use compilers which never
use complex instructions. Apple for instance uses RISC chips. Therefore
fewer, simpler and faster instructions would be better, than the large,
complex and slower CISC instructions. However, more instructions are
needed to accomplish a task.
An other advantage of RISC is that - in theory - because of the more simple
instructions, RISC chips require fewer transistors, which makes them easier
to design and cheaper to produce.
Finally, it's easier to write powerful optimised compilers, since fewer
instructions exist.
RISC vs CISC
There is still considerable controversy among experts about which
architecture is better. Some say that RISC is cheaper and faster and therefor
the architecture of the future. Others note that by making the hardware
simpler, RISC puts a greater burden on the software. Software needs to
become more complex. Software developers need to write more lines for the
same tasks.
Therefore they argue that RISC is not the architecture of the future, since
conventional CISC chips are becoming faster and cheaper anyway.
RISC has now existed more than 10 years and hasn't been able to kick CISC
out of the market. If we forget about the embedded market and mainly look
at the market for PC's, workstations and servers I guess a least 75% of the
processors are based on the CISC architecture. Most of them the x86
standard (Intel, AMD, etc.), but even in the mainframe territory CISC is
dominant via the IBM/390 chip. Looks like CISC is here to stay … Is RISC
than really not better? The answer isn't quite that simple. RISC and CISC
architectures are becoming more and more alike. Many of today's RISC
chips support just as many instructions as yesterday's CISC chips. The
PowerPC 601, for example, supports more instructions than the Pentium. Yet
the 601 is considered a RISC chip, while the Pentium is definitely CISC.
Further more today's CISC chips use many techniques formerly associated
with RISC chips.
ALU Design
Mathematician John von Neumann proposed the ALU concept in 1945, when
he wrote a report on the foundations for a new computer called the EDVAC.
Research into ALUs remains an important part of computer science, falling
under Arithmetic and logic structures in the ACM Computing Classification
System
Numerical systems
An ALU must process numbers using the same format as the rest of the digital
circuit. The format of modern processors is almost always the two's
complement binary number representation. Early computers used a wide
variety of number systems, including ones' complement, Two's complement
sign-magnitude format, and even true decimal systems, with ten tubes per
digit.
ALUs for each one of these numeric systems had different designs, and that
influenced the current preference for two's complement, as this is the
representation that makes it easier for the ALUs to calculate additions and
subtractions.
The ones' complement and Two's complement number systems allow for
subtraction to be accomplished by adding the negative of a number in a very
simple way which negates the need for specialized circuits to do subtraction;
however, calculating the negative in Two's complement requires adding a one
to the low order bit and propagating the carry. An alternative way to do Two's
complement subtraction of A-B is present a 1 to the carry input of the adder
and use ~B rather than B as the second input.
Practical overview
Simple operations
A simple example arithmetic logic unit (2-bit ALU) that does AND, OR,
XOR, and addition
Complex operations
Engineers can design an Arithmetic Logic Unit to calculate any operation. The
more complex the operation, the more expensive the ALU is, the more space
it uses in the processor, the more power it dissipates. Therefore, engineers
compromise. They make the ALU powerful enough to make the processor
fast, but yet not so complex as to become prohibitive. For example, computing
the square root of a number might use:
The options above go from the fastest and most expensive one to the slowest
and least expensive one. Therefore, while even the simplest computer can
calculate the most complicated formula, the simplest computers will usually
take a long time doing that because of the several steps for calculating the
formula.
Powerful processors like the Intel Core and AMD64 implement option #1 for
several simple operations, #2 for the most common complex operations and
#3 for the extremely complex operations. Inputs and outputs
The inputs to the ALU are the data to be operated on (called operands) and a
code from the control unit indicating which operation to perform. Its output is
the result of the computation.
In many designs the ALU also takes or generates as inputs or outputs a set of
condition codes from or to a status register. These codes are used to indicate
cases such as carry-in or carry-out, overflow, divide-by-zero, etc.
In modern practice, engineers typically refer to the ALU as the circuit that
performs integer arithmetic operations (like two's complement and BCD).
Circuits that calculate more complex formats like floating point, complex
numbers, etc. usually receive a more specific name such as FPU.
Representation
The maximum value of a fixed-point type is simply the largest value that can
be represented in the underlying integer type, multiplied by the scaling factor;
and similarly for the minimum value. For example, consider a fixed-point type
represented as a binary integer with b bits in two's complement format, with a
scaling factor of 1/2f (that is, the last f bits are fraction bits): the minimum
representable value is −2b-1/2f and the maximum value is (2b-1-1)/2f.
Operations
To convert a number from a fixed point type with scaling factor R to another
type with scaling factor S, the underlying integer must be multiplied by R and
divided by S; that is, multiplied by the ratio R/S. Thus, for example, to convert
the value 1.23 = 123/100 from a type with scaling factor R=1/100 to one with
scaling factor S=1/1000, the underlying integer 123 must be multiplied by
(1/100)/(1/1000) = 10, yielding the representation 1230/1000. If S does not
divide R (in particular, if the new scaling factor R is less than the original S),
the new integer will have to be rounded. The rounding rules and methods are
usually part of the language's specification.
To add or subtract two values the same fixed-point type, it is sufficient to add
or subtract the underlying integers, and keep their common scaling factor. The
result can be exactly represented in the same type, as long as no overflow
occurs (i.e. provided that the sum of the two integers fits in the underlying
integer type.) If the numbers have different fixedpoint types, with different
scaling factors, then one of them must be converted to the other before the
sum.
To divide two fixed-point numbers, one takes the integer quotient of their
underlying integers, and assumes that the scaling factor is the quotient of their
scaling factors. The first division involves rounding in general. For example,
division of 3456 scaled by 1/100 (34.56) by 1234 scaled by 1/1000 (1.234)
yields the integer 3456÷1234 = 3 (rounded) with scale factor (1/100)/(1/1000)
= 10, that is, 30. One can obtain a more accurate result by first converting the
dividend to a more precise type: in the same example, converting 3456 scaled
by 1/100 (34.56) to 3456000 scaled by 1/100000, before dividing by 1234
scaled by
1/1000 (1.234), would yield 3456000÷1234 = 2801 (rounded) with scaling
factor (1/100000)/(1/1000) = 1/100, that is 28.01 (instead of 290). If both
operands and the desired result are represented in the same fixed-point type,
then the quotient of the two integers must be explicitly divided by the common
scaling factor.
Precision loss and overflow
Because fixed point operations can produce results that have more bits than
the operands there is possibility for information loss. For instance, the result
of fixed point multiplication could potentially have as many bits as the sum of
the number of bits in the two operands. In order to fit the result into the same
number of bits as the operands, the answer must be rounded or truncated. If
this is the case, the choice of which bits to keep is very important. When
multiplying two fixed point numbers with the same format, for instance with
I integer bits, and Q fractional bits, the answer could have up to 2I integer bits,
and 2Q fractional bits.
For simplicity, fixed-point multiply procedures use the same result format as
the operands. This has the effect of keeping the middle bits; the I-number of
least significant integer bits, and the Q-number of most significant fractional
bits. Fractional bits lost below this value represent a precision loss which is
common in fractional multiplication. If any integer bits are lost, however, the
value will be radically inaccurate.
Some operations, like divide, often have built-in result limiting so that any
positive overflow results in the largest possible number that can be represented
by the current format. Likewise, negative overflow results in the largest
negative number represented by the current format. This built in limiting is
often referred to as saturation.
The binary point is said to be float and the numbers are called floating
point numbers.
Where ,
PART A
PART B