Module 1
Module 1
18EC35
COMPUTERS
• The individual instruction are brought from the memory to the processor.
ADD LOCA, R0
• This instruction is an addition operation. The following are the steps to execute the
instruction: Step 1: Fetch the instruction from main-memory into the processor.
Step 2: Fetch the operand at location LOCA from main-memory into the processor.
Step 3: Add the memory operand (i.e. fetched contents of LOCA) to the contents of register
R0. Step 4: Store the result (sum) in R0.
Step 2: Fetch the operand at location LOCA from main-memory into the register R1.
Step 3: Add the content of Register R1 and the contents of register R0.
• The control-unit generates the timing-signals that determine when a given action is to
take place.
• During the execution of an instruction, the contents of PC are updated to point to next
instruction.
• The MDR contains the data to be written into or read out of the addressed location.
• MAR and MDR facilitates the communication with memory. (IR Instruction-
Register, PC Program Counter)
1) The address of first instruction (to be executed) gets loaded into PC.
2) The contents of PC (i.e. address) are transferred to the MAR & control-unit issues
Read signal to memory.
3) After certain amount of elapsed time, the first instruction is read out of memory and
placed into MDR.
4) Next, the contents of MDR are transferred to IR. At this point, the instruction can be
decoded & executed.
5) To fetch an operand, it's address is placed into MAR & control-unit issues Read
signal. As a result, the operand is transferred from memory into MDR, and then it is
transferred from MDR to ALU.
8) If the result of this operation is to be stored in the memory, then the result is sent to
the MDR.
9) The address of the location where the result is to be stored is sent to the MAR and a
Write cycle is initiated.
10) At some point during execution, contents of PC are incremented to point to next
instruction in the program
• A bus is a group of lines that serves as a connecting path for several devices.
• There are 2 types of Bus structures: 1) Single Bus Structure and 2) Multiple Bus Structure.
Because the bus can be used for only one transfer at a time, only 2 units can
actively use the bus at any given time.
Bus control lines are used to arbitrate multiple requests for use of the bus.
Advantages:
• Buffer Registers
→ are included with the devices to hold the information during transfers.
→ prevent a high-speed processor from being locked to a slow I/O device during data
transfers.
1.5 PERFORMANCE
• The most important measure of performance of a computer is how quickly it can execute
programs.
1) Instruction-set.
• Because programs are usually written in a HLL, performance is also affected by the
compiler that translates programs into machine language. (HLL High Level Language).
• For best performance, it is necessary to design the compiler, machine instruction set and
hardware in a co-ordinated way.
examine the flow of program instructions and data between the memory & the processor.
• At the start of execution, all program instructions are stored in the main-memory.
• As execution proceeds, instructions are fetched into the processor, and a copy is placed in
the cache.
• Later, if the same instruction is needed a second time, it is read directly from the cache.
• To execute a machine instruction, the processor divides the action to be performed into a
sequence of basic steps such that each step can be completed in one clock cycle.
S = Average number of basic steps needed to execute one machine instruction. R = Clock
rate in cycles per second.
------(1)
• To achieve high performance, the computer designer must reduce the value of T, which
means CLOCK RATE
This reduces the time needed to compute a basic step. (IC integrated circuits).
This allows the clock period P to be reduced and the clock rate R to be increased.
2) Reducing the amount of processing done in one basic step also reduces the clock
period P.
Hence, much of performance-gain expected from the use of faster technology can be realized.
The value of T will be reduced by same factor as R is increased „.‟ S & N are not affected.
• Benchmark refers to standard task used to measure how well a processor operates.
• The Performance Measure is the time taken by a computer to execute a given benchmark.
• SPEC selects & publishes the standard programs along with their test results for different
application domains. (SPEC System Performance Evaluation Corporation).
• The test is repeated for all the programs in the SPEC suite. Then, the geometric mean of
the results is computed.
RISC CISC
Simple instructions taking one cycle. Complex instructions taking multiple cycle.
Instructions are executed by hardwired control unit. Instructions are executed by microprogrammed
control unit.
Few addressing modes, and most instructions have Many addressing modes.
register to register addressing mode.
Problem 1:
in terms of transfers between the components of processor and some simple control
commands. Assume that the address of the memory-location containing this instruction is
initially in register PC. Solution:
And, then wait until it has transferred the requested word into register MDR.
8. Perform addition of the two operands in the ALU and transfer result into R0.
Problem 2:
in terms of transfers between the components of processor and some simple control
commands. Assume that the address of the memory-location containing this instruction is
initially in register PC. Solution:
And, then wait until it has transferred the requested word into register MDR.
5. Perform addition of two operands in the ALU and transfer answer into R3.
Problem 3:
(a) Give a short sequence of machine instructions for the task “Add the contents of memory-
location A to those of location B, and place the answer in location C”. Instructions:
are the only instructions available to transfer data between memory and the general purpose
registers. Add instructions are described in Section 1.3. Do not change contents of either
location A or B.
(b) Suppose that Move and Add instructions are available with the formats:
These instructions move or add a copy of the operand at the second location to the first
location, overwriting the original operand at the first location. Either or both of the operands
can be in the memory or the general-purpose registers. Is it possible to use fewer instructions
of these types to accomplish the task in part (a)? If yes, give the sequence.
Solution:
(a)
Store R1, C
(b) Yes;
Move B, C Add A, C
Problem 4:
A program contains 1000 instructions. Out of that 25% instructions requires 4 clock
cycles,40% instructions requires 5 clock cycles and remaining require 3 clock cycles for
execution. Find the total time required to execute the program running in a 1 GHz machine.
Solution:
N = 1000
40% of N =400 instructions require 5 clock cycles. 35% of N=350 instructions require 3
clock cycles.
Problem 5:
Solution:
Problem 6:
(a) Program execution time T is to be examined for a certain high-level language program.
The program can be run on a RISC or a CISC computer. Both computers use pipelined
instruction execution, but pipelining in the RISC machine is more effective than in the
CISC machine. Specifically, the effective value of S in the T expression for the RISC
machine is 1.2, bit it is only 1.5 for the CISC machine. Both machines have the same clock
rate R. What is the largest allowable value for N, the number of instructions executed on
the CISC machine, expressed as a percentage of the N value for the RISC machine, if time
for execution on the CISC machine is to be longer than on the RISC machine?
(b) Repeat Part (a) if the clock rate R for the RISC machine is 15 percent higher than that for
the CISC machine.
Solution:
(a) Let TR = (NR X SR)/RR & TC = (NC X SC)/RC be execution times on RISC and
CISC processors. Equating execution times and clock rates, we have
1.2NR = 1.5NC
Then
1.2NR/1.15 = 1.5NC/1.00
Then
Problem 7:
(a) Suppose that execution time for a program is proportional to instruction fetch time.
Assume that fetching an instruction from the cache takes 1 time unit, but fetching it from
the main-memory takes 10 time units. Also, assume that a requested instruction is found in
the cache with probability 0.96. Finally, assume that if an instruction is not found in the
cache it must first be fetched from the main- memory into the cache and then fetched from
the cache to be executed. Compute the ratio of program execution time without the cache
to program execution time with the cache. This ratio is called the speedup resulting from
the presence of the cache.
(b) If the size of the cache is doubled, assume that the probability of not finding a requested
instruction there is cut in half. Repeat part (a) for a doubled cache size.
Solution:
(a) Let cache access time be 1 and main-memory access time be 20. Every instruction
that is executed must be fetched from the cache, and an additional fetch from the
main-memory must be performed for 4% of these cache accesses.
Therefore,
(b) (b)
2) 1's complement
3) 2's complement
• In all three formats, MSB=0 for +ve numbers & MSB=1 for -ve numbers.
• In sign-and-magnitude system,
negative value is obtained by changing the MSB from 0 to 1 of the corresponding positive
value.
-5 is represented by 1101.
negative values are obtained by complementing each bit of the corresponding positive
number.
(In other words, the operation of forming the 1's complement of a given number is equivalent
to subtracting that number from 2n-1).
forming the 2's complement of a number is done by subtracting that number from 2n.
For ex, -5 is obtained by complementing each bit in 0101 & then adding 1 to yield 1011. (In
other words, the 2's complement of a number is obtained by adding 1 to the 1's complement
of that number).
• 2's complement system yields the most efficient way to carry out addition/subtraction
operations.
The sum of 1 & 1 requires the 2-bit vector 10 to represent the value 2. We say that sum is 0
• out is 1.
• Following are the two rules for addition and subtraction of n-bit signed numbers using the
2's complement representation system (Figure 1.6).
Rule 1:
To Add two numbers, add their n-bits and ignore the carry-out signal from the MSB
position.
Rule 2:
To Subtract two numbers X and Y (that is to perform X-Y), take the 2's
complement of Y and then add it to X as in rule 1.
• To represent a signed in 2's complement form using a larger number of bits, repeat the sign
bit as many times as needed to the left. This operation is called sign extension.
• In 1's complement representation, the result obtained after an addition operation is not
always correct. The carry-out(cn) cannot be ignored. If cn=0, the result obtained is correct.
If cn=1, then a 1 must be added to the result to make it correct.
is said to occur.
• For example: If we add two numbers +7 and +4, then the output sum S is
1011(0111+0100), which is the code for -5, an incorrect result.
1) Overflow can occur only when adding two numbers that have the same sign.
The carry-out signal from the sign-bit position is not a sufficient indicator of overflow when
The scale factor has a range of 2-126 to 2+127 (which is approximately equal to 10+38).
• The 32 bit word is divided into 3 fields: sign(1 bit), exponent(8 bits) and mantissa(23 bits).
• Signed exponent=E.
• The last 23 bits represent the mantissa. Since binary normalization is used, the MSB of the
mantissa is always equal to 1. (M represents fractional-part).
• The 24-bit mantissa provides a precision equivalent to about 7 decimal-digits (Figure 9.24).
• Double precision representation occupies a single 64-bit word. And E' is in the range
1<E'<2046.
NORMALIZATION
• When the decimal point is placed to the right of the first(non zero) significant digit, the
number is said to be normalized.
• If a number is not normalized, it can always be put in normalized form by shifting the
fraction and adjusting the exponent. As computations proceed, a number that does not fall in
the representable range of normal numbers might be generated.
• In single precision, it requires an exponent less than -126 (underflow) or greater than +127
(overflow). Both are exceptions that need to be considered.
SPECIAL VALUES
• The end values 0 and 255 of the excess-127 exponent E’ are used to represent special
values.
• When E’=0 and the mantissa fraction m is zero, the value exact 0 is represented.
• When E’=255 and M=0, the value ∞ is represented, where ∞ is the result of dividing a
normal number by zero.
• when E’=0 and M!=-, denormal numbers are represented. Their value is X2-126
• When E’=255 and M!=0, the value represented is called not a number(NaN). A NaN is the
result of performing an invalied operation such as 0/0 or .
• Each group of n bits is referred to as a word of information, and n is called the word
length.
• Accessing the memory to store or retrieve a single item of information (word/byte) requires
distinct addresses for each item location. (It is customary to use numbers from 0 through
2k-1 as the addresses of successive-locations in the memory).
For example, a 24-bit address generates an address-space of 224 locations (16 MB).
1.8.1 BYTE-ADDRESSABILITY
• If the word-length is 32 bits, successive words are located at addresses 0, 4, 8. . with each
word having 4 bytes.
• There are two ways in which byte-addresses are arranged (Figure 2.3).
1) Big-Endian: Lower byte-addresses are used for the more significant bytes of
the word.
2) Little-Endian: Lower byte-addresses are used for the less significant bytes of
the word
memory.
Consider a 32-bit integer (in hex): 0x12345678 which consists of 4 bytes: 12, 34, 56, and 78.
COMPUTER ORGANIZATION | DEPT. OF ELECTRONICS & COMMUNICATION ENGG. 21
COMPUTER ORGANIZATION| MODULE 1: BASIC STRUCTURE OF
18EC35
COMPUTERS
Address Value
1000 78
1001 56
1002 34
1003 12
Address Value
1000 12
1001 34
1002 56
1003 78
WORD ALIGNMENT
• Words are said to be Aligned in memory if they begin at a byte-address that is a multiple
of the number of bytes in a word.
• For example,
• Words are said to have Unaligned Addresses, if they begin at an arbitrary byte-address.
• A number usually occupies one word. It can be accessed in the memory by specifying its
word address. Similarly, individual characters can be accessed by their byte-address.
1) A special control character with the meaning "end of string" can be used as the last
character in the string.
2) A separate memory word location or register can contain a number indicating the
length of the string in bytes.
• The Load operation transfers a copy of the contents of a specific memory-location to the
processor. The memory contents remain unchanged.
• The Store operation transfers the information from the register to the specified memory-
location. This will destroy the original contents of that memory-location.
1) Processor sends the address of the memory-location where it wants to store data.
1) Data transfers between the memory and the registers (MOV, PUSH, POP,
XCHG).
2) Arithmetic and logic operations on data (ADD, SUB, MUL, DIV, AND, OR,
NOT).
Processor R0, R1 ,R2 [R3] [R1]+[R2] Add the contents of register R1 &R2
I/O Registers DATAIN, DATAOUT R1 DATAIN Contents of I/O register DATAIN are
transferred into register R1.
Add R1, R2, R3 Add the contents of registers R1 and R2, and places their
sum into register R3.
Two Address Opcode Source, Destination Add A,B Add the contents Move B, C
of Add A, C
memory-locations A & B.
Then, place the result into
location B, replacing the
original contents
of this
location.
Operand B is both a source
and a destination.
One Address Opcode Source/Destination Load A Copy contents of memory- Load A Add B
location A into Store C
accumulator.
Add B Add contents of memory-
location B to contents of
accumulator register &
place sum back into
accumulator.
Zero Address Opcode [no Push Locations of all operands Not possible
Source/Destination] are defined implicitly.
The operands are stored in
a pushdown stack.
• Access to data in the registers is much faster than to data stored in memory-locations.
• Let Ri represent a general-purpose register. The instructions: Load A,Ri
Store Ri,A Add A,Ri
are generalizations of the Load, Store and Add Instructions for the single-accumulator case,
in which register Ri performs the function of the accumulator.
• In processors, where arithmetic operations as allowed only on operands that are in
registers, the task C<-[A]+[B] can be performed by the instruction sequence:
Move A,Ri
Move B,Rj
Add Ri,Rj
Move Rj,C
1) Initially, the address of the first instruction is loaded into PC (Figure 2.8).
2) Then, the processor control circuits use the information in the PC to fetch and
execute instructions, one at a time, in the order of increasing addresses. This is
called Straight-Line sequencing.
1) Fetch Phase: The instruction is fetched from the memory-location and placed in
the IR.
Program Explanation
• The Address of the memory-locations containing the n numbers are symbolically given as
NUM1, NUM2…..NUMn.
• Separate Add instruction is used to add each number to the contents of register R0.
• After all the numbers have been added, the result is placed in memory-location SUM.
BRANCHING
• Register R1 is used as a counter to determine the number of times the loop is executed.
• The Loop is a straight line sequence of instructions executed as many times as needed. The
loop starts at location LOOP and ends at the instruction Branch>0.
• The instruction Decrement R1 reduces the contents of R1 by 1 each time through the loop.
• Then Branch Instruction loads a new value into the program counter. As a result, the
processor fetches and executes the instruction at this new address called the Branch
Target.
• The processor keeps track of information about the results of various operations. This is
accomplished by recording the required information in individual bits, called Condition
Code Flags.
• These flags are grouped together in a special processor-register called the condition code
register (or statue register).