1: Basic Structure of Computers: Dept of CSE, JIT
1: Basic Structure of Computers: Dept of CSE, JIT
BASIC CONCEPTS
• Computer Architecture (CA) is concerned with the structure and behaviour of the computer.
• CA includes the information formats, the instruction set and techniques for addressing memory.
• In general covers, CA covers 3 aspects of computer-design namely: 1) Computer Hardware, 2)
Instruction set Architecture and 3) Computer Organization.
1. Computer Hardware
It consists of electronic circuits, displays, magnetic and optical storage media and
communication facilities.
2. Instruction Set Architecture
It is programmer visible machine interface such as instruction set, registers, memory
organization and exception handling.
Two main approaches are 1) CISC and 2) RISC.
(CISCComplex Instruction Set Computer, RISCReduced Instruction Set Computer)
3. Computer Organization
It includes the high level aspects of a design, such as
→ memory-system
→ bus-structure &
→ design of the internal CPU.
It refers to the operational units and their interconnections that realize the architectural
specifications.
It describes the function of and design of the various units of digital computer that store and
process information.
FUNCTIONAL UNITS
• A computer consists of 5 functionally independent main parts:
1) Input
2) Memory
3) ALU
4) Output &
5) Control units.
• Let us examine the flow of program instructions and data between the memory & the processor.
• At the start of execution, all program instructions are stored in the main-memory.
• As execution proceeds, instructions are fetched into the processor, and a copy is placed in the cache.
• Later, if the same instruction is needed a second time, it is read directly from the cache.
• A program will be executed faster
if movement of instruction/data between the main-memory and the processor is minimized
which is achieved by using the cache.
PROCESSOR CLOCK
• Processor circuits are controlled by a timing signal called a Clock.
• The clock defines regular time intervals called Clock Cycles.
• To execute a machine instruction, the processor divides the action to be performed into a sequence
of basic steps such that each step can be completed in one clock cycle.
• Let P = Length of one clock
cycle R = Clock rate.
• Relation between P and R is given by
------(1)
• Equ1 is referred to as the basic performance equation.
• To achieve high performance, the computer designer must reduce the value of T, which means
reducing N and S, and increasing R.
The value of N is reduced if source program is compiled into fewer machine instructions.
The value of S is reduced if instructions have a smaller number of basic steps to perform.
The value of R can be increased by using a higher frequency clock.
PERFORMANCE MEASUREMENT
• Benchmark refers to standard task used to measure how well a processor operates.
• The Performance Measure is the time taken by a computer to execute a given benchmark.
• SPEC selects & publishes the standard programs along with their test results for different application
domains. (SPEC System Performance Evaluation Corporation).
• SPEC Rating is given by
Problem 1:
List the steps needed to execute the machine instruction:
Load R2, LOC
in terms of transfers between the components of processor and some simple control commands.
Assume that the address of the memory-location containing this instruction is initially in register PC.
Solution:
1. Transfer the contents of register PC to register MAR.
2. Issue a Read command to memory.
And, then wait until it has transferred the requested word into register MDR.
3. Transfer the instruction from MDR into IR and decode it.
4. Transfer the address LOCA from IR to MAR.
5. Issue a Read command and wait until MDR is loaded.
6. Transfer contents of MDR to the ALU.
7. Transfer contents of R0 to the ALU.
8. Perform addition of the two operands in the ALU and transfer result into R0.
Problem 3:
(a) Give a short sequence of machine instructions for the task “Add the contents of memory-location A
to those of location B, and place the answer in location C”. Instructions:
Load Ri, LOC
and
Store Ri, LOC
are the only instructions available to transfer data between memory and the general purpose registers.
Add instructions are described in Section 1.3. Do not change contents of either location A or B.
(b) Suppose that Move and Add instructions are available with the formats:
Move Location1, Location2
and
Add Location1, Location2
These instructions move or add a copy of the operand at the second location to the first location,
overwriting the original operand at the first location. Either or both of the operands can be in the
memory or the general-purpose registers. Is it possible to use fewer instructions of these types to
accomplish the task in part (a)? If yes, give the sequence.
Solution:
(a)
Load A, R0
Load B, R1
Add R0,
R1
Store R1, C
(b) Yes;
Move B, C
Add A, C
Problem 4:
A program contains 1000 instructions. Out of that 25% instructions requires 4 clock cycles,40%
instructions requires 5 clock cycles and remaining require 3 clock cycles for execution. Find the total
time required to execute the program running in a 1 GHz machine.
Solution:
N = 1000
25% of N= 250 instructions require 4 clock cycles.
40% of N =400 instructions require 5 clock cycles.
35% of N=350 instructions require 3 clock cycles.
9 9
T = (N*S)/R= (250*4+400*5+350*3)/1X10 =(1000+2000+1050)/1*10 = 4.05 μs.
Problem 6:
(a) Program execution time T is to be examined for a certain high-level language program. The
program can be run on a RISC or a CISC computer. Both computers use pipelined instruction
execution, but pipelining in the RISC machine is more effective than in the CISC machine. Specifically,
the effective value of S in the T expression for the RISC machine is 1.2, bit it is only 1.5 for the CISC
machine. Both machines have the same clock rate R. What is the largest allowable value for N, the
number of instructions executed on the CISC machine, expressed as a percentage of the N value for
the RISC machine, if time for execution on the CISC machine is to be longer than on the RISC
machine?
(b) Repeat Part (a) if the clock rate R for the RISC machine is 15 percent higher than that for the CISC
machine.
Solution:
(a) Let TR = (NR X SR)/RR & TC = (NC X SC)/RC be execution times on RISC and CISC processors.
Equating execution times and clock rates, we have
1.2NR = 1.5NC
Then
NC/NR = 1.2/1.5 = 0.8
Therefore, the largest allowable value for NC is 80% of NR.
Problem 7:
(a) Suppose that execution time for a program is proportional to instruction fetch time. Assume that
fetching an instruction from the cache takes 1 time unit, but fetching it from the main-memory takes
10 time units. Also, assume that a requested instruction is found in the cache with probability 0.96.
Finally, assume that if an instruction is not found in the cache it must first be fetched from the main-
memory into the cache and then fetched from the cache to be executed. Compute the ratio of program
execution time without the cache to program execution time with the cache. This ratio is called the
speedup resulting from the presence of the cache.
(b) If the size of the cache is doubled, assume that the probability of not finding a requested
instruction there is cut in half. Repeat part (a) for a doubled cache size.
Solution:
(a) Let cache access time be 1 and main-memory access time be 20. Every instruction that is
executed must be fetched from the cache, and an additional fetch from the main-memory must
be performed for 4% of these cache accesses.
Therefore,
(b)
• Consider a 32-bit integer (in hex): 0x12345678 which consists of 4 bytes: 12, 34, 56, and 78.
Hence this integer will occupy 4 bytes in memory.
Assume, we store it at memory address starting 1000.
On little-endian, memory will look like
Address Value
1000 78
1001 56
1002 34
1003 12
WORD ALIGNMENT
• Words are said to be Aligned in memory if they begin at a byte-address that is a multiple of the
number of bytes in a word.
• For example,
If the word length is 16(2 bytes), aligned words begin at byte-addresses 0, 2, 4 . . . . .
If the word length is 64(2 bytes), aligned words begin at byte-addresses 0, 8, 16 . . . . .
• Words are said to have Unaligned Addresses, if they begin at an arbitrary byte-address.
Dept of CSE, JIT Page 12
ACCESSING NUMBERS, CHARACTERS & CHARACTERS STRINGS
• A number usually occupies one word. It can be accessed in the memory by specifying its word
address. Similarly, individual characters can be accessed by their byte-address.
• There are two ways to indicate the length of the string:
1) A special control character with the meaning "end of string" can be used as the last character
in the string.
2) A separate memory word location or register can contain a number indicating the length of
the string in bytes.
MEMORY OPERATIONS
• Two memory operations are:
1) Load (Read/Fetch) &
2) Store (Write).
• The Load operation transfers a copy of the contents of a specific memory-location to the processor.
The memory contents remain unchanged.
• Steps for Load operation:
1) Processor sends the address of the desired location to the memory.
2) Processor issues „read‟ signal to memory to fetch the data.
3) Memory reads the data stored at that address.
4) Memory sends the read data to the processor.
• The Store operation transfers the information from the register to the specified memory-location.
This will destroy the original contents of that memory-location.
• Steps for Store operation are:
1) Processor sends the address of the memory-location where it wants to store data.
2) Processor issues „write‟ signal to memory to store the data.
3) Content of register(MDR) is written into the specified memory-location.
Program Explanation
• Consider the program for adding a list of n numbers (Figure 2.9).
• The Address of the memory-locations containing the n numbers are symbolically given as NUM1,
NUM2…..NUMn.
• Separate Add instruction is used to add each number to the contents of register R0.
• After all the numbers have been added, the result is placed in memory-location SUM.
CONDITION CODES
• The processor keeps track of information about the results of various operations. This is
accomplished by recording the required information in individual bits, called Condition Code Flags.
• These flags are grouped together in a special processor-register called the condition code register (or
statue register).
• Four commonly used flags are:
1) N (negative) set to 1 if the result is negative, otherwise cleared to 0.
2) Z (zero) set to 1 if the result is 0; otherwise, cleared to 0.
3) V (overflow) set to 1 if arithmetic overflow occurs; otherwise, cleared to 0.
• To execute the Add instruction in fig 2.11 (a), the processor uses the value which is in register R1, as
the EA of the operand.
• It requests a read operation from the memory to read the contents of location B. The value read is
the desired operand, which the processor adds to the contents of register R0.
• Indirect addressing through a memory-location is also possible as shown in fig 2.11(b). In this case,
the processor first reads the contents of memory-location A, then requests a second read operation
using the value B as an address to obtain the operand.
Program Explanation
• In above program, Register R2 is used as a pointer to the numbers in the list, and the operands are accessed
indirectly through R2.
• The initialization-section of the program loads the counter-value n from memory-location N into R1 and uses the
immediate addressing-mode to place the address value NUM1, which is the address of the first number in the list,
into R2. Then it clears R0 to 0.
• The first two instructions in the loop implement the unspecified instruction block starting at LOOP.
• The first time through the loop, the instruction Add (R2), R0 fetches the operand at location NUM1 and adds it to
R0.
• The second Add instruction adds 4 to the contents of the pointer R2, so that it will contain the address value
NUM2 when the above instruction is executed in the second pass through the loop.
• Fig(a) illustrates two ways of using the Index mode. In fig(a), the index register, R1, contains the
address of a memory-location, and the value X defines an offset(also called a displacement) from this
address to the location where the operand is found.
• To find EA of operand:
Eg: Add 20(R1),
R2
EA=>1000+20=1020
• An alternative use is illustrated in fig(b). Here, the constant X corresponds to a memory address, and
the contents of the index register define the offset to the operand. In either case, the effective-address
is the sum of two values; one is given explicitly in the instruction, and the other is stored in a register.
RELATIVE MODE
• This is similar to index-mode with one difference:
The effective-address is determined using the PC in place of the general purpose register Ri.
• The operation is indicated as X(PC).
• X(PC) denotes an effective-address of the operand which is X locations above or below the current
contents of PC.
• Since the addressed-location is identified "relative" to the PC, the name Relative mode is associated
with this type of addressing.
• This mode is used commonly in conditional branch instructions.
• An instruction such as
Branch > 0 LOOP ;Causes program execution to go to the branch target
location identified by name LOOP if branch condition is
satisfied.
ASSEMBLER DIRECTIVES
• Directives are the assembler commands to the assembler concerning the program being assembled.
• These commands are not translated into machine opcode in the object-program.
• EQU informs the assembler about the value of an identifier (Figure: 2.18).
Ex: SUM EQU 200 ;Informs assembler that the name SUM should be replaced by the value 200.
• ORIGIN tells the assembler about the starting-address of memory-area to place the data block.
Ex: ORIGIN 204 ;Instructs assembler to initiate data-block at memory-locations starting from 204.
• DATAWORD directive tells the assembler to load a value into the location.
Ex: N DATAWORD 100 ;Informs the assembler to load data 100 into the memory-location N(204).
• RESERVE directive is used to reserve a block of memory.
Ex: NUM1 RESERVE 400 ;declares a memory-block of 400 bytes is to be reserved for data.
• END directive tells the assembler that this is the end of the source-program text.
• RETURN directive identifies the point at which execution of the program should be terminated.
• Any statement that makes instructions or data being placed in a memory-location may be given a
label. The label(say N or NUM1) is assigned a value equal to the address of that location.
SUBROUTINES
PARAMETER PASSING
• The exchange of information between a calling-program and a subroutine is referred to as
Parameter Passing (Figure: 2.25).
• The parameters may be placed in registers or in memory-location, where they can be accessed by
the subroutine.
• Alternatively, parameters may be placed on the processor-stack used for saving the return-address.
• Following is a program for adding a list of numbers using subroutine with the parameters passed
through registers.
• Fig: 2.27 show an example of a commonly used layout for information in a stack-frame.
Problem 2:
Registers R1 and R2 of a computer contains the decimal values 1200 and 4600. What is the effective-
address of the memory operand in each of the following instructions?
(a) Load 20(R1), R5
(b) Move #3000,R5
(c) Store R5,30(R1,R2)
(d) Add -(R2),R5
(e) Subtract (R1)+,R5
Solution:
(a) EA = [R1]+Offset=1200+20 = 1220
(b) EA = 3000
(c) EA = [R1]+[R2]+Offset = 1200+4600+30=5830
(d) EA = [R2]-1 = 4599
(e) EA = [R1] = 1200
Problem 3:
Registers R1 and R2 of a computer contains the decimal values 2900 and 3300. What is the effective-
address of the memory operand in each of the following instructions?
(a) Load R1,55(R2)
(b) Move #2000,R7
(c) Store 95(R1,R2),R5
(d) Add (R1)+,R5
(e) Subtract‐(R2),R5
Solution:
a) Load R1,55(R2) This is indexed addressing mode. So EA = 55+R2=55+3300=3355.
b) Move #2000,R7 This is an immediate addressing mode. So, EA = 2000
c) Store 95(R1,R2),R5 This is a variation of indexed addressing mode, in which contents of 2
registers are added with the offset or index to generate EA. So,
95+R1+R2=95+2900+3300=6255.
d) Add (R1)+,R5 This is Autoincrement mode. Contents of R1 are the EA so, 2900 is the EA.
e) Subtract -(R2),R5 This is Auto decrement mode. Here, R2 is subtracted by 4 bytes
(assuming 32‐bt processor) to generate the EA, so, EA= 3300‐4=3296.
Problem 4:
Given a binary pattern in some memory-location, is it possible to tell whether this pattern represents a
machine instruction or a number?
Solution:
No; any binary pattern can be interpreted as a number or as an instruction.
Move #300,1000
Explain the difference.
Solution:
The assembler directives ORIGIN and DATAWORD cause the object
program memory image constructed by the assembler to indicate that
300 is to be placed at memory word location 1000 at the time the
program is loaded into memory prior to execution.
The Move instruction places 300 into memory word location 1000 when
the instruction is executed as part of a program.
problem 6:
Consider the following possibilities for saving the return address of a subroutine:
(a) In the processor register.
(b) In a memory-location associated with the call, so that a different location is
used when the subroutine is called from different places
(c) On a stack.
Which of these possibilities supports subroutine nesting and which supports
subroutine recursion(that is, a subroutine that calls itself)?
Solution:
(a) Neither nesting nor recursion is supported.
(b) Nesting is supported, because different Call instructions will save the
return address at different memory-locations. Recursion is not
supported.
(c) Both nesting and recursion are supported.