Digital 1st sem
Digital 1st sem
BINARY ARITHMETIC
Binary arithmetic is essential part of all the digital computers and many other digital system.
Binary Addition
It is a key for binary subtraction, multiplication, division.
There are four rules of binary addition.
0+0=0
0+1=1
1+0=1
1+1=0 plus a carry of 1 to next higher column
In fourth case, a binary addition is creating a sum of 1 + 1 = 10 i.e. 0 is written in the given
column and a carry of 1 over to the next column.
Example1
Add binary number 10011 and 1001
Solution
Example 2
Add binary number 100111 and 11011
Solution
Binary Subtraction
Subtraction and Borrow, these two words will be used very frequently for the binary subtraction.
There are four rules of binary subtraction.
0-0=0
0-1=1 with a borrow from the next column
1-0=1
0-0=0
Example 1
Subtract binary number 01110 from 10101
Solution
Binary Multiplication
Binary multiplication is similar to decimal multiplication. It is simpler than decimal multiplication
because only 0s and 1s are involved.
There are four rules of binary multiplication.
0 x 0=0
0 x 1=0
1 x 0=0
1 x 1=1
Example
Multiply binary number 1010 and 1001
Solution
Binary Division
Binary division is similar to decimal division. It is called as the long division procedure.
Example
Sign-magnitude Form
When a signed binary number is represented in sign-magnitude, the left-most bit is the sign bit
and the remaining bits are the magnitude bits. A 0-sign bit indicates a positive number, anda
1 sign bit indicates a negative number. The magnitude bits are in true binary for both positive
and negative numbers.
For example, the decimal number +25 is expressed as an 8-bit signed binary number using the
sign-magnitude form as,
00011001
Sign bit Magnitude bitsThe
decimal number -25 is expressed as 10011001.
Notice that the only difference between +25 and 225 is the sign bit because the magnitude bits
are in true binary for both positive and negative numbers.
In the sign-magnitude form, a negative number has the same magnitude bits as the
corresponding positive number but the sign bit is a 1 rather than a zero.
↓↓↓↓↓↓↓↓
Example:
1. Subtract 14 from 25 using the 8 bit 1’s complement arithmetic.
25 = 00011001
14 = 00001110
-14 = 11110001
00011001 Binary of 25
1 00001010
1 Add the end around carry.
00001011
The MSB is 0. So, the result is positive and is in its normal binary form. Therefore, the
result is +11.
2. Add -25 to +14 using the 8 bit 1’s complement method.
14 = 00001110
25 = 00011001
-25= 11100110
00001110 Binary of 14
11110100 No carry
There is no carry and the MSB is 1. So, the result is negative and is in its 1’s
complement form .The 1’s complement of 11110100 is 00001011. Answer is -11.
1 11010111
1
11011000
The MSB is 1. So, the result is a negative and is in its 1’s complement form. The 1’s
complement of 11011000 is 00100111. Answer is -39.
100100000
Ignore the carry. The MSB is 0,so the result is positive and is in normal binary form.
11001111
There is no carry, the MSB is 1. So, the result is negative and is in 2’s complement
form.
Binary Codes
WHAT IS BINARY CODE?
Binary code, used in digital computers, based on a binary number system in which there are only
two possible states, off and on, usually symbolized by 0 and 1. ... In binary code, each decimal
number (0-9) is represented by a set of four binary digits, or bits.
BCD Code
BCD Arithmetic
• Disadvantage of the 8421 code is that the rules of binary addition and subtraction do not
apply to the entire 8421 number not only to the individual 4- bit groups
• The BCD addition is therefore performed by individually adding the corresponding digits
of the decimal numbers expressed in 4 bit binary groups starting from the LSD .
• If there is a carry out of one group to the next group or if the result is an illegal code then
6 (0 1 10 ) is added to the sum term of that group and the resulting carry is added to the
next group
• The BCD subtraction is performed by subtracting the digits of each 4- bit group of the
subtrahend and from the corresponding 4 bit group of the minuend in binary starting from
the LSD.
• If there is a borrow from the next group then 6 (0 1 10) is subtracted from the difference
term of this group period this is done to skip the six illegal States.
Nine’s complement and ten’s complement
● 9’s complement of a decimal number is obtained by subtracting each digit of that decimal
number from 9
● The 10’s complement of a digital number is obtained by adding a one to its 9’s complement
● To perform decimal subtraction using the 9’s complement method obtain the 9’s
complement of the subtrahend and add it to the minuend
● Call this number as the Intermediate result
● If there is a carry it indicates that the answer is positive ,add the carry to the LSD of this
result to get the answer the carrier is called the end around carry
● If there is no carry it indicates that the answer is negative and the Intermediate result is its 9
‘s complement take the 9’s complement of this result and place a negative sign in front to get
the answer
● To perform decimal subtraction using the 10’s complement method obtain the 10’s
complement of the substrahend and and add it to the minuend
● If there is a carry ignore it ,the presence of the Carrie indicates that the answer is positive the
result is the result obtained is itself the answer
● if there is no carry it indicates that the answer is negative and the result obtained is its 10’s
complement obtain the 10’s complement of the result and place in negative sign in front to
get the answer
Examples
● The gray code is unweighted and is not an arithmetic code that is there are no specific
weights assigned to the bit positions.
● The most important feature of the gray code is that it exhibits only a single bit change
from one codeword to the next in sequence this property is important in many
applications such as shaft position encoders
Binary to gray code conversion
The rules for conversion are:-
1. The most significant bit ( leftmost ) in the gray code is the same as the corresponding MSB
in the binary number
2. Going from left to right, add each adjacent pair of binary code to get the next gray code bit
.Discard carries
Parity
• Simplest technique for detecting errors is that of adding an extra bit known as the parity bit
to each word being transmitted.
• There are two types of parity odd parity and even parity.
• For odd parity the parity bit is set to a 0 or a 1 at the transmitter such that the total number of
1 bits in the word including the parity bit is an odd number.
• For even parity the parity bit is set to a 0 or a 1 at the transmitter such that the total number
of 1 bit in the word including the parity bit is an even number.
Parity
• When the digital data is received a parity checking system generates an error signal if the
total number of 1 s is even in an odd parity system or odd in an even parity system.
• This parity check can always detect a single bit error but cannot detect two or more errors
within the same word
• Odd parity is used more often than even parity because even parity does not detect the
situation where all zeros are created by a short circuit or some other fault condition
In an even parity scheme,which of the following words contain an error
Solution:-
the number of 1s in the word is even(6).So this word has an error
the number of 1s in the word is even(4).So this word has an error
the number of 1s in the word is odd (5).Therefore there is no error
Check sums
• Simple parity cannot detect two errors within the same word
• To overcome:-use two-dimensional parity
• As each word is transmitted ,it is added to the sum of the previously transmitted word and
the sum retained at the transmitters end
• At the end of transmission ,the sum (called checksum) up to that time is sent to the receiver.
• The receiver can check its sum with the transmitted sum.
• If two sum are the same ,then no errors were detected at the receivers end
• If an error were detected at the receiver end ,receiving end asks for retransmission
Block Parity
• When several binary words are transmitted or stored in succession ,the resulting
collection of bits can be regarded as a block of data having rows and columns
• Parity bits can be assigned to both rows and columns
Error -Detecting Codes
➢ When a binary data is transmitted and processed ,it is susceptible to noise that can alter or
distort its contents
➢ Because digital system must be accurate to the the digit , errors can pose a serious
problem .
➢ Several schemes have been devised to detect the occurrence of a single biterror in
a binary word so that whenever such an error occurs the concernedbinary word can
be corrected and retransmitted
Parity
➢ Simplest technique for detecting errors is that of adding an extra bit known as the parity
bit to each word being transmitted.
➢ There are two types of parity odd parity and even parity.
➢ For odd parity the parity bit is set to a 0 or a 1 at the transmitter such that the total
number of 1 bits in the word including the parity bit is an odd number.
➢ For even parity the parity bit is set to a 0 or a 1 at the transmitter such that the total
number of 1 bit in the word including the parity bit is an even number.
Parity
• When the digital data is received a parity checking system generates an error signal if the
total number of 1 s is even in an odd parity system or odd in an even parity system.
• This parity check can always detect a single bit error but cannot detect two or more errors
within the same word
• Odd parity is used more often than even parity because even parity does not detect the
situation where all zeros are created by a short circuit or some other fault condition
Block Parity
●When several binary words are transmitted or stored in succession ,the resulting
collection of bits can be regarded as a block of data having rows and columns
●Parity bits can be assigned to both rows and columns
Error -Correcting Codes
●A code is said to be an error correcting code,if the correct code word can always
be detected from the erroneous word.
●If the location of an error has been determined ,then by complementing the
erroneous digit ,the message can be corrected.
D7 D6 D5 P4 D3 P2 P1
D bits are data bits and the P bits are the parity bits
I.e 4 bit data +3 parity bits gives us =7 bit
Parity bits are placed in powers of 2
ALPHANUMERIC CODES
●Alphanumeric codes are codes used to encode the characters of alphabets in
addition to decimal digits
●They are used primarily for transmitting data between computers and its I/O
devices such as printers ,keyboards and video display terminals.
○EBCDIC code
The EBCDIC Code
●EBCDIC full form Extended Binary Coded Decimal Interchange Code
●Pronounced as ‘eb-si-dik
●EBCDIC code can be use to encode all the symbols and control characters found
in ASCII
1 A[B+C(AB +AC)]
2 A+B[AC+(BCD)]
3 A+BC(AB+ABC)
4 (B+BC)(B+BC )(B+D)
5 AB+ABC+BC=AC+BC
Additional Theorm
Theorem ( 1)
X.f(X,X, Y, ....., Z)=X.f(1,0,Y,....Z)
This theorm states that if a function containing expression /terms
with X and X is multiplied by X,then all the Xs and Xs in the function
can be replaced by 1s and 0s ,respectively.This is permissible
because
Theorem (2)
X+f(X,X,Y,....Z)=X+f(0,1,Y,....Z)
minterm.
3 Replace the non -complemented variables by 1s and the
complemented variables by 0s ,and use all combinations of Xs in
terms of 0s and 1s to generate minterms.
4 Drop out all the redundant terms.
Example
Expand A + B to minterms andmaxterms.
The given expression is a two variable function.
In the first term A,the variable B is missing; so ,multiply it by
(B+B).
In the second term B,the variable A is missing ;so multiply it by
(A+A).
A +B
= A(B + B) +B(A +A)
= AB +A B +BA + B A
= AB +A B +AB + A B
= AB +A B + AB
= 01 + 00 +10
= m1 + m0 +m2
= Σm(0, 1, 2)
A standard SOP can always be converted to a standard POS form
By treating the missing minterms of the SOP form as the
maxterms of the POS form.
Similarly a standard POS form can be converted to a standard
SOP form ,
By treating the missing maxterms of the POS form as the
minterms of the corresponding SOP form.
Example
= A.X +X.B
= 0X + X0
= 00 +01+00+10
= 00+01 +10
= Σ(0, 1, 2)
Expansion Of A Boolean Expression To POS Form
maxterm.
2 Replace the complemented variables by 1s and the
non-complemented variables by 0s and use all combinations of
Xs in terms of 0s and 1s to generate maxterms.
3 Drop out the redundant terms.
Expansion Of A Boolean Expression To POS Form
A −→ 0X = (00)(01) = M0.M1
(A +B) −→ (01) = M1
B −→ X0 = (00)(10) = M0.M2
Therefore,
Y
A(A+ B)B = M(0, 1, 2)
Example:
Expand A(A+B)(A+B+C ) to maxterms andminterms
The given expression is a three variable function in POS form .
The variable B and C are missing in the first term A.
So, add BB and CC to it
The variable C is missing in secord term (A +B) So,
add CC to it.
The third term (A + B + C) contains all three variables.So, leave it as it
is.
A = BB + CC
= (A + B)(A + B) + CC
= (A + B + CC)(A +B + CC)
= (A + B + C)(A + B + C)(A + B + C)(A+ B + C)
Example:
Expand A(A+B)(A+B+C ) to maxtermsand minterms
the given expression is a three variable function in POS form.
A = A + BB + CC = (A + B)(A + B) + CC
= (A + B + CC)(A + B + CC)
= (A + B +C)(A+ B + C)(A + B + C)(A + B + C)
A+ B = A+ B + CC
= (A + B +C)(A+ B + C)
Example:
Expand A(A+B)(A+B+C ) to maxterms andminterms
the given expression is a three variable function in POS form.
the variable B and C are missing in the first term A.So, add BB and CC
to it.
A = A + BB + CC = (A + B)(A + B) + CC
= (A + B + CC)(A + B + CC)
= (A + B +C)(A+ B + C)(A + B + C)(A + B + C)
A+ B = A+ B + CC
= (A + B +C)(A+ B + C)
Example:
Expand A(A+B)(A+B+C ) to maxterms andminterms
the given expression is a three variable function in POS form.
the variable B and C are missing in the first term A.So, add BB and CC
to it.
The variable C is missing in the second term (A+B) .So add CC to it.
A = A + BB + CC = (A + B)(A + B) + CC
= (A + B + CC)(A + B + CC)
= (A + B +C)(A+ B + C)(A + B + C)(A + B + C)
A+ B = A+ B + CC
= (A + B +C)(A+ B + C)
Example:
Expand A(A+B)(A+B+C ) to maxterms andminterms
the given expression is a three variable function in POS form.
the variable B and C are missing in the first term A.So, add BB and CC
to it.
The variable C is missing in the second term (A+B) .So add CC to it. The
third term (A+B+C) contains all the three variables .So leave it as it is .
A = A + BB + CC = (A + B)(A + B) + CC
= (A + B + CC)(A + B + CC)
= (A + B +C)(A+ B + C)(A+ B + C)(A+ B + C)
A+ B = A+ B + CC
= (A + B +C)(A+ B + C)
Continued...
Therefore,
Also.
A −→ 0XX = (000)(001)(010(011)
= M0.M1.M2.M3
A+ B −→ = (100)(101) = M4.M5 A+ B
+C = 101 = M5
Therefore,
Q
A(A+B)(A+B+C)= M(0,1,2,3,4,5)
Two-variable K -Map
B
0 1
0 AB AB
A
1 AB AB
Two-variable K-Map
Figure:Two-variable K-Map
B
0 1
0 1 0
A
1 1 1
Map the expression AB +AB
0 0 1
A
1 1 0
Minimization of SOP Expression
Figure: f1 = A
B
0 1
0 1 1
A
1 0 0
For example:
Figure: f2 = B
B
0 1
0 1 0
A
1 1 0
For example:
Figure: f3 = B
B
0 1
0 0 1
A
1 0 1
For example:
Figure: f4 = A
B
0 1
0 0 0
A
1 1 1
For example:
m0 , m1, m2 and m3 can be combined to yield,
f5 = m0 + m1 + m2 + m3 = AB + AB + AB + AB
= A+A
=1
Figure: f5 = 1
B
0 1
0 1 1
A
1 1 1
0 1
0 1 1
A
1 1
A
f
= B
Mapping of POS Expression
B
0 1
0 A+B A+B
A
1 A+B A+B
Plot the expression (A+B)(A+B)(A+B)on the K-map
Q
The given expression in terms of maxterms is M(0, 2, 3).
B
0 1
0 0 1
A
1 0 0
Minimization of POS Expressions
0 1 0 1 B
0 1
0 0 0 0 0 1
0 0 0
1 1 1 1 0 1
A A A
1 0 0
Figure:f2 = B Figure:f4 =A
B B
0 1 0 1
0 1 0 0 1 1
A A
1 1 0 1 0 0
Reduce the expression (A+B)(A+B)(A+B) using mapping
Q
the given expression in terms of maxterms is M(0, 1, 3)
The diagram below shows the K-map and its reduction
Figure: f = AB
B
0 1
A
f
0 0 0 = B
A
1 1 0
Figure:(a)Minterms Figure:(a)Maxterms
BC BC
00 01 11 10 00 01 11 10
A A
1 ABC ABC ABC ABC 1 A+B+C A+B+C A+B+C A+B+C
Map the expression ABC+ABC+ABC+ABC+ABC
BC
00 01 11 10
0
A
1
Map the expression ABC+ABC+ABC+ABC+ABC
BC
00 01 11 10
0 0 1 0 1
A
1 0 1 1 1
Map the expression (A+B+C)(A+B+C)(A+B+C)(A+B+C
)(A+B+C)
BC
00 01 11 10
0
A
1
Map the expression (A+B+C)(A+B+C)(A+B+C)(A+B+C
)(A+B+C)
BC
00 01 11 10
0 0 1 0 1
A
1 1 0 0 0
Minimization of SOP and POS Expression
0 1 1 1
A
1 1 1 1
Σ
Reduce the expression m(0, 2, 3, 4, 5, 6) using mapping
BC
00 01 11 10
0 1 1 1
A
1 1 1 1
f = C + AB +AB
Q
Reduce the expression M(0, 1, 2, 3, 4, 7) using mapping
BC
00 01 11 10
0 0 0 0 0
A
1 0 0
Q
Reduce the expression M(0, 1, 2, 3, 4, 7) using mapping
BC
00 01 11 10
0 0 0 0 0
A
1 0 0
f = (B +C)(B +C)(A)
Σ
Obtain the minimal expression for m(1, 2, 4, 6, 7)
Four Variable K-map
A four variable (A,B,C,D)expression can have 24 = 16 possible combinations.
CD CD
00 01 11 10 00 01 11 10
00 00
01 01
AB AB
11 11
10 10
Four Variable K-map
A four variable (A,B,C,D)expression can have 24 = 16 possible combinations.
Σ
Reduce using mapping the expression m(0, 1, 2, 3, 5, 7, 8, 9, 10, 12, 13)
CD
CD
00 01 11 10
00 01 11 10
00
00 1 1 1 1
01 0 0
01 1 1
AB
AB 11
0 0
11 1 1
10
0
10 1 1 1
Figure:
Figure: f = BD + AC +AD
f = (A+B +D)(A +C + D)(A +B +C)
Q
Reduce using mapping the expression M(2, 8, 9, 10, 11, 12, 14)
CD
CD
00 01 11 10
00 01 11 10
00 0
00 1 1 1
01
AB 01 1 1 1 1
11 0 0 AB
11 1 1
10 0 0 0 0
10
During the process of design using an SOP map each don’t care
is treated as a1 if it is helpful in map reduction ,otherwise it is
treated as a 0 and left alone.
During the process of design of using a POS map,each don’t
care is treated as a 0 if it is useful in map reduction,otherwise it
is treatedd as a1 and left alone.
The SOP expression with don’t cares can be converted into
POS form by keeping the don’t cares as they are and writing
the missing mintermsof the SOP form as the maxterm of the
POS form .
Similarly ,to convert a POS expression with don’t cares into
SOP expression, keep the don’t cares of POS expression as
they are and write the missing maxterms of the POS
expression as the minterms of the SOP expression.
Don’t Care Combinations
Σ
Reduce the expression m(1, 5, 6, 12, 13, 14)+d(2, 4)
CD
00 01 11 10
00 1 X
01 X 1 1
AB
11 1 1 f = BC + DB+ ACD
1
10
Don’t Care Combinations
CD
00 01 11 10
00 0 0 X
01 X 0
AB
11 f = (B +D)(A +B)(C +D)
0
10 0 0 0 0
COMBINATIONAL CIRCUITS
A combinational circuit is the digital logic circuit in which the output depends
on the combination of inputs.
Given the current inputs, one can analyse and say what the output must be.
HALF-ADDER
A combinational circuit that performs the addition of 2 bits is called a half
adder. Half-adder is used to add two bits. Therefore, half-adder has two inputs
and two outputs, with SUM and CARRY.
The Boolean expressions for SUM and CARRY are
SUM = AB’+A’B CARRY = AB These expressions shows`that, SUM output is
EX-OR gate and the CARRY output is AND gate. Figure shows the be
implementation of half-adder with all the combinations including the
implementation using NAND gates only.
Figure shows truth table of half adder:
FULL-ADDER
One that perform the addition of 3 bits (2 significant bit and a previous carry) is
a full adder.
Truth table of full adder:
X Y Z C S
0 0 0 0 0
0 0 1 0 1
0 1 0 0 1
0 1 1 1 0
1 0 0 0 1
1 0 1 1 0
1 1 0 1 0
1 1 1 1 1
MAGNITUDE COMPARATOR
Magnitude comparator is a combinational circuit that compare 2 digital or
binary number in order to find out whether one binary number is equal, less
than or greater than the other binary number. We logically design a circuit for
which we will have two inputs one for A and other for B and have three output
terminals, one for A > B condition, one for A = B condition and one for A < B
condition.
A A>B
1-BIT COMPARATOR
A comparator used to compare two bits is called a single bit comparator. It
consists of two inputs each for two single bit numbers and three outputs to
generate less than, equal to and greater than between two binary numbers.
Truth table of 1 bit comparator:
A B A<B A=B A>B
0 0 0 1 0
0 1 1 0 0
1 0 0 0 1
1 1 0 1 0
From the above truth table logical expressions for each output can be
expressed as follows:
A>B: AB'
A<B: A'B
A=B: A'B' + AB
From the above expressions we can derive the following formula:
(A<B)+(A>B) = A’B+AB’
Taking complement both sides
((A<B) + (A>B))’ = (A’B + AB’)
((A<B) + (A>B))’ = (A’B’) (AB’)
((A<B) + (A>B))’ = (A+B’) (A’+B)
((A<B) + (A>B))’ = (AA’ + AB + A’B’ + BB’)
((A<B) + (A>B))’ = (AB + A’B’)
Thus ((A<B) + (A>B))’ = (A=B)
By using these Boolean expressions, we can implement a logic circuit for this
comparator as given below:
2-BIT COMPARATOR
A comparator used to compare two binary numbers each of two bits is called a
2-bit Magnitude comparator. It consists of four inputs and three outputs to
generate less than, equal to and greater than between two binary numbers.
The truth table for a 2-bit comparator is given below:
From the above truth table K-map for each output can be drawn as follows:
From the above K-maps logical expressions for each output can be expressed
as follows:
By using these Boolean expressions, we can implement a logic circuit for this
comparator as given below:
MULTIPLEXERS
It is a combinational circuit which have many data inputs and single output
depending on control or select inputs. Multiplexers are mainly used to increase
amount of the data that can be sent over the network within certain amount of time
and bandwidth.
2 INPUT MULTIPLEXER
In 2 input multiplexer, there are only 2 inputs. Block diagram and truth table
are given below:
Y=S0'.A0+S0.A1
● 1. Introduction:
Switching circuits may be combinational switching circuits or sequential switching
circuits.
1. Combinational switching circuits are you so cute whose output levels at any
instant of time are depended only on the level present at the Inputs at that time
where Prior input level conditions have no effect on the process and output
because combinational logic circuits have no memory.
2. Whereas the sequential switching circuits are those whose output level at any
instant of time are depended not only on the level present at the inputs at that
time, but also on the pri or input level conditions.
It means that sequential switching circuits have memory and thus it is made up of combinational
circuits and memory elements. The most important memory element is the flip-flop which is
made up of an assembly of logic gates. Even though a logic GATE by itself has no storage
capability but several logic gates can be connected together in a way that permit information to
be stored.
A flip- flop (FF), also known as bistable multivibrator which has two stable States. It can
remain in either of the states indefinitely. Its state it can be charged by applying the proper
triggering signal. General flip flop symbol is shown below.
The flip-flop has two outputs labelled as Q and Q dash . The Q is the normal output and Q dash
is the inverted output as shown in above figure. The state of flip-flop always refers to the state of
normal output Q were as inverted output Q dash is in the opposite state. A flip flop is said to be
in HIGH state or logic 1 state or SET state when Q=1, and in LOW state or logic 0state or
RESET state or clear state when Q dash= 0. A flip-flop can have one or more inputs. These
inputs are used to cause the flip-flop to switch back (flip)and forth(flop) between its possible
output States. A flip-flop input has to be pulsed momentarily to cause a change in the flip-flop
output and the output will remain in that new state even after the input pulse has been removed.
This is called flip-flop's memory characteristic. A flip-flop serves as a storage device. It stores a
1 when it'sQ output is a 1, and stores a 0 when it's Q output is 0.
The term' latch' is used for certain flip-flops. It refers to non clocked flip flop, because these flip
flops 'latch on' to a 1 or a 0 immediately upon receiving the input pulse called SET or RESET.
Gate latches are latches which respond to the inputs and latch on to a 1 or 0 only when they are
enabled. In the absence of ENABLE or gating signal, the lach does not respond to the change in
its input.
The simplest type of flip flop is called an S-R latch . It has two outputs labelled as Q, Q dash
and two inputs labelled as S and R. The state of latch corresponds to the level of Q (ie. HIGH or
LOW or 1 or 0 ) and Q dash is the complement of that state. The output as well as it's
complement are available for each file flop.
Active HIGH S-R latch and it's truth table is shown below:
Q0 represents the state of the flip-flop before applying the inputs. The name of the latch S-R or
SET-RESET, is derived from the name of its inputs.
When the SET input is made HIGH, Q becomes 1 ( and Q dash becomes 0). When the RESET
input is made HIGH ,Q becomes 0 ( and Q dash becomes 1).
If both the inputs S and R are made LOW, there is no Change in the state of the latch .
If both the outputs are made HIGH , the output is unpredictable.
The S-R latch is also called R-S latch or S-C(SET-CLEAR) latch.
An S-R latch can be constructed using two cross coupled NORr gates or NAND gates. Using
two NOR gates, an active HIGH S-R latch can be constructed and using to navigate and active
low SR latch can be constructed.
Figure below shows the logic diagram of an active high SR latch composed of two cross
coupled NOR gates where the output of each gate is connected to one of the input of the other
gate.
1. Let us assume that the latches initially SET ,ie Q=1, Q dash =0. If the input are S=0 and
R=0, the inputs to G1 are 0 ( R ) and 0 ( Q dash)and so it's output is 1, ie. Q remains as
1. The input to G2 are 0( S) and 1(Q) and so it's output is 0,i e, Q dash remains as 0.
That is ,S=0 and R=0 do not result in a change of state.
Similarly v,if Q=0 and Q dash =1 initially and if S=0 and R=0 are the inputs applied ,the
inputs to G2 are 0(Q) and 0(S) and so it's output is a 1,i.e, Q dash remains as 1. The
input to G1 are 0(R ) and 1(Q dash) and so it's output is 0,i.e,Q remains as 0. This
implies the light remains in the same state, when S=0,R=0 is applied.
2. If Q=1 and Q dash =0 initially ,and if inputs S=1 and R=0 are applied ,the inputs to G2
are 1(S) and 1(Q )and so it's output is a 0 ,i.e, Q dash remains 0. The inputs to G1 are 0
(R ) and 0 ( Q dash) and so it's output is a 1,i.e. Q remains as 1. If Q=0 and Q dash =1
initially and if inputs S= 1 and R =0are applied ,the inputs to G2 are 1( S) and 0 (Q) and
so it's output is a 0 ,i.e Q dash goes to 0. The inputs to G2 are 0 (R ) and 0 ( Qdash)
,and so it's output is 1,i.e Q goes to 1. This implies that irrespective of the present state,
the output of S-R latch goes to SET state,i.e state 1 after application of the input, S=1
and R=0.
3. IfQ=1 and Q dash=0 initially,and if inputs S=0,R=1 are applied the inputs to G1 are 1 ( R)
and 0 ( Q dash) and so it's output is 0 ,i.e Q goes to 0. The inputs to G2 are 0 (S) and
0(Q) ,and so it's output is 1,IE.Q dash goes to 1. If Q=0 and Q dash =1 initially, and if
inputs S=0 and R =1 are applied, the inputs to G1 are 1 (R ) and 1( Q dash) ,and so the
output of G1 is 0,i.e Q remains as 0. The inputs to G2 are 0 (S) and 0(Q dash) ,and so
the output of G2 is 1,i.e Q dash remains as 1. This implies that whatever may be it's
present state, when S=0,R=1 are applied the flip flop first to RESET state ,i.e state 0.
4. When both the inputs S and R are 1 ,the corresponding outputs will be Q=0 and Q
dash=0, which is invalid.
An active - LOW S-R latch can be constructed using two cross- coupled NAND gates.
Since the NAND Gate is equivalent to an active -LOW OR gate , an active- LOW S-R using OR
gates is represented as well as the logic diagram and truth table of an active- LOW S-R latch
are represented below.
An active - LOW NAND latch can be converted into an active- HIGH NANAD latch by inserting
the inverters at the S and R inputs . Logic diagram and truth table of an active- HIGH NANAD
latch is represented below.
● 3. GRATED LATCHES :
Output can change state anytime the input conditions are changed. This type of latches are
called as asynchronous latches. A gated S-R large requires an ENABLE (EN) input. Its S and
R inputs will control the state of flip-flop only when the ENABLE is HIGH. When the ENABLE is
LOW , the inputs become ineffective and no change of state can take place. The ENABLE input
may be a clock. So a gated SR latch is called clocked SR latch orr synchronous SR latch. Since
this type of flip-flop response to the changes in inputs only as long as the clock is HIGH, this
type of flip-flops are also called level triggered flip flops. The logic diagram the logic symbol
and the truth table for the gated SR latch are shown below in which the invalid state occurs
when both S and R are simultaneously HIGH .
● 3.2 The Gated D - latch :
It is not necessary to have separate S and R inputs to a latch if the input combinations S= R=0
and S=R=1 are never needed, the S and R are always the complement of each other. So we
can construct a latch with a single input (S) and obtain the R input by inverting it. This single
input is labelled as D (for data) and the device is called a D- latch . Show another type of gated
latch is the gated D latch. It differs from the SR latch in that it has only one input in addition to
EN . When D= 1, we have S=1 and R=0, causing the latch to SET when ENABLED. When D=0,
we have S=0 and R=1, causing the latch to RESET when ENABLED . When EN is LOW, the
latch is ineffective. The logic diagram , the logic symbol and the truth table of the gated D latch
is shown below.
anaro positive spike is generated at the rising edge of the clock using an inverter and an AND
Gate . The inverter produce a delay of a few nanoseconds (NS) . The AND gate produces an
output spike that is HIGH only for a few NS, when CLK and CLK dash are both HIGH. This
results in a narrow pulse at the output of AND gate which occurs at the positive going transition
of the clock signal. Similarly a narrow positive spike is generated at the falling edge of the clock
by using an inverter and an active - LOW AND gate as shown below.
The S and R are inputs of SR flip flop are synchronous control inputs because data on which
input affect the flip flops output only on the triggering edge of the clock pulse.
1. When S is HIGH and R is LOW ,the Q outputs goes HIGH on the positive going edge of
the clock pulse and flip flop is SET. (if it is already in SET state it remains SET).
2. When S is LOW and R is HIGH,the Q output goes LOW on the positive going edge of
clock pulse and flip flop is RESET that is cleared. (if it is already in RESET state,it
remains RESET).
3. When both S and R are LOW ,the output does not change from its prior state.
Similarly , negative edge triggered SR flip-flop triggers only when the clock input goes negative
i.e 1-0 as shown below.
D flip flop has only one input terminal. The D flip flop may be obtained from a SR flip flop by
just putting one inverter between the S and R terminals. The the flip-flop has only one
synchronous control input in addition to the clock. This is called the D( data) input. The level
present at D will be stored in the flip flop at the instant of positive going transition occurs.i.e , if D
is a 1 and the clock is applied, Q goes to a 1and Q dash to a 0 at the rising edge of the pulse. If
D is 0 and the clock is applied, Q goes to 0 and Q dash to a 1 at the rising edge of the clock
pulse and thereafter remain so.
The negative edge triggered D flip flop operates in the same way as a positive edge triggered D
flip flop except that the change of state takes place at the negative going edge of the clock pulse
as shown below.
The logic symbol and truth table of the negative edge triggered JK a flip flop is shown below are
shown below.
Explanation:
Serial in, serial out (implementation with S-R and J-K FF):
Explanation:
● There are four data lines A, B, C and D through which the data is
entered into the register in parallel form.
● The signal Shift/Load allows
a. The data to be entered in parallel form into the register.
b. The data to be shifted out serially from terminal Q4.
● When Shift/Load line is HIGH, gates G1, G2 and G3 are disabled, but
gates G4, G5 and G6 are enabled allowing the data bits to shift -right
from one stage to the next.
● When a clock pulse is applied, these data bits are shifted to the Q
output terminals of FFs and therefore, data is inputted in one step.
● When a clock pulse occurs, the data bits are then effectively shifted
one place to the right.
● A LOW on the Right/Left control input :
➢ Enables the AND gates G5,G6,G7 and G8 and
➢ Disables the AND gates G1,G2,G3 and G4 and
➢ The Q output of FF is passed to the D input of the preceding
FF.
● When a clock pulse occurs, the data bits are then effectively shifted
one place to the left.
Supercomputers
Functional units
Input Unit
Output Unit
Memory Unit
Secondary storage
Performance
Byte Addressability
We look at 3 new terms
• the bit
• the byte:8 bits
• the word: ranges from 16 to 64 bit
• Impractical: -assign individual address to each bit in memory.
• Practical Solution: -successive address refers to successive byte
locations in the memory.
• Byte-addressable memory is used for this assignment.
• Byte locations have address 0, 1, 2, ...... thus, if the word length
of the machine is 32-bits, locations are located 0, 4, 8, .....
consisting four bytes.
Memory Operations:
• To execute an instruction
➢ The processor control circuits must cause the word containing the instruction to
be transferred from memory to the processor
➢ Operands and results must be moved between the memory and the processor.
• Two basic operations involving the memory are needed
• Load(or Read or Fetch)
• Store (or Write)
• Load operation:
➢ Transfer a copy of content of a specific memory location to the processor register.
➢ Address need to be specified – Load
➢ Memory content remains unchanged. Load operation is performed as follows:-
1. The processor sends the address of the desired location to the memory and
requests that its contents be read.
2. The memory reads the data stored at that address and sends them to the
processor.
• Store operation:
➢ Transfer an item of information from processor to a specific memory location,
destroying the former contents of that location.
➢ Overwrite the content in memory.
➢ Address and Data need to be specified - Store
➢ The processor sends the address of the desired location to the memory, together
with the data to be written into location.
Another example:
Consider the operation that adds the contents of register R1 and R2 and then places their sum
into R3.
R3 −[R1]+[R2]
This type of notation is known as Register Transfer Notation (RTN)
MOV LOC,R1
• Contents of LOC are unchanged by execution of the statement
• Old contents of R1 are overwritten
Add R1, R2, R3
• add 2 numbers in processor registerR1 and R2
• place the sum in R3
•
Add A,B,C
• Operands A and B are called source operands and C is destination operands.
• Add is the operation to be performed on operands.
Storage In Memory of the Above Instruction
• Suppose each operand needs k bits
• Above instruction has 3 operands ,so 3k bits are needed
• Add is denoted by additional bits.
• The modern 32 bit address space of computer ,3 address instruction is too large to fit in 1
word(32 bit)
• Therefore, we need multiple words to be used for single instruction.
Branching
• Task: add a list of n numbers
• Address of memory location containing n numbers :NUM1,NUM2,NUM3......NUMn
• Add instruction: add each number to contents of register R0
• Result after addition placed in R0
Condition Codes
• The processor keeps track of information of result of various operations for use by
subsequent conditional branch instruction.
• The information is recorded in individual bits called conditional code flag
• Flag are grouped in special processor register called condition code register or status
register
• In the Loop to add numbers, the block must refer to different address during each pass.
• Memory address cannot be given directly in a single Add instruction in the loop.
Otherwise it would be need to be modified on each pass through the loop.
• Only possibility: a processor register Ri is used to hold memory address of an operand
and incremented by 4 at each pass.
Addressing Modes
• Computer programs are written in high level language
• High level language has constants, local and global variables, pointers and arrays
• When this High level language is translated into assembly language compiler must
implement elements of high level language using facilities provided in the instruction set
of computer.
Definition
The different ways in which the location of an operand is specifiedin an instruction are referred
to as Addressing Modes
A declaration:-
Integer A, B;
• In a high-level language program will cause the compiler to allocate a memory location
to each of the variables A and B.
• Whenever they are referenced later in the program, the compiler can generate assembly
language instructions that use the Absolute mode to access these variables.
Implementation of Constants
Address and data constants can be represented in assembly language using the immediate mode.
Definition
Immediate mode - The operand is given explicitly in the instruction.
For example, the instruction
Move 200 immediate, RO places the value 200 in register RO.
A common convention is to use the sharp sign (#) in front of the value to indicate that this value
is to be used as an immediate operand.
We denote indirection by placing the name of the register or the memory address given in the
instruction in parentheses.
Relative Addressing
• The Index mode defined using general-purpose processor registers.
• If the program counters, PC, is used instead of a general purpose register.
• Then, X(PC) can be used to address a memory location that is X bytes away from the
location presently pointed to by the program counter.
• Since the addressed location is identified ”relative” to the program counter, the name
Relative mode is associated with this type of addressing.
Definition
Relative mode - The effective address is determined by the Index mode using the program
counter in place of the general-purpose register Ri.
Additional Modes
• Modes are useful for accessing data items in successive locations.
➢ Auto increment mode:-The effective address of an operand is the contents of a
register specified in the instruction.
➢ After accessing the operands, the contents of this register are automatically
incremented to point to next item in a list.
➢ Notation:(Ri)+
➢ the increment amount is 1 implicitly
➢ but in byte addressable memory with 32bit word length ,increment must be 4.
If resultant sum is positive, you can find the magnitude of it directly. But, if
the resultant sum is negative, then take 2’s complement of it in order to get
the magnitude.
Example 1
Let us perform the addition of two decimal numbers +7 and +4 using 2’s
complement method.
+710 = 001112
+410 = 001002
The resultant sum contains 5 bits. So, there is no carry out from sign bit.
The sign bit ‘0’ indicates that the resultant sum is positive. So, the
magnitude of sum is 11 in decimal number system. Therefore, addition of
two positive numbers will give another positive number.
Example 2
Let us perform the addition of two decimal numbers -7 and -4 using 2’s
complement method.
−710 = 110012
−410 = 111002
The resultant sum contains 6 bits. In this case, carry is obtained from sign
bit. So, we can remove it
10 + −410 = 101012.
The sign bit ‘1’ indicates that the resultant sum is negative. So, by taking
2’s complement of it we will get the magnitude of resultant sum as 11 in
decimal number system. Therefore, addition of two negative numbers will
give another negative number.
Subtraction of two Signed Binary Numbers
Consider the two signed binary numbers A & B, which are represented in
2’s complement form. We know that 2’s complement of positive number
gives a negative number. So, whenever we have to subtract a number B
from number A, then take 2’s complement of B and add it to A. So,
mathematically we can write it as
A - B = A + 2′scomplementofB
Similarly, if we have to subtract the number A from number B, then take 2’s
complement of A and add it to B. So, mathematically we can write it as
B - A = B + 2′scomplementofA
So, the subtraction of two signed binary numbers is similar to the addition
of two signed binary numbers. But, we have to take 2’s complement of the
number, which is supposed to be subtracted. This is the advantage of 2’s
complement technique. Follow, the same rules of addition of two signed
binary numbers.
Example 3
+710 = 001112
+410 = 111002
The sign bit ‘0’ indicates that the resultant sum is positive. So, the
magnitude of it is 3 in decimal number system. Therefore, subtraction of
two decimal numbers +7 and +4 is +3.
Example 4
+410 = 001002
−710 = 110012
Here, carry is not obtained from sign bit. The sign bit ‘1’ indicates that the
resultant sum is negative. So, by taking 2’s complement of it we will get the
magnitude of resultant sum as 3 in decimal number system. Therefore,
subtraction of two decimal numbers +4 and +7 is -3
Multiplication Algorithm in Signed Magnitude
Representation
Multiplication of two fixed point binary number in signed magnitude
representation is done with process of successive shift and add operation.
The numbers copied down in successive lines are shifted one position to
the left from the previous number.
Finally numbers are added and their sum form the product.
The sign of the product is determined from the sign of the multiplicand and
multiplier. If they are alike, sign of the product is positive else negative.
Booth’s Algorithm
Booth algorithm gives a procedure for multiplying binary integers in signed
2’s complement representation in efficient way, i.e., less number of
additions/subtractions required. It operates on the fact that strings of 0’s in
the multiplier require no addition but just shifting and a string of 1’s in the
multiplier from bit weight 2^k to weight 2^m can be treated as 2^(k+1 ) to
2^m.
When the two bits are equal, the partial product does not change. An
overflow cannot occur because the addition and subtraction of the
multiplicand follow each other. As a consequence, the 2 numbers that are
added always have a opposite signs, a condition that excludes an overflow.
The next step is to shift right the partial product and the multiplier (including
Qn+1). This is an arithmetic shift right (ashr) operation which AC and QR ti
the right and leaves the sign bit in AC unchanged. The sequence counter is
decremented and the computational loop is repeated n times.
0000 1001 0 4
AC + MR 1101 1100 1
OPERATION AC MR Qn+1 SC
Product = AC MR
Product = 0010 0011 = 35
Worst case is when there are pairs of alternate 0’s and 1’s, either 01 or 10
in the multipliers, so that maximum number of additions and subtractions
are required.
IEEE Floating point Number Representation −
IEEE (Institute of Electrical and Electronics Engineers) has standardized
Floating-Point Representation as following diagram.
Half Precision (16 bit): 1 sign bit, 5 bit exponent, and 10 bit mantissa
Single Precision (32 bit): 1 sign bit, 8 bit exponent, and 23 bit
mantissa
Double Precision (64 bit): 1 sign bit, 11 bit exponent, and 52 bit
mantissa
Quadruple Precision (128 bit): 1 sign bit, 15 bit exponent, and 112 bit
mantissa
There are some special values depended upon different values of the
exponent and mantissa in the IEEE 754 standard.
All the exponent bits 0 with all mantissa bits 0 represents 0. If sign bit
is 0, then +0, else -0.
All the exponent bits 1 with all mantissa bits 0 represents infinity. If
sign bit is 0, then +∞, else -∞.
All the exponent bits 0 and mantissa bits non-zero represents
denormalized number.
All the exponent bits 1 and mantissa bits non-zero represents error.
Instruction Execution Cycle
Instruction Execution
Instruction is command which is given by the user to computer. Execution is the process by
which a computer performs instruction. Instruction Execution means a program to be executed
by a processor consists of a set of instructions stored in memory.
Terminologies
Program Counter is a register in a computer processor that contains the address of the next
instruction which will be executed.
Memory Address Register (MAR) holds the Memory Location of data that needs to be accessed.
Instruction Register (IR) is a part of CPU control unit that stores the instruction currently being
executed or decoded.
Memory Buffer Register (MBR) stores the data being transferred to and from immediate access
store also known as Memory Data Register (MDR).
Control Unit (CU) decodes the program instruction in the IR, selecting machine resources such
as a data source register and a particular arithmetic operation.
Instruction Register
❖ The time period during which one instruction is fetched from memory and execute when
computer given an instruction in machine language.
❖ Each instruction is further divided into sequence of phases.
❖ After the execution of program counter is incremented to point to the next instruction.
Process
• The Program Counter (PC) contains the address of the next instruction to be fetched.
• The address contained in the PC is copied to the Memory Address Register (MAR).
• The instruction is copied from the memory location contained in the MAR and placed in
the Memory Buffer Register (MBR).
• The entire instruction is copied from the MBR and placed in the Current Instruction
Register (CIR)
Execute Cycle
• To execute the instructions, the CPU must generate the control signals in the proper
sequence.
• Two techniques to generate the control signals are
1. Hardwired control
2. Micro programmed control
Hardwired Control
The Hardwired Control organization involves the control logic to be implemented with gates,
flip-flops, decoders, and other digital circuits.
The following image shows the block diagram of a Hardwired Control organization.
The following image shows the block diagram of a Micro programmed Control organization.
• The Control memory address register specifies the address of the micro-instruction.
• The Control memory is assumed to be a ROM, within which all control information is
permanently stored.
• The control register holds the microinstruction fetched from the memory.
• The micro-instruction contains a control word that specifies one or more micro-
operations for the data processor.
• While the micro-operations are being executed, the next address is computed in the
next address generator circuit and then transferred into the control address register to
read the next microinstruction.
• The next address generator is often referred to as a micro-program sequencer, as it
determines the address sequence that is read from control memory.
Control Signals
A pulse or frequency of electricity or light that represents a control command as it travels over a network, a
computer channel or wireless. Control signals are of two types: clocks and signals that set up communication
channels and control the flow of data.
There are three main types of control signals namely;
• Those that activate an ALU function.
• Those that activate a data path.
• Those that are signals on the external system bus or other external interface.
Clock: This is how the control unit “keeps time.” The control unit causes
one micro-operation (or a set of simultaneous micro-operations) to be
performed for each clock pulse. This is sometimes referred to as the
processor cycle time, or the clock cycle time.
Microinstructions
• A symbolic microprogram can be translated into its binary equivalent by means of an assembler.
• Each line of the assembly language microprogram defines a symbolic microinstruction.
• Each symbolic microinstruction is divided into five fields: label, microoperations, CD, BR, and AD.
Micro program:
• A sequence of microinstructions constitutes a microprogram.
• Since alterations of the microprogram are not needed once the control unit is in operation,
the control memory can be a read-only memory (ROM).
• ROM words are made permanent during the hardware production of the unit.
• The use of a micro program involves placing all control variables in words of ROM for
use by the control unit through successive read operations.
• The content of the word in ROM at a given address specifies a microinstruction.
Microcode:
• Microinstructions can be saved by employing subroutines that use common sections of microcode.
• For example, the sequence of micro-operations needed to generate the effective address of the operand
for an instruction is common to all memory reference instructions.
• This sequence could be a subroutine that is called from within many other routines to execute the
effective address computation.
Organization of micro programmed control unit
• The general configuration of a micro-programmed control unit is demonstrated in the block diagram of
Figure 4.1.
• The control memory is assumed to be a ROM, within which all control information is permanently
stored.
figure 4.1: Micro-programmed control organization
The control memory address register specifies the address of the microinstruction, and the control data register
holds the microinstruction read from memory.
The microinstruction contains a control word that specifies one or more microoperations for the data processor.
Once these operations are executed, the control must determine the next address.
The location of the next microinstruction may be the one next in sequence, or it may be located somewhere
else in the control memory.
While the microoperations are being executed, the next address is computed in the next address generator
circuit and then transferred into the control address register to read the next microinstruction.
Thus a microinstruction contains bits for initiating microoperations in the data processor part and bits that
determine the address sequence for the control memory.
The next address generator is sometimes called a micro-program sequencer, as it determines the address
sequence that is read from control memory.
Typical functions of a micro-program sequencer are incrementing the control address register by one, loading
into the control address register an address from control memory, transferring an external address, or loading an
initial address to start the control operations.
The control data register holds the present microinstruction while the next address is computed and read from
memory.
The data register is sometimes called a pipeline register.
It allows the execution of the microoperations specified by the control word simultaneously with the generation
of the next microinstruction.
This configuration requires a two-phase clock, with one clock applied to the address register and the other to
the data register.
The main advantage of the micro programmed control is the fact that once the hardware configuration is
established; there should be no need for further hardware or wiring changes.
If we want to establish a different control sequence for the system, all we need to do is specify a different set
of microinstructions for control memory.
Address Sequencing
Microinstructions are stored in control memory in groups, with each group specifying a routine.
To appreciate the address sequencing in a micro-program control unit, let us specify the steps that the control
must undergo during the execution of a single computer instruction.
Step-1:
An initial address is loaded into the control address register when power is turned on in the computer.
This address is usually the address of the first microinstruction that activates the instruction fetch routine.
The fetch routine may be sequenced by incrementing the control address register through the rest of its
microinstructions.
At the end of the fetch routine, the instruction is in the instruction register of the computer.
Step-2:
The control memory next must go through the routine that determines the effective address of the operand.
A machine instruction may have bits that specify various addressing modes, such as indirect address and
index registers.
The effective address computation routine in control memory can be reached through a branch
microinstruction, which is conditioned on the status of the mode bits of the instruction.
When the effective address computation routine is completed, the address of the operand is available in the
memory address register.
Step-3:
The next step is to generate the microoperations that execute the instruction fetched from memory.
The microoperation steps to be generated in processor registers depend on the operation code part of the
instruction.
Each instruction has its own micro-program routine stored in a given location of control memory.
The transformation from the instruction code bits to an address in control memory where the routine is
located is referred to as a mapping process.
A mapping procedure is a rule that transforms the instruction code into a control memory address.
Step-4:
Once the required routine is reached, the microinstructions that execute the instruction may be sequenced by
incrementing the control address register.
Micro-programs that employ subroutines will require an external register for storing the return address.
Return addresses cannot be stored in ROM because the unit has no writing capability.
When the execution of the instruction is completed, control must return to the fetch routine.
This is accomplished by executing an unconditional branch microinstruction to the first address of the fetch
routine.
Mapping of an Instruction
A special type of branch exists when a microinstruction specifies a branch to the first word in control
memory where a microprogram routine for an instruction is located.
The status bits for this type of branch are the bits in the operation code part of the instruction.
For example, a computer with a simple instruction format as shown in figure 4.3 has an operation code of four
bits which can specify up to 16 distinct instructions.
Assume further that the control memory has 128 words, requiring an address of seven bits.
One simple mapping process that converts the 4-bit operation code to a 7-bit address for control memory is
shown in figure 4.3.
This mapping consists of placing a 0 in the most significant bit of the address, transferring the four operation
code bits, and clearing the two least significant bits of the control address register.
This provides for each computer instruction a microprogram routine with a capacity of four
microinstructions.
If the routine needs more than four microinstructions, it can use addresses 1000000 through 1111111. If it
uses fewer than four microinstructions, the unused memory locations would be available for other routines.
One can extend this concept to a more general mapping rule by using a ROM to specify the mapping
function.
The contents of the mapping ROM give the bits for the control address register.
In this way the microprogram routine that executes the instruction can be placed in any desired location in
control memory.
The mapping concept provides flexibility for adding instructions for control memory as the need arises.
Computer Hardware Configuration
Figure 4.4: Computer hardware configuration
The block diagram of the computer is shown in Figure 4.4. It consists of
1. Two memory units:
Main memory -> for storing instructions and data, and
Control memory -> for storing the microprogram.
2. Six Registers:
Processor unit register: AC(accumulator),PC(Program Counter), AR(Address Register),
DR(Data Register)
Control unit register: CAR (Control Address Register), SBR(Subroutine Register)
3. Multiplexers:
The transfer of information among the registers in the processor is done through
multiplexers rather than a common bus.
4. ALU:
The arithmetic, logic, and shift unit performs microoperations with data from AC and DR
and places the result in AC.
Microinstruction Format
The microinstruction format for the control memory is shown in figure 4.5. The 20 bits of the
microinstruction are divided into four functional parts as follows:
1. The three fields F1, F2, and F3 specify microoperations for the computer.
The microoperations are subdivided into three fields of three bits each. The three bits in each field are
encoded to specify seven distinct microoperations. This gives a total of 21 microoperations.
2. The CD field selects status bit conditions.
3. The BR field specifies the type of branch to be used.
4. The AD field contains a branch address. The address field is seven bits wide, since the control
memory has 128 = 27 words.
□ The BR (branch) field consists of two bits. It is used, in conjunction with the address fieldAD, to
choose the address of the next microinstruction shown in Table 4.2.
Symbolic Microinstruction.
□ Each line of the assembly language microprogram defines a symbolic microinstruction.
□ Each symbolic microinstruction is divided into five fields: label, microoperations, CD,BR, and
AD. The fields specify the following Table 4.3.
1
S MUX 1 SBR
l1 Logic
T 0
1
Test Incremen
MUX 2 t
L Clock CAR
Select
Control
Memory
Microo AD
ps CD BR
Boolean Function:
S0 = I0
S1 = I0I1 + I0’TL =
I0’I1T
□ Typical sequencer operations are: increment, branch or jump, call and return from subroutine, load
an external address, push or pop the stack, and other address sequencing operations.
□ With three inputs, the sequencer can provide up to eight address sequencing operations.
□ Some commercial sequencers have three or four inputs in addition to the T input and thusprovide a
wider range of operations.
The Central Processing Unit (CPU) is called the brain of the computer that performs data-processing
operations. Figure 3.1 shows the three major parts of CPU.
·
Intermediate data is stored in the register set during the execution of the instructions. Themicrooperations
required for executing the instructions are performed by the arithmetic logic unit whereas the control unit
takes care of transfer of information among the registers and guides the ALU. The control unit services the
transfer of information among the registers and instructs the ALU about which operation is to be
performed. The computer instruction set is meant forproviding the specifications for the design of the
CPU. The design of the CPU largely, involves choosing the hardware for implementing the machine
instructions. The need for memory locations arises for storing pointers, counters, return address,
temporary results and partial products. Memory access consumes the most of the time off an operation in a
computer. It is more convenient and more efficient to store these intermediate values in processor registers.
A common bus system is employed to contact registers that are included in the CPU in a large number.
Communications between registers is not only for direct data transfer but also for performing various micro-
operations. A bus organization for such CPU register shown in Figure 3.2, is connected to two multiplexers
(MUX) to form two buses A and B. The selected lines in each multiplexers select one register of the input
data for the particular bus.
OPERATION OF CONTROL UNIT:
The control unit directs the information flow through ALU by:
- Selecting various Components in the system
- Selecting the Function of ALU
Example: R1 <- R2 + R3
1] MUX A selector (SELA): BUS A R2
[2] MUX B selector (SELB): BUS B R3
[3] ALU operation selector (OPR): ALU to ADD
[4] Decoder destination selector (SELD): R1 Out Bus
Control Word
Encoding of register selection fields
Symbolic Designation
Microoperation SELA SELB SELD OPR Control Word
Stack organization:
□ A stack is a storage device that stores information in such a manner that the item storedlast is the first
item retrieved.
□ The stack in digital computers is essentially a memory unit with an address register that can count only.
The register that holds the address for the stack is called a stack pointer (SP) because its value always
points at the top item in the stack.
□ The physical registers of a stack are always available for reading or writing. It is thecontent of the
word that is inserted or deleted.
Register stack:
□ A stack can be placed in a portion of a large memory or it can be organized as a collection of a finite
number of memory words or registers. Figure shows the organization of a 64- word register stack.
□ The stack pointer register SP contains a binary number whose value is equal to the addressof the
word that is currently on top of the stack. Three items are placed in the stack: A, B, and C, in that
order. Item C is on top of the stack so that the content of SP is now 3.
□ To remove the top item, the stack is popped by reading the memory word at address 3 and
decrementing the content of SP. Item B is now on top of the stack since SP holds address 2.
□ To insert a new item, the stack is pushed by incrementing SP and writing a word in the next-
higher location in the stack.
□ In a 64-word stack, the stack pointer contains 6 bits because 26 = 64.
□ Since SP has only six bits, it cannot exceed a number greater than 63 (111111 in binary). When 63
are incremented by 1, the result is 0 since 111111 + 1 = 1000000 in binary, but SP can accommodate
only the six least significant bits.
□ Similarly, when 000000 is decremented by 1, the result is 111111. The one-bit register FULL is
set to 1 when the stack is full, and the one-bit register EMTY is set to 1 when thestack is empty of
items.
□ DR is the data register that holds the binary data to be written into or read out of the stack.
PUSH:
□ If the stack is not full (FULL =0), a new item is inserted with a push operation. The pushoperation
consists of the following sequences of microoperations:
□ The stack pointer is incremented so that it points to the address of next-higher word. Amemory
write operation inserts the word from DR into the top of the stack.
□ SP holds the address of the top of the stack and that M[SP] denotes the memory wordspecified by
the address presently available in SP.
□ The first item stored in the stack is at address 1. The last item is stored at address 0. If SP reaches 0,
the stack is full of items, so FULL is set to 1. This condition is reached if thetop item prior to the
last push was in location 63 and, after incrementing SP, the last item is stored in location 0.
□ Once an item is stored in location 0, there are no more empty registers in the stack. If an item is written
in the stack, obviously the stack cannot be empty, so EMTY is cleared to 0.
POP:
□ A new item is deleted from the stack if the stack is not empty (if EMTY = 0). The popoperation
consists of the following sequences of microoperations:
Memory Stack.
Figure 5.2: Computer memory with program, data, and stack segments
□ The implementation of a stack in the CPU is done by assigning a portion of memory to astack
operation and using a processor register as a stack pointer.
□ Figure 5.2 shows a portion of computer memory partitioned into three segments: program,data, and
stack.
□ The program counter PC points at the address of the next instruction in the program whichis used during
the fetch phase to read an instruction.
□ The address registers AR points at an array of data which is used during the execute phase to read an
operand.
□ The stack pointer SP points at the top of the stack which is used to push or pop items into or from the
stack.
□ The three registers are connected to a common address bus, and either one can provide an address for
memory.
□ As shown in Figure 5.2, the initial value of SP is 4001 and the stack grows with decreasing addresses.
Thus the first item stored in the stack is at address 4000, the second item is stored at address 3999, and the
last address that can be used for the stack is 3000.
□ We assume that the items in the stack communicate with a data register DR.
PUSH
□ A new item is inserted with the push operation as follows:
SP ← SP - 1 M[SP]
←DR
□ The stack pointer is decremented so that it points at the address of the next word.
□ A memory write operation inserts the word from DR into the top of the stack.
POP
□ A new item is deleted with a pop operation as follows:
DR ←
M[SP] SP
← SP + 1
□ The top item is read from the stack into DR.
□ The stack pointer is then incremented to point at the next item in the stack.
□ The two microoperations needed for either the push or pop are (1) an access to memorythrough
SP, and (2) updating SP.
□ Which of the two microoperations is done first and whether SP is updated by
incrementing or decrementing depends on the organization of the stack.
□ In figure. 5.2 the stack grows by decreasing the memory address. The stack may be
constructed to grow by increasing the memory also.
□ The advantage of a memory stack is that the CPU can refer to it without having to specify an address,
since the address is always available and automatically updated in the stack pointer.
Instruction formats:
Insruction fields:
* Specifies a rule for interpreting or modifying the address field of the instruction (before the operand is
actually referenced)
202
450
R1 = 400
700
399
XR = 100
400 800
900
AC
500
325
600 300
Addressing Effective Content
Mode Address of AC
Direct address 500 /* AC (500) */ 800
Immediate operand - /* AC 500 */ 500 702
Indirect address 800 /* AC ((500)) */ 300
Relative address 702 /* AC (PC+500) */ 325
Indexed address 600 /* AC (XR+500) */ 900
Register - /* AC R1 */ 400 800
Register indirect 400 /* AC (R1) */ 700
Autoincrement 400 /* AC (R1)+ */ 700
Autodecrement 399 /* AC -(R) */ 450
Shift instructionsArithmetic
Instructions
Name Mnemonic
Increment INC
Decrement DEC
Add ADD
Subtract SUB
Multiply MUL
Divide DIV
Add with Carry ADDC
Subtract with Borrow SUBB
Negate(2’s Complement) NEG
Computer Organization 20
CMP and TST instructions do not retain their results of operations(- and AND, respectively).
They only set or clear certain Flags.
CONDITIONAL BRANCH INSTRUCTIONS
□ The interrupt facility allows the running program to proceed until the input or output device
sets its ready flag. Whenever a flag is set to 1, the computer completes the execution of the
instruction in progress and then acknowledges the interrupt.
□ The result of this action is that the retune address is stared in location 0. The instruction in
location 1 is then performed; this initiates a service routine for the input or output transfer. The
service routine can be stored in location 1.
□ The service routine must have instructions to perform the following tasks:
1) External interrupts:
□ External interrupts come from input-output (I/0) devices, from a timing device, from a circuit
monitoring the power supply, or from any other external source.
□ Examples that cause external interrupts are I/0 device requesting transfer of data, I/o device
finished transfer of data, elapsed time of an event, or power failure. Timeout interrupt may
result from a program that is in an endless loop and thus exceeded its time allocation.
□ Power failure interrupt may have as its service routine a program that transfers the complete
state of the CPU into a nondestructive memory in the few milliseconds before power ceases.
□ External interrupts are asynchronous. External interrupts depend on external conditions that
are independent of the program being executed at the time.
2) Internal interrupts:
□ Internal interrupts arise from illegal or erroneous use of an instruction or data. Internal
interrupts are also called traps.
□ Examples of interrupts caused by internal error conditions are register overflow, attempt to
divide by zero, an invalid operation code, stack overflow, and protection violation. These
error conditions usually occur as a result of a premature termination of the instruction
execution. The service program that processes the internal interrupt determines the corrective
measure to be taken.
□ Internal interrupts are synchronous with the program. . If the program is rerun, the internal
interrupts will occur in the same place each time.
3) Software interrupts:
□ A software interrupt is a special call instruction that behaves like an interrupt rather than a
subroutine call. It can be used by the programmer to initiate an interrupt procedure at any
desired point in the program.
□ The most common use of software interrupt is associated with a supervisor call instruction. This
instruction provides means for switching from a CPU user mode to the supervisor mode. Certain
operations in the computer may be assigned to the supervisor mode only, as for example, a
complex input or output transfer procedure. A program written by a user must run in the user
mode.
□ When an input or output transfer is required, the supervisor mode is requested by means of a
supervisor call instruction. This instruction causes a software interrupt that stores the old CPU
state and brings in a new PSW that belongs to the supervisor mode.
□ The calling program must pass information to the operating system in order to specify the
particular task requested.
Reverse Polish Notation (RPN) with appropriate example.
□ The postfix RPN notation, referred to as Reverse Polish Notation (RPN), places
theoperator after the operands.
□ The following examples demonstrate the three representations:
A+B Infix notation
+AB Prefix or Polish notation
AB+ Postfix or reverse Polish notation
□ The reverse Polish notation is in a form suitable for stack manipulation.
The expression
A * B + C * D is written in reverse Polish
notation asA B * C D * +
□ The conversion from infix notation to reverse Polish notation must take into
considerationthe operational hierarchy adopted for infix notation.
□ This hierarchy dictates that we first perform all arithmetic inside inner parentheses, then
inside outer parentheses, and do multiplication and division operations before addition
and subtraction operations.
Characteristic of RISC –
• Simpler instruction, hence simple instruction decoding.
• Instruction comes undersize of one word.
• Instruction takes a single clock cycle to get executed.
• More number of general-purpose registers.
• Simple Addressing Modes.
• Less Data types.
• Pipeline can be achieved.
Characteristic of CISC –
• Complex instruction, hence complex instruction decoding.
• Instructions are larger than one-word size.
• Instruction may take more than a single clock cycle to get executed.
• Less number of general-purpose registers as operation get performed in memory itself.
• Complex Addressing Modes.
• More Data types.
Example – Suppose we have to add two 8-bit number:
CISC approach: There will be a single command or instruction for this like ADD which will
perform the task.
RISC approach: Here programmer will write the first load command to load data in registers then
it will use a suitable operator and then it will store the result in the desired location.
So, add operation is divided into parts i.e. load, operate, store due to which RISC programs are
longer and require more memory to get stored but require fewer transistors due to less complex
command.
Difference
RISC CISC
It requires multiple register sets to store the It requires a single register set to store the
instruction. instruction.
RISC has simple decoding of instruction. CISC has complex decoding of instruction.
Uses of the pipeline are simple in RISC. Uses of the pipeline are difficult in CISC.
It uses a large number of instruction that
It uses a limited number of instruction that
requires more time to execute the
requires less time to execute the instructions.
instructions.
It uses LOAD and STORE that are independent It uses LOAD and STORE instruction in
instructions in the register-to-register a program's the memory-to-memory interaction of a
interaction. program.
The execution time of RISC is very short. The execution time of CISC is longer.
RISC architecture can be used with high-end CISC architecture can be used with low-
applications like telecommunication, image end applications like home automation,
processing, video processing, etc. security system, etc.
The program written for RISC architecture needs Program written for CISC architecture
to take more space in memory. tends to take less space in memory.
Example of RISC: ARM, PA-RISC, Power Examples of CISC: VAX, Motorola 68000
Architecture, Alpha, AVR, ARC and the family, System/360, AMD and the Intel
SPARC. x86 CPUs.
University Questions
1.Compare the functions of RISC and CISC (9 Marks)2020
2.Explain the steps for branch address modification (9 Marks)2018
Cache Memories
● Main memory (still) slow in comparison to processor speed.
● Main memory constrained by packaging, electronic characteristics and costs.
● Cache memory on the processor chip typically ten times faster than main memory.
● Temporal
Cache Memories
● correspondence between the main memory blocks and those in the cache is specified by a
mapping function.
● When the cache is full and a memory word (instruction or data) that is not in the cache is
referenced, the cache control hardware must decide which block should be removed to create
space for the new block that contains the referenced word.
● The collection of rules for making this decision constitutes the replacement algorithm.
● The cache control circuitry determines whether the requested word currently exists in the
cache.
● If it does, the Read or Write operation is performed on the appropriate cache location.
● In this case, a read or write hit is said to have occurred.
● In a Read operation, the main memory is not involved.
● For a Write operation, the system can proceed in 2 ways.
■ write-through protocol:the cache location and the main memory location are
updated simultaneously
■ write-back, or copy-back, protocol.update only the cache location and to mark it
as updated with an associated flag bit, often called the dirty or modified bit.
● The main memory location of the word is updated later, when the block containing
this marked word is to be removed from the cache to make
room for a new block.
● When the addressed word in a Read operation is not in the cache, a read miss occurs.
● The block of words that contains the requested word is copied from the main memory into the
cache.
● After the entire block is loaded into the cache, the particular word requested is forwarded to
the processor.
● Alternatively, this word may be sent to the processor as soon as it is read from the main
memory this approach, which is called load-through, or early restart,
● During a Write operation, if the addressed word is not in the cache, a write miss occurs.
● Then, if the write-through protocol is used, the information is written directly into the main
memory.
● In the case of the write-back protocol, the block containing the addressed word is first brought
into the cache, and then the desired word in the cache is overwritten with the new information.
SECONDARY MEMORY
• As we all known that primary memory, is expensive as well
as limited. The faster primary memory are also volatile. If
we need to store large amount of data or programs
permanently, we need a cheaper and permanent memory.
Such memory is called secondary memory.
Hard Disk
CD- ROM
Floppy Disk
Memory Card
Flash Drive
1. HARD DISK
• Hard disk drive is made up of a series of circular
disks called platters arranged one over the other
almost ½ inches apart around a spindle.
• Disks are made of non-magnetic material like
aluminum alloy and coated with 10-20 nm of
magnetic material.
• Standard diameter of these disks is 14 inches and
they rotate with speeds varying from 4200 rpm
(rotations per minute) for personal computers to
15000 rpm for servers.
• Data is stored by magnetizing or
demagnetizing the magnetic coating.
• A magnetic reader arm is used to read data
from and write data to the disks. A typical
modern HDD has capacity in terabytes (TB).
2. CD-ROM
• CD stands for Compact Disk. CDs are circular
disks that use optical rays, usually lasers, to
read and write data.
• They are very cheap as you can get 700 MB of
storage space for less than a dollar. CDs are
inserted in CD drives built into CPU cabinet.
• They are portable as you can eject the drive,
remove the CD and carry it with you.
There are three types of CDs.
Example
Program Program
COMPUTE PRINT
2
1
routine routine
1
2
Interrupt i
her
occurs
e i+
1
M
Interrupts
•An interrupt is the automatic transfer of software execution in response to a Hardware/Software event that is asynchronous
with the current software execution.
• While the processor is executing a program an ‘interrupt’ breaks the sequence of execution of that program and start
execution of another program.
•ISR –routine executed in response to interrupt requests.
The information that needs to be saved and restored typically includes the condition code flags and the contents of any
registers used by the current program.
Saving registers also increases the delay between the time an interrupt request is received and the start of execution of the
interrupt-service routine. The delay is called interrupt latencyTypically, the processor saves the contents of the program
counter, the processor status register and some additional information that needs to be saved on STACKS.
Interrupt Hardware: An equivalent circuit for an open-drain bus used to implement a common interrupt-request
line.
Vd
Process
R
INR INT
•Most computers have several I/O devices that can request an interrupt.
• A single interrupt request line may be used to serve n devices.
•Devices connected via switches to ground.
•Request – by closing switch.
•Vdd is inactive state of line only when all the request signals(INTR1 to INTRn) are inactive.
•When any device requests by closing switch, INTR line drops to 0 causing INTR signal to CPU.
➔ All computers fundamentally should be able to enable and disable interruptions as desired. When a device
activates the interrupt-request signal, it keeps this signal activated until it learns that the processor has accepted
its request.
• The device is informed that its request has been recognized and deactivates the interrupt request
signal.
• The action requested by the interrupt is performed by the interrupt-service routine.
Polling
Vectored Interrupts
Interrupts nesting (priority)
When a device raises an interrupt request, it sets to 1 one of the bits in its status register , which we will call the IRQ bit.
The simplest way to identify the interrupting device is to have the interrupt-service routine poll all the I / O devices connected to
the bus. The first device encountered with IRQ bit set is serviced. Time consuming process.
• A device requesting an interrupt may identify itself directly to the processor by sending its own special code in the
bus . Tis code include starting address of ISR for this interrupt. This address is called Interrupt vector .
Then, the processor can immediately start executing the corresponding interrupt-service routine. Then it activates Interrupt
acknowledge line
INTA. This is called vectored interrupts.
Nesting interrupts
CPU getting another Interrupts while handling one.
Eg: Tracking ‘Time of the day’ based on real time clock. This device interrupts at regular intervals to update
counters. So need to handle 2nd interrupt.
An interrupt request from a high-priority device should be accepted while the processor is servicing another request from a lower-
priority device
Interrupt Priority
The processor’s priority is usually encoded in a few bits of the processor status word. It can be changed by program instructions
that write into the program status register (PS). These are privileged instructions, which can be executed only while the processor
is running in the supervisor mode
The processor is in the su pervisor mode only when executing operating system routines. It switches to the user mode before
beginning to execute application program
An attempt to execute a privileged instruction while in the user mode leads to a special type of interrupt called a privilege
exception
Pr INTR1 INTRp
o Device 1 Device 2 Device p
INTA
ce INTA
ss
Priority arbitration
circuit
Simultaneous Requests
Consider the problem of simultaneous arrivals of interrupt requests on single line
➔ from two or more devices. The processor must have some means of deciding which request to
service first Interrupt priority scheme with Daisy chain.
Devices connected to a single line. Devices arranged in priority basis. More than one device
can make INTR low. The Processor response to it by sending INTA which passes in a daisy
chain fashion.
Needed device seize the INTA and puts its vector on bus.
Pr INTR
o
ce
ss
or
•At the device end, an interrupt enable bit determines whether it is allowed to generate an interrupt request.
•At the processor end, either Enable bit or Priority scheme determines whether a given interrupt request will be
accepted.
Exceptions
•The term exception is used to refer to any event that causes an interruption.
•Hence, I/O interrupts are one example of an exception.
• Recovery from errors – These are techniques to ensure that all hardware components are operating
properly.
• Debugging – find errors in a program, trace and breakpoints (only at specific points selected by the user).
• Privilege exception – execute privileged instructions to protect OS of a computer. A user pgm will not
allowed to change the priority level
Use of interrupts in Operating Systems
•Operating system is system software which is also termed as resource manager, as it manages all variety of computer
peripheral devices efficiently.
OS uses interrupts for
Program-controlled I/O:
A special control unit is provided to allow transfer of a block of data directly between an
external device and the main memory, without continuous intervention by the processor. This
approach is called direct memory access, or DMA.
DMA transfers are performed by a control circuit that is part of the I/O device interface. We
refer to this circuit as a DMA controller. The DMA controller performs the functions that
would normally be carried out by the processor when accessing the main memory. For each
word transferred, it provides the memory address and all the bus signals that control data
transfer. Since it has to transfer blocks of data, the DMA controller must increment the
memory address for successive words and keep track of the number of transfers.
Although a DMA controller can transfer data without intervention by the processor, its
operation must be under the control of a program executed by the processor. To initiate the
transfer of a block of words, the processor sends the starting address, the number of words in
the block, and the direction of the transfer. On receiving this information, the DMA controller
proceeds to perform the requested operation. When the entire block has been transferred, the
controller informs the processor by raising an interrupt signal.
While a DMA transfer is taking place, the program that requested the transfer cannot
continue, and the processor can be used to execute another program. After the DMA transfer
is completed, the processor can return to the program that requested the transfer.
I/O operations are always performed by the operating system of the computer in response to a
request from an application program. The OS is also re, initiates the DMA operation, and
starts the execution of another program. When the transfer is completed, the DMA controller
informs the processor by sending an interrupt request. In response, the OS puts the suspended
program in the Runnable state so that it can be selected by the scheduler to continue
execution
Two registers are used for storing the starting address and the word count. The third register
contains status and control flags. The R/W bit determines the direction of the transfer. When
this bit is set to 1 by a program instruction, the controller performs a read operation, that is, it
transfers data from the memory to the I/O device. Otherwise, it performs a write operation.
When the controller has completed transferring a block of data and is ready to receive another
command, it sets the Done flag to 1. Bit 30 is the Interrupt-enable flag, IE. When this flag is
set to 1, it causes the controller to raise an interrupt after it has completed transferring a block
of data. Finally, the controller sets the IRQ bit to 1 when it has requested an interrupt
Use of DMA controllers in a computer system
A DMA controller connects a high-speed network to the computer bus. The disk controller,
which controls two disks, also has DMA capability and provides two DMA channels. It can
perform two independent DMA operations, as if each disk had its own DMA controller. The
registers needed to store the memory address, the word count, and so on are duplicated, so
that one set can be used with each device.
To start a DMA transfer of a block of data from the main memory to one of the disks, a
program writes the address and word count information into the registers of the
corresponding channel of the disk controller. It also provides the disk controller with
information to identify the data for future retrieval. The DMA controller proceeds
independently to implement the specified operation. When the DMA transfer is completed,
this fact is recorded in the status and control register of the DMA channel by setting the Done
bit. At the same time, if the IE bit is set, the controller sends an interrupt request to the
processor and sets the IRQ bit. The status register can also be used to record other
information, such as whether the transfer took place correctly or errors occurred.
Memory accesses by the processor and the DMA controllers are interwoven. Requests by
DMA devices for using the bus are always given higher priority than processor requests.
Among different DMA devices, top priority is given to high-speed peripherals such as a disk,
a high-speed network interface, or a graphics display device. Since the processor originates
most memory access cycles, the DMA controller can be said to “steal” memory cycles from
the processor. Hence, this interweaving technique is usually called cycle stealing.
Alternatively, the DMA controller may be given exclusive access to the main memory to
transfer a block of data without interruption. This is known as block or burst mode
Bus Arbitration
The device is that allowed to initiate data transfers on the bus at any given time is called the
bus master. When the current master relinquishes control of the bus, another device can
acquire this status. Bus arbitration is the process by which the next device to become the bus
master is selected and bus mastership is transferred to it. The selection of the bus master must
take into account the needs of various devices by establishing a priority system for gaining
access to the bus.
There are two approaches to bus arbitration: centralized and distributed. In centralized
arbitration, a single bus arbiter performs the required arbitration. In distributed arbitration, all
devices participate in the selection the next bus master.
o Centralized Arbitration:
The bus arbiter may be the processor or a separate unit connected to the bus. A basic
arrangement in which the processor contains the bus arbitration circuitry. In this case, the
processor is normally the bus master unless it grants bus mastership to one of the DMA
controllers, A DMA controller indicates that it needs to become the bus master by activating
the Bus-Request line, R. The signal on the Bus-Request line is the logical OR of the bus
requests from all the devices connected to it. When Bus-Request is activated, the processor
activates the Bus-Grant signal, BGI, indicating to the DMA controllers that they may use the
bus when it becomes free. This signal is connected to all DMA controllers using a daisy-
chain arrangement. Thus, if DMA controller 1 is requesting the bus, it blocks the propagation
of the grant signal to other devices. Otherwise, it passes the grant downstream by asserting
BG2. The current bus master indicates to all device that it is using the bus by activating
another open-controller line called Bus-Busy, BBSY Hence, after receiving the Bus-Grant
signal, a DMA controller waits for Bus-Busy to become inactive, then assumes mastership of
the bus. At this time. it activates Bus-Busy to prevent other devices from using the bus at the
same
The timing diagram in Figure 4.21 shows the sequence of events for the devices in Figure
4.20 as DMA controller 2 requests and acquires bus mastership and later releases the bus.
During its tenure as the bus master, it may perform one or more data transfer operations,
depending on whether it is operating in the cycle stealing or block mode. After it releases the
bus, the processor resumes bus mastership. This figure shows the causal relationships among
the signals involved in the arbitration process. Details of timing, which vary significantly
from one computer bus to another, are not shown.
o Distributed arbitration
Distributed arbitration means that all devices waiting to use the bus have equal responsibility
in carrying out the arbitration process, without using a central arbiter. A simple method for
distributed arbitration is illustrated in figure 6. Each device on the bus assigned a 4-bit
identification number. When one or more devices request the bus, they assert the Start
Arbitration signal and place their 4-bit ID numbers on four open collector lines. ARB 0
through ARB 3 A winner is selected as a result of the interaction among the signals
transmitted over those liens by all contenders. The net outcome is that the code on the four
lines represents request that the highest ID number.
Decentralized arbitration has the advantage of offering higher reliability, because operation of
the bus is not dependent on any single device.
FLYNNS
CLASSIFICATION:
•As most of the Array processors operates asynchronously from the host
CPU, hence it improves the overall capacity of the system.
•Array Processors has its own local memory, hence providing extra
memory for systems with low memory.
Multiprocessors
These systems have multiple processors working in parallel that share
the computer clock, memory, bus, peripheral devices etc.
MIMD architecture Interconnection network :allows ‘n’ processors to
access ‘k’ memories. Any processor can access any memory. One
CPU writes some data into memory and another one reads the data
out.
Multiprocessor are classified by the way their memory is organised,
mainly its called as two types
• Tightly coupled multiprocessors (Shared Memory)
• Loosely coupled Multiprocessors (Distributed Memory)
2. Multiport Memory
5. Hypercube Interconnection
Others…
•Tree
•Ring
• Mesh
Memory organisation in multiprocessors
UMA architecture
• Locality of reference- Cache memory – speed
•Primary cache- inbuilt on each Processor chip.
NUMA
Parallel Processing
Parallel processing can be described as a class of techniques which enables the system to
achieve simultaneous data-processing tasks to increase the computational speed of a
computer system.
A parallel processing system can carry out simultaneous data-processing to achieve faster
execution time. For instance, while an instruction is being processed in the ALU
component of the CPU, the next instruction can be read from memory.
The following diagram shows one possible way of separating the execution unit into eight
functional units operating in parallel.
The operation performed in each functional unit is indicated in each block if the diagram:
o The adder and integer multiplier performs the arithmetic operation with integer numbers.
o The floating-point operations are separated into three circuits operating in parallel.
o The logic, shift, and increment operations can be performed concurrently on different data.
All units are independent of each other, so one number can be shifted while another
number is being incremented.
Pipelining
The term Pipelining refers to a technique of decomposing a sequential process into sub-
operations, with each sub-operation being executed in a dedicated segment that operates
concurrently with all other segments.
The most important characteristic of a pipeline technique is that several computations can
be in progress in distinct segments at the same time. The overlapping of computation is
made possible by associating a register with each segment in the pipeline. The registers
provide isolation between each segment so that each can operate on distinct data
simultaneously.
The following block diagram represents the combined as well as the sub-operations
performed in each segment of the pipeline.
Registers R1, R2, R3, and R4 hold the data and the combinational circuits operate in a
particular segment.
The output generated by the combinational circuit in a given segment is applied as an
input register of the next segment. For instance, from the block diagram, we can see that
the register R3 is used as one of the input registers for the combinational adder circuit.
In general, the pipeline organization is applicable for two areas of computer design which
includes:
1. Arithmetic Pipeline
2. Instruction Pipeline
Arithmetic Pipeline
Arithmetic Pipelines are mostly used in high-speed computers. They are used to
implement floating-point operations, multiplication of fixed-point numbers, and similar
computations encountered in scientific problems.
The inputs to the floating-point adder pipeline are two normalized floating-point binary
numbers defined as:
X = A * 2a = 0.9504 * 103
Y = B * 2b = 0.8200 * 102
Where A and B are two fractions that represent the mantissa and a and b are the
exponents.
The combined operation of floating-point addition and subtraction is divided into four
segments. Each segment contains the corresponding suboperation to be performed in
the given pipeline. The suboperations that are shown in the four segments are:
We will discuss each suboperation in a more detailed manner later in this section.
The following block diagram represents the suboperations performed in each segment of
the pipeline.
1. Compare exponents by subtraction:
The exponents are compared by subtracting them to determine their difference. The
larger exponent is chosen as the exponent of the result.
The difference of the exponents, i.e., 3 - 2 = 1 determines how many times the mantissa
associated with the smaller exponent must be shifted to the right.
X = 0.9504 * 103
Y = 0.08200 * 103
3. Add mantissas:
The two mantissas are added in segment three.
Z = X + Y = 1.0324 * 103
Z = 0.1324 * 104
Instruction Pipeline
Pipeline processing can occur not only in the data stream but in the instruction stream as
well.
Most of the digital computers with complex instructions require instruction pipeline to
carry out operations like fetch, decode and execute instructions.
In general, the computer needs to process each instruction with the following sequence
of steps.
1. Fetch instruction from memory.
2. Decode the instruction.
3. Calculate the effective address.
4. Fetch the operands from memory.
5. Execute the instruction.
6. Store the result in the proper place.
Each step is executed in a particular segment, and there are times when different
segments may take different times to operate on the incoming information. Moreover,
there are times when two or more segments may require memory access at the same
time, causing one segment to wait until another is finished with the memory.
The organization of an instruction pipeline will be more efficient if the instruction cycle is
divided into segments of equal duration. One of the most common examples of this type
of organization is a Four-segment instruction pipeline.
Along with developments like the superscalar design that use microprocessor
innovation to speed up the implementation of multiple instructions, the
microprocessor industry has also seen the emergence of multicore design,
where builders simply incorporate more than one processor or core into a
multicore CPU.
The instruction fetch segment can be implemented using first in, first out (FIFO) buffer.
Segment 2:
The instruction fetched from memory is decoded in the second segment, and eventually,
the effective address is calculated in a separate arithmetic circuit.
Segment 3:
Segment 4:
The instructions are finally executed in the last segment of the pipeline organization.
INSTRUCTION LEVEL PARALLELISM (ILP)
SUPERSCALAR PROCESSOR
MULTICORE SYSTEMS
INSTRUCTION LEVEL PARALLELISM (ILP)
ILP processors have the same execution hardware as Risc processors. The machines
without ILP have complex hardware which is hard to implement. A typical ILP
allows multiple-cycle operations to be pipelined.
Example :
Suppose, 4 operations can be carried out in single clock cycle. So there will be 4
functional units, each attached to one of the operations, branch unit, and common
register file in the ILP execution hardware. The sub-operations that can be
performed by the functional units are Integer ALU, Integer Multiplication, Floating
Point Operations, Load, Store. Let the respective latencies be 1, 2, 3, 2, 1.
Let the sequence of instructions be –
1. y1 = x1*1010
2. y2 = x2*1100
3. z1 = y1+0010
4. z2 = y2+0101
5. t1 = t1+1
6. p = q*1000
7. clr = clr+0010
8. r = r+0001
Fig. a shows sequential execution of operations.
Fig. b shows use of ILP in improving performance of the processor
The ‘nop’s in the above diagram are used to show idle time of processor. Since
latency of floating-point operations is 3, hence multiplications take 3 cycles and
processor has to remain idle for that time period. However, in Fig. b processor can
utilize those nop’s to execute other operations while previous ones are still being
executed.
While in sequential execution, each cycle has only one operation being executed, in
processor with ILP, cycle 1 has 4 operations, cycle 2 has 2 operations. In cycle 3
there is ‘nop’ as the next two operations are dependent on first two multiplication
operations. The sequential processor takes 12 cycles to execute 8 operations
whereas processor with ILP takes only 4 cycles.
2.Instruction pipelining
3.Out-of-order execution
4.Register renaming
5.Branch prediction
LOOP LEVEL PARALLELISM
Each iteration of the loop takes the value from the current index of L , and
increments it by 10. If statement S1 takes T time to execute, then the loop
takes time n * T to execute sequentially, ignoring time taken by loop
constructs. Now, consider a system with p processors where p > n .
If n threads run in parallel, the time to execute all n steps is reduced to T .
Less simple cases produce inconsistent, i.e. non-serializable outcomes.
Consider the following loop operating on the same list L .
INSTRUCTION PIPELINING
Pipeline processing can occur not only in the data stream but in the instruction stream as well.
Most of the digital computers with complex instructions require instruction pipeline to carry
out operations like fetch, decode and execute instructions.
In general, the computer needs to process each instruction with the following sequence of
steps.
Each step is executed in a particular segment, and there are times when different segments
may take different times to operate on the incoming information. Moreover, there are times
when two or more segments may require memory access at the same time, causing one
segment to wait until another is finished with the memory.
The organization of an instruction pipeline will be more efficient if the instruction cycle is
divided into segments of equal duration. One of the most common examples of this type of
organization is a Four-segment instruction pipeline.
A four-segment instruction pipeline combines two or more different segments and makes it
as a single one. For instance, the decoding of the instruction can be combined with the
calculation of the effective address into one segment.
The following block diagram shows a typical example of a four-segment instruction pipeline.
The instruction cycle is completed in four segments.
Segment 1:
The instruction fetch segment can be implemented using first in, first out (FIFO) buffer.
Segment 2:
The instruction fetched from memory is decoded in the second segment, and eventually, the
effective address is calculated in a separate arithmetic circuit.
Segment 3:
Segment 4:
The instructions are finally executed in the last segment of the pipeline organization.
OUT-OF-ORDER EXECUTION
In computer engineering, out-of-order execution (or more formally dynamic execution) is
a paradigm used in most high-performance central processing units to make use of instruction cycles that
would otherwise be wasted. In this paradigm, a processor executes instructions in an order governed by
the availability of input data and execution units. rather than by their original order in a program. In
doing so, the processor can avoid being idle while waiting for the preceding instruction to complete and
can, in the meantime, process the next instructions that are able to run immediately and independently.
REGISTER RENAMING
Register renaming is a form of pipelining that deals with data dependences
between instructions by renaming their register operands. An assembly
language programmer or a compiler specifies these operands
using architectural registers - the registers that are explicit in the instruction
set architecture. Renaming replaces architectural register names by, in effect,
value names, with a new value name for each instruction destination operand.
This eliminates the name dependences (output dependences and
antidependences) between instructions and automatically recognizes true
dependences.
The recognition of true data dependences between instructions permits a
more flexible life cycle for instructions. By maintaining a status bit for each
value indicating whether or not it has been computed yet, it allows the
execution phase of two instruction operations to be performed out of order
when there are no true data dependences between them. This is called out-of-
order execution.
BRANCH PREDICTION
Branch prediction is a technique used in CPU design that attempts to guess the outcome of
a conditional operation and prepare for the most likely result. A digital circuit that performs
this operation is known as a branch predictor. It is an important component of modern
CPU architectures, such as the x86.
The first time a conditional operation is seen, the branch predictor does not have
much information to use as the basis of a guess. But the more frequently the same
operation is used, the more accurate its guess can become.
1.Data dependences
2.Name dependences
3.Control dependences.
DATA DEPENDENCE
A data dependency in computer science is a situation in which a program
statement (instruction) refers to the data of a preceding statement. In
compiler theory, the technique used to discover data dependencies among
statements (or instructions) is called dependence analysis
For example, consider the following code sequence that increments a vector of
values in memory (starting at 0(R1) and with the last element at 8(R2)) by a
scalar in register F2:
Loop: L.D F0,0(R1) ; F0=array element ADD.D F4,F0,F2 ; add scalar in F2 S.D
F4,0(R1) ;store result DADDUI R1,R1,#-8 ;decrement pointer 8 bytes (/e BNE
R1,R2,LOOP ; branch R1!=zero
The dependence implies that there would be a chain of one or more data
hazards between the two instructions. Executing the instructions
simultaneously will cause a processor with pipeline interlocks to detect a
hazard and stall, thereby reducing or eliminating the overlap. Depend ences
are a property of programs.
The presence of the dependence indicates the potential for a hazard, but the actual hazard
and the length of any stall is a property of the pipeline. The importance of the data
dependences is that a dependence
NAME DEPENDENCES
The name dependence occurs when two instructions use the same register or
memory location, called a name, but there is no flow of data between the
instructions associated with that name.
There are two types of name dependences between an instruction i that precedes
instruction j in program order:
• An antidependence between instruction i and instruction j occurs when
instruction j writes a register or memory location that instruction i reads. The
original ordering must be preserved to ensure that i reads the correct value.
CONTROL DEPENDENCES:
S1 is control dependent on p1, and S2is control dependent on p2 but not on p1. In
general, there are two constraints imposed by control dependences:
HAZARDS
Hazard is the situation that prevent the next instruction in the instruction
stream from executing during its designated clock cycle. Hazards reduce the
performance from the ideal speedup gained by pipelining.
ARCHITECTURE :
Instruction Level Parallelism is achieved when multiple operations are performed in
single cycle, that is done by either executing them simultaneously or by utilizing
gaps between two successive operations that is created due to the latencies.
3. Independence Architecture :
Here, program gives information regarding which
operationsareindependent of each other so that they can be executed in
stead of the ‘nop’s.
In order to apply ILP, compiler and hardware must determine data dependencies,
independent operations, and scheduling of these independent operations,
assignment of functional unit, and register to store data.
SUPERSCALAR PROCESSOR
A superscalar processor is a specific type of microprocessor that uses instruction-level parallelism to
help to facilitate more than one instruction executed during a clock cycle. This depends on analysis
of the instructions to be carried out and the use of multiple execution units to triage these
instructions.
SUPERSCALAR DESIGN TECHNIQUES TYPICALLY INCLUDE
1.parallel instruction decoding
2.register renaming
3.out-of-order execution
4.ILP
• Deeper pipelines
APPLICATIONS
Multicore technology is very effective in challenging tasks and applications, such as
encoding, 3-D gaming and video editing
PREVIOUS YEAR QUESTIONS
1.define superscalar (3 marks) (2019)
2.define hazards (3 marks) (2020)
3.what is an instruction pipeline (3marks) (2020)