0% found this document useful (0 votes)

5 views

Data science

Uploaded by

rohitmadje32

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

Data science

Uploaded by

rohitmadje32

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 72

CS31001 COMPUTER

ORGANIZATION
AND
ARCHITECTURE
Debdeep Mukhopadhyay,
CSE, IIT Kharagpur
Datapath Elements and
Their Designs
Why Datapaths?
 The speed of these elements often dominates the
overall system performance so optimization
techniques are important.
 However, as we will see, the task is non-trivial since
there are multiple equivalent logic and circuit
topologies to choose from, each with adv./disadv. in
terms of speed, power and area.
 Datapath elements include shifters, adders,
multipliers, etc.
Bit-slicing method of constructing
ALU
 Bit slicing is a technique for constructing a
processor from modules of smaller bit width.
 Each of these components processes one
bit field or "slice" of an operand.
 The grouped processing components would
then have the capability to process the chosen
full word-length of a particular software
design.
Bit slicing

How can we develop architectures

which are bit sliced?
Shifters
Sel1 Sel0 Operation Function

0 0 Y<-A No shift
0 1 Y<-shlA Shift left
1 0 Y<-shrA Shift right
1 1 Y<-0 Zero
outputs

What would be a bit sliced architecture of this simple shifter?

Using Muxes Con[1:0]
A[2]
Y[2]
A[1] MUX

0
A[1]
A[0] Y[1]
MUX
A[2]

A[0]
A[1] Y[0]
MUX

0
Verilog Code
module shifter(Con,A,Y);
input [1:0] Con;
input[2:0] A;
output[2:0] Y;
reg [2:0] Y;
always @(A or Con)
begin
case(Con)
0: Y=A;
1: Y=A<<1;
2: Y=A>>1;
default: Y=3’b0;
endcase
end
endmodule
Combinational logic shifters with
shiftin and shiftout
Sel Operation Function

0 Y<=A, ShiftLeftOut=0 No shift

ShiftRightOut=0
1 Y<=shl(A), Shift left
ShiftLeftOut=A[5]
ShiftRightOut=0
2 Y<=shr(A), Shift Right
ShiftLeftOut=0
ShiftRightOut=A[0]
Y<=0, ShiftLeftOut=0
3 Zero Outputs
ShiftRightOut=0
Verilog Code
always@(Sel or A or ShiftLeftIn or ShiftRightIn);
begin
A_wide={ShiftLeftIn,A,ShiftRightIn};
case(Sel)
0: Y_wide=A_wide;
1: Y_wide=A_wide<<1;
2: Y_wide=A_wide>>1;
3:Y_wide=5’b0;
default: Y_wide=A_wide;
endcase
ShiftLeftOut=Y_wide[0];
Y=Y_wide[2:0];
ShiftRightOut=Y_wide[4];
end
Combinational 6 bit Barrel Shifter
Sel Operation Function

0 Y<=A No shift
1 Y<-A rol 1 Rotate once
2 Y<-A rol 2 Rotate twice
3 Y<- A rol 3 Rotate Thrice
4 Y<-A rol 4 Rotate four times
5 Y<-A rol 5 Rotate five times
Verilog Coding
 function [2:0] rotate_left;
input [5:0] A;
input [2:0] NumberShifts;
reg [5:0] Shifting;
integer N;
begin
Shifting = A;
for(N=1;N<=NumberShifts;N=N+1)
begin
Shifting={Shifting[4:0],Shifting[5]};
end
rotate_left=Shifting;
end
endfunction
Verilog
 always @(Rotate or A)
begin
case(Rotate)
0: Y=A;
1: Y=rotate_left(A,1);
2: Y=rotate_left(A,2);
3: Y=rotate_left(A,3);
4: Y=rotate_left(A,4);
5: Y=rotate_left(A,5);
default: Y=6’bx;
endcase
end
Another Way
.

data 1
n bits

output
data 2 n bits

n bits

Code is left as an exercise…

Single-Bit Addition
HalfS =Adder A B S= Full Adder A B
Cout =
Cout = Cout Cout C
A B C Co S
S S
A B Co S 0 0 0
0 0 0 0 1
0 1 0 1 0
1 0 0 1 1
1 1 1 0 0
1 0 1
1 1 0
1 1 1
Single-Bit Addition
Half Adder A B
Full Adder A B

S = A⊕ B Cout S = A⊕ B ⊕C
Cout C
Cout = AgB Cout = MAJ ( A, B, C )
S S
A B Co S A B C Co S
0 0 0 0 0 0 0 0 0
0 1 0 1 0 0 1 0 1
1 0 0 1 0 1 0 0 1
1 1 1 0 0 1 1 1 0
1 0 0 0 1
1 0 1 1 0
1 1 0 1 0
1 1 1 1 1
Carry-Ripple Adder
 Simplest design: cascade full adders
 Critical path goes from Cin to Cout
 Design full adder to have fast carry delay
A4 B4 A3 B 3 A2 B2 A 1 B1

Cout Cin
C3 C2 C1
S4 S3 S2 S1
Full adder
 Computes one-bit sum, carry:
 si = ai XOR bi XOR ci
 ci+1 = aibi + aici + bici
 Half adder computes two-bit sum.
 Ripple-carry adder: n-bit adder built from full
adders.
 Delay of ripple-carry adder goes through all
carry bits.
Verilog for full adder
module fulladd(a,b,carryin,sum,carryout);
input a, b, carryin; /* add these bits*/
output sum, carryout; /* results */

assign {carryout, sum} = a + b + carryin;

/* compute the sum and carry */
endmodule
Verilog for ripple-carry adder
module nbitfulladd(a,b,carryin,sum,carryout)
input [7:0] a, b; /* add these bits */
input carryin; /* carry in*/
output [7:0] sum; /* result */
output carryout;
wire [7:1] carry; /* transfers the carry between bits */

fulladd a0(a[0],b[0],carryin,sum[0],carry[1]);
fulladd a1(a[1],b[1],carry[1],sum[1],carry[2]);
…
fulladd a7(a[7],b[7],carry[7],sum[7],carryout]);
endmodule
Generate and Propagate
G[i ] = A[i ].B[i ] G[i ] = A[i ].B[i ]
P[i ] = A[i ] ⊕ B[i ] P[i ] = A[i ] + B[i ]
C[i ] = G[i ] + P[i ].C[i −1] C[i ] = G[i ] + P[i ].C[i −1]
S [i ] = P[i ] ⊕ C[i −1] S [i ] = A[i ] ⊕ B[i ] ⊕ C[i − 1]

Two methods to develop C[i] and S[i].

Both are correct
 Because, A[i]=1 and B[i]=1 (which may lead
to a difference is taken care of by the term
A[i]B[i])
 How do we make an n bit adder?
 The delay of the adder chain needs to be
optimized.
Carry-lookahead adder
 First compute carry propagate, generate:
 Pi = ai + bi
 Gi = ai bi
 Compute sum and carry from P and G:
 si = ci XOR Pi XOR Gi
 ci+1 = Gi + Pici
Carry-lookahead expansion
 Can recursively expand carry formula:
 ci+1 = Gi + Pi(Gi-1 + Pi-1ci-1)
 ci+1 = Gi + PiGi-1 + PiPi-1 (Gi-2 + Pi-1ci-2)
 Expanded formula does not depend on
intermediate carries.
 Allows carry for each bit to be computed
independently.
Depth-4 carry-lookahead
Analysis
 As we look ahead further logic becomes
complicated.
 Takes longer to compute
 Becomes less regular.
 There is no similarity of logic structure in
each cell.
 We have developed CLA adders, like Brent-
Kung adder.
Verilog for carry-lookahead carry
block
module carry_block(a,b,carryin,carry);
input [3:0] a, b; /* add these bits*/
input carryin; /* carry into the block */
output [3:0] carry; /* carries for each bit in the block */
wire [3:0] g, p; /* generate and propagate */

assign g[0] = a[0] & b[0]; /* generate 0 */

assign p[0] = a[0] ^ b[0]; /* propagate 0 */
ci+1 = Gi + Pi(Gi-1 + Pi-1ci-1)
assign g[1] = a[1] & b[1]; /* generate 1 */
assign p[1] = a[1] ^ b[1]; /* propagate 1 */
…
assign carry[0] = g[0] | (p[0] & carryin);
assign carry[1] = g[1] | p[1] & (g[0] | (p[0] & carryin));
assign carry[2] = g[2] | p[2] &
(g[1] | p[1] & (g[0] | (p[0] & carryin)));
assign carry[3] = g[3] | p[3] &
(g[2] | p[2] & (g[1] | p[1] & (g[0] | (p[0] & carryin))));

 endmodule
Verilog for carry-lookahead sum unit
module sum(a,b,carryin,result);
input a, b, carryin; /* add these bits*/
output result; /* sum */

assign result = a ^ b ^ carryin;

/* compute the sum */
endmodule
Verilog for carry-lookahead adder
 module carry_lookahead_adder(a,b,carryin,sum,carryout);
input [15:0] a, b; /* add these together */
input carryin;
output [15:0] sum; /* result */
output carryout;
wire [16:1] carry; /* intermediate carries */

assign carryout = carry[16]; /* for simplicity */

/* build the carry-lookahead units */
carry_block b0(a[3:0],b[3:0],carryin,carry[4:1]);
carry_block b1(a[7:4],b[7:4],carry[4],carry[8:5]);
carry_block b2(a[11:8],b[11:8],carry[8],carry[12:9]);
carry_block b3(a[15:12],b[15:12],carry[12],carry[16:13]);
/* build the sum */
sum a0(a[0],b[0],carryin,sum[0]);
sum a1(a[1],b[1],carry[1],sum[1]);
…
sum a15(a[15],b[15],carry[15],sum[15]);
endmodule
Dealing with the
problem of carry propagation
1. Reduce the carry propagation time.

2. To detect the completion of the carry

propagation time.

We have seen some ways to do the former. How

do we do the second one?
Motivation
Carry Completion Sensing
A=0 0 1 1 1 0 1 1 0 1 1 0 1 1 0 1
B=0 1 0 0 1 1 1 0 0 0 0 1 0 1 0 1
---------------------------------------------
4 1 5 1
Can we compute the average length of
carry chain?
 What is the probability that a chain generated
at position i terminates at j?
 It terminates if both the inputs A[j] and B[j] are
zero or 1.
 From i+1 to j-1 the carry has to propagate.
 p=(1/2)j-i
 So, what is the expected length?
 Define a random variable L, which denotes the
length of the chain.
Expected length
 The chain can terminate at j=i+1 to j=k (the MSB
position of the adder)
 Thus L=j-i for a choice of j.
 Thus expected length is: approximately 2!
k −1

∑ ( j − i)2
j =i +1
− ( j −i )
+ (k − i)2− ( k −1−i )

(the carry definitely ends at position k, so we do not

multiply 2− ( k −1−i ) with 1/2.)
k −1− i
= ∑ l2
l =1
−l
+ (k − i )2− ( k −1−i ) = 2 − (k − i + 1)2− ( k −1−i ) + (k − i )2− ( k −1−i )

= 2 − 2− ( k −1−i )
p
[Using, ∑l2
l =1
−l
= 2 − ( p + 2)2− p ]
Carry completion sensing adder
A=011101101101101 A=011101101101101
B=100111000010101 B=100111000010101
------------------------------ ------------------------------
C=000000000000000 C=000101000000101
N=000000000000000 N=000000010000010
------------------------------ ------------------------------
C=000101000000101 C=001111000001101
N=000000010000010 N=000000110000010
Carry completion sensing adder
A=011101101101101 A=011101101101101
B=100111000010101 B=100111000010101
------------------------------ ------------------------------
C=001111000001101 C=011111000011101
N=000000110000010 N=000000110000010
------------------------------ ------------------------------
C=011111000011101 C=111111000111101
N=000000110000010 N=000000110000010
Carry completion sensing adder
A=011101101101101
B=100111000010101
------------------------------
C=111111000111101
N=000000110000010
-----------------------------
-
C=111111001111101
N=000000110000010
Carry completion sensing adder
 (A[i],B[i])=(0,0)=>(Ci,Ni)=(0,1)
 (A[i],B[i])=(1,1)=>(Ci,Ni)=(1,0)
 (A[i],B[i])=(0,1)=>(Ci,Ni)=(Ci-1,Ni-1)
 (A[i],B[i])=(1,0)=>(Ci,Ni)=(Ci-1,Ni-1)
 Stop, when for all i, Ci V Ni = 1
Justification
 Ci and Ni together is a coding for the carry.
 When Ci=1, carry can be computed. Make
Ni=0
 When Ci=0 is the final carry, then indicate by
Ni=1
 The carry can be surely stated when both Ai
and Bi are 1’s or 0’s.
Carry-skip adder
 Looks for cases in which carry out of a set of
bits is identical to carry in.
 Typically organized into b-bit stages.
 Can bypass carry through all stages in a group
when all propagates are true: Pi Pi+1 … Pi+b-1.
 Carry out of group when carry out of last bit in
group or carry is bypassed.
Carry-skip structure
ci
Pi
Pi+1 AND
…
Pi+b-1
OR
Ci+b-1
Carry-skip structure

b adder stages b adder stages b adder stages

Carry out P[2b,3b-1] Carry out P[b,2b-1] Carry out P[0,b-1]

skip skip skip Cin
Worst-case carry-skip
 Worst-case carry-propagation path goes
through first, last stages:
Verilog for carry-skip add with P
module fulladd_p(a,b,carryin,sum,carryout,p);
input a, b, carryin; /* add these bits*/
output sum, carryout, p; /* results including propagate */

assign {carryout, sum} = a + b + carryin;

/* compute the sum and carry */
assign p = a ^ b;
endmodule
Want to use ripple carry adder for the
blocks
module fulladd_p(a,b,carryin,sum,carryout,p);
input a, b, carryin; /* add these bits*/
output sum, carryout, p; /* results including propagate */
$rtl_binding=“ADD3_RPL”;
assign {carryout, sum} = a + b + carryin;
/* compute the sum and carry */
assign p = a ^ b;
endmodule

Directive to a synthesis tool!

Verilog for carry-skip adder
module carryskip(a,b,carryin,sum,carryout);
input [7:0] a, b; /* add these bits */
input carryin; /* carry in*/
output [7:0] sum; /* result */
output carryout;
wire [8:1] carry; /* transfers the carry between bits */
wire [7:0] p; /* propagate for each bit */
wire cs4; /* final carry for first group */

fulladd_p a0(a[0],b[0],carryin,sum[0],carry[1],p[0]);
fulladd_p a1(a[1],b[1],carry[1],sum[1],carry[2],p[1]);
fulladd_p a2(a[2],b[2],carry[2],sum[2],carry[3],p[2]);
fulladd_p a3(a[3],b[3],carry[3],sum[3],carry[4],p[3]);
assign cs4 = carry[4] | (p[0] & p[1] & p[2] & p[3] & carryin);
fulladd_p a4(a[4],b[4],cs4, sum[4],carry[5],p[4]);
…
assign carryout = carry[8] | (p[4] & p[5] & p[6] & p[7] & cs4);
endmodule
Delay analysis
 Assume that skip delay = 1 bit carry delay.
 Delay of k-bit adder with block size b:
 T = (b-1) + 0.5 + (k/b –2) + (b-1)
block 0 OR gate skips last block
 For equal sized blocks, optimal block size is
sqrt(k/2).
Delay of Carry-Skip Adder

tp
ripple adder

bypass adder

N 
t d = 2( k − 1) t RCA +  − 2 t SKIP
 2k 
4..8
N
Carry-select adder
 Computes two results in parallel, each for
different carry input assumptions.
 Uses actual carry in to select correct result.
 Reduces delay to multiplexer.
Carry-select structure
Carry-save adder
 Useful in multiplication.
 Input: 3 n-bit operands.
 Output: n-bit partial sum, n-bit carry.
 Use carry propagate adder for final sum.
 Operations:
 s = (x + y + z) mod 2.
 c = [(x + y + z) –2] / 2.
Carry Network is the Essence of a Fast Adder
gi pi Carry is: xi yi
0 0 annihilated or killed gi = xi yi
0 1 propagated
1 0 generated pi = xi ⊕ yi
1 1 (impossible)

g k−2 p k−2 g i+1 p i+1 gi pi

g1 p1
g k−1 p k−1 g0 p0
... ... c0

Carry network

Ripple; Skip;
ck
c k−1
... ci ... c0 Lookahead;
c k−2 c1
c i+1 Parallel-prefix

Generic structure of a binary adder, highlighting its

carry network.
Apr. 2012 Computer Arithmetic, Addition/Subtraction Slide 52
Ripple-Carry Adder Revisited
The carry recurrence: ci+1 = gi ∨ pi ci

Latency of k-bit adder is roughly 2k gate delays:

1 gate delay for production of p and g signals, plus
2(k – 1) gate delays for carry propagation, plus
1 XOR gate delay for generation of the sum bits

gk−1 pk−1 gk−2 pk−2 g1 p1 g0 p0

...
ck ck−1 ck−2 c2 c1 c0

Alternate view of a ripple-carry network in connection with the

generic adder structure shown in Fig. 5.14.
Apr. 2012 Computer Arithmetic, Addition/Subtraction Slide 53
The Complete Design of a Ripple-Carry Adder
gi pi Carry is: xi yi
0 0 annihilated or killed gi = xi yi
0 1 propagated
1 0 generated pi = xi ⊕ yi
1 1 (impossible)

g k−2 p k−2 g i+1 p i+1 gi pi

g1 p1
g k−1 p k−1 g0 p0
... ... c0

Carry network

ck
c k−1
... ci ... c0
c k−2 c1
c i+1

Apr. 2012 Computer Arithmetic, Addition/Subtraction Slide 54

6.1 Unrolling the Carry Recurrence
Recall the generate, propagate, annihilate (absorb), and transfer signals:
Signal Radix r Binary
gi is 1 iff xi + yi ≥ r xi yi
pi is 1 iff xi + yi = r – 1 xi ⊕ yi
ai is 1 iff xi + yi < r – 1 xi′yi ′ = (xi ∨ yi) ′
ti is 1 iff xi + yi ≥ r – 1 xi ∨ yi
si (xi + yi + ci) mod r xi ⊕ yi ⊕ ci

The carry recurrence can be unrolled to obtain each carry signal directly
from inputs, rather than through propagation Note:
Addition symbol
ci = gi–1 ∨ ci–1 pi–1 vs logical OR
= gi–1 ∨ (gi–2 ∨ ci–2 pi–2) pi–1
= gi–1 ∨ gi–2 pi–1 ∨ ci–2 pi–2 pi–1
= gi–1 ∨ gi–2 pi–1 ∨ gi–3 pi–2 pi–1 ∨ ci–3 pi–3 pi–2 pi–1
= gi–1 ∨ gi–2 pi–1 ∨ gi–3 pi–2 pi–1 ∨ gi–4 pi–3 pi–2 pi–1 ∨ ci–4 pi–4 pi–3 pi–2 pi–1
=...
Full Carry Lookahead
x3 y3 x2 y2 x1 y1 x0 y0

cin

...

s3 s2 s1 s0

Theoretically, it is possible to derive each sum digit directly

from the inputs that affect it
Carry-lookahead adder design is simply a way of reducing
the complexity of this ideal, but impractical, arrangement by
hardware sharing among the various lookahead circuits
Apr. 2012 Computer Arithmetic, Addition/Subtraction Slide 56
c4
Four-Bit Carry-Lookahead Adder p3
g3
Complexity
reduced by
c3
deriving the
carry-out p2

indirectly g2
Full carry lookahead is quite practical
for a 4-bit adder c2 p1

c1 = g0 ∨ c0 p0 g1
c2 = g1 ∨ g0 p1 ∨ c0 p0 p1 p0
c3 = g2 ∨ g1 p2 ∨ g0 p1 p2 ∨ c0 p0 p1 p2 c1
g0
c4 = g3 ∨ g2 p3 ∨ g1 p2 p3 ∨ g0 p1 p2 p3 c0
∨ c0 p0 p1 p2 p3 Four-bit carry network with
full lookahead.
Carry Lookahead Beyond 4 Bits
Consider a 32-bit adder
No circuit sharing:
c1 = g0 ∨ c0 p0
Repeated computations
c2 = g1 ∨ g0 p1 ∨ c0 p0 p1
c3 = g2 ∨ g1 p2 ∨ g0 p1 p2 ∨ c0 p0 p1 p2
.
.
. 32-input AND
c31 = g30 ∨ g29 p30 ∨ g28 p29 p30 ∨ g27 p28 p29 p30 ∨ . . . ∨ c0 p0 p1 p2 p3 ... p29 p30

...
High fan-ins necessitate
32-input OR tree-structured circuits
Solution to the Fan-in Problem
High-radix addition (i.e., radix 2h)
Increases the latency for generating g and p signals and sum digits,
but simplifies the carry network (optimal radix?)

Multilevel lookahead

Example: 16-bit addition

Radix-16 (four digits)
Two-level carry lookahead (four 4-bit blocks)

Either way, the carries c4, c8, and c12 are determined first

c16 c15 c14 c13 c12 c11 c10 c9 c8 c7 c6 c5 c4 c3 c2 c1 c0

cout ? ? ? cin

Apr. 2012 Computer Arithmetic, Addition/Subtraction Slide 59

Carry-Lookahead Adder Design
Block generate and propagate signals

g [i,i+3] = gi+3 ∨ gi+2 pi+3 ∨ gi+1 pi+2 pi+3 ∨ gi pi+1 pi+2 pi+3
p [i,i+3] = pi pi+1 pi+2 pi+3
ci+3 ci+2 ci+1

gi+3 p i+3 gi+2 pi+2 gi+1 pi+1 gi pi

4-bit lookahead carry generator

g[i,i+3] p[i,i+3]

Schematic diagram of a 4-bit lookahead carry generator.

p [i,i+3]

A Building Block for

Carry-Lookahead Addition g [i,i+3]

pi+3
c4
A 4-bit
lookahead gi+3
carryp3generator Block Signal Generation
g3 Intermediate Carries

c3 ci+3

p2 pi+2
A 4-bit
carry g2 gi+2
network

p1 ci+2 pi+1
c2

g1 gi+1
p0 pi
c1 ci+1
g0 gi
ci
c0
Combining Block g and p Signals
j0 i0
j1 i1
j2 i2
j3
Block generate and
i3
propagate signals
can be combined in
c j 2 +1 c j 1 +1 cj the same way as bit
0 +1
g and p signals to
form g and p signals
g p g p g p g p
for wider blocks

ci 0
4-bit lookahead carry generator

g p
Fig. 6.3 Combining of g and p signals of four
(contiguous or overlapping) blocks of arbitrary widths
into the g and p signals for the overall block [i0, j3].
Apr. 2012 Computer Arithmetic, Addition/Subtraction Slide 62
A Two-Level Carry-Lookahead Adder
c12 c8 c4 c0

c 32 c16 g [12,15] g [8,11] g [4,7] g [0,3]

c48 p [12,15] p [8,11] p [4,7] p [0,3]

4-bit lookahead carry generator

g [48,63] g [32,47] 16-bit

g [16,31] g [0,15]
p [48,63] p [32,47] p [16,31] p [0,15] Carry-Lookahead
Adder

4-bit lookahead carry generator

g [0,63]
p [0,63] Fig. 6.4 Building a 64-bit carry-lookahead adder from 16
4-bit adders and 5 lookahead carry generators.

Carry-out: cout = g [0,k–1] ∨ c0 p [0,k–1] = xk–1yk–1 ∨ sk–1′ (xk–1 ∨ yk–1)

Apr. 2012 Computer Arithmetic, Addition/Subtraction Slide 63

Latency of a Multilevel Carry-Lookahead Adder
Latency through the 16-bit CLA adder consists of finding:

g and p for individual bit positions 1 gate level

g and p signals for 4-bit blocks 2 gate levels
Block carry-in signals c4, c8, and c12 2 gate levels
Internal carries within 4-bit blocks 2 gate levels
Sum bits 2 gate levels

Total latency for the 16-bit adder 9 gate levels

(compare to 32 gate levels for a 16-bit ripple-carry adder)

Each additional lookahead level adds 4 gate levels of latency

Latency for k-bit CLA adder: Tlookahead-add = 4 log4k + 1 gate levels

Apr. 2012 Computer Arithmetic, Addition/Subtraction Slide 64

Carry Determination as Prefix Computation
Block B' g″ p″
g′
Block B"
j0 i0

p′
j1 i1

(g", p") (g', p')

g" p" g' p'

¢
g = g" + g'p"
g p p = p'p"

(g, p)
Block B g p
Combining of g and p signals of two (contiguous or overlapping) blocks B'
and B" of arbitrary widths into the g and p signals for block B.
Formulating the Prefix Computation Problem
The problem of carry determination can be formulated as:
Given (g0, p0)(g1, p1) . . . (gk–2, pk–2) (gk–1, pk–1)

Find (g [0,0] , p [0,0]) (g [0,1] , p [0,1]) . . . (g [0,k–2] , p [0,k–2]) (g [0,k–1] , p [0,k–1])

c1 c2 . . . ck–1 ck
Carry-in can be viewed as an extra (−1) position: (g–1, p–1) = (cin, 0)

The desired pairs are found by evaluating all prefixes of

(g0, p0) ¢ (g1, p1) ¢ . . . ¢ (gk–2, pk–2) ¢ (gk–1, pk–1)

The carry operator ¢ is associative, but not commutative

[(g1, p1) ¢ (g2, p2)] ¢ (g3, p3) = (g1, p1) ¢ [(g2, p2) ¢ (g3, p3)]

Prefix sums analogy:

Given x0 x1 x2 . . . xk–1
Apr. 2012 Computer Arithmetic, Addition/Subtraction Slide 66

Find x0 x0+x1 x0+x1+x2 . . . x0+x1+...+xk–1

g3,Example
p3 Prefix-Based
g2, p g1, p1 Carry
g , p Network
6 −1
2
2 0
50
Fig. 6.6 Four-input
+ + parallel prefix
(a) A 4-input sums network and
prefix sums its corresponding
network carry network.
+ +
12 6 7 5
g[0,3], p[0,3] g[0,2], p[0,2] g[0,1], p[0,1] g[0,0], p[0,0] Scan g″ p″
order g′
=g(c ,p
3 4,3--) = g(c2,3,p--)
2
g
=1,(cp21, --) g
= (c , p
0 1, 0--)
p′

¢ ¢ (b) A 4-bit
Carry
lookahead
¢ ¢ network

g[0,3], p[0,3] g[0,2], p[0,2] g[0,1], p[0,1] g[0,0], p[0,0]

g p
= (c4, --) = (c3, --) = (c2, --) = (c1, --)
Apr. 2012 Computer Arithmetic, Addition/Subtraction Slide 67
Brent-Kung Carry Network (8-Bit Adder)
[7, 7 ] [6, 6 ] [5, 5 ] [4, 4 ] [3, 3 ] [2, 2 ] [1, 1 ] [0, 0 ] g[1,1] p[1,1]
g[0,0]
p[0,0]
¢ ¢ ¢ ¢
[6, 7 ] [2, 3 ]
[4, 5 ] [0, 1 ]
¢ ¢
[4, 7 ]
[0, 3 ]
¢ ¢

¢ ¢ ¢
g[0,1] p[0,1]

[0, 7 ] [0, 6 ] [0, 5 ] [0, 4 ] [0, 3 ] [0, 2 ] [0, 1 ] [0, 0 ]

Apr. 2012 Computer Arithmetic, Addition/Subtraction Slide 68

Brent-Kung Carry Network (16-Bit Adder)
x x x x x x x x x x x5 x4 x3 x2 x1 x0
15 14 13 12 11 10 9 8 7 6
Level
1

Reason for 2
latency being
2 log2k – 2 3

Brent-Kung
5
parallel prefix
graph for
16 inputs. 6

s15 s14 s13 s12 s s s s s s s s s s s s

11 10 9 8 7 6 5 4 3 2 1 0

Apr. 2012 Computer Arithmetic, Addition/Subtraction Slide 69

Adder comparison
 Ripple-carry adder has highest
performance/cost.
 Optimized adders are most effective in very
long bit widths (> 48 bits).
ALUs
 ALU computes a variety of logical and
arithmetic functions based on opcode.
 May offer complete set of functions of two
variables or a subset.
 ALU built around adder, since carry chain
determines delay.

Fortified Drops Chart A4 Modified 1pdfpdf 2
No ratings yet
Fortified Drops Chart A4 Modified 1pdfpdf 2
1 page
Data Logic Cells Unit3 Asic
100% (1)
Data Logic Cells Unit3 Asic
34 pages
Dayananda Sagar College of Engineering
No ratings yet
Dayananda Sagar College of Engineering
11 pages
Chapter 6. Arithmetic: Computer Organization
No ratings yet
Chapter 6. Arithmetic: Computer Organization
74 pages
Sexing The Pubis': A Newly Developed Visual Method of
No ratings yet
Sexing The Pubis': A Newly Developed Visual Method of
5 pages
Ec6612 Vlsi Design Lab - Exact Record Details
No ratings yet
Ec6612 Vlsi Design Lab - Exact Record Details
54 pages
Signed Integers: 2's Complement: Arithmetic Circuits & Multipliers
No ratings yet
Signed Integers: 2's Complement: Arithmetic Circuits & Multipliers
15 pages
02 SystemVerilogLecture1
No ratings yet
02 SystemVerilogLecture1
31 pages
Module:4 Design of Data Path Circuits 6 Hours
No ratings yet
Module:4 Design of Data Path Circuits 6 Hours
44 pages
RISC-V_Lecture_00
No ratings yet
RISC-V_Lecture_00
62 pages
Systemverilog - Lecture 1
No ratings yet
Systemverilog - Lecture 1
62 pages
Arth Cir
No ratings yet
Arth Cir
105 pages
Chapter3 Number Representation
No ratings yet
Chapter3 Number Representation
61 pages
DigitalLogic ComputerOrganization L13 Arithmetic Handout
No ratings yet
DigitalLogic ComputerOrganization L13 Arithmetic Handout
37 pages
Adders and Multipliers
No ratings yet
Adders and Multipliers
59 pages
Combinational Logic Design - Ripple Carry Adder, Carry Look Ahead Adder
No ratings yet
Combinational Logic Design - Ripple Carry Adder, Carry Look Ahead Adder
25 pages
Data Logic Cells Unit3 Asic
No ratings yet
Data Logic Cells Unit3 Asic
34 pages
Dataflow Modelling. // // Behavioural Modelling.
No ratings yet
Dataflow Modelling. // // Behavioural Modelling.
2 pages
Verlogic3 Chapter3
No ratings yet
Verlogic3 Chapter3
60 pages
DDMP_Unit_3
No ratings yet
DDMP_Unit_3
43 pages
DSD Lab 06
No ratings yet
DSD Lab 06
9 pages
VLSI
No ratings yet
VLSI
19 pages
Chap - 04 ALU
No ratings yet
Chap - 04 ALU
32 pages
CS M151B / EE M116C: Computer Systems Architecture
No ratings yet
CS M151B / EE M116C: Computer Systems Architecture
33 pages
Lec 14
No ratings yet
Lec 14
45 pages
001 Verilog
No ratings yet
001 Verilog
59 pages
EEE 241 Chap 04
No ratings yet
EEE 241 Chap 04
74 pages
Assignment - 3 Hardware Design Methodology: Submitted By: Shaily Garg MEC2019010 M.Tech (MI)
No ratings yet
Assignment - 3 Hardware Design Methodology: Submitted By: Shaily Garg MEC2019010 M.Tech (MI)
31 pages
PPT#04
No ratings yet
PPT#04
43 pages
EEE 241 Chap 04
No ratings yet
EEE 241 Chap 04
74 pages
Implementation of Carry Select Adder Using Verilog On FPGA: Sapan Desai (17BEC023) & Devansh Chawla (17BEC024)
No ratings yet
Implementation of Carry Select Adder Using Verilog On FPGA: Sapan Desai (17BEC023) & Devansh Chawla (17BEC024)
9 pages
Lecture2wp BasicCombCir
No ratings yet
Lecture2wp BasicCombCir
56 pages
DSD Subsystem Design
No ratings yet
DSD Subsystem Design
65 pages
Lec06-ALU
No ratings yet
Lec06-ALU
59 pages
Unit2 Arithmetics 120411093803 Phpapp01
No ratings yet
Unit2 Arithmetics 120411093803 Phpapp01
79 pages
DDCA Ch5
No ratings yet
DDCA Ch5
101 pages
Ecad & Vlsi Design Laboratory Manual FOR Iv B.Tech Ece-I Semester
No ratings yet
Ecad & Vlsi Design Laboratory Manual FOR Iv B.Tech Ece-I Semester
77 pages
Lec2 Slides
No ratings yet
Lec2 Slides
21 pages
FALLSEM2023-24 MVLD503L TH VL2023240107475 2023-11-22 Reference-Material-I
No ratings yet
FALLSEM2023-24 MVLD503L TH VL2023240107475 2023-11-22 Reference-Material-I
23 pages
FPGA系統設計實務 L2
No ratings yet
FPGA系統設計實務 L2
8 pages
module_3
No ratings yet
module_3
60 pages
18EC56 Verilog HDL Module 3b 2020
No ratings yet
18EC56 Verilog HDL Module 3b 2020
38 pages
Week 6: Arithmetic Functions and Circuits: Adding Two Bits
No ratings yet
Week 6: Arithmetic Functions and Circuits: Adding Two Bits
12 pages
Unit-Iv Adders:: Binary Adder Notations and Operations
No ratings yet
Unit-Iv Adders:: Binary Adder Notations and Operations
33 pages
Adders
No ratings yet
Adders
82 pages
Coa m3 part2 extra slides
No ratings yet
Coa m3 part2 extra slides
66 pages
Module 2 - Number System Arithmetic
No ratings yet
Module 2 - Number System Arithmetic
82 pages
What's The Deal?: Two-Operand Addition
No ratings yet
What's The Deal?: Two-Operand Addition
28 pages
Lecture35
No ratings yet
Lecture35
34 pages
Com Bi National Logic Circuit
No ratings yet
Com Bi National Logic Circuit
9 pages
Chapter 5 - Dataflow Modeling
No ratings yet
Chapter 5 - Dataflow Modeling
42 pages
Chapter 3 OnlyFor Q39 and ProblemNo 9
No ratings yet
Chapter 3 OnlyFor Q39 and ProblemNo 9
32 pages
21CSS201T COA UNIT 3 NOTES
No ratings yet
21CSS201T COA UNIT 3 NOTES
113 pages
Computer Arithmetics
No ratings yet
Computer Arithmetics
78 pages
ECE 171 Digital Circuits: Prof. Mark G. Faust Maseeh College of Engineering and Computer Science
No ratings yet
ECE 171 Digital Circuits: Prof. Mark G. Faust Maseeh College of Engineering and Computer Science
38 pages
COA_Module 3 Computer Arithmetic_Part2 (1)
No ratings yet
COA_Module 3 Computer Arithmetic_Part2 (1)
43 pages
Week 6 - Lecture 6 - Arithmetic Processing Unit Implementation
No ratings yet
Week 6 - Lecture 6 - Arithmetic Processing Unit Implementation
32 pages
Co-Unit 6
No ratings yet
Co-Unit 6
19 pages
Chapter4 Arithmetic
No ratings yet
Chapter4 Arithmetic
74 pages
Ece341 Lecture05
No ratings yet
Ece341 Lecture05
20 pages
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Anti-Aliasing with MSAA vs ABAA
From Everand
Anti-Aliasing with MSAA vs ABAA
Michel A Rohner
No ratings yet
Grade 7 Technology Exam June 2021
No ratings yet
Grade 7 Technology Exam June 2021
9 pages
Herbs For 8 Vessels
No ratings yet
Herbs For 8 Vessels
1 page
Aqueous Lubrication in Cosmetics: Gustavo S Luengo Anthony Galliano
No ratings yet
Aqueous Lubrication in Cosmetics: Gustavo S Luengo Anthony Galliano
49 pages
EZC100H3100
No ratings yet
EZC100H3100
2 pages
Manufacture of Clay Ceramic Batches
No ratings yet
Manufacture of Clay Ceramic Batches
5 pages
Grimoire and Ceremonial Magick - Mari Silva
No ratings yet
Grimoire and Ceremonial Magick - Mari Silva
228 pages
Lab Expercise # 10: Determination of Residual Chlorine
No ratings yet
Lab Expercise # 10: Determination of Residual Chlorine
3 pages
Onity HT28v3 2user Master Final
100% (1)
Onity HT28v3 2user Master Final
141 pages
Chapter 8: Space Program: 8.1 List of Spatial Requirements
No ratings yet
Chapter 8: Space Program: 8.1 List of Spatial Requirements
5 pages
Data Sheet: MS 2500 MS 2500
No ratings yet
Data Sheet: MS 2500 MS 2500
26 pages
03175
No ratings yet
03175
24 pages
To Kill A Mockingbird - Example of Escape and
No ratings yet
To Kill A Mockingbird - Example of Escape and
6 pages
MS - Admixtures For Concrete, Mortar and Grout
No ratings yet
MS - Admixtures For Concrete, Mortar and Grout
4 pages
RA 9367, Biofuels Act of 2006
No ratings yet
RA 9367, Biofuels Act of 2006
7 pages
Record of Service For RESMED S10 AIRSENSE SN 22161505399
No ratings yet
Record of Service For RESMED S10 AIRSENSE SN 22161505399
3 pages
Presentation 2
No ratings yet
Presentation 2
9 pages
CBSE Class 10 Science MCQ Chapter 11 The Human Eye and The Colourful World
No ratings yet
CBSE Class 10 Science MCQ Chapter 11 The Human Eye and The Colourful World
7 pages
EGES240 B
No ratings yet
EGES240 B
54 pages
Latihan Soal USBN 2021
No ratings yet
Latihan Soal USBN 2021
11 pages
Contaminated Communities Coping With Residential Toxic Exposure 2nd Edition Michael R. Edelstein download
100% (1)
Contaminated Communities Coping With Residential Toxic Exposure 2nd Edition Michael R. Edelstein download
62 pages
MC 10203925 0001
No ratings yet
MC 10203925 0001
4 pages
Arduino 4-Digit 0.5" 7 Segment Display Module: Arduino Compatible Hardware Contains
No ratings yet
Arduino 4-Digit 0.5" 7 Segment Display Module: Arduino Compatible Hardware Contains
4 pages
Performance and Suitability of Growing Crops in Haryana: District-Level Analysis
No ratings yet
Performance and Suitability of Growing Crops in Haryana: District-Level Analysis
15 pages
Taguig City University: The Two-Storey Señior High School Green Building of Taguig City University
No ratings yet
Taguig City University: The Two-Storey Señior High School Green Building of Taguig City University
5 pages
Thermal Properties of Matter Jee 11th Answer Key
No ratings yet
Thermal Properties of Matter Jee 11th Answer Key
5 pages
Rebel Angels in Paradise Lost-Nibblepop
No ratings yet
Rebel Angels in Paradise Lost-Nibblepop
4 pages
Group 5 Final Output
No ratings yet
Group 5 Final Output
23 pages
Recommended Practices For Sponge Sharp and Instrument Counts 1999
No ratings yet
Recommended Practices For Sponge Sharp and Instrument Counts 1999
7 pages

Data science

Uploaded by

Data science

Uploaded by

CS31001 COMPUTER

How can we develop architectures

What would be a bit sliced architecture of this simple shifter?

0 Y<=A, ShiftLeftOut=0 No shift

Code is left as an exercise…

assign {carryout, sum} = a + b + carryin;

Two methods to develop C[i] and S[i].

assign g[0] = a[0] & b[0]; /* generate 0 */

assign result = a ^ b ^ carryin;

assign carryout = carry[16]; /* for simplicity */

2. To detect the completion of the carry

We have seen some ways to do the former. How

(the carry definitely ends at position k, so we do not

b adder stages b adder stages b adder stages

Carry out P[2b,3b-1] Carry out P[b,2b-1] Carry out P[0,b-1]

assign {carryout, sum} = a + b + carryin;

Directive to a synthesis tool!

g k−2 p k−2 g i+1 p i+1 gi pi

Generic structure of a binary adder, highlighting its

Latency of k-bit adder is roughly 2k gate delays:

gk−1 pk−1 gk−2 pk−2 g1 p1 g0 p0

Alternate view of a ripple-carry network in connection with the

g k−2 p k−2 g i+1 p i+1 gi pi

Apr. 2012 Computer Arithmetic, Addition/Subtraction Slide 54

Theoretically, it is possible to derive each sum digit directly

Example: 16-bit addition

c16 c15 c14 c13 c12 c11 c10 c9 c8 c7 c6 c5 c4 c3 c2 c1 c0

Apr. 2012 Computer Arithmetic, Addition/Subtraction Slide 59

gi+3 p i+3 gi+2 pi+2 gi+1 pi+1 gi pi

4-bit lookahead carry generator

Schematic diagram of a 4-bit lookahead carry generator.

A Building Block for

c 32 c16 g [12,15] g [8,11] g [4,7] g [0,3]

4-bit lookahead carry generator

g [48,63] g [32,47] 16-bit

4-bit lookahead carry generator

Carry-out: cout = g [0,k–1] ∨ c0 p [0,k–1] = xk–1yk–1 ∨ sk–1′ (xk–1 ∨ yk–1)

Apr. 2012 Computer Arithmetic, Addition/Subtraction Slide 63

g and p for individual bit positions 1 gate level

Total latency for the 16-bit adder 9 gate levels

Each additional lookahead level adds 4 gate levels of latency

Latency for k-bit CLA adder: Tlookahead-add = 4 log4k + 1 gate levels

Apr. 2012 Computer Arithmetic, Addition/Subtraction Slide 64

(g", p") (g', p')

Find (g [0,0] , p [0,0]) (g [0,1] , p [0,1]) . . . (g [0,k–2] , p [0,k–2]) (g [0,k–1] , p [0,k–1])

The desired pairs are found by evaluating all prefixes of

The carry operator ¢ is associative, but not commutative

Prefix sums analogy:

Find x0 x0+x1 x0+x1+x2 . . . x0+x1+...+xk–1

g[0,3], p[0,3] g[0,2], p[0,2] g[0,1], p[0,1] g[0,0], p[0,0]

[0, 7 ] [0, 6 ] [0, 5 ] [0, 4 ] [0, 3 ] [0, 2 ] [0, 1 ] [0, 0 ]

Apr. 2012 Computer Arithmetic, Addition/Subtraction Slide 68

s15 s14 s13 s12 s s s s s s s s s s s s

Apr. 2012 Computer Arithmetic, Addition/Subtraction Slide 69

You might also like