0% found this document useful (0 votes)
12 views

Clockgating Fpga 2

Uploaded by

Shahzaib Ashher
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Clockgating Fpga 2

Uploaded by

Shahzaib Ashher
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

2018 International Conference on Advanced Technologies for Communications

2018 International Conference on Advanced Technologies for Communications (ATC)

The Merged Clock Gating Architecture For Low


Power Digital Clock Application On FPGA
Minh Huan Vo
Ho Chi Minh University of Technology and Education

Abstract: We propose a novel merged clock gating Normally, the clock gating solution can be
architecture to design low power digital clock which groups implemented by identifying groups of registers which are
all clock gating signals together into a single clock gating controlled by a common enable signal. There are two
signal, then uses one DEMUX gate to process and split the clock gating styles. The AND logic gate is used to
single clock gating signal into many different clocks. We
multiply the enable signal and clock. Alternatively, Latch
compare the proposed technique with the conventional clock
gating technique and no-clock gating technique in term of gate can also be used to gate register clocks [8-9]. By
clock power, dynamic power and total power consumption. inserting the additional gates to circuit, the power penalty
The simulation results in Spartan-3E shows total is added to total power consumption. We have to consider
consumption power that the proposed architecture can save carefully the clock gating cells not to generate the power
3.45%, 26.53%, 50.69%, 53.15%, 53.13% compared to the overhead so much. To overcome this problem, we
no-clock gating and 1.19%, 1.85%, 11.83%, 14.76%, propose an optimum method of merged clock gating
15.67% compared to the conventional clock gating technique control.
in operation frequency of 100MHz, 1GHz, 10GHz, 100GHz
and 1THz, respectively. Moreover, the proposed technique In this paper, the author proposes a merged clock
can reduce to 20.88% and 5.28% compared to the no-clock gating technique to reduce the dynamic power for digital
gating and the conventional clock gating in term of the clock circuit design application. As the result, the power
created operation temperature, respectively. The number of consumption is reduced compared to the conventional
LUTs also decreases to 175 instead of 179 in the other clock clock gating technique. The author uses this innovative
gating techniques.
clock gating technique to build an integrated digital clock
Keywords: Clock Gating, Low Power, Dynamic Power, application with timer of seconds, minutes, hours, days of
Clock Power, Digital Clock. the month, and months of year in consideration of a leap
I. INTRODUCTION year. Here, in order to compare the effect of the proposed
clock gating method, we implemented the proposed clock
Reducing power consumption on the chip has become gating technique in comparison with the conventional
a major concern in the design of low consumption clock gating and non-clock gating techniques in term of
applications, particularly in battery powered applications. power dissipation, operation temperature and overhead
The consumed dynamic power when the circuit performs area, thereby providing the best possible evaluation to the
the switching function is the main component of the total proposed technique.
power consumption in the circuit [1]. The formula for
calculating the dynamic power is by II. DIGITAL CLOCK CIRCUIT DESIGN
1 clk_s s[5:0]

P   01CloadVDD 2 f . Here, C load is the load of the m[5:0]


2 clk clk_mi
h[5:0]
driven circuit, VDD is the supply voltage, f is the operating ctrl_gate clk_h
d[5:0]
frequency of the circuit,  01 is switching coefficient n_reset Gating Block
clk_d mo[5:0]
Digital Clock
[2]. Based on this equation, to reduce the dynamic power next_enable[5:0]
clk_mo
Block y[5:0]

consumption, we need to reduce the load driven by the next_enable[5:0]

control circuit, or the power supply, and switching clk_y


dotw[6:0]
frequency. Researchers have come up with various clk_ms

solutions in aspect of device, circuit and technology to n_reset

handle these parameters of the dynamic power


consumption [3-4]. In particular, clock gating techniques Fig. 1: Block diagram of general digital clock circuit application
have been studied and applied effectively to reduce the
dynamic power in various applications [5-7]. Clock gating In Figure 1, the digital clock circuit consists of two
is a power optimization technique used in both ASIC and blocks generally. The first digital clock circuit block
FPGA designs, which aims to eliminate unnecessary which operates as the function of a normal digital clock
clock switching operations when the circuit is in active circuit and a second gating circuit block that plays a key
operation. role to reduces the digital clock consumption power by
implementing the clock gating on the whole system to

978-1-5386-6542-8/18/$31.00 ©2018 IEEE 282


978-1-5386-6542-8/18/$31.00 ©2018 IEEE 282
2018 International Conference on Advanced Technologies for Communications (ATC)
2018 International Conference on Advanced Technologies for Communications

save dynamic power consumption. When integrating or dk_ms dk1 dk2 dk3

not integrating this low power clock gating technique, the Mod_50M Mod_60s q_next_en_mi Mod_60mi q_next_en_h Mod_24h
first digital clock circuit block still operates the same CLK
clk_s
CLK
clk_mi
CLK
clk_h
CLK
counting function. This digital clock circuit block counts
the seconds, minutes, hours, days, months, years, days
dk_d dk_mo
including the number of days per month in concern to
leap year, corresponding to the registers of s[5:0], m[5:0], q_next_en_d
Mod_days
q_next_en_mo
Mod_12m
q_next_en_y
Mod_y
h[5:0], d[5:0], mo[5:0], y [5:0], dotw [5:0]. Here, clk_d
CLK
clk_m
CLK
clk_y
CLK

next_enable [5:0] signal plays an important role in


enabling the whether a function of block is Fig. 2: The Digital clock circuit counting seconds, minutes, hours,
days, months and years is shown.
enabled/disabled.
q_next_en_mi
The second gating block circuit performs clock gating
solutions to save dynamic energy. Specifically, this part dk_ms
q_next_en_h
of the circuit will be functioned based on the signal of the q_next_en_d
q_next_en_mo
next_enable [5:0], including 6 bits to determine the dk1
ON/OFF period of clk_s, clk_mi, clk_h, clk_d, clk_mo,
dk2
clk_y signals, ensuring the circuit still functions properly. q_next_en_y
dk3
In this paper, the author presents a merged clock dk_day
dk_mo
gating technique and performs comparison this proposed
technique with the conventional clock gating design Fig. 3: Generating enable signals for controlling clock gating
technique to propose the most optimal design of power The signals of dk_ms, q_next_en_mi, q_next_en_h,
consumption and used resources in the digital clock q_next_en_d, q_next_en_mo, q_next_en_y are counter
application as an example. condition signals used to enable the digital clock circuit
In Figure 2, the digital clock circuit includes Mod_60s block as shown in Figure 3. Signals of dk1, dk2, dk3,
block for counting seconds and Mod_60mi block for dk_day, dk_mo are generated after modulo counter
counting minutes, Mod_24h block for counting hours, finishes counting. To enable a next counter to operate its
Mod_day block for counting days, Mod_12m block for counting function, the enable signal is created to activate
counting months, Mod_y for counting years and counter. For example, in order to begin the minute
Mod_50M block for counting the 50MHz frequency counter, q_next_ent_mi signal has to be active high that is
pulses. When the Mod_50M block counts 50 million performed by AND gate between dk1 and dk_ms. Here,
pulses at active high clocks, it generates dk_ms signal to dk1 signal is generated after Mod_60s block finishes
following Mode_60s block. The Mode_60s block will counting seconds from 0 to 59th. The dk_ms signal is
count up when there is a clk_s rise and there is an active generated after counting 50 million pulses from clock
high dk_ms signal. Here, Mod_60s means that the counter source of FPGA kit. Similarly, the hour counter is active
will count the number of clk_s pulses from 0 to 59th and only when Mod_60mi block finishes counting minutes
reset at 60th pulse. Its output is taken to 1 during 60th pulse from 0 to 59th and at the same time, second counter
period. The operation principle is similar to the other finishes counting seconds from 0 to 59th. It means that
remaining blocks such as the counter of minutes, hours, q_next_en_mi and dk2 signal are going to be high.
days, months, and years. The condition to begin minute
counting block is that both clk_s signal and q_next_en_mi
signal is active high at same time. It means that the 31 1,3,5,7,8,
signals will perform their functions when their logic 10,12
levels are rising to 1.
Similarly, the other blocks will count the same 30
4,6,9, days
function when their conditions are satisfied. It means that 11

the hour counting block is started if q_next_en_h signal


feb
and clk_h signal are active high simultaneously. The day 2
counting block is started if q_next_en_d signal and clk_d
signal are active high simultaneously. The month
counting block is started if q_next_en_mo and clk_m are
active high simultaneously. The year counting block is
q_reg_mo[5:0]
started if q_next_en_y and clk_y are active high
simultaneously. These signals of clk_s, clk_mi, clk_h, Fig. 4: Circuit determining the number of days in each month.
clk_d, clk_m, clk_y are generated conventionally as In Figure 4, the registers of q_reg_mo [5:0] are from
described in Figure 6. the output of Mod_12m block, used as inputs of this block
to determine the current month. Depending on type of

283
283
2018 International Conference on Advanced Technologies for Communications (ATC)
2018 International Conference on Advanced Technologies for Communications

month, the circuit determines how many days are in a one clock gating block. By doing so, the clock network is
month. In February, temporary value for days is feb signal decreased very much in the control load, reducing the
which is handled as illustrated in Figure 5. number of buffers and reducing the switching power in
the clock network. Furthermore, reducing logic gates lead
29
to reduce the resource and switching power caused by
0
feb these logic gates. Therefore, the consumption power of
28
this improved clock gating method will be more efficient
x than using conventional clock gating technique.
next_enable[4:0] next_enable_temp[4:0]
D Q
Latch
-

q_next_y
>>2 <<2 G clk_s

clk_mi
1xxxx
q_reg_enable clk_h
Fig. 5: Processing leap year with 28 days in February =1'b1 D Q
x1xxx
q_next_enable
next_enable[5] clk_d
xx1xx
In Figure 5, the inputs show the year value register of clk_enable
clk clk_mo
q_next_y signal. In theory, the leap year is divisible by 4. CLK
xxx1x

The divisor is designed by two shift registers. If the year xxxx1 clk_y

is leap that contains one additional day, ie. 29 days. The


output result is equal to the integer part of the q_next_y next_enable_temp[4:0]
signal divided by shifting right 2 bits, and multiplied by
Fig. 7: The merged clock gating circuit is proposed
shifting left 2 bits. In this case, the q_next_y signal is
divisible by 4, the result of the two shifts is equal to itself.
Thus, when performing subtraction, the output result will
be 0. So, the feb signal value will be 29. Conversely,
when q_next_y is not divisible by 4, that is, q_next_y is
shifted right, then left, and subtracted as in Figure 5. If the
subtraction result is not 0, the feb signal output is 28.
The Figure 6 and Figure 7 are two clock gating
techniques applied to the digital clock circuit application,
which are the conventional clock gating technique and
novel merged clock gating technique.
Fig. 8: Simulation waveform of the merged clock gating circuit.
next_enable[5] q_reg_enable[5]
D Q clk_s In Figure 8, the waveform of the improved clock
gating circuit is explained in the following detail. When
clk CLK
the q_reg_ms signal is counted to 10. Here, assuming that
one second is equal to 10 clk pulses. The current state of
next_enable [5: 0] is 110000, so next_enable [5] is 1 and
q_next_enable is 1 at clk’s rising edge. At the same time
Fig. 6: The conventional clock gating circuit technique when next_enable [5] is 1, it will make latch gate update
status, so next_enable_tmp [4:0] = next_enable [4:0] =
When the next_enable signal [5] is equal to 1, the flip 10000.
flop D output (FF_D) and q_reg_enable [5] will be equal
to 1 at the rising edge of the clk. The clk_s output will be When the q_reg_ms signal is reset and becomes 0 at
logic result of the clk AND q_reg_enable [5] and is clk the next clk pulse. It leads to next_enable [5:0] = 000000.
signal in case of q_reg_enable [5] =1. In contrast, when As the result, the next_enable [5] is 0 and q_next_enable
next_enable [5] is set to 0, the FF_D output will achieve is 0. Also, during the period of q_reg_ms = 0, the
value of 0 at the rising edge of clk. Then, the clk_s signal q_reg_enable signal will update the value of
will get logic result of clk AND FF_D output, ie. zero. q_next_enable in the previous cycle which is 1. The logic
The clock gating signals of clk_mi, clk_h, clk_d, clk_mo, result of q_reg_enable AND clk creates signal of
and clk_y work the same function as the clk_s does. clk_enable. Then, it allows a cycle of clk and clk_s to pop
up 1, updating second counting timer as seen in figure 8.
Here, instead of using the six conventional clock Thanks to next_enable_tmp [4: 0] still holding the value
gating circuits as shown in Figure 6, we improved clock of 10000, it means that next_enable_tmp [4] is 1. Thus,
gating solution by designing the novel clock gating circuit logic result of clk_enable AND next_enable_tmp [4]
that consisting of a basic clock gating, 5 latches for creates clk_mi to be raised up 1. As the consequence,
holding condition values, and a DEMUX gate for dividing
clock signal. Thus, the clock load will be driven by only

284
284
2018 International Conference on Advanced Technologies for Communications (ATC)
2018 International Conference on Advanced Technologies for Communications

clk_mi signal updates minute counter as seen in figure 8. Conventional


0.002 0.027 0.27 2.71 27.3
The operation of the circuit takes place as expected. Clock gating
Proposed
III. SIMULATION RESULTS Clock gating
0.002 0.025 0.23 2.3 23

Similarly to the clock power, it can be seen that the


dynamic power in Table 2 in the case of the conventional
clock gating technique is always smaller than in the case
of the no-clock gating technique. Specifically, the
dynamic power with the conventional clock gating
technique decreases 66.67% 58.46%, 57.14%, 54.27%,
53.24% corresponding to the simulation frequencies 100
MHz, 1GHz, 10GHz, 100GHz and 1THz, respectively in
comparison with the no-clock gating technique. When
applying improved clock gating technique, dynamic
power decreases more by ~ 0%, 7.41%, 15.19%, 15.44%,
15.74%, according to simulation frequencies at 100 MHz,
Fig. 9: Function simulation of digital clock circuit 1GHz, 10GHz, 100GHz and 1THz, respectively in
comparison with the conventional clock gating technique.
In this function simulation of Figure 9, the initial time
In summary, an average reduction is about 10.76%
is set at 31/12/2015, 23h59''59'. We can see at the moment
compared to the conventional clock gating method.
of transition, the timer is updated to 1/1/2016, 0h0''0 '.
Since December month has 31 days, it is possible to see TABLE 3 COMPARION IN TOTAL CONSUMPTION POWER
that the operation of the clock gating outputs is updated 100 1
100 10
with the new values, the signals of the clk_s, clk_mi, (W)
MHz
1 GHz
GHz GHz THz
clk_h, so on are switched ON for a period of the clk pulse,
and then turned off in next clk pulse, returning to the zero No-Clock gating 0.087 0.147 0.7 6.1 58.5
state. Conventional
0.084 0.108 0.36 2.8 27.4
TABLE 1 COMPARISON IN CLOCK POWER CONSUMPTION Clock gating

10 100 1 Proposed clock


100 0.083 0.106 0.31 2.4 23.1
(W) 1GHz gating
MHz GHz GHz THz
As seen in Table 3, the proposed clock gating
No-Clock
gating
0.005 0.050 0.497 5 49.7 technique is slightly effective, only 3.45% reduction at the
low frequency at 100MHz. However, the higher the
Conventional frequency, the greater the reduction rate. In more detail,
0.002 0.025 0.252 2.5 25.2
clock gating
the reduction rate is 26.53%, 50.69%, 53.15%, 53.13%
Proposed
0.002 0.023 0.211 2.1 21.2 respectively according to 1GHz, 10GHz, 100GHz and
clock gating 1THz compared to the no-clock gating method. In
According to Table 1, it can be seen that the clock summary, the average consumption power decreases to
power of the conventional clock gating technique is 37.39%. In case of applying the improved clock gating
always smaller than in that of no clock gating. technique to the digital clock circuit, total consumption
Particularly, the conventional clock gating technique can power decreases by 1.19%, 1.85%, 11.83%, 14.76%,
save up to 60%, 50%, 49.30%, 49.32%, 49.32% 15.67% corresponding to the operation frequencies of 100
corresponding to operation frequency at 100 MHz, 1GHz, MHz, 1GHz, 10GHz, 100GHz and 1THz, respectively. In
10GHz, 100GHz and 1THz, respectively compared to no- sum, a consumption power reduction of the merged clock
clock gating technique. When applying the improved gating technique is averaged about 9.06% compared to
clock gating technique, the clock power decreases more that of the conventional clock gating technique as in
by approximately 0%, 8%, 16.27%, 16.43%, 16.44%, Table 3.
according to the operation frequencies at 100 MHz, TABLE 4 COMPARISON IN OPERATION TEMPERATURE
1GHz, 10GHz, 100GHz and 1THz, respectively. In
100 1 10 100 1
summary, the average clock power is reduced by 11.43% ̊C
MHz GHz GHz GHz THz
compared to the conventional clock gating technique.
TABLE 2 COMPARISON IN DYNAMIC POWER No-Clock gating 27.3 28.8 43.8 125.0 125.0

100 10 100 1 Conventional


(W) 1GHz 27.2 27.8 34.3 98.9 125.0
MHz Clock gating
GHz GHz THz
Proposed
No-clock 27.2 27.8 33.7 93.7 125.0
0.006 0.065 0.63 5.92 58.3 Clockgating
gating

285
285
2018 International Conference on Advanced Technologies for Communications (ATC)

2018 International Conference on Advanced Technologies for Communications

The operation temperature is created by circuit power presents a novel clock gating solution by grouping all
dissipation. The conventional clock gating technique is clock gating signals together into a single clock gating
lower than no-clock gating technique in term of the signal, then using one DEMUX gate to process and split
operation temperature. Specifically, it will be decreased the signal into many different clocks. By doing so, the
by 0.36%, 3.47%, 21.69%, 20.88%, at operation proposed design can reduce the number of inserted logic
frequencies of 100 MHz, 1 GHz, 10 GHz, 100 GHz, gates and reduce ~ 15.67% of the total power to the
respectively. There is no drop in temperature at 1THz digital clock design compared to using the conventional
frequency as shown in Table 4 because the operation clock gating technique at 1THz operation frequency. As a
frequency is too high and it reaches the level specified in result, the number of LUTs decreases to 175 instead of 179 in
the power simulation tool, thus, the operating temperature the other clock gating techniques, and the operating
is the same among three clock gating techniques. The new temperature is also reduced by using this merged clock
merged clock gating technique is lower than the gating technique.
conventional gating clock technique in term of the
REFERENCES
dissipated operation temperature. Specifically, it will be
decreased by 1.75%, 5.26% according to 10 GHz, 100 [1] Semiconductor Industry Assoc., ITRS, 2003 update;
GHz frequency, respectively. http://public.itrs.net
[2] Vo Minh Huan, Tong Van On, “Solutions to minimize the
TABLE 5 COMPARISON IN RESOURSE USAGE
power consumption in nanometer designs”, ISEE Proceedings,
Resource usage FF LUT International Symposium on Electrical & Electronics Engineering pp
64-72, Oct. 2007.
No-Clock gating 75 179
[3] Ya-Ting Shyu, et al, “Effective and Efficient Approach for
Conventional Clock gating 75 179 Power Reduction by Using Multi-Bit Flip-Flops”, IEEE Transactions On
Very Large Scale Integration (VLSI) Systems, Vol. 21, No. 4, April
The proposed clock gating 75 175 2013.
From Table 5 above, the usage resources of the three [4] Huan Minh Vo, Chul-Moon Jung, Eun-Sub Lee, and
schemes are approximately equal. The number of Flip Kyeong-Sik Min, “Carry select adder with sub-block power gating for
Flops is equal in all three cases. The number of FFs and reducing active-mode leakage in sub-32-nm VLSIs,” IEICE Electronics
Express, vol. 8, no. 16, pp. 1322-1329, Aug. 2011.(SCIE).
logic resources of the LUT are the same in cases of the
absence of clock gating and the conventional clock gating [5] Shmuel Wimer and Israel Koren, “ Design flow for flipflop
grouping in Data-Driven gating”, IEEE transactions on VLSI systems,
technique because they share the same circuit. For novel pp. 771-778, vol.2, 2014.
improved clock gating technique, the block_design is
designed the same counting function among the three [6] Xiaoxiao Ahang, et al, “ 32bit X32bit multiprecision razor
based dynamic voltage scaling multiplier with operands scheduler”,
techniques. So, it should be equal in term of logic IEEE transactions on VLSI systems, vol.22, pp. 759-770, April 2014.
resource usage. In case of the conventional clock gating
technique, it occupies 6 Flip Flops for 6 signals of clk_s, [7] Bishwajeet Pandey; Jyotsana Yadav; M. Pattanaik; Nitish
Rajoria, “Clock gating based energy efficient ALU design and
clk_mi, clk_h, clk_d, clk_mo, clk_y and they are all loads implementation on FPGA”, 2013 International Conference on Energy
of system clock. In case of the proposed technique, it uses Efficient Technologies for Sustainability (ICEETS), pp. 93-97, 2013.
a Flip Flop as the general clock gating block, 5 latches for [8] Endri Bezati , Simone Casale-Brunet , Marco Mattavelli ,
conditional signal processing, so the number of FFs is the and Jorn W. Janneck, “Clock-Gating of Streaming Applications for
same between the conventional clock gating technique Energy Efficient Implementations on FPGAs”, IEEE Transactions on
and the merged clock gating technique. However, the Computer-Aided Design of Integrated Circuits and Systems, pp. 699 –
703, Vol. 36, Issue 4, April 2017.
proposed technique has only one clock load. It is
explained in the previous analysis that the dissipation [9] H. Li, S. Bhunia, Y. Chen, K. Roy, and T. Vijaykumar, “Dcg:
power saving of the proposed technique is better than the deterministic clock-gating for low-power microprocessor design,” IEEE
Trans.Very Large Scale Integr. (VLSI) Syst., vol. 12, no. 3, pp. 245–
conventional one. In terms of the LUT's resources, the 254, 2004.
proposed improved clock gating technique results in
[10] Elio Consoli, et al, “Novel Class of Energy-Efficient Very
reduction of 4 LUTs for the digital clock circuit system. High-Speed Conditional Push–Pull Pulsed Latches”, IEEE Transactions
IV. CONCLUSION on VLSI systems, Vol. 22 No. 7 pp. 1593 1605, July 2014.

This study analyzed and compared the energy saving


methods using different clock gating techniques, applied
to application of the digital clock circuit, which were
performed on 90nm Spartan-3E FPGA. The paper

286
286

You might also like