0% found this document useful (0 votes)

5 views

eytu_lecture2-3

Uploaded by

octaviomartingalan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

eytu_lecture2-3

Uploaded by

octaviomartingalan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 114

Power Consumption and Thermal

Management 2:
Low Power Digital Design
and Management
Pablo Ituero and Rubén San Segundo
Outline
• 4 Relationship between Energy and Delay
• 5 Circuit-level Strategies
• 6 Gate-level Strategies
• 7 Architecture-level Strategies
• 8 Software-level Strategies
• 9 System-level Strategies
• 10 Power Management Examples
▪ 10.1 Arduino
▪ 10.2 Raspberry Pi

2
Slides Credits
• Low Power Design Essentials. Jan Rabaey. Springer
• Low Power VLSI Design. Dr.-Ing. Frank Sill.
Department of Electrical Engineering, Federal
University of Minas Gerais, Brazil.
• www.arduino.cc
• www.raspberrypi.org
• Own Material

3
Lecture Recap 1
• In current electronic circuits, power is mainly
consumed through the charge and discharge of
capacitors through resistances (CMOS gates) that
store the information that the circuit processes.
• In each charge cycle, half of the energy provided
by the power supply is stored in the capacitance
and the other half is dissipated in the pull-up
resistance (turned into thermal energy, heat),
regardless of the value of the resistance.
• In the discharge cycle, the remaining half energy, is
dissipated in the pull-down resistance.

4
Lecture Recap 2
The total power consumption of a circuit is given by

P = α f CL VDD2 + VDD Ipeak (P0→1 + P1→0 ) + VDD Ileak

Dynamic power Short-circuit power Leakage power

(≈ 40 - 70% today (≈ 10 % today and (≈ 20 – 50 % today
and decreasing decreasing absolutely) and increasing)
relatively)

5
Threshold voltage

6
Sub-threshold Leakage
The dominant component of the leakage currents

Off-current increases exponentially when reducing VTH

−VTH
W
I leak = I0 10 S Pleak = VDD.Ileak
W0
7
Leakage current and temperature

8
The Traditional Design Philosophy
• Maximum performance is primary goal
▪ Minimum delay at circuit level
• Architecture implements the required function with
target throughput, latency
• Performance achieved through optimum sizing, logic
mapping, architectural transformations.
• Supplies, thresholds set to achieve maximum
performance, subject to reliability constraints
Trend: Power

Source: Moore, ISSCC 2003

10
The New Design Philosophy
• Maximum performance (in terms of propagation delay)
is too power-hungry, and/or not even practically
achievable
• Many (if not most) applications either can tolerate
larger latency, or can live with lower than maximum
clock-speeds
• Excess performance (as offered by technology) to be
used for energy/power reduction

Trading off speed for power

4. Relationship
between Energy
and Delay

12
Lowering Vdd
• One of the most straightforward ways to reduce
power is lowering 𝑉𝐷𝐷
• However, lowering 𝑉𝐷𝐷 also affects an important
metric of the circuit: Speed.

13
Threshold voltage

14
Energy-Delay Interaction

Delay decreases with supply voltage but energy/power

increases
2
𝑉𝐷𝐷
𝑃𝑑𝑦𝑛 = 𝑉𝐷𝐷 · 𝑓 · 𝐶 · 𝛼 𝐷𝑒𝑙𝑎𝑦 = 𝑘 · 𝐶
(𝑉𝐷𝐷 − 𝑉𝑇 )2
15
Static-Energy Delay Interaction

Static energy increases exponentially with decrease in

threshold voltage

Delay increases with threshold voltage

16
Relationship Between Power and Delay
-4 -10
x 10 x 10
1 5
0.8 4
Power (W)

0.6

Delay (s)
3
A
0.4 2
0.2 1
B
0 04
4 A
3 -0.4 3 B
2 0 0 -0.4
0.4 2 0.4
1 0.8 1 0.8

For a given activity level, power is reduced while delay is unchanged if both VDD
and VTH are lowered such as from A to B.

[Ref: T. Sakurai and T. Kuroda, numerous references]

Effect of VDD reduction on RON
𝑉𝐷𝐷
𝐷𝑒𝑙𝑎𝑦 = 𝑘 · 𝐶
(𝑉𝐷𝐷 − 𝑉𝑇 )2

𝐷𝑒𝑙𝑎𝑦 = 0.69 · 𝑅𝑂𝑁 · 𝐶

𝑉𝐷𝐷
𝑅𝑂𝑁 = 𝑘2
(𝑉𝐷𝐷 − 𝑉𝑇 )2

18
4.5 Reducing power:
global overview

19
Exploring the Energy-Delay Space
Energy
Unoptimized
design

Emax Pareto-optimal
designs

Emin
Dmin Dmax Delay

In energy-constrained world, design is trade-off process

♦ Minimize energy for a given performance requirement
♦ Maximize performance for given energy budget
[Ref: D. Markovic, JSSC’04]
The Design Abstraction Stack
A very rich set of design parameters to consider!
It helps to consider options in relation to their abstraction
layer

System/Application Choice of algorithm

Amount of concurrency
Software
Parallel versus pipelined, general
purpose versus application
(Micro-)Architecture specific
logic family, standard cell versus
Logic/RT custom

Circuit sizing, supply, thresholds

Bulk versus SOI

Device
Reducing Active Energy @ Design Time

Eactive ~ a  CL Vswing VDD

Pactive ~ a  CL Vswing VDD  f

• Reducing VDD has a quadratic effect!

▪ Has a negative effect on performance especially as VDD
approaches 2VT
• Reducing transistor sizes (CL)
▪ Slows down logic
• Reducing activity (a)
▪ Reducing switching activity through transformations
▪ Reducing glitching by balancing logic
▪ Impacted by logic and architecture design decisions
Lowering Dynamic Power

Power Consumption 28
and Thermal
5 Circuit-Level
Strategies

29
Transistor Sizing for Power Minimization

Lower Capacitance Higher Voltage

Small W’s

To keep
performance

Large W’s
Higher Capacitance Lower Voltage

• Larger sized devices: only useful only when interconnects dominate

• Minimum sized devices: usually optimal for low-power
Source: Timmernann,
2007

Micro transductors ‘08, Low 30

Power
6 Gate-Level
Strategies

31
Gate-Level Strategies for Low-Power
6.1 Algebraic transformations
6.2 Restructuring
6.3 Input Ordering
6.4 Dealing with glitches
6.5 Multiple VDD
Algebraic Transformations
Idea: Modify network to reduce capacitance

p1=0.05
p5=0.075
a p3=0.075 a
b f
f
a b
c c
p2=0.05 p4=0.75

pa = 0.1; pb = 0.5; pc = 0.5

Caveat: This may increase activity!

Logic Restructuring
▪ Logic restructuring: changing the topology of a logic
network to reduce transitions

AND: P0→1 = P0 * P1 = (1 - PAPB) * PAPB

3/16
0.5 A Y
0.5 (1-0.25)*0.25 = 3/16
A W 7/64 = 0.109 0.5 B 15/256
B X F
0.5
0.5 C C
0.5 D F 15/256
0.5 0.5 D Z
3/16 = 0.188
➔Chain implementation has a lower overall switching activity than tree
implementation for random inputs
▪ BUT: Ignores glitching effects
Source: Timmernann,
2007
34
Input Ordering
(1-0.5x0.2)*(0.5x0.2)=0.09 (1-0.2x0.1)*(0.2x0.1)=0.0196
0.5 0.2
A B X
X
B C
F 0.1 A F
0.2 C
0.1 0.5
AND: P0→1 = (1 - PAPB) * PAPB

Beneficial: postponing introduction of signals with a

high transition rate (signals with signal probability
close to 0.5)

Source: Timmernann,
2007
35
Glitching
A X
B
C Z

ABC 101 000

Unit Delay

36
Example 1: Chain of NAND Gates
out1 out2 out3 out4 out5
1
...

6.0

out8
4.0 out6
out4
V (Volt)

out2
VDD / 2
2.0
out1
out3
out5
out7
0.0
0 1 2 3
t (nsec)

37
Example 2: Adder Circuit

Cin

S15 S14 S2 S1 S0
3
S Output Voltage (V)

2 S3
S4 S15
Cin VDD / 2
S2
1 S5
S10
S1
S0
0
0 2 4 6 8 10 12
Time (ps)

Micro transductors ‘08, Low 38

Power
How to Cope with Glitching?

0
F1 0
1 F1 1
F2 0
0 2
F3
0 F3
0
0 F2 1
0

Equalize Lengths of Timing Paths Through Design

Micro transductors ‘08, Low 39

Power
Dealing with Glitches

0
1 1
1 1
0 0
1 1

Logic restructuring to minimize glitches

1
1 1
1 1
2
1 1 1
1
3

Buffer insertion for path balancing

Multiple VDD
• Main ideas:
▪ Use of different supply voltages within the same design
▪ High VDD for critical parts (high performance needed)
▪ Low VDD for non-critical parts (only low performance demands)

• At design phase:
▪ Determine critical path(s)
▪ High VDD for gates on those paths
▪ Lower VDD on the other gates (in non-critical paths)
▪ For low VDD: prefer gates that drive large capacitances (yields the
largest energy benefits)
• Usually two different VDD (but more are possible)

Micro transductors ‘08, Low 41

Power 2
Data Paths
• Data propagate through different data paths between registers (flipflops -
FF)
• Paths mostly differ in propagation delay times
• Frequency of clock signal (CLK) depends on path with longest delay ➔
critical path

FF FF FF

FF FF FF
Paths
Path
FF FF FF

CLK CLK CLK

42
Data Paths: Slack

C
A Y G2
G1
B

A
G1 ready with
B evaluation

Y all inputs of G2
all Inputs of G1 arrived
arrived
C

delay of G1 Slack for G1 time

43
Multiple VDD in Data Paths
• Minimum energy consumption when all logic paths are critical (same delay)
• Possible Algorithm: clustered voltage-scaling
▪ Each path starts with VDDH and switches to VDDL (blue gates) when slack
is available
▪ Level conversion in flipflops at end of paths

Connected with VDDL

Connected with VDDH

44
11. The following section represents a segment of a pipelined architecture.

11.a Signals 𝑆1 , 𝑆2 and 𝑆3 have a ‘1’ probability of 0.5. Find the ‘1’ probability of the rest of the
signals of the circuit.

11.b Find the activity factor of each signal.

11.c The circuit operates at 100 MHz, from a 1.2 V supply voltage and the average load
capacitance is 10 fF. Find the dynamic power consumption of the circuit. Do not consider the
effect of glitches in your analysis.

45
12. Considering the circuit in the previous exercise,

12.a Draw the timing diagram when the inputs signals change from 𝑆1 = 0, 𝑆2 = 0, 𝑆3 = 1 to 𝑆1 =
1, 𝑆2 = 1, 𝑆3 = 0.

12.b Is there any glitch in the circuit? What can you say about the power consumption results
of the previous exercise?

46
14. You need to implement a 3-OR function with two 2-OR gates. Find the input ordering that
minimizes power consumption, knowing that PA = 0.7, PB = 0.5, PC = 0.2.

47
7 Architecture-Level
Strategies

48
Strategies
• 7.1 Review of architectural metrics and design
techniques
• 7.2 Reducing supply voltage while maintaining
performance
• 7.3 Clock Gating
• 7.4 Bus Power Reduction

49
Design Layer: Architecture
Level
• Also known as Register transfer level (RTL)
• Base elements:
▪ Register structures
▪ Arithmetic logic units (ALU)
▪ Memory elements
• Only behavior is described
(no inner structure)

Micro transductors ‘08, Low 50

Power 2
Performance Metrics

• Two common metrics

▪ Latency (how long to do X)
• Also called response time and execution time
▪ Throughput (how often can it do X)
• Example of car assembly line
▪ Takes 6 hours to make a car
(latency is 6 hours)
▪ A car leaves every 5 minutes
(throughput is 12 cars per hour)
▪ Overlap results in Throughput > 1/Latency
Basic Concepts: Pipelining

No pipeline:
1 operation
every 1ns

1ns

Pipeline:
1 operation
every 200ps
200ps 200ps 200ps 200ps 200ps

52
Basic Concepts: Parallelism

1ns

1ns Parallel
implementation:
5 operations
every 1ns
1ns

1ns

53
1ns
Motivation for Power Reduction
• Optimizations at the architecture or system level can enable
more effective power minimization at the circuit level (while
maintaining performance), such as
▪ Enabling a reduction in supply voltage
▪ Reducing the effective switching capacitance for a given function
(physical capacitance, activity)
▪ Reducing the switching rates
▪ Reducing leakage

• Optimizations at higher abstraction levels tend to have greater

potential impact
▪ While circuit techniques may yield improvements in the 10-50% range,
architecture and algorithm optimizations have reported orders of
magnitude power reduction
Expanding the Playing Field

E E

D D

Removing inefficiencies (1) Alternative topologies (2)

E
Architecture and system
transformations and
optimizations reshape
the E-D curves

Discrete options (3)

Reducing the Supply Voltage
(while maintaining performance)
Concurrency:
trading off clock frequency versus area to reduce power

Consider the following reference design

R
F1
R
F2

fref
R: register,
Cref: average switching capacitance
F1,F2: combinational logic blocks
(adders, ALUs, etc)

[A. Chandrakasan, JSSC’92]

A Parallel Implementation
R
F1
R
F2
R

fref /2

R Almost cancels
F1
R
F2
R

fref /2

Running slower reduces required supply voltage

Yields quadratic reduction in power
A Pipelined Implementation
R
F1 R
R
F2
R R

fref fref

Shallower logic reduces required supply voltage

(this example assumes equal Vdd for par / pipe designs)

Assuming
ovpipe = 10%
Parallel Architecture: Example

• Reference Data path (for example)

• Critical path delay Tadder + Tcomparator (= 25 ns)

➔ fref = 40 MHz
• Total capacitance being switched = Cref
• VDD = Vref = 5V
• Power for reference datapath = Pref = Cref Vref2 fref

Source: Irwin, 2000

Micro transductors ‘08, Low 59

Power 2
Parallel Architecture: Example cont’d

Area = 1476 x 1219 µ2

• The clock rate can be reduced by half with the same throughput fpar = fref / 2
• Vpar = Vref / 1.7, Cpar = 2.15 Cref
• Ppar = (2.15 Cref) (Vref / 1.7)2 (fref / 2) = 0.36 Pref

Source: Irwin, 2000

Micro transductors ‘08, Low 60

Power 2
Pipelined Architecture: Example

◼ fpipe = fref, , Cpipe = 1.1 Cref , Vpipe = Vref / 1.7

◼ Voltage can be dropped while maintaining the original throughput
◼ Ppipe = CpipeVpipe2 fpipe = (1.1 Cref) (Vref/1.7)2 fref = 0.37 Pref

Source: Irwin, 2000

Micro transductors ‘08, Low 61

Power 2
Approximate Trend
N-parallel proc. N-stage pipeline proc.

Capacitance N*Cref Cref

Voltage Vref/N Vref/N

Frequency fref/N fref

Dynamic Power CrefVref2fref/N2 CrefVref2fref/N2

Chip area N times 10-20% increase

Source: G. K. Yeap, Practical Low Power Digital

VLSI Design, Boston: Kluwer Academic Publishers,
1998.
Micro transductors ‘08, Low 62
Power 2
15. Consider the circuit of Figure 1. Modules A and B have a delay of 20 nsec and 65 nsec at 5v,
and switch of 30pF and 112 pF, respectively. The register has a delay of 4 nsec and switch
0.2pF. Adding a pipeline register allows for reduction of the supply voltage while maintaining
throughput. How much power can be saved this way? Delay with respect to Vdd can be
approximated from the lower figure.

63
16. Repeat problem 15, using parallelism instead of pipelining. Assume that a 2-to-1
multiplexer has a delay of 4 ns at 2.5 V and switches 0.3 pF. Try parallelism levels of 2 and by 4.
Which one is preferred?

64
Increasing use of Concurrency Saturates
▪ Can combine parallelism and pipelining to drive VDD down
▪ But, close to process threshold overhead of excessive concurrency starts to dominate

1
0.9
0.8
0.7
Power

0.6
0.5
0.4
0.3
0.2
0.1
2 4 6 8 10 12 14 16
Concurrency
Assuming constant % overhead
Increasing use of Concurrency Saturates

P Nominal design
Fixed (no concurrency)
Throughput

Overhead +
leakage
Concurrency

Pmin

VDD

Only option: Reduce VTH as well!

But: Must consider Leakage …
Mapping into the Energy-Delay Space
E Op © IEEE 2004

N=5 N=4 N=3 N=2 nominal

Fixed throughput

Optimum
Energy-Delay
point
increasing level of parallelism

Delay = 1/Throughput

▪ For each level of performance, optimum amount of concurrency

▪ Concurrency only energy-optimal if requested throughput larger than
optimal operation point of nominal function

[Ref: D. Markovic, JSSC’04]

Some Energy-Inspired Design Guidelines

• For maximum performance

▪ Maximize use of concurrency at the cost of area
• For given performance
▪ Optimal amount of concurrency for minimum energy
• For given energy
▪ Least amount of concurrency that meets performance
goals
• For minimum energy
▪ Solution with minimum overhead (that is – direct mapping
between function and architecture)
Concepts Slowly Embraced in Late 90’s
1012
1.000E+12
Transistors/chip

1011
1.000E+11
memory
1010
1.000E+10

109
1.000E+09

108
1.000E+08
microprocessor/DSP
107
1.000E+07 100
memory

[mA/ MIP]
106
1.000E+06 10 processors
processor speed

105
1.000E+05 1
Normalized

104
1.000E+04 0.1
103
1.000E+03 computational 0.01
102
1.000E+02 efficiency 0.001
101
1.000E+01

100
1.000E+00
1960
1 3 5 7 91970 1980
11 13 15 17 19 1990
21 23 25 27 29 2000
31 33 35 37 39 2010
41 43 45 47 49 51

[Ref: R. Subramanyan, Tampere’99]

And Finally Accepted in the 00’s

100
(for constant power envelope)
Processor performance

Dual/Many Core

10x
10

Single Core
3x

1
2000 2004 2008+

[Ref: S. Chou, ISSCC’05]

Fully Accepted in 00’s
UCB Pleiades

Heterogeneous Xilinx Vertex 4

reconfigurable
fabric Intel Montecito

ARM

AMD DualCore

NTT Video codec

(4 Tensilica cores)

IBM/Sony Cell Processor

[© Xilinx, Intel, AMD, IBM, NTT]

The Quest for Concurrency
Serial = 0%
10
Serial = 6.7%
8
Performance

4
Serial = 20%
2

0
0 10 20 30
Number of Cores

Amdahl’s Law:
Clock Gating
• Most popular method for power reduction of clock signals and
functional units
• Gate off clock to idle functional units
• Logic for generation of disable signal necessary R
Functional
Higher complexity of control logic e
unit
Higher power consumption g
Critical timing critical for avoiding of
clock glitches at OR gate output
 Additional gate delay on clock signal
clock
disable

Source: Irwin, 2000

Micro transductors ‘08, Low 77

Power 2
Clock Gating: Example
Without clock gating

30.6mW

With clock gating

8.5mW DEU
VDE

MIF
0 5 10 15 20 25 DSP/
Power [mW]
HIF
896Kb SRAM
▪ 90% of FlipFlops clock-gated

▪ 70% power reduction by clock-gating

MPEG4 decoder
Source: M. Ohashi, Matsushita, 2002

78
Bus Power
• Buses are significant source of power dissipation
▪ 50% of dynamic power for interconnect switching (Magen, SLIP 04)
▪ MIT Raw processor’s on-chip network consumes 36% of total chip power
(Wang et al. 2003)
• Caused by:
▪ High switching activities
▪ Large capacitive loading

Wout Xout Yout Zout

Bus
receivers
Bus
Bus
drivers
Ain Bin Cin Din
Source: Irwin, 2000

79
Bus Power Reduction
• For an n-bit bus: Pbus = n* αfClkCloadVDD2
• Alternative bus structures
▪ Segmented buses (lower Cload)
▪ Charge recovery buses
▪ Bus multiplexing (lower fClk possible)
• Minimizing bus traffic (n)
▪ Code compression
▪ Instruction loop buffers
• Minimization of bit switching activity (fclk) by data encoding
• Minimize voltage swing (VDD2) using differential signaling

Source: Irwin, 2000

80
Reducing Shared Resources
• Shared resources incur switching overhead
• Local bus structures reduce overhead

Global bus architecture Local bus architecture

Source: Irwin, 2000

81
Reducing Shared Resources cont’d
• Bus segmentation
▪ Another way to reduce shared buses
▪ Control of bus segment by controller blocks (B)

Shared Bus
B

Segmented Bus

Source: Evgeny Bolotin – Jan 2004

82
8 Software-Level
Strategies

83
Design Layer: Algorithm Level
• Base elements:
▪ Functions
▪ Procedures
▪ Processes
▪ Control structures
• Description of design behavior

Micro transductors ‘08, Low 84

Power 2
Coding styles
• Use processor-specific instruction style:
▪ Variable types
▪ Function calls style
▪ Conditionalized instructions (for ARM)
• Follow general guidelines for software coding
▪ Use table look-up instead of conditionals
▪ Make local copies of global variables so that they can be assigned to
registers
▪ Avoid multiple memory look-ups with pointer chains

Micro transductors ‘08, Low 85

Power 2
Source-code Transformations
• Minimize power-consuming activity:
▪ Computation
A*B+A*C A*(B+C)

▪ Communication
for (c = 1..N) receive (A)
receive (A) for (c = 1..N)
B=c*A B=c*A
▪ Storage
for (c = 1..N)
B[c] = A[c]*D[c] for (c = 1..N)
for (c = 1..N) F[c] = A[c]*D[c]-1
F[c] = B[c]-1

Micro transductors ‘08, Low 86

Power 2
Adaptive Dynamic Voltage Scaling (DVS)
• Slow down processor to fill idle time
• More Delay ➔ lower operational voltage

Active Idle Active Idle 3.3 V

Active 2.4 V
• Runtime Scheduler determines processor speed and selects
appropriate voltage
• Transitions delay for frequencies ~150s
• Potential to realize 10x energy savings

Micro transductors ‘08, Low 87

Power 2
Adaptive DVS: Example
• Task with 100 ms deadline, requires 50 ms CPU time at full speed
▪ Normal system gives 50 ms computation, 50 ms idle/stopped time
▪ Half speed/voltage system gives 100 ms computation, 0 ms idle
▪ Same number of CPU cycles but: E = C (VDD/2)2 = Eref / 4
▪ Dynamic Voltage Scaling adapts voltage to workload

T1 T2 T1 T2

Same work,
Speed

lower energy
Task Idle
Task

Time Time
Micro transductors ‘08, Low Power 88
2
9 System-Level
Strategies

89
Design Layer: System Level
• Basic Elements:
▪ Complex modules
▪ Processors
▪ Calculation and control units
▪ Sensors
ALU

MEM

MEM
MP3

Micro transductors ‘08, 90

Low Power 2
Dynamic Power Management
• Systems are:
▪ Designed to deliver peak performance, but …
▪ Not needing peak performance most of the time
• Components are idle sometimes
• Dynamic power management (DPM):
▪ Puts idle components in low-power non-operational
states when idle
• Power manager:
▪ Observes and controls the system
▪ Power consumption of power manager is negligible

Micro transductors ‘08, Low 91

Power 2
Processor Sleep Modes
• Software power control - power management
DOZE Most units stopped except on-chip
cache memory (cache coherency)
NAP Cache also turned off, PLL still on,
time out or external interrupt
to resume
SLEEP PLL off, external interrupt to resume

Deeper sleep mode requires

Deeper sleep mode consumes
more latency to resume
less power

Micro transductors ‘08, Low 92

Power 2
Processor Sleep Modes: Example
• PowerPC sleep modes
Mode 66Mhz 80Mhz
No power mgmt 2.18W 2.54W
Dynamic power mgmt 1.89W 2.20W
DOZE 307mW 366mW
NAP 113mW 135mW
SLEEP 89mW 105mW
SLEEP without PLL 18mW 19mW
SLEEP without clock 2mW 2mW

10 cycles to wake up from SLEEP

100us to wake up from SLEEP+
Source: Irwin, 2000

Micro transductors ‘08, Low 93

Power 2
Transmeta LongRun
• Applies adaptive DVS
• LongRun policies:
▪ Detection of different workload scenarios
▪ Based on runtime performance information
• After detection ➔ accordingly adaptation of:
▪ Processor supply voltage
▪ Processor frequency
▪ Clock frequency always within limits required by supply voltage to avoid clock
skew problems
• Use of core frequency/voltage hard coded operating points

➔ Best trade-off between performance and power possible

Micro transductors ‘08, Low 94

Power 2
Transmeta LongRun cont’d
100
90
% of max powerl consumption

80
70
60
50
40
30
20
10
Typical operating region Peak performance region
0
300 400 500 600 700 800 900 1000
300 Mhz 433 Mhz 533 Mhz 667 Mhz 800 Mhz 900 Mhz 1000 Mhz
0.80 V 0.87 V 0.95 V 1.05 V 1.15 V 1.25 V 1.30 V

Frequency (MHz)
Source: Transmeta

Micro transductors ‘08, Low 95

Power 2
Transmeta LongRun: Example

Source: Transmeta

Micro transductors ‘08, Low 96

Power 2
10 Power
Management
Examples

97
10.1 Reducing
power consumption
with Arduino

98
Standard situation

Running from a 9V battery through the "power

in" plug, it draws about 50 mA.

Running on 5V through the +5V pin, it draws

about 49 mA.

99
Sleep modes

#include <avr/sleep.h>

void setup () {
set_sleep_mode
(SLEEP_MODE_PWR_DOWN);
sleep_enable();
sleep_cpu ();

} // end of setup
oid loop () { }

100
Sleep modes

SLEEP_MODE_IDLE: 50 mA
SLEEP_MODE_ADC: 42 mA
SLEEP_MODE_PWR_SAVE: 36 mA
SLEEP_MODE_EXT_STANDBY: 36 mA
SLEEP_MODE_STANDBY : 35 mA
SLEEP_MODE_PWR_DOWN : 34.5 mA

1. Power-save mode: keep Timer 2 running (providing clocked

from an external source).
2. Stand-by mode is similar to power-down mode, except that the
oscillator is kept running. This lets it wake up faster.
3. In IDLE mode, the clocks are running (millis()) it wakes up
every millisecond.
101
Sleep modes

SLEEP_MODE_IDLE: 50 mA
SLEEP_MODE_ADC: 42 mA
SLEEP_MODE_PWR_SAVE: 36 mA
SLEEP_MODE_EXT_STANDBY: 36 mA
SLEEP_MODE_STANDBY : 35 mA
SLEEP_MODE_PWR_DOWN : 34.5 mA

1. SLEEP_MODE_IDLE provides the least power savings but also

retains the most functionality.
2. SLEEP_MODE_PWR_DOWN uses the least power but turns
almost everything off, so your options for wake interrupts and
the like are limited.
102
Sleep modes: ATMEGA Datasheet

103
Sleep modes

104
Power Reduction Mode

In addition to putting the whole thing to sleep, you can turn off
parts of the chip with the chip's Power Reduction Manager.

Turn ADC off: power_adc_disable();

You do not need to do any serial communication:

power_usi_disable();

For maximum power reduction, just run this:

power_all_disable()
then enable what you want you actually need:
http://www.nongnu.org/avr-libc/user-
manual/group__avr__power.html
105
Configuring pins: current increment

1. All pins as outputs, and LOW: 0.0 µA (same as before).

2. All pins as outputs, and HIGH: 1.86 µA.

3. All pins as inputs, and LOW (in other words, internal

pull-ups disabled): 0.0 µA (same as before).

4. All pins as inputs, and HIGH (in other words, internal

pull-ups enabled): 1.25 µA.

106
Down-clock

typedef enum
{
clock_div_1 = 1, clock_div_2 = 2, clock_div_4 = 4,
clock_div_8 = 8, clock_div_16 = 16, clock_div_32
= 32, clock_div_64 = 64, clock_div_128 = 128
} clock_div_t;

clock_prescale_set ( clock_div_t x)

107
Down-voltage

At 8MHz

1. 5.0V : 11.67 mA
2. 4.5V : 7.74 mA
3. 4.0V : 5.60 mA
4. 3.5V : 4.10 mA
5. 3.3V : 3.70 mA
108
Max frequency vs. Voltage

109
10.2 Reducing
power consumption
with Raspberry Pi

110
General comparison

111
Power Management in Linux

112
Suspend(suspend.c)/Resume

113
Hibernation (hibernate.c)

114
Restore

115
Disconnect Unnecessary Peripherals

116
Shut down the USB Hub

117
Shut down the USB Hub

#!/bin/bash
#Code to stop
/etc/init.d/networking stop
echo 0 > /sys/devices/platform/bcm2708_usb/buspower;
echo “Bus power stopping”

#!/bin/bash
#Code to start
echo 1 > /sys/devices/platform/bcm2708_usb/buspower;
echo “Bus power starting”
sleep 2;
/etc/init.d/networking start
118
Shut down the USB Hub

To locate buspower

find /sys/devices/ -name `dmesg -t | grep dwc_otg | grep “DWC

OTG Controller” | awk ‘{print $2}’ | cut -d”:” -f1`
Power Consumption Reduction
❑With USB Hub
560-580 mA with LCD display and USB WIFI dongle.
(about 2.9 Watt). Tª 48.

❑Without USB Hub

220 mA, the power is about 1.1 Watt. Tª 42
119
Turn off video output

To turn off the HDMI port with:

sudo /opt/vc/bin/tvservice –o

to turn it back on:

sudo /opt/vc/bin/tvservice -p

This command will save you around 20-30mA.

120
Down-clock the Core

Adding this lines to your config.txt (number in MHz)

arm_freq=700
arm_freq_min=250
core_freq=250
core_freq_min=100
sdram_freq_min=150

over_voltage_min=0 or 4 if you have overclocked your

rasppi.

Dividing by 2 the frequency reduces by 8 (23) the ARM

power consumption.
121
Example

122
Example
Multithread
GPU Module

If the GPU increases a 20% its frequency, the power consumption increases (1.2)3
123

Power and Speed Trade-Offs in Data Path Structures Array Subsystems
100% (1)
Power and Speed Trade-Offs in Data Path Structures Array Subsystems
54 pages
Low Power Cmos Vlsi Circuit Design by Kaushik Roy PDF
No ratings yet
Low Power Cmos Vlsi Circuit Design by Kaushik Roy PDF
374 pages
Lecture 11 Low Power Circuits
No ratings yet
Lecture 11 Low Power Circuits
67 pages
Chapter 17: Low-Power Design: Keshab K. Parhi and Viktor Owall
No ratings yet
Chapter 17: Low-Power Design: Keshab K. Parhi and Viktor Owall
34 pages
Low Power VLSI Design
No ratings yet
Low Power VLSI Design
6 pages
Cmos Low Power
No ratings yet
Cmos Low Power
5 pages
Transcription T3
No ratings yet
Transcription T3
17 pages
IJCRT1872033
No ratings yet
IJCRT1872033
10 pages
Power and Energy Basics: Jan M. Rabaey
No ratings yet
Power and Energy Basics: Jan M. Rabaey
46 pages
Lecture13 03 PDF
No ratings yet
Lecture13 03 PDF
35 pages
3 Anandi
No ratings yet
3 Anandi
27 pages
Energy Efficient CMOS Microprocessor Design
No ratings yet
Energy Efficient CMOS Microprocessor Design
10 pages
30VLSI System Level
No ratings yet
30VLSI System Level
49 pages
Power 6.2
No ratings yet
Power 6.2
21 pages
Figure of Merit
No ratings yet
Figure of Merit
31 pages
Low Power Cmos Vlsi Circuit Design by Kaushik Roy 1 To 30 Page
No ratings yet
Low Power Cmos Vlsi Circuit Design by Kaushik Roy 1 To 30 Page
30 pages
Designing For Low Power in Soc Projects
No ratings yet
Designing For Low Power in Soc Projects
14 pages
Low Power
No ratings yet
Low Power
67 pages
Low Power Vlsi Design 2
No ratings yet
Low Power Vlsi Design 2
70 pages
VLSIdoc
No ratings yet
VLSIdoc
6 pages
Sill LowPower2
No ratings yet
Sill LowPower2
67 pages
UNIT-3: P V I β V τf= β V V V τf
No ratings yet
UNIT-3: P V I β V τf= β V V V τf
6 pages
Lecture-2 Low Power VLSI Design: Instructor: Rajesh Bathija, Hod-Ece, Mewar University, Chittorgarh
No ratings yet
Lecture-2 Low Power VLSI Design: Instructor: Rajesh Bathija, Hod-Ece, Mewar University, Chittorgarh
63 pages
Vtu Lecture1
No ratings yet
Vtu Lecture1
48 pages
Low Power CMOS VLSI Circuit Design by Kaushik Roy, Sharat Prasad PDF
No ratings yet
Low Power CMOS VLSI Circuit Design by Kaushik Roy, Sharat Prasad PDF
374 pages
LPVD U1,2
No ratings yet
LPVD U1,2
34 pages
lpvd u1
No ratings yet
lpvd u1
21 pages
UNIT-3 Sources of Power Dissipation
No ratings yet
UNIT-3 Sources of Power Dissipation
6 pages
2539
No ratings yet
2539
29 pages
Power Dissipation in CMOS Circuits: Advanced VLSI EEE 6405 Slide1 Abm Harun-Ur Rashid
No ratings yet
Power Dissipation in CMOS Circuits: Advanced VLSI EEE 6405 Slide1 Abm Harun-Ur Rashid
28 pages
Power For FF
No ratings yet
Power For FF
63 pages
JNTUA Low Power VLSI Circuits & Systems Notes - R15
No ratings yet
JNTUA Low Power VLSI Circuits & Systems Notes - R15
68 pages
Zhou 2008
No ratings yet
Zhou 2008
7 pages
Lect7 Power Mod
No ratings yet
Lect7 Power Mod
43 pages
Lecture Notes: B.Tech
No ratings yet
Lecture Notes: B.Tech
68 pages
File 1501
No ratings yet
File 1501
31 pages
ISSCC2011Visuals T3
No ratings yet
ISSCC2011Visuals T3
74 pages
Thesis On Low Power Vlsi Design
100% (2)
Thesis On Low Power Vlsi Design
8 pages
Low Power Implem
No ratings yet
Low Power Implem
19 pages
15A04802-Low Power VLSI Circuits & Systems-CDF-5 Units
No ratings yet
15A04802-Low Power VLSI Circuits & Systems-CDF-5 Units
61 pages
Unit -1 - Lpvlsi
No ratings yet
Unit -1 - Lpvlsi
37 pages
WINSEM2024-25_MVLD602L_TH_VL2024250502084_2024-12-13_Reference-Material-I
No ratings yet
WINSEM2024-25_MVLD602L_TH_VL2024250502084_2024-12-13_Reference-Material-I
37 pages
Cmos Inverter Characterization
No ratings yet
Cmos Inverter Characterization
54 pages
Low Power Vlsi Design: Assignment-1 G Abhishek Kumar Reddy, M Manoj Varma
No ratings yet
Low Power Vlsi Design: Assignment-1 G Abhishek Kumar Reddy, M Manoj Varma
17 pages
Design of Low - Power and High - Speed Pla and Rom For SOC Applications
No ratings yet
Design of Low - Power and High - Speed Pla and Rom For SOC Applications
50 pages
Kaushik Roy, Sharat Prasad-Low Power CMOS VLSI - Circuit Design-Wiley (2000)
75% (4)
Kaushik Roy, Sharat Prasad-Low Power CMOS VLSI - Circuit Design-Wiley (2000)
374 pages
Low Power Design of Digital Systems
No ratings yet
Low Power Design of Digital Systems
28 pages
High-Level Power Analysis and Optimization
No ratings yet
High-Level Power Analysis and Optimization
185 pages
Low Power Design Techniques and Implementation Strategies Adopted in VLSI Circuits
No ratings yet
Low Power Design Techniques and Implementation Strategies Adopted in VLSI Circuits
4 pages
Rends and Challenges in Vlsi: BY: Bhanuteja Labishetty
No ratings yet
Rends and Challenges in Vlsi: BY: Bhanuteja Labishetty
35 pages
Review on Low Power Vlsi Design
No ratings yet
Review on Low Power Vlsi Design
5 pages
Power Consumption
No ratings yet
Power Consumption
16 pages
Review Paper On Low Power VLSI Design Techniques: Neha Thakur, Deepak Kumar
No ratings yet
Review Paper On Low Power VLSI Design Techniques: Neha Thakur, Deepak Kumar
5 pages
2023_ch5-Power
No ratings yet
2023_ch5-Power
51 pages
Power Optimization For Low Power VLSI Circuits
No ratings yet
Power Optimization For Low Power VLSI Circuits
4 pages
eytu_lecture1
No ratings yet
eytu_lecture1
64 pages
LP Main
No ratings yet
LP Main
10 pages
Analog Dialogue, Volume 45, Number 2: Analog Dialogue, #2
From Everand
Analog Dialogue, Volume 45, Number 2: Analog Dialogue, #2
Analog Dialogue
No ratings yet
Analog Dialogue, Volume 46, Number 3: Analog Dialogue, #7
From Everand
Analog Dialogue, Volume 46, Number 3: Analog Dialogue, #7
Analog Dialogue
No ratings yet
Static-Inverter 1.0: A Complete Design Process to Convert D.C. to A.C. Electricity Using the Astable-Multivibrator
From Everand
Static-Inverter 1.0: A Complete Design Process to Convert D.C. to A.C. Electricity Using the Astable-Multivibrator
Mac Yancy C. Del Rosario
No ratings yet
OCP8108A
No ratings yet
OCP8108A
2 pages
Analog Multiplier ICs
No ratings yet
Analog Multiplier ICs
5 pages
Wireless Robot Control Through RF
No ratings yet
Wireless Robot Control Through RF
9 pages
Uni-Iii Part2
No ratings yet
Uni-Iii Part2
11 pages
FSEZ1317A
No ratings yet
FSEZ1317A
16 pages
Design and Construction of Under and Over Voltage Power Protection System
100% (6)
Design and Construction of Under and Over Voltage Power Protection System
18 pages
Laboratory 2 NMK22003
No ratings yet
Laboratory 2 NMK22003
28 pages
Unit 3
No ratings yet
Unit 3
31 pages
Opamp Circuits
No ratings yet
Opamp Circuits
43 pages
Tutorial 2_AC (2)
No ratings yet
Tutorial 2_AC (2)
7 pages
Datasheet KW45
No ratings yet
Datasheet KW45
94 pages
Theft
No ratings yet
Theft
12 pages
Lab 2 Report
No ratings yet
Lab 2 Report
9 pages
75 Volt 20 Amp Mosfet H-Bridge PWM Motor Driver/Amplifier
100% (1)
75 Volt 20 Amp Mosfet H-Bridge PWM Motor Driver/Amplifier
5 pages
Lecture5 3
No ratings yet
Lecture5 3
217 pages
A CMOS Based Low Power Digitally Controlled Oscill
No ratings yet
A CMOS Based Low Power Digitally Controlled Oscill
12 pages
ISSCC2011Visuals T1
No ratings yet
ISSCC2011Visuals T1
88 pages
Classabssf
No ratings yet
Classabssf
21 pages
Adafruit AM2320 Temperature & Humidity I2C Sensor User Manual
No ratings yet
Adafruit AM2320 Temperature & Humidity I2C Sensor User Manual
11 pages
CMOS Logic Styles-1 (Unit 3)
No ratings yet
CMOS Logic Styles-1 (Unit 3)
45 pages
Operational Amplifier, Comparator (Tutorial)
No ratings yet
Operational Amplifier, Comparator (Tutorial)
52 pages
Cntfet sram using trasistor 17
No ratings yet
Cntfet sram using trasistor 17
16 pages
PT2240-Princeton Technology Corp
No ratings yet
PT2240-Princeton Technology Corp
4 pages
Common Emitter Amplifier
100% (1)
Common Emitter Amplifier
11 pages
Don't Trust The Internet (And How To Add An Inductive Proximity Sensor To Your 3D Printer The Proper and Easiest Way) - So Many Questions
No ratings yet
Don't Trust The Internet (And How To Add An Inductive Proximity Sensor To Your 3D Printer The Proper and Easiest Way) - So Many Questions
15 pages
Tunneling Field Effect Transistors: Design, Modeling and Applications 1st Edition T. S. Arun Samuel - Download the ebook today to explore every detail
100% (1)
Tunneling Field Effect Transistors: Design, Modeling and Applications 1st Edition T. S. Arun Samuel - Download the ebook today to explore every detail
80 pages
Clarion - Ph2858wa B E6368 00 Honda
No ratings yet
Clarion - Ph2858wa B E6368 00 Honda
24 pages
Full Download Microelectronics Failure Analysis Desk Reference 5th Ed 5th Edition Coll. PDF DOCX
100% (4)
Full Download Microelectronics Failure Analysis Desk Reference 5th Ed 5th Edition Coll. PDF DOCX
81 pages
Passgate
No ratings yet
Passgate
7 pages
SONY KD-49XD7005 Chassis GN2TR-QEL
No ratings yet
SONY KD-49XD7005 Chassis GN2TR-QEL
168 pages