Unit-2 Embedded System Analysis and Design
Unit-2 Embedded System Analysis and Design
1
Major levels of abstraction in the
design process.
2
Requirements
• before we design a system, we must know what
we are designing.
• May be developed in several ways:
• talking directly to customers;
• talking to marketing representatives;
• providing prototypes to users for comment.
• Requirements may be functional or
nonfunctional
3
Functional vs. non-functional
requirements
• Functional requirements:
• output as a function of input.
• Non-functional requirements:
• Performance, Cost;
• time required to compute output;
• Physical size and weight;
• power consumption;
• reliability;
• etc.
4
A sample requirements form that can be filled out at
the start of the project.
name
purpose
Inputs
outputs
functions
performance
manufacturing cost
power
physical size/weight
5
Requirements analysis of a GPS
moving map
Scotch Road
• the map display changes as the user and the
map device change position..
• The moving map obtains its position from
the GPS, a satellite-based navigation
system.
• The moving map display the location
lat: 40 13 lon: 32 19
6
GPS moving map needs
7
GPS moving map needs, cont’d.
• Performance: Map should scroll smoothly. No more than
1 sec power-up. system should be able to verify its
position and display the current map within 15 s.
• Cost: Should be economical
• Physical size/weight: Should fit in hand.
• Power consumption: Should run for 8 hours on four AA
batteries.
8
GPS moving map requirements form
name GPS moving map
purpose consumer-grade
moving map for driving
inputs power button, two
control buttons
outputs back-lit LCD 400 X 600
functions 5-receiver GPS; three
resolutions; displays
current lat/lon
performance updates screen within
0.25 sec of movement
manufacturing cost $100 cost-of-goods-
sold
power 100 mW
physical size/weight no more than 2: X 6:,
12 oz.
9
Specification
10
GPS specification Should include:
11
Architecture design
• The architecture is a plan for the overall structure of the
system.
• The specification does not say how the system does the things,
only what the system does.
12
GPS moving map block diagram
user
database interface
13
GPS moving map hardware
architecture
memory
panel I/O
14
GPS moving map software
architecture
user
timer
interface
15
Designing hardware and software
components
• The architectural description tells us what components
we need.
• The components will in general include both hardware—
FPGAs, boards, and so on—and software modules.
• Some components are ready-made, some can be
modified from existing designs, others must be designed
from scratch.
• You will have to design some components yourself.
• Even if you are using only standard integrated circuits,
you may have to design the PCB that connects them
16
System integration
17
FORMALISMS FOR SYSTEM DESIGN
18
Structural Description
• structural description, mean the basic components of the
system;
• we will learn how to describe how these components act in the
next section
• An object includes a set of attributes that define its internal
state.
• these attributes usually become variables or constants held in a
data structure.
• An object describing a display (such as a CRT screen) is shown
in UML notation
• The attribute is, in this case, an array of pixels that holds the
contents of the display
19
Objects and classes
20
UML object
object name
class name
d1: Display
comment
attributes
21
UML class
pixels
elements
menu_items
mouse_click()
operations
draw_box
22
The class interface
23
Choose your interface properly
25
Class derivation
Derived_class
UML
generalization
Base_class
26
Class derivation example
Display
base
pixels class
elements
menu_items
pixel()
derived class set_pixel()
mouse_click()
draw_box
BW_display Color_map_display
27
Multiple inheritance
base classes
Speaker Display
Multimedia_display
derived class
28
Behavioral description
29
State machines
transition
a b
30
Event-driven state machines
31
Types of events
32
Signal event
<<signal>>
mouse_click a
b
declaration
event description
33
Call event
draw_box (10,5,3,2,blue)
c d
34
Timer event
tm(time-value)
e f
35
Example state machine
start input/output
mouse_click(x,y,button)/ region = menu/
find_region(region) which_menu(i) call_menu(I)
region got menu called
found item menu item
region = drawing/
find_object(objid) highlight(objid)
found object
object highlighted
finish
36
Sequence diagram
37
Sequence diagram example
mouse_click(x,y,button)
which_menu(x,y,i)
time
call_menu(i)
38
MEMORY SYSTEM MECHANISMS
40
Caches and CPUs
address data
cache
controller
cache main
CPU
memory
address
data data
41
Cache operation
• Many main memory locations are mapped
onto one cache entry.
• May have caches for:
• instructions;
• data;
• data + instructions (unified).
• Memory access time is no longer
deterministic.
42
Terms
43
Types of misses
44
Memory system performance
• The average memory access time is given as
• The hit rate depends on the program being executed and the
cache organization, and is typically measured using simulators
• t main is typically 50–60 ns for DRAM, while tcache is at most
a few nanoseconds
45
Multiple levels of cache
Modern CPUs may use multiple levels of cache as shown
46
Multi-level cache access time
47
Replacement policies
• Replacement policy: strategy for choosing
which cache entry to throw out to make
room for a new memory location.
• Two popular strategies:
• Random.
• Least-recently used (LRU).
48
Cache organizations
• Fully-associative: any memory location
can be stored anywhere in the cache
(almost never implemented).
• Direct-mapped: each memory location
maps onto exactly one cache entry.
• N-way set-associative: each memory
location can go into one of n sets.
49
Cache performance benefits
• Keep frequently-accessed locations in fast
cache.
• Cache retrieves more than one word at a
time.
• Sequential accesses are faster after first
access.
50
Direct-mapped cache
hit value
byte
51
Write operations
52
Direct-mapped cache locations
• Many locations map onto the same cache
block.
• Conflict misses are easy to generate:
• Array a[] uses locations 0, 1, 2, …
• Array b[] uses locations 1024, 1025, 1026, …
• Operation a[i] + b[i] generates conflict
misses.
53
Set-associative cache
hit data
54
Example: direct-mapped vs.
set-associative
address data
000 0101
001 1111
010 0000
011 0110
100 1000
101 0001
110 1010
111 0100
55
Direct-mapped cache behavior
56
Direct-mapped cache behavior,
cont’d.
57
Direct-mapped cache behavior,
cont’d.
58
2-way set-associtive cache
behavior
• Final state of cache (twice as big as
direct-mapped):
set blk 0 tag blk 0 data blk 1 tag blk 1 data
00 1 1000 - -
01 0 1111 1 0001
10 0 0000 - -
11 0 0110 1 0100
59
2-way set-associative cache
behavior
• Final state of cache (same size as direct-
mapped):
set blk 0 tag blk 0 data blk 1 tag blk 1 data
0 01 0000 10 1000
1 10 0111 11 0100
60
Example caches
• StrongARM:
• 16 Kbyte, 32-way, 32-byte block instruction
cache.
• 16 Kbyte, 32-way, 32-byte block data cache
(write-back).
• SHARC:
• 32-instruction, 2-way instruction cache.
61
Memory management units
logical physical
address memory address main
CPU management
memory
unit
62
Memory management tasks
63
Address translation
64
Segments and pages
page 1
page 2
segment 1
memory
segment 2
65
Segment address translation
physical address
66
Page address translation
page offset
page i base
concatenate
page offset
67
Page table organizations
page
descriptor
page descriptor
flat tree
68
Caching address translations
• Large translation tables require main
memory access.
• TLB: cache for address translation.
• Typically small.
69
ARM memory management
• Memory region types:
• section: 1 Mbyte block;
• large page: 64 kbytes;
• small page: 4 kbytes.
• An address is marked as section-mapped
or page-mapped.
• Two-level translation scheme.
70
ARM address translation
descriptor concatenate
1st level table
concatenate
descriptor
2nd level table physical address
71
CPU performance
72
Pipelining
• Several instructions are executed simultaneously at
different stages of completion.
• Pipelining greatly increases the efficiency of the
CPU.
• ARM 7 Processor has three stage pipeline.
• Fetch : the instruction is fetched from memory.
• Decode: the instruction’s opcode and operands are
decoded to determine what function to
perform.
• Execute : the decoded instruction is executed.
• Each of these operations requires one clock cycle for
typical instructions.
73
ARM pipeline execution
add r0,r1,#5
fetch decode execute
time
1 2 3
74
Performance measures
75
CPU power consumption
• Most modern CPUs are designed with
power consumption in mind to some
degree.
• Power is, energy consumption per unit
time.
• heat generation depends on power
consumption
• battery life depends on energy consumption.
76
CMOS power consumption
• Voltage drops: power consumption proportional to V2 ( power
supply voltage)
• Therefore, by reducing the power supply voltage, we
can significantly reduce power consumption.
• Toggling: CMOS circuit uses most of its power when it is
changing its output value.
• By reducing the speed at which the circuit operates, we can
reduce its power consumption
• Leakage: The only way to eliminate leakage current is to
remove the power supply when the circuit is not
active.
77
CPU power-saving techniques
78
Power management features
79
Power-down costs
• A power-down mode provides the opportunity to
reduce power consumption
• Going into a power-down mode costs:
• The power-down or power-up transition consumes time and
energy in order to control the CPU’s internal logic.
80
Strong ARM SA-1100 power saving
81
SA-1100 power state machine
Prun = 400 mW
run
10 ms
160 ms
90 ms
10 ms
90 ms
idle sleep
82
The CPU bus
83
Bus protocols
• Bus protocol determines how devices communicate.
• Devices on the bus go through sequences of states.
• Protocols are specified by state machines,
• The basic building block of most bus protocols is the
four-cycle handshake
• The handshake ensures that when two devices want to
communicate, one is ready to transmit and the other is
ready to receive.
• The handshake uses a pair of wires dedicated to the
handshake
• enq (enquiry) and
• ack ( acknowledge).
84
Four-cycle handshake
enq device 1
device 1 device 2
ack
device 2
1 2 3 4
time
85
Four-cycle handshake, cont’d.
86
Microprocessor busses
• Clock provides
synchronization.
• R/W is true when
reading (R/W’ is false
when reading).
• Address is a-bit bundle
of address lines.
• Data is n-bit bundle of
data lines.
• Data ready signals
when n-bit data is
ready.
87
Timing diagrams
88
Bus read
89
State diagrams for bus read
device
CPU start
90
Bus multiplexing
adrs
Adrs enable
91
DMA
• Direct memory access
(DMA) is a bus operation that
allows reads and writes not
controlled by the CPU.
• A DMA transfer is controlled by
a DMA controller,
• The bus request is an input
to the CPU through which DMA
controllers ask for ownership
of the bus.
• ■ The bus grant signals that
the bus has been granted to
the DMA controller.
92
Bus mastership
• By default, CPU is bus master and initiates transfers.
• DMA must become bus master to perform its work.
• CPU can’t use bus while DMA operates.
• A device that can initiate its own bus transfer is
known as a bus master.
• Bus mastership protocol:
• Bus request.
• Bus grant.
93
DMA operation
94
System bus configurations
bridge
bus.
• Fast devices on memory slow device
separate bus.
• A bridge connects high-speed
device
two busses.
95
ARM AMBA bus
• Two varieties:
• AHB is high-
performance.
• APB is lower-speed,
lower cost.
• AHB supports
pipelining, burst
transfers, split
transactions, multiple
bus masters.
• All devices are slaves
on APB.
96
Memory Devices
• Several different
types of memory:
• DRAM.
• SRAM.
• Flash.
• Each type of memory
comes in varying:
• Capacities.
• Widths.
97
Random-access memory
98
Static RAM
99
Read-only memory
100
Timers and counters
• Very similar:
• a timer is incremented by a periodic signal;
• a counter is incremented by an
asynchronous, occasional signal.
101
Watchdog timer
• Watchdog timer is periodically reset by
system timer.
• If watchdog is not reset, it generates an
interrupt to reset the host.
interrupt
102
Switch debouncing
• A switch must be debounced to multiple
contacts caused by eliminate mechanical
bouncing:
103
Encoded keyboard
• An array of switches is read by an
encoder.
• N-key rollover remembers multiple key
depressions.
row
104
LED
105
7-segment LCD display
106
Touchscreen position sensing
ADC
voltage
107
Digital-to-analog conversion
• Use resistor tree:
R
bn Vout
2R
bn-1
4R
bn-2
8R
bn-3
108
Flash A/D conversion
Vin
encoder
...
109
Debugging embedded systems
• Challenges:
• target system may be hard to observe;
• target may be hard to control;
• may be hard to generate realistic inputs;
• setup sequence may be complex.
110
Host/target design
target
system
serial line
host system
111
Host-based tools
• Cross compiler:
• compiles code on host for target system.
• Cross debugger:
• displays target state, allows target system to
be controlled.
112
Software debuggers
113
Breakpoints
• A breakpoint allows the user to stop
execution, examine system state, and
change state.
• Replace the breakpointed instruction with
a subroutine call to the monitor program.
114
In-circuit emulators
115
Logic analyzers
116
Program design and analysis
• Software components.
• Representations of programs.
• Assembly and linking.
117
Models of programs
• Source code is not a good representation for
programs:
• Compilers derive intermediate representations
to manipulate and optimize the program.
118
Data flow graph
• DFG: data flow graph.
• Does not represent control.
• Models basic block: code with no entry or
exit.
• Describes the minimal ordering
requirements on operations.
119
Data flow graph
x = a + b; a b c d
y = c - d;
+ -
z = x * y;
y1 = b + d; x
y
* +
single assignment form
z y1
DFG
120
Control-data flow graph
121
CDFG example
T
if (cond1) bb1(); cond1 bb1()
else bb2(); F
bb3(); bb2()
switch (test1) {
case c1: bb4(); break; bb3()
case c2: bb5(); break;
case c3: bb6(); break; c3
c1 test1
}
c2
bb4() bb5() bb6()
122
for loop
equivalent
123
Assembly and linking
124
Assemblers
• Major tasks:
• generate binary for symbolic instructions;
• translate labels into addresses;
• handle pseudo-ops (data, etc.).
• Generally one-to-one translation.
• Assembly labels:
ORG 100
label1 ADR r4,c
125
Symbol table
126
Program performance metrics
• Try to use registers efficiently.
• Make use of page mode accesses in the memory
system whenever possible.
• Analyze cache behavior to find major cache
conflicts.
127
Energy/power optimization
• In optimizing a program’s energy consumption is
knowing how much energy the program consumes.
• It is possible to measure power consumption for an
instruction or a small code
• Energy consumption varies somewhat from
instruction to instruction.
• The sequence of instructions has some influence.
• The opcode and the locations of the operands also
matter.
128
Measuring energy consumption
129
Efficient loops
• General rules:
• Don’t use function calls.
• Keep loop body small to enable local repeat
(only forward branches).
• Use unsigned integer for loop counter.
• Use <= to test loop counter.
• Make use of compiler---global optimization,
software pipelining.
130
Optimizing for program size
• Goal:
• reduce hardware cost of memory;
• reduce power consumption of memory units.
• Two opportunities:
• data;
• instructions.
131
Data size minimization
• Reuse constants, variables, data buffers in
different parts of code.
• Requires careful verification of correctness.
• Generate data using instructions.
132
Program validation and testing
• The two major types of testing strategies:
• Black-box methods generate tests
without looking at the internal structure of
the program.
• Clear-box (also known as white-box)
methods generate tests based on the
program structure.
133
Clear-box testing
• Examine the source code to determine whether
it works:
• Can you actually exercise a path?
• Do you get the value you expect along a path?
• Testing procedure:
• Controllability: rovide program with inputs.
• Execute.
• Observability: examine outputs.
134
Black-box test vectors
• Random tests.
• May weight distribution based on software
specification.
• Regression tests.
• Tests of previous versions, bugs, etc.
• May be clear-box tests of previous versions.
135
Hardware platform
• CPU.
• A/D converter.
• D/A converter.
• Timer.
• Tasks:
• spark control
• crankshaft sensing
• fuel/air mixture engine
• oxygen sensor controller
• Kalman filter
deadline
P1
time
initiating period
event aperiodic process
periodic process initiated
at start of period
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 148
Rate requirements on
processes
• Period: interval
between process
activations.
CPU 1 P11
• Rate: reciprocal of
period. CPU 2 P12
• Initiatino rate may be CPU 3 P13
higher than period--- CPU 4 P14
several copies of
process run at once. time
• A process can be in
one of three states: executing gets data
• executing on the CPU; gets and CPU
preempted
• ready to run; CPU needs
data
• waiting for data.
gets data
ready waiting
needs data
• Resource constraints
make schedulability
analysis NP-hard. P1 P2
• Assume:
• No resource conflicts.
• Constant process T1 T2 T3
execution times.
• Require: T
• T ≥ Si Ti
• Can’t use more than
100% of the CPU.
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 159
Hyperperiod
• Hyperperiod: least common multiple
(LCM) of the task periods.
• Must look at the hyperperiod schedule to
find all task interactions.
• Hyperperiod can be very long if task
periods are not chosen carefully.
• Schedule in time
slots. T1 T2 T3 T1 T2 T3
• Same process
activation P P
irrespective of
workload.
• Time slots may be
equal size or
unequal.
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 163
TDMA assumptions
• Schedule based on
least common
multiple (LCM) of
the process P1 P1 P1
periods.
P2 P2
• Trivial scheduler -
> very small PLCM
scheduling
overhead.
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 164
TDMA schedulability
• Always same CPU utilization (assuming
constant process execution times).
• Can’t handle unexpected loads.
• Must schedule a time slot for aperiodic
events.
• TDMA period = 10
ms. TDMA period 1.00E-02
• P1 CPU time 1 ms. P1
CPU time
1.00E-03
• P2 CPU time 3 ms. P2 3.00E-03
P3 2.00E-03
• P3 CPU time 2 ms. P4 2.00E-03
total 8.00E-03
• P4 CPU time 2 ms. utilization 8.00E-01
• Schedule process
only if ready.
• Always test T1 T2 T3 T2 T3
processes in the
same order.
• Variations: P P
• Constant system
period.
• Start round-robin
again after finishing
a round.
• A process can be in
one of three states: executing gets data
• executing on the CPU; gets and CPU
preempted
• ready to run; CPU needs
data
• waiting for data.
gets data
ready waiting
needs data
P3 ready t=18
P2 ready t=0 P1 ready t=15
P2 P1 P2 P3
0 10 20 30 40 50 60
time
© 2000 Morgan Overheads for Computers as
Kaufman Components 184
The scheduling problem
• Can we meet all deadlines?
• Must be able to meet deadlines in all cases.
• How much CPU horsepower do we need
to meet our deadlines?
memory
CPU 1 CPU 2
CPU 1 CPU 2
message message
message
period ti
Pi
computation time Ti
P1 P1 P1 P1 P1
P2 P2 P2
P3 P3
critical
instant
P4
P2 period
P2
P1 period
P1 P1 P1
0 5 10
time
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 208
RMS CPU utilization
• Utilization for n processes is
• S i Ti / ti
• As number of tasks approaches infinity,
maximum utilization approaches 69%.
• Data dependencies
allow us to improve
P1
utilization.
• Restrict combination
of processes that can
run simultaneously. P2
• P1 and P2 can’t run
simultaneously.
memory
CPU 1 CPU 2
CPU 1 CPU 2
message message
message
someClass
<<signal>>
aSig
<<send>>
p : integer sigbehavior()
applications
power
OS kernel management
device
drivers
ACPI BIOS
Hardware platform
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 241
ACPI global power states
• G3: mechanical off
• G2: soft off
• S1: low wake-up latency with no loss of context
• S2: low latency with loss of CPU/cache state
• S3: low latency with loss of all state except memory
• S4: lowest-power state with all devices off
• G1: sleeping state
• G0: working state
analog
time
ADPCM 3 2 1 -1 -2 -3
time
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 244
ADPCM coding
• Coded in a small alphabet with positive
and negative values.
• {-3,-2,-1,1,2,3}
• Minimize error between predicted value
and actual signal value.
S quantizer
inverse
integrator
quantizer
encoder
samples
inverse
integrator
quantizer
decoder
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 246
Telephone system terms
• Subscriber line: line to phone.
• Central office: telephone switching
system.
• Off-hook: phone active.
• On-hook: phone inactive.
1 Lights
Buttons*
1
Speaker*
sample() sample()
sample()
ring-indicator() pick-up()
Message
length
start-adrs
next-msg
samples
Incoming-message Outgoing-message
Activations?
Play OGM
Wait for timeout
Allocate ICM
Erase
Record ICM
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 256
Record-msg/playback-msg
behaviors
nextadrs = 0 nextadrs = 0
msg.samples[nextadrs] = speaker.samples() =
sample(source) msg.samples[nextadrs];
nextadrs++
F F
End(source) nextadrs=msg.length
T T
record-msg playback-msg
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 257
Hardware platform
• CPU.
• Memory.
• Front panel.
• 2 A/Ds:
• subscriber line, microphone.
• 2 D/A:
• subscriber line, speaker.
• Better cost/performance.
• Match each CPU to its tasks or use custom
logic (smaller, cheaper).
• CPU cost is a non-linear function of
performance.
cost
performance
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 262
Why multiprocessors? cont’d.
• Better real-time performance.
• Put time-critical functions on less-loaded
processing elements.
• Remember RMS utilization---extra CPU cycles
must be reserved to meet deadlines.
cost
deadline w.
deadline RMS overhead
performance
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 263
Why multiprocessors? cont’d.
• Using specialized
processors or custom
logic saves power.
• Desktop
uniprocessors are not
power-efficient [Aus04] © 2004 IEEE Computer Society
request accelerator
result
data
data
CPU
memory
I/O
• Single-threaded: • Multi-threaded:
P1
P1
P2 A1 P2 A1
P3
P3
P4
P4
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 282
Execution time analysis
• Single-threaded: • Multi-threaded:
• Count execution time • Find longest path
of all component through execution.
processes.
P1 P2
M1 M2
d1 d2
P3
M1 P1 P1C P2 P2C
M2 P3
time
M1 P1 P1C
M2 P2 P3
time
d1 d2
P3
4
Transmission time = 4
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 290
Initial schedule
M1 P1
M2 P2
M3 P3
network d1 d2
Time = 15
0 5 10 15 20 time
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 291
New design
• Modify P3:
• reads one packet of d1, one packet of d2
• computes partial result
• continues to next packet
M1 P1
M2 P2
M3
P3 P3 P3 P3
network d1d2d1d2d1d2d1d2
Time = 12
0 5 10 15 20 time
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 293
Buffering and performance
• Buffering may sequentialize operations.
• Next process must wait for data to enter
buffer before it can continue.
• Buffer policy (queue, RAM) affects
available parallelism.
• Three processes
separated by buffers:
B1 A B2 B B3 C
A[0] A[0]
A[1] B[0]
… C[0]
Must wait for
B[0] all of A before A[1]
B[1] getting any B B[1]
… C[1]
C[0] …
C[1]
… Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 296
Multiprocessors
• Consumer electronics systems.
• Cell phones.
• CDs and DVDs.
• Audio players.
• Digital still cameras.
• Multimedia: stored in
compressed form,
uncompressed on
viewing.
• Data storage and
management: keep track
of your multimedia, etc.
• Communication:
download, upload, chat.
• Most popular CE
device in history;
most widely used
computing device.
• 1 billion sold per year.
• Handset talks to cell.
• Cells hand off
handset as it moves.
Audio
CPU
memory
Jog
memory
Error Analog
display focus, drive
corrector out
tracking,
sled,
amp DAC Servo Analog head
motor
CPU in
I2S FE, TE, amp
memory
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 305
CD medium
• Rotational speed: 1.2-1.4 m/s (CLV).
• Track pitch: 1.6 microns.
• Diameter: 120 mm.
• Pit length: 0.8 -3 microns.
• Pit depth: .11 microns.
• Pit width: 0.5 microns.
• Laser wavelength: 780 nm.
track
detectors
diffraction
sled grating
laser
track
Overheads for Computers as 307
© 2008 Wayne Wolf Components 2nd ed.
Laser focus
Side spot
detectors F
A
Level:
D B A+B+C+D
Focus error:
C (A+C)-(B+D)
E Tracking error:
E-F
• Eight-to-fourteen modulation:
• Fourteen-bit code guarantees a maximum
distance between transitions.
00000011 00100100000000
Choose
Scale factor
mux
Filter
bank * requantize
0101..
Masking
FFT model
Scale
factor
demux inverse
quantize Inverse
0101.. * * filter
bank
expand
Step
size
Bayer pattern
PC Motion-estimator
memory[]
compute-mv()
:PC :Motion-estimator
compute-mv()
Search area memory[]
memory[]
macroblocks memory[]
search area
PE 0
network
PE 1
generator
comparator
Address
ctrl ...
Motion
vector
macroblock
network
PE 15
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 339
Pixel schedules
PE 0 PE 1 PE 2
|M(0,0)-S(0,0)|
M(0,0)
|M(0,1)-S(0,1)| |M(0,0)-S(0,1)|
PE
PE
communication link
network
PE
PEs may be CPUs or ASICs.
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 343
Networks in embedded
systems
initial processing
more processing
PE sensor
PE
PE actuator
PE 1 PE 2 PE 3
link 1 link 2
PE 1 PE 2 PE 3 PE 4
fixed A B C A B C
round-robin
A B C B C A
A,B,C A,B,C
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 353
Crossbar
out4
out3
out2
out1
in1 in2 in3 in4
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 354
Crossbar characteristics
• Non-blocking.
• Can handle arbitrary multi-cast
combinations.
• Size proportional to n2.
master 1 master 2
data line
SDL
clock line
SCL
slave 1 slave 2
SDL ...
SDL
+
SCL
multi-byte write
S adrs 1 data P
A B C
time
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 369
Ethernet packet format
A B C
Network 1 Network 2
P1 P2
• Computational • Communication
requirements: requirements:
• sum up process • Count all
requirements over transmissions in one
least-common multiple period.
of periods, average
over one period.
application application
presentation presentation
session session
transport IP transport
network network network
data link data link data link
physical physical physical
User
TCP UDP Datagram
Protocol
IP
Quickcam HTTP
server QuickCam
Java VM
Java nanokernel
486
• 11 bit destination
address.
• RTR bit determines
read/write from/to
destination.
• Any node can detect
bus error, interrupt
packet for
retransmission.
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 400
CAN controller
• Controller implements
physical and data link
layers.
• No network layer
needed---bus
provides end-to-end
connections.
floor
floor
floor
floor
floor
Hoistway 1 Hoistway 2
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 408
Theory of operation
• Each floor has control panel, display.
• Each car has control panel:
• one button per floor;
• emergency stop.
• Controlled by a single controller.
sensor
fine
coarse
1
Coarse-sensor*
Master-control-panel*
1 1
1 N 1
Fine-sensor* Car 1
1 1
1
1 Controller
Car-control-panel* 1
1
1 Floor F N
Floor-control-panel* 1 Motor*
Sensor* Car-control-panel*
hit: boolean Floors[1..F]: boolean
emergency-stop:
boolean
open-door, close-door:
Coarse-sensor* Fine-sensor* boolean
Master-control-panel...
Motor* Floor-control-panel*
speed: {o,s,f} up, down: boolean
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 414
Car and Floor classes
Car Floor
request-lights[1..F]:
up-light, down-light:
boolean
boolean
current-floor: integer
Controller
car-floor[1..H]: integer
emergency-stop[1..H]:
integer
scan-cars()
scan-floors()
scan-master-panel()
operate()
requirements
architecture
coding
testing
maintenance
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 426
Waterfall model steps
• Requirements: determine basic
characteristics.
• Architecture: decompose into basic
modules.
• Coding: implement and integrate.
• Testing: exercise and uncover bugs.
• Maintenance: deploy, fix bugs, upgrade.
system feasibility
specification
prototype
initial system
enhanced system
requirements
design
test
specify specify
architect architect
design design
build build
test test
requirements and
specification
architecture
integration
testing
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 432
Co-design methodology
• Must architect hardware and software
together:
• provide sufficient resources;
• avoid software bottlenecks.
• Can build pieces somewhat
independently, but integration is major
step.
• Also requires bottom-up feedback.
spec spec
spec
architecture HWSW
architecture
architecture
HW SW detailed
detailed
design
design
integrate integration
integration
test testtest
system hardware
software
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 435
Concurrent engineering
• Large projects use many people from
multiple disciplines.
• Work on several tasks at once to reduce
design time.
• Feedback between tasks helps improve
quality, reduce number of later design
problems.
• Used in telephone
on-hook
telecommunications
protocol design. caller goes
• Event-oriented state off-hook
machine model.
dial tone
caller gets
dial tone
i2
S3 S3
traditional OR state
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 447
Statechart AND state
sab
c
S1-3 S1-4 S1 S3
d
b a b a b a c d
c
S2-3 S2-4 S2 S4
d r
r r
S5
S5
traditional AND state
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 448
AND-OR tables
• Alternate way of specifying complex
conditions:
cond1 or (cond2 and !cond3)
cond1 T
OR -
cond2 - T
AND cond3 - F
state description b
c
outputs
d
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 451
TCAS top-level description
CAS
power-on power-off
Inputs:
TCAS-operational-status {operational,not-operational}
fully-operational
own-aircraft C
other-aircraft i:[1..30]
standby
mode-s-ground-station i:[1..15]
Outputs:
sound-aural-alarm: {true,false} aural-alarm-inhibit: {true, false}
combined-control-out: enumerated, etc.
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 453
CRC cards
• Well-known method for analyzing a
system and developing an architecture.
• CRC:
• classes;
• responsibilities of each class;
• collaborators are other classes that work with
a class.
• Team-oriented methodology.
front back
requirements
bug coding bug