Embedded
Embedded
Bernard Boigelot
E-mail
WWW
:
:
[email protected]
http://www.montefiore.ulg.ac.be/boigelot/
http://www.montefiore.ulg.ac.be/boigelot/courses/embedded/
References :
Chapter 1
Introduction
Embedded systems
Definition: An embedded system is a computer system used as a component of a more
complex entity.
Typical applications:
Advantages
Chapter 2
Hardware
Memory:
Static or dynamic RAM, ROM (EEPROM, FLASH, . . . ).
Either internal to the microcontroller, external, or integrated in a System on Chip
(SoC).
Parallel or serial interface.
Possibility of addressing peripherals in memory.
Communication buses.
9
Auxiliary components:
Power supply,
Clock generator,
Bus controllers,
...
10
11
Pinout:
RA2
RA3
RA4
MCLR
VSS
RB0
RB1
RB2
RB3
RA1
RA0
OSC1
OSC2
VDD
RB7
RB6
RB5
RB4
Description:
1N4001
7805
10F
10K
100nF
VDD
22K
9V
10K
MCLR
RB0
PIC 16F716
Reset
RA0
220R
RA1
RB1
LED
OSC1
22K
NTC
Test
RB2
4MHz
OSC2
VSS
Piezo
33pF
33pF
13
Bus topology.
Small number of communication lines.
Flexible configuration.
Mechanisms for addressing devices, managing transactions, for performing arbitration
and flow control.
14
The bus consists of a pair of two-way lines: SDA (Serial DAta) and SCL (Serial Clock).
The value of each line stays high whenever it is unused.
Each device connected to the bus can read the value of SDA and SCL, but is only able
to force them down, i.e., to write a low value.
VCC
SDA
SCL
Device 1
Device 2
15
I2C: Transactions
The master of a transaction is responsible for
When a transaction is in progress (i.e., between S and P), transitions of SDA are only
allowed when SCL is low.
Illustration:
SDA
SCL
16
During a transaction, the sender of data can either be the master or the slave.
The value of each bit of data sent on the bus corresponds to the value of SDA during a
low-to-high transition of SCL.
Data is exchanged in 8-bit groups, the most significant bit (MSB) being sent first.
Each group of 8 bits must be followed by an acknowledgment, represented by a low
value placed on SDA by the receiver.
If a group of bits is not acknowledged, then the master immediately aborts the
transaction, and the slave stops sending or receiving data.
17
I2C: Addressing
When a transaction is initiated, the master has to specify which device is the other
participant.
Principles:
The first 8 bits exchanged in a transaction are always sent by the master.
The first 7 bits of this group correspond to the address of the intended slave.
The 8th bit then specifies the direction of the following data transfer:
0 : The master is the sender;
1 : The master is the receiver.
Remark: The first group of 8 bits must thus be acknowledged by the addressed slave,
regardless of the data transfer direction.
18
I2C: Arbitration
It is possible to have several devices attempting to initiate transactions at the same time, by
generating simultaneous Start signals.
For detecting potential conflicts, each master constantly monitors the value of SDA when it
sends data. If the observed value differs from the sent one, then the master performing this
observation immediately and silently withdraws from the transaction.
Remarks:
A conflict can only be detected by the device that send a high value.
Transmitting simultaneously two exactly identical frames does not lead to a conflict!
19
SCL
released by the slave
20
VCC
Open-drain output
Digital input with pull-up
GND
Schmitt trigger
D/A
A/D
Analog output
Analog input
This feature makes is possible to build simple circuits in which the processor can interact
with a large number of peripherals.
21
22
Solution:
The screen and the keyboard are scanned: At a given time, one can only display a
single digit, or read a single column of keys.
The 8 remaining pins drive the screen segments during display and channel reading
phases (8 digital outputs), and are also able to the scan the keyboard (4 digital outputs
+ 4 digital inputs with pull-up).
23
Schematics:
MCU
24
Chapter 3
Interrupts
25
Introduction
An interrupt is a signal that requests the processor to temporarily suspend program
execution, in order to execute an interrupt routine.
Advantages:
26
timer expiration,
arithmetic or instruction exception,
software interrupt request,
...
27
28
Interrupt control
Some critical operations can never be interrupted. It is then necessary to temporarily
disable interrupts prior to their execution, and to enable them again afterwards.
Some processors allow to assign specific priorities to interrupts originating from different
sources. Such architectures generally provide a mechanism for disabling the interrupts
having a priority less than some specified threshold. Interrupt priorities are also used for
resolving simultaneous interrupt requests.
Enabling and disabling interrupts is performed by executing specific instructions, or by
setting the value of dedicated registers.
Notes:
29
When an interrupt request is received, the processor sets interrupt flags, in order to
trigger the interrupt as soon as it becomes enabled. Interrupt flags have to be cleared
explicitly by the interrupt routine.
30
The context is either saved on the execution stack or in a specific memory area.
Some processors automatically save the context (either totally or in part) when an
interrupt is triggered.
31
Context save and restore operations can sometimes be simplified by using dedicated
instructions.
32
Programming interrupts
The compilers aimed at embedded applications provide language extension mechanisms
for programming interrupts without going down to assembly language.
Enabling and disabling interrupts is performed with the help of macros or specific
compilation directives (i.e., enable()/disable(), critical keyword).
It is sometimes necessary to inform the compiler than the value of a variable can be
modified by interrupt routines, in order to prevent incorrect optimizations (e.g.,
volatile keyword in C).
33
Notes:
Carrying out the comparison between the two measurements in a single C instruction
does not solve the problem:
...
void controller(void)
{
for (;;)
if (temp[0] != temp[1]) !! sound the alarm;
}
...
35
Correct solution:
The instructions that read the measurements sent by the interrupt routine to the controller
form a critical section, the execution of which cannot be interrupted.
36
Other solution:
37
Notes:
38
Improved solution:
#define MAX_FIFO 10
/* Must be even ! */
static volatile int temp_fifo[MAX_FIFO];
static volatile int first = 0;
static int last = 0;
interrupt void measure(void)
{
/* If the buffer is not saturated */
if (!((first + 2 == last)
|| (first == MAX_FIFO - 2 && last == 0)))
{
temp_fifo[first] = !! first measurement ;
temp_fifo[first + 1] = !! second measurement ;
first += 2;
if (first == MAX_FIFO)
first = 0;
}
else !! discard measurements;
}
void controller(void)
{
int temp0, temp1;
for (;;)
if (first != last)
/* If the buffer is not empty */
{
temp0 = temp_fifo[last];
temp1 = temp_fifo[last + 1];
last += 2;
if (last == MAX_FIFO)
last = 0;
if (temp0 != temp1) !! sound the alarm;
}
}
39
Note: For this solution to be correct, it is necessary that the instruction last += 2
executes atomically.
This kind of solution is thus very sensitive to implementation details!
In practice, disabling interrupts during communications with interrupt routines is acceptable
in most situations. The more complex solutions are used only when disabling interrupts is
impossible or forbidden.
40
Interrupt latency
The delay between an interrupt request I and the end of execution of urgent operations in
an interrupt routine RI is called the response time, or latency of the interrupt.
This latency is influenced by four parameters:
1. The longest interval during which interrupts of priority larger or equal to I are disabled.
2. The time needed for executing the interrupt routines with a higher priority than RI .
3. The maximum delay between an interrupt trigger and the branch to the corresponding
interrupt routine.
4. The time spent in RI before having executed the urgent operations.
41
42
Example
A system implements the following interrupt routines, sharing the same priority.
Name
I1
I2
I3
Description
Temperature measurement
Timer expiration
Network I/0
Execution time
100 s
200 s
300 s
Period
500 s
1000 s
> 1000 s
The main program disables interrupts during resp. 200 s and 250 s for exchanging
data with I1 and I2.
The time needed for triggering I3 and executing the corresponding urgent operations is
equal to 100 s.
43
Answer:
It is sufficient to study the system during an interval of length equal to 1000 s. The highest
possible latency is obtained with the following delays:
Notes:
Only the largest interval in which interrupts are disabled has to be taken into account!
44
IRQ1
enable()
I3
I2
I1
Main program
100
200
300
400
500
600
700
800
900
1000
(s)
45
Chapter 4
Software architectures
46
}
if (!! task 2 is ready )
{
!! operations of task 2 ;
}
..
.
if (!! task n is ready )
{
!! operations of task n;
}
}
}
47
Advantages:
Drawbacks:
The worst-case latency of an external request is equal to the execution time of the
entire main loop.
48
Example (multimeter):
#include "types.h"
#include "multimeter.h"
static UINT1 phase = 0; /* 03: display, 4: keyboard, 5: channels */
static UINT1 display_content[4];
static SINT4 measures[4];
static keyboard_state
keys;
static multimeter_state parameters;
void main(void)
{
!! initialize global data ;
for (;;)
{
switch (phase)
{
case 4:
handle_keyboard();
if (keys.new_keypress)
{
keypress_action();
keys.new_keypress= 0;
}
break;
case 5:
handle_channels();
update_display_content();
break;
default:
handle_display();
}
if (++phase > 5)
phase = 0;
}
}
49
void handle_display(void)
{
UINT1 digit, segments;
!! PORTA: 4 digital outputs ;
!! PORTB: 8 digital outputs ;
digit
}
void handle_keyboard()
{
static UINT1 column = 0;
UINT1 row;
!! PORTA: 4 digital outputs ;
!! PORTB: 4 digital outputs (low nibble),
!!
4 digital inputs with pull-ups (high nibble) ;
50
out(PORTA, 0);
out(PORTB, 1 < < column);
row = in(PORTB) > > 4;
!! update keys according to the content of row;
if (++column >= 4)
column = 0;
}
void keypress_action()
{
!! update parameters according to the key that has
!! been pressed (specified in keys) ;
}
void update_display_content()
{
!! update display_content according to the values in
!! measures and parameters;
}
52
void main(void)
{
for (;;)
{
if (ready1)
{
!! non-urgent operations of task 1;
ready1 = 0;
}
if (ready2)
{
!! non-urgent operations of task 2 ;
ready2 = 0;
}
..
.
if (readyn)
{
!! non-urgent operations of task n;
readyn = 0;
}
}
}
53
Advantage: The urgent operations take priority over the non-urgent ones.
Round-robin
Urg. 1
Task 1
Urg. n
Task n
Urgent n
Task 1
Task 2
Task n
Drawbacks:
The non-urgent tasks share the same effective priority. This yields high latencies when
at least one task has a large execution time (e.g., raster generation in laser printers).
Important note: Moving non-urgent operations from tasks to interrupt routines is
not a good solution!
54
Indeed,
performing non-urgent operations in an interrupt routine increases the latency of
interrupts with a lower or equal priority;
interrupts do not offer flexible synchronization mechanisms.
Data exchange operations between interrupt routines and tasks have to be correctly
implemented (cf. Chapter 3).
55
UART
CPU
UART
Principles:
Incoming bytes are signaled by interrupt requests, which must be answered as soon
as possible (before the next received byte).
When a UART is ready to send a byte on its output line, it requests an interrupt. The
processor is then free to wait for an arbitrarily long time before providing this byte.
56
Solution:
#include "types.h"
#include "fifo.h"
#include "filter.h"
static volatile BOOL uart1_ready, uart2_ready;
static volatile fifo rx1, tx1,
rx2, tx2;
interrupt void uart1_rx(void)
{
char byte;
byte = !! reception from UART1;
fifo_put(rx1, byte);
}
interrupt void uart2_rx(void)
{
char byte;
byte = !! reception from UART2 ;
fifo_put(rx2, byte);
}
interrupt void uart1_ready_to_send(void)
{
uart1_ready = 1;
}
interrupt void uart2_ready_to_send(void)
{
uart2_ready = 1;
}
57
void main(void)
{
!! initialize global data ;
!! initialize interrupt vectors ;
enable();
for (;;)
{
if (fifo_content_size(rx1) >= FILTER_THRESHOLD)
{
!! remove data from rx1;
!! filter ;
!! add the result to tx2;
}
if (fifo_content_size(rx2) >= FILTER_THRESHOLD)
{
!! remove data from rx2;
!! filter ;
!! add the result to tx1;
}
if (uart1_ready && !fifo_is_empty(tx1))
{
char byte;
byte = fifo_get(tx1);
disable();
!! send byte to UART1;
uart1_ready = 0;
enable();
}
58
Notes:
Attempting to add data to a saturated FIFO buffer cannot be a blocking operation (i.e.,
it must instead discard data).
59
The functions for handling FIFO buffers must execute correctly both in the interrupt
routines and in the main code.
Example of implementation:
60
In the same way as the round-robin with interrupts architecture, the operations are
partitioned into urgent and non-urgent tasks.
Interrupt routines perform urgent operations, and then place in a waiting queue
requests for executing non-urgent tasks.
The main program retrieves execution requests from the queue and calls the
corresponding functions. These requests are not necessarily processed in FIFO order.
(For instance, different selection priorities can be assigned to non-urgent tasks.)
61
Illustration:
#include "queue.h"
static volatile queue waiting_queue;
interrupt void urgent1(void)
{
!! urgent operations of task 1;
!! add task1 to waiting_queue;
}
interrupt void urgent2(void)
{
!! urgent operations of task 2;
!! add task2 to waiting_queue;
}
..
.
interrupt void urgentn(void)
{
!! urgent operations of task n;
!! add taskn to waiting_queue;
}
62
void main(void)
{
!! initialize waiting_queue with an empty content ;
for (;;)
{
while (!queue_is_empty(waiting_queue))
{
!! extract a function from waiting_queue;
!! execute this function;
}
}
}
void task1(void)
{
!! non-urgent operations of task 1;
}
void task2(void)
{
!! non-urgent operations of task 2 ;
}
..
.
void taskn(void)
{
!! non-urgent operations of task n;
63
Advantage: The latency of a non-urgent high-priority task can become smaller that the
execution time of all the non-urgent operations.
Drawbacks:
The maximum latency of a non-urgent task is still at least as large as the execution
time of the slowest task.
64
Urgent operations are performed by interrupt routines. Those are able to signal to
other tasks that non-urgent operations are ready to be carried out.
The non-urgent tasks are invoked dynamically rather than in a predefined order. The
responsibility of calling tasks is assigned to the operating system, implemented as an
additional software component.
The operating system is able to suspend the execution of a task before its completion,
in order to transfer the processor to another task.
The signals exchanged between tasks are handled by the operating system, instead of
being implemented with shared variables.
65
Illustration:
#include "signal.h"
interrupt void urgent1(void)
{
!! urgent operations of task 1;
!! send signal 1;
}
interrupt void urgent2(void)
{
!! urgent operations of task 2 ;
!! send signal 2 ;
}
..
.
void task1(void)
{
!! wait for signal 1;
!! non-urgent operations of task 1;
}
void task2(void)
{
!! wait for signal 2 ;
!! non-urgent operations of task 2 ;
}
..
.
void main(void)
{
!! initialize the operating system;
!! create and enable tasks;
!! start task sequencing ;
66
Advantages:
One can easily combine low-latency operations together with long computations.
Round-robin with interrupts
Operating system
priority
Urgent 1
Urgent 2
Urgent 1
Urgent 2
Urgent n
Task 1
Urgent n
Task 2
Task 1
Task 2
Task n
Task n
67
The system is efficient: When a non-urgent task is waiting for a signal, the processor
remains available for other computations.
The structure of the system is robust: New features can easily be added without
affecting the latencies of urgent operations or of high-priority tasks.
68
Drawbacks:
The system is complex (but this complexity is mainly located in the operating system,
which can be reused over many projects).
The operating system consumes some amount of system resources (a typical figure is
2 to 4 % of the instructions executed by the processor).
69
Summary
Task priorities and latencies:
Architecture
Available priorities
Maximum latency
round-robin
none
round-robin
with
interrupts
interrupt routines;
all tasks share the
same priority
waiting queue
interrupt routines,
then tasks
execution time of
the longest task
+ interrupt routines
operating
system
interrupt routines,
then tasks
execution time of
interrupt routines
70
Architecture
Robustness against
modifications
Complexity
round-robin
poor
very simple
round-robin
with
interrupts
waiting queue
fair
operating
system
very good
quite complex
71
Chapter 5
72
Introduction
An operating system (OS) is a software component responsible for coordinating the
concurrent execution of several tasks, by
73
74
Execution levels
At a given time, the instruction currently executed by the processor can either be
75
The processes
Each task managed by an OS is represented by a process. At a given time, a process is in
one out of five possible states:
Dormant: The task is not scheduled (e.g., because it is not yet known to the OS).
Executable: The task is ready to execute instructions, but is not currently running.
Active: The instructions of the task are now being executed by the processor.
Blocked: The execution of the task is suspended while waiting for a signal, or for a
resource to become available.
76
Blocked
Dormant
Executable
Active
Interrupted
77
The scheduler
The scheduler is the kernel component responsible for managing the state of the
processes, i.e., for assigning the processor to the processes.
Principles:
The scheduler always assigns the processor to the non-dormant and non-blocked task
that has the highest priority.
If several tasks share the highest priority, then the conflict can be solved in several ways:
The time slicing approach consists in assigning the processor in turn to each of these
tasks, in order to execute a bounded sequence of instructions.
78
One can alternatively assign the processor to a task that has been arbitrarily chosen.
Another solution is to forbid different tasks to share the same priority.
Note: With the first two strategies, computing the deadline of a task becomes difficult. Most
real-time operating systems thus implement the third solution.
79
Preemption
If a task T 2 has a higher priority than the active task T 1 and switches from the blocked to
the executable state, then there are two possible scheduling strategies:
The task T 2 remains suspended (in executable state) until completion of T 1. The
scheduler is said to be non-preemptive.
T1
Interrupt
routine
T2
80
Drawback: The latency of a task is influenced by the behavior of tasks with a lower
priority.
The scheduler turns the task T 1 executable, and assigns the processor to T 2. The
scheduler is said to be preemptive.
T1
Interrupt
routine
T2
81
Context switching
The scheduler performs a context switch when it transfers the processor from a process to
another.
Principles:
The suspended task must be able to resume its execution later. The state of the
processor thus has to be saved when the task is suspended.
The kernel memory maintains for each non-dormant process a context storage area
for this purpose.
82
Illustration:
Kernel
T1
T2
T1
T2
..
83
The working data of the suspended task has to be preserved until its execution can be
resumed.
This data is located on the runtime stack of the task, which contains
the context (return address, stack register values) of the active function calls, and
the arguments and local variables of these function calls.
84
Example:
f(int
{
int
..
c =
..
}
g(int
{
int
..
e =
..
}
..
a, int b)
f call
c;
context
a, b
g(a);
c
g call
context
d)
e, f;
e, f
B
g call
context
g(f);
d
e, f
SP
85
Notes:
Since a task can be suspended at any time, it is necessary for each process to
manage its own stack.
In general the stack pointers (e.g., top of stack, base of current stack frame) are
particular processor registers. Those pointers are therefore saved, together with
the other registers, during a context switch.
The kernel also manages its own stack.
86
Reentrancy
With a preemptive scheduler, calling the same function in different tasks can be
problematic.
Example:
static int aux;
void swap(int *p1, int *p2)
{
aux = *p1;
*p1 = *p2;
*p2 = aux;
}
swap(&x1, &y1)
aux 1
x1 2
swap(&x2, &y2)
aux
aux 2
x1
y1
x2
y2
x2 3
y2 2
y1 2
87
Reentrant function:
void swap(int *p1, int *p2)
{
int aux;
aux = *p1;
*p1 = *p2;
*p2 = aux;
}
Non-reentrant function:
volatile int is_new;
void display(int v)
{
if (is_new)
{
printf(" %d", v);
is_new = 0;
}
else
printf(" ---");
}
88
89
The semaphores
A semaphore s is an object that
if at least one task is suspended as the result of an operation wait(s), make one
of them become executable;
91
Notes:
The operations that test and modify the value of a semaphore must be implemented
atomically.
Binary semaphores are semaphores with a value restricted to the set {0, 1}.
There are several possible strategies for selecting a task blocked on a semaphore in
order to make it executable again: arbitrary choice, FIFO policy, priorities, . . .
void task2(void)
{
for (;;)
{
wait(s);
!! critical section;
!! critical section;
signal(s);
signal(s);
!! other operations ;
!! other operations ;
}
}
}
}
92
The maximum capacity of a queue (i.e., the maximum number of messages that have
been written and not yet read) and the size of each message are fixed.
Variants:
Several data access policies are possible: FIFO order, arbitrary selection, priority
mechanism.
93
Sending data to a saturated message queue can either discard the new message,
block the sender, block the sender during a bounded amount of time, . . .
When a task is blocked waiting for data from an empty queue, a maximum suspension
delay (i.e., a timeout) can be specified.
94
First rule:
95
Indeed, if this rule is not respected, then an interrupt routine can get suspended, which
amounts to assigning to this interrupt routine an effective priority smaller than the one
of a task.
Example:
T1
Interrupt
routine
T2
T 1 is suspended
T 1 is resumed
T3
T 1 becomes active
End of interrupt
96
Moreover, the interrupt routine might get called again before its completion. If this
routine is not reentrant, then erroneous behaviors are possible (e.g., overwriting a
saved processor context).
T1
Interrupt
routine
T2
T 1 is suspended
Reentrant call
End of interrupt
T 1 is resumed
End of interrupt
97
Second rule:
If an interrupt routine calls an OS service that can lead to a context switch, then
the scheduler must be informed that this service call is performed inside an
interrupt routine.
If this rule is not respected, then the scheduler can suspend the execution of an interrupt
routine.
Example:
T1
Interrupt
routine
T2
T 1 is preempted
End of interrupt
t
98
Solution: At the beginning and at the end of each interrupt routine programmed by the
user, special OS services must be called in order to inform the kernel that the processor is
currently executing an interrupt routine.
Notes:
In the case of many levels (i.e., priorities) of interrupts, those services must handle
correctly nested interrupt routine calls.
Interrupt latencies are increased by the time needed for executing the notification
services.
99
Example:
T1
Interrupt
Interrupt request routine
Kernel
Possible preemption
T 1 is preempted
T2
Context switch
End of interrupt
100
Note: An interrupt routine containing explicit instructions for saving the processor state
must perform the corresponding restore operations before informing the kernel that the
interrupt routine is about to end.
101
Time-oriented services
The real-time operating systems offer timed services, for instance for suspending a task for
a predefined amount of time.
Principles:
102
Note: The precision is limited. Asking to suspend the task during k ticks only ensures that
the suspension delay is greater or equal to k 1 times the clock period.
Example:
Periodic interrupt
Higher-priority tasks
Timed task
delay(1)
delay(1)
103
Chapter 6
104
The following operations need to be efficient (i.e., ideally, to have a maximum execution
time that is independent from the number of tasks managed by the operating system):
Identifying the executable process with the highest priority in order to make it
active.
Performing a context switch.
Selecting the process that has to be unblocked following an operation over a
communication object.
105
The context of the task, i.e., the state of the processor, saved when the task was last
suspended.
Notes:
This context contains, in particular, a pointer to the runtime stack of the process.
Some operating systems (e.g., COS-III) save the bulk of the context on this stack.
106
Pointers linking the TCB to the global data structures of the kernel.
Auxiliary data aimed at speeding up some operations (e.g., values derived from the
task priority).
Additional information managed by the user (e.g., configuration data for a peripheral
controlled by the task).
107
A structure for identifying efficiently the executable process with the highest priority.
An index for accessing directly a task control block from its corresponding process
identifier.
108
Example: COS-III
The set of processes suspended for some delay is represented by a hash table
OSCfg_TickWheel[], indexed by their deadline.
109
Managing a list of free TCB is avoided by letting the user code allocate TCB when
tasks are created.
A global counter OSIntNestingCtr keeps tack of the current interrupt nesting level.
110
OSPrioTb[]
OSRdyList[]
OSTCBCurPtr
0
1
2
14
42
OS TCB
(task 14-1)
OS TCB
(task 42)
OS TCB
(task 14-2)
14
14
42
stack of task 42
111
Note: Identifying quickly the executable process with the highest priority is achieved by
112
The scheduler
The scheduler is implemented by a kernel function called after each operation that can
potentially influence the state of processes:
This function must be kept simple and efficient, and only performs the following operations:
1. Checking whether the scheduler is allowed to preempt tasks.
2. Identifying the executable task with the highest priority.
3. Performing a context switch in order to assign the processor to this task.
113
preempting the current task should be prevented inside interrupt routines (cf.
Chapter 5).
114
Context switching
The main issue for implementing context switching is to be able to save and restore all the
processor registers.
For many processors, these operations are automatically performed (totally or in part)
during interrupts:
When an interrupt routine is called: The current value of the registers is saved on the
runtime stack of the interrupted task.
When an interrupt routine returns: The values extracted from the current stack are
loaded into the processor registers.
115
Notes:
This approach avoids the need to store the entire state of the processor into a TCB.
Preserving the state of the processor between a kernel service call and the
subsequent context switch can be tricky.
The user can sometimes define a hook function that will be called at every context
switch (a typical application is to put a peripheral in sleep mode).
116
The runtime stack of a new process is allocated by the task that asks this process to
be created.
The initial processor context of a new task, including its entry point, is built when its
stack is initialized.
One must take care of removing references to a task that is destroyed from the
structures managing communication objects.
117
The scheduler can be more efficient, since it does not has to check whether there
exists at least one executable task.
In the case of a mobile system, an idle task can put the processor and some
peripherals in sleep mode in order to conserve energy.
118
Time management
Quantitative time management is performed by the clock interrupt routine.
Principles: At each clock tick:
One computes the set of suspended tasks that must become executable again.
One increments a counter aimed at measuring elapsed time.
Notes:
The maximum execution time of the clock interrupt routine depends on the mechanism
used for waking up tasks.
With the help of suitable data structures, this time can become equal to the number of
tasks becoming executable.
It is often necessary to configure and calibrate the clock interrupt source during
initialization.
119
Example: COS-III
The time management operations are not directly performed in the clock interrupt
routine, but in an internal task OS_TickTask() woken up by this routine.
Advantage: Some user-defined tasks can have a higher priority than the time
management operations.
The deadline of the tasks suspended to a timeout is expressed with respect to a global
clock counter.
A hash table OSCfg_TickWheel[] stores pointers to the TCB of those tasks, indexed
by their deadline. Tasks sharing the same table entries are sorted in increasing
deadline order.
120
The operations of the host operating system are suspended when the real-time OS is
started, and get the processor back when the RTOS terminates (e.g., COS-III).
The host operating system is seen as special task that has a lower priority than all the
tasks managed by the real-time OS (e.g., RTAI).
For implementing this approach, it is necessary to ensure that the host OS can never
disable the interrupts managed by the kernel or the real-time tasks.
122
Chapter 7
Scheduling problems
123
Priority inversion
Priority inversion happens when a task is suspended waiting for a resource controlled by
another task with a lower priority.
T1
Example:
wait( s)
T 1 is preempted
T2
T 2 is preempted
T3
T 3 is suspended
wait( s)
T 2 terminates
T 3 is resumed
signal( s)
124
Problem:
In such a situation, the effective priority of T 3 becomes equal to the one of T 1.
Solution:
The priority of T 1 can be momentarily increased (becoming equal to that of T 3) during all
the time that T 3 is suspended waiting for the semaphore acquired by T 1.
This priority inheritance mechanism is automatically applied by some operating systems.
125
Illustration:
T1
wait( s)
T2
T3
wait( s)
signal( s)
Priority = 3
126
127
128
Definitions:
The response time of an execution request for i is the delay between this request and
the end of the corresponding execution of this task.
A critical instant for the task i is an occurrence of an execution request for i that
leads to the largest possible response time for this task.
A critical zone for i is an interval of duration T i that starts at a critical instant (for i).
129
Theorem 1: A critical instant for i occurs when an execution request for this task coincides
with requests for executing all the tasks that have a higher priority than i.
Proof: Assume that an execution request for i occurs at t = t1, and that an execution
request for a higher-priority task j is received at t = t2.
t1
t2
t2 + C j
t2 + T j
t1 + T i
Ti
Advancing the request for j from t2 to t1 can never decrease the response time of i
The same reasoning can be applied to all the tasks that have a higher priority than i.
130
Schedulable tasks
Definition: A set of tasks is schedulable (with respect to a given assignment of priorities) if
the response time of each task i is always less than or equal to its period T i.
Thanks to Theorem 1, checking whether a given set of tasks is schedulable reduces to
simulating the scheduling strategy in the particular case of simultaneous execution
requests for all tasks at t = 0.
Examples: Consider two tasks 1 and 2, with T 1 = 2, T 2 = 5, C1 = 1 and C2 = 1.
t
0
5
t
Critical zone
131
The tasks are schedulable, and remain schedulable even if the execution time of 2 is
increased by one time unit (C2 = 2):
1
t
0
5
t
Critical zone
t
0
5
t
Critical zone
Rate-Monotonic Scheduling
In the previous example, the best strategy was to assign the highest priority to the task that
has the smallest period.
Definition: Given a set of tasks 1, 2, . . . , n with respective periods T 1, T 2, . . . , T n, the
Rate-Monotonic Scheduling (RMS) strategy consists in assigning distinct priorities
P1, P2, . . . Pn to the tasks, such that for all i, j:
T i < T j Pi > P j .
The following result establishes that the RMS strategy is optimal:
Theorem 2: If a set of tasks is schedulable with respect to some priorities assignment, then
it is schedulable as well with respect to priorities defined by the RMS strategy.
133
Proof: Consider a set of tasks 1, 2, . . . , n for which there exists a priorities assignment
P1, P2, . . . Pn that makes them schedulable.
Let i and j two tasks with adjacent priorities Pi and P j, such that Pi > P j.
If T i > T j, then the priorities of i and j can be swapped:
i
j
0
Tj
Ti
U=
n
X
C
i=1
Ti
135
Notes:
A set of tasks that has a processor load factor less than 1 is not necessarily
schedulable:
Example:
1 : T 1 = 5, C1 = 2
2 : T 2 = 7, C2 = 4
1
)
U=
2 4
+ 97%
5 7
t
7
t
136
UL
0%
137
The best lower bound U L on the processor load factor of the sets of tasks that fully use the
processor is such that:
If the processor load factor of a set of tasks is less than or equal to U L, then this set of
tasks is schedulable (regardless of the periods and execution times of the tasks!).
If the processor load factor of a set of tasks is greater than U L, then this set of tasks
may or may not be schedulable, depending on the details of the tasks.
138
lT m
C1
1
C2
2
t
T1
T2
T
C1 T 2 T 1 2 .
T1
139
&
'
T
C2 = T 2 C1 2 .
T1
It follows that the highest possible processor load factor is equal to
U =
C1 C2
+
T1 T2
&
1
1 T2
T1 T2 T1
= 1 + C1
&
'!
.
'
1
1 T2
0.
T1 T2 T1
Therefore, for given values of T 1 and T 2, the maximum processor load factor
decreases with C1.
U
1
1 + C1
1
T1
1
T2
l T m
2
T1
0
C1
T2 T1
jT k
2
T1
140
C1
1
C2
2
t
T1
T2
%
T2
C1 > T 2 T 1
.
T1
For a given value of C1, the largest possible value of C2 is given by
$
C2 = (T 1 C1)
T2
.
T1
$ %
$ %!
1
1 T2
T1 T2
U=
+ C1
.
T2 T1
T1 T2 T1
141
For given values of T 1 and T 2, this expression increases with C1, since
$ %
1 T2
1
0.
T1 T2 T1
U
T1
T2
jT k
2
T1
+ C1
1
T1
1
T2
j T k
2
T1
0
T2 T1
jT k
2
T1
T1
C1
142
Summary:
U
T1
T2
jT k
2
T1
+ C1
1
T1
1
T2
j T k
2
T1
0
T2 T1
jT k
2
T1
T1
C1
The smallest value of U corresponds to the boundary between the two cases, where we
have
T
C1 = T 2 T 1 2 .
T1
By introducing this value in the expression of U , one obtains
%!
T
T1 T2
1
1 T2
+ T2 T1 2
T2 T1
T1 T1 T2 T1
$ %
$ %
$ %
T1 T2
T2
T2
T1 T2 2
=
+
2
+
.
T2 T1
T1
T1
T2 T1
%!
U =
143
$
Let us define I =
T2
T
T
and f = 2 2 .
T1
T1
T1
I2
I
+ (I + f ) 2I +
U =
I+ f
I+ f
1 f
= 1 f
.
I+ f
The smallest possible value of U is obtained with I = 1. We then have
U =1 f
1 f
,
1+ f
and
dU
f2 + 2f 1
=
.
2
df
(1 + f )
2 2
U L = 1 ( 2 1) = 2( 2 1) 0.83.
2
144
This sufficient criterion is independent from the periods and execution times of the
tasks.
C1 C2
+
1 !) are thus
T1 T2
145
U L: Case of n tasks
The goal is now to compute the value of U L
C1 = T 2 T 1,
C2 = T 3 T 2,
...
Cn1 = T n T n1,
Cn = T n 2(C1 + C2 + Cn1)
= 2T 1 T n.
C1
C2
Cn1 Cn C1
C2
Cn1
t
T1
T2
T 3 T n1
Tn
147
If C1 = T 2 T 1 + , with > 0.
We modify the execution time of tasks in the following way:
C10 = C1 ,
C20 = C2 + ,
C30 = C3,
...
0
Cn1
= Cn1,
Cn0 = Cn.
T2 +
C1
C2
C1
C2
T1 T2
C10
C20
C10
C20
148
After the modification, the new set of tasks still fully uses the processor. However, the
processor load factor now becomes
U0 = U
+
< U,
T1 T2
If C1 = T 2 T 1 , with > 0.
We now modify the execution time of tasks as follows:
C100 = C1 + ,
C200 = C2,
C300 = C3,
...
00
Cn1
= Cn1,
Cn00 = Cn 2.
149
T2
C1
0
C100
n C 1 n C 2
C2
C200
T1
n C100
n n
T2
C200
The resulting set of tasks fully uses the processor. The processor load factor becomes
2
0
U =U+
.
T1 Tn
Since we have by hypothesis T n < 2T 1, this property contradicts U 0 < U .
C2 = T 3 T 2,
C3 = T 4 T 3,
...
Cn1 = T n T n1.
150
Cn = T n 2(C1 + C2 + Cn1).
Corollary: For each set of tasks that satisfies the hypotheses of Lemma 1, the processor
load factor is equal to
T2 T1 T3 T2
T n T n1
+
+ +
U =
T1
T2
T n1
2T T n
+ 1
Tn
T
T
Tn
T
= 2 + 3 + +
+ 2 1 n.
T1 T2
T n1
Tn
151
T
For each i = 1, 2, . . . , n 1, let us define qi = i+1 . We then have
Ti
2
U = q1 + q2 + + qn1 +
n,
q1q2 qn1
and thus for each i,
q q qi1qi+1 qn1
U
.
=12 1 2
2
qi
(q1q2 qn1)
The best lower bound U L of U therefore corresponds to
U
= 0
qi
1
1
2
.
= 0.
qi q1q2 qn1
152
2
qi =
,
q1q2 qn1
hence
q1 = q2 = = qn1 =
1
n
2 .
U L = (n
1
1)2 n
= (n
1
1)2 n
2
n1
2 n
1
2n
n
n
= n(2 n 1).
We thus have the following result:
Theorem 4: If the periods T 1, T 2, . . . , T n of a set of n tasks are such that
1
n(2 n
In the hypotheses of Theorem 4, the constraint over the task periods is actually not
necessary:
Theorem 5: If a set of n periodic tasks has a processor load factor that is less than or equal
1
n
to n(2 1), then this set of tasks is schedulable.
Proof: Let 1, 2, . . . , n be tasks with respective periods and execution times T 1, T 2, . . . , T n
and C1, C2, . . . , Cn. We assume that this set of tasks fully uses the processor.
%
Tn
and
If there exists i {1, 2, . . . , n 1} such that 2T i T n, then we define q =
Ti
r = T n qT i (we thus have q > 1 and r 0).
$
We replace i by 0i with the period T i0 = qT i and the execution time Ci0 = Ci.
We replace n by 0n, with the period T n0 = T n and an execution time Cn0 chosen so as to
fully use the processor.
154
Ci
Ci
Ci
Ci
t
0
Ti
2T i
Ci0 = Ci
(q 1)T i
qT i
Tn
T i0
In the critical zone of n, the amount of execution time used by i and leaved unused by 0i
is at most equal to (q 1)Ci. Therefore, one has
Cn0 Cn (q 1)Ci.
After modifying the set of tasks, the processor load factor U 0 becomes equal to
Ci0
C (q 1)Ci
U0 U + 0 i +
Ti Ti
Tn
where U is the processor load factor of the initial set of tasks.
One then obtains
!
1
1
q
1
U 0 U + Ci
+
.
qT i T i
Tn
155
1
1 q1
1 q1
1
+
+
qT i T i
Tn
qT i T i
qT i
0.
As a consequence, we have U 0 U . This implies that our modification of the set of tasks
did not increase the processor load factor.
By repeatedly performing such a modification, one eventually obtains a set of tasks to
which Theorem 4 can be applied.
156
dU L
ln 2 1
= 1
2n 1
dn
n
= (1 x)e x 1,
by defining x =
ln 2
(1 x)e x < 1
for all x > 0 (which implies
dU L
< 0 for all n > 0).
dn
x2 x3
x
For all x > 0, we have e = 1 + x +
+
+ , hence
2!
3!
2
3
x
x
(1 x)e x = (1 x) + (1 x)x + (1 x) + (1 x) +
3!!
!
! 2!
1 2
1
1 3
1
1 4
=1 1
x
x +
2!
2! 3!
3! 4!
< 1.
157
lim
1
2n
1
n
ln 2 2 1n
n2
= lim
n 1
n2
n
= ln 2
0.69
158
1
n(2 n
159
Notes
In situations where U 69% for the periodic tasks, the processor does not have to
remain unused during 31% of the time! One can instead run low-priority tasks that are
not bound by real-time constraints.
For some specific class of sets of tasks, one can obtain U L = 100%, which guarantees
that every set of tasks for which U 100% is schedulable.
Example: Let 1, 2, . . . , n be a set of tasks with respective periods and execution
times T 1, T 2, . . . , T n and C1, C2, . . . , Cn, such that
0 < T 1 T 2 T n,
i, j : i < j T j is an integer multiple of T i,
U=
n
X
C
i=1
Ti
1.
160
T
The critical zone of 2 contains 2 complete executions of 1:
T1
t
0
(k 1)T 1
2T 1
T1
kT 1
T2
Tj
complete executions of 1,
T1
...
Tj
T j1
The condition that must be satisfied in order to finish the execution of j before the end
of its critical zone is thus
Tj
Tj
Tj
C j T j C1 C2
T
T
T
j1
C j1
161
Cj
C1 C2
+
+ +
1,
T1 T2
Tj
162
Chapter 8
163
Introduction
In order to analyze the properties of a complex system, it is not always sufficient to study
the individual behavior of its components.
Example: An embedded system controlling a railroad crossing is composed of the following
elements:
Two sensors located on the tracks 1000 meters before and 100 meters after the
crossing, aimed at detecting (respectively) that a train approaches or has passed the
crossing.
A receiver that processes the signals emitted by the sensors, and sends orders to
open or close the gate.
receiver
1000 m
100 m
164
The speed of the approaching trains is between 48 and 52 m/s. Then, after reaching
the first sensor, their speed is reduced to a value between 40 and 52 m/s.
After it receives a signal from a sensor, the receiver waits for at most 5 seconds before
sending an order to close or to open the gate. During this delay, the receiver ignores
incoming signals.
The gate is closed (resp. open) when its angle is equal to 0 (resp. 90) deg. The gate is
able to move at the rate of 20 deg/s.
Question: Is the gate always closed when a train passes the crossing?
165
Modeling a system
In order to analyze the properties of a system, the first step consists in building a model,
i.e., an abstract representation of the system that describes its relevant properties without
any ambiguity.
For embedded applications, the modeling formalism must be able to express
166
Hybrid systems
Hybrid systems are a modeling formalism that meets those requirements.
Syntax:
A hybrid system is composed of:
An activity dif (v), expressed as a conjunction of linear constraints over the variables
x1, x2, . . . , xn and their first temporal derivative x1, x2, . . . , xn.
An invariant inv (v), expressed as a conjunction of linear constraints over the variables
x1, x2, . . . , xn.
A guard guard (e), that represents a condition that must be satisfied in order to enable
this transition.
An action act (e), composed of constraints that specify how the values of the variables
are modified when this transition is followed.
In practice, the guard and the action can be combined into a conjunction of constraints
over the values of the variables before ( x1, x2, x3, . . . ) and after ( x10 , x20 , x30 , . . . )
following the transition.
168
An optional label sync (e) L that makes it possible to synchronize this transition with
a transition belonging to another process.
Finally, one defines an initial control location for each process, and assigns a set of
possible initial values for each variable, specified as a conjunction of linear constraints.
169
Example: Process modeling the behavior of a train and the two sensors.
The distance between the train and the crossing is represented by a variable x1.
The signals emitted by the sensors are modeled by two synchronization labels app
and exit.
170
[1]
x1 1500
1000 x1
52 x1 48
x1 = 100
exit
x10 1500
x1 = 1000
app
[2]
0 x1 1000
52 x1 40
x1 = 0
[3]
x1 100
40 x1 52
171
The delay between receiving a sensor signal and sending an order to the gate is
represented by a variable x2.
The labels raise and lower model the orders sent to the gate.
172
app
app
exit
exit
[2]
[3]
0 x2 5
0 x2 5
x2 = 1
x2 = 1
app
exit
x20 = 0
x20 = 0
lower
x2 = 0
raise
[1]
x2 = 0
173
174
raise
raise
[1]
[2]
x3 = 90
0 x3 90
x3 = 90
x3 = 20
x3 = 0
x3 = 90
lower
raise
lower
raise
[3]
x3 = 0
[4]
0 x3 90
x3 = 0
x3 = 20
x3 = 0
lower
lower
175
Semantics:
At any given time, the current state of a hybrid system is characterized by
By letting time elapse (time steps). The control locations of processes stay
unchanged, and the values of the variables evolve according to the invariants and
activities associated to these locations.
176
In both cases, a transition can only be followed provided that its guard is satisfied by
the current variable values.
When a transition is followed, the variable values are modified according to the action
associated to the transition. The invariant of the destination location must be satisfied
by the new variable values (otherwise, the transition cannot be followed).
A state s2 is reachable from a state s1 if there exists a finite sequence of time steps and
transition steps that lead from s1 to s2.
A state s is reachable if it is reachable from an initial state.
177
Example: The state ([2], [2], [2], 800, 4, 90) of the railroad crossing controller model
corresponds to
app
178
The time spent at a control location may not be precisely constrained by the invariant.
A control location can have several outgoing transitions enabled at a given time.
An execution s1, s2, s3, . . . beginning at time t = 0 is said to be divergent if for every T > 0,
there exists i such that the state si is reached later than time t = T .
179
x1 = 10
x2 = 0
x1 0
x1 = x2
x2 = g
180
x1
Remarks:
Such models are inconsistent with physical reality and must be avoided!
For some restricted classes of hybrid systems, automatic methods have been
developed for transforming any given model into another one that does not have the
Zeno property, and admits the same divergent executions.
181
State-space exploration
A large number of interesting properties of a hybrid system can be checked by computing
its reachable states.
This computation can be carried out by building, from every initial state, a tree in which
each node q represents a reachable state s(q), and the children of q correspond to the
states that are reachable from s(q) by
a time step, or
a transition step.
Problems:
Since executions are infinite, the trees also have an infinite depth.
182
Solutions:
Sets of states sharing the same control locations and differing only in the elapsed time
in those locations can be grouped into regions. A tree can be built in which the nodes
are associated with regions instead of individual states.
At each exploration step, a first operation saturates the current region by letting time
elapse during all possible delays. Then, the enabled transitions are individually
followed, creating new branches.
The branches of the exploration tree that only contain already visited states can be
pruned.
Notes:
183
For general hybrid systems, the region tree can still be infinite. It is however possible
to define restricted classes of models, for which a finite region tree can always be
computed.
Example: Timed automata are hybrid systems in which
the activities are of the form xi = 1,
all invariants, guards and actions are conjunctions of constraints of the form xi#c or
xi x j#c, where c is an integer number, and # {<, , =, , >}.
Some tools are available for exploring automatically the state space of hybrid systems
(e.g., HyTech) or timed automata (e.g., Uppaal).
Notes: These tools
represent and handle regions with the help of dedicated data structures, based on
logic formulas, convex polyhedra, difference matrices, . . .
are able to check properties that go beyond simple reachability.
184
lower
9/2
185
x3 =0
=
x1 =0
5/2
exit
raise
186
9/2
x3 =90
=
app
187
Notes:
In this example, the regions correspond to the sets of states obtained after each
time-step operation (denoted by =).
Checking whether the gate is always closed when a train reaches the crossing
amounts to verifying that in each reachable region, x1 = 0 implies x3 = 0.
This particular system shows a very deterministic behavior: In each reachable state,
there is at most one transition (or a pair of synchronized transitions) that is enabled.
(This is generally not the case!)
188