21CS43 Module 5 Microcontroller and Embedded Systems Prof VANARASAN
21CS43 Module 5 Microcontroller and Embedded Systems Prof VANARASAN
MODULE 5
RTOS AND IDE FOR EMBEDDED SYSTEM DESIGN
The following Figure gives an insight into the basic components of an operating system and their interfaces with
rest of the world.
User Applications
Application
Programming
Interface (API)
Memory Management
Kernel Services
Process Management
Time Management
Underlying Hardware
The Kernel:
» The kernel is the core of the operating system. It is responsible for managing the system resources and the
communication among the hardware and other system services.
» Kernel acts as the abstraction layer between system resources and user
applications.
» Kernel contains a set of system libraries and services.
» For a general purpose OS, the kernel contains different services like memory management, process management, time
management, file system management, I/O system management
Process Management:
deals with managing the process/ tasks.
» Process management includes –
» setting up a memory for the process
» loading process code into memory
File System Management: File is a collection of related information. A file could be a program (source code or
executable), text files, image files, word documents, audio/ video files, etc.
» A file system management service of kernel is responsible for –
» The creation, deletion and alteration of files
» Creation, deletion, and alteration of directories
» Saving of files in the secondary storage memory
» Providing automatic allocation of file space based on the amount of free running space available
» Providing flexible naming conversion for the files.
I/O System (Device) Management: Kernel is responsible for routing the I/O requests coming from different user
applications to the appropriate I/O devices of the system.
» In a well structured OS, direct access to I/O devices is not allowed; access to
them is establish through Application Programming Interface (API).
» The kernel maintains list of all the I/O devices of the system.
» „Device Manager‟ of the kernel is responsible for handling all I/O related operations.
» The Device Manager is responsible for –
» Loading and unloading of device drivers
» Exchanging information and the system specific control signals to and from the device.
Microkernel: The microkernel design incorporates only essential set of operating system services into the kernel. The
rest of the operating systems services are implemented in program known as „Servers‟ which runs in user space.
The memory management, timer systems and interrupt handlers are the essential services, which forms the part of
the microkernel.
Task/ Process Scheduling: Deals with sharing the CPU among various tasks/ processes. A kernel application called
„Scheduler‟ handles the task scheduling. Scheduler is an algorithm implementation, which performs the efficient
and optimal scheduling of tasks to provide a deterministic behavior.
Task/ Process Synchronization: Deals with synchronizing the concurrent access of a resource, which is shared
across multiple tasks and the communication between various tasks.
Error/ Exception Handling: Deals with registering and handling the errors occurred/ exceptions rose during the
execution of tasks.
Insufficient memory, timeouts, deadlocks, deadline missing, bus error, divide by zero, unknown
instruction execution etc, are examples of errors/exceptions.
Errors/ Exceptions can happen at the kernel level services or at task level.
o Deadlock is an example for kernel level exception, whereas timeout is an example for a task level exception.
Errors/ Exceptions can happen at the kernel level services or at task level.
o Deadlock is an example for kernel level exception, whereas timeout is
an example for a task level exception.
Deadlock is a situation where a set of processes are blocked because each process is holding a
resource and waiting for another resource acquired by some other process.
Timeouts and retry are two two techniques used together. The tasks retries an event/ message
certain number of times; if no response is received after exhausting the limit, the feature might be
aborted.
The OS kernel gives the information about the error in the form of
a system call (API).
Memory Management: The memory management function of an RTOS kernel is slightly different compared to the
General Purpose Operating Systems.
In general, the memory allocation time increases depending on the size of the block of memory need to be
allocated and the state of the allocated memory block. RTOS achieves predictable timing and deterministic
behavior, by compromising the effectiveness of memory allocation.
RTOS generally uses „block‟ based memory allocation technique, instead of the usual dynamic memory
allocation techniques used by the GPOS. RTOS kernel uses blocks of fixed size of dynamic memory and
the block is allocated for a task on a need basis. The blocks are stored in a „Free buffer Queue‟.
Most of the RTOS kernels allow tasks to access any of the memory blocks without any memory protection
to achieve predictable timing and avoid the timing overheads.
Some commercial RTOS kernels allow memory protection as optional and
the kernel enters a fail-safe mode when an illegal memory access occurs.
» The memory management function a block of fixed memory is always allocated for tasks on need basis and it is taken
as a unit. Hence, there will not be any memory fragmentation issues.
Interrupt Handling: Deals with the handling of various interrupts. Interrupts inform the processor that an external
device or an associated task requires immediate attention of the CPU.
Interrupts can be either Synchronous or Asynchronous.
Interrupts which occurs in sync with the currently executing task is known as Synchronous
interrupts. Usually the software interrupts fall under the Synchronous Interrupt category.
Divide by zero, memory segmentation error etc are examples of Synchronous interrupts.
For synchronous interrupts, the interrupt handler runs in the same context of the interrupting task.
Interrupts which occurs at any point of execution of any task, and are not in sync with the currently executing task are
Asynchronous interrupts.
Interrupts which occurs at any point of execution of any task, and are not in sync with the currently
executing task are Asynchronous interrupts.
Timer overflow interrupts, serial data reception/
transmissioninterrupts etc., are examples for asynchronous interrupts.
For asynchronous interrupts, the interrupt handler is usually written as separate task (depends on OS
Kernel implementation) and it runs in a different context. Hence, a context switch happens while handling
the asynchronous interrupts.
» Priority levels can be assigned to the interrupts and each interrupts can be enabled or disabled individually. Most of
the RTOS kernel implements
„Nested Interrupts‟ architecture.
Time Management: Accurate time management is essential for providing precise time reference for all applications.
The time reference to kernel is provided by a high-resolution Real Time Clock (RTC) hardware chip (hardware
timer).
The hardware timer is programmed to interrupt the processor/ controller at a fixed rate. This timer
interrupt is referred as „Timer tick‟. The „Timer tick‟ is taken as the timing reference by the kernel.
The „Timer tick‟ interval may vary depending on the hardware timer.
Usually, the „Timer tick‟ varies in the microseconds range.
The time parameters for tasks are expressed as the multiples of the „Timer tick‟.
The System time is updated based on the „Timer tick‟.
If the System time register is 32 bits wide and the „Timer tick‟ interval is 1
microsecond, the System time register will reset in;
232 * 10–6 / (24 * 60 * 60) = ~ 0.0497 Days = 1.19 Hours
If the „Timer tick‟ interval is 1 millisecond, the System time register will
reset in
232 * 10–3 / (24 * 60 * 60) = 49.7 Days = ~ 50 Days
The „Timer tick‟ interrupt is handled by the „Timer Interrupt‟ handler of
kernel.
The „Timer tick‟ interrupt can be utilized for implementing the following actions:
Save the current context (Context of the currently executing task)
Increment the System time register by one. Generate timing error and reset the System time register if the
timer tick count is greater than the maximum range available for System time register.
Update the timers implemented in kernel (Increment or decrement the timer registers for each timer
depending on the count direction setting for each register. Increment registers with count direction setting
=
„count up‟ and decrement registers with count direction setting =
„count down‟)
Structure of a Processes:
The concept of „Process‟ leads to concurrent execution of tasks and thereby, efficient utilization of the CPU and other
system resources. Concurrent execution is achieved through the sharing of CPU among the processes.
A process mimics a processor in properties and holds a set of registers, process status, a Program Counter (PC) to point
to the next executable instruction of the process, a stack for holding the local variables associated with the process
and the code corresponding to the process.
» This can be visualized as shown in the following Figure.
A process which inherits all the properties of the CPU. When the process gets its turn, its registers and
Program Counter register becomes mapped to the physical registers of the CPU.
The memory occupied by the process is segregated into three regions
namely; Stack memory, Data memory and Code memory
The “Stack‟ memory holds all temporary data such as variables local to the process.
The “Data‟ memory holds all global data for the process.
The “Code‟ memory contains the program code (instructions) corresponding to the process.
On loading a process into the main memory, a specific area of memory is allocated for the process. The
stack memory usually starts at the highest memory address from the memory area allocated for the
process.
Process States & State Transition: The creation of a process to its
termination is not a single step operation.
»The process traverses through a series of states during its transition from the newly created state to the terminated
state.
The cycle through which a process changes its state from „newly created‟
to „execution completed‟ is known as „Process Life Cycle‟.
The various states through which a process traverses through during a Process Life Cycle indicates the
current status of the process with respect to time and also provides information on what it is allowed to do
next.
The transition of a process from one state to another is known as “State transition‟.
»The Process states and state transition representation are shown in the following Figure.
Threads:
A thread is the primitive that can execute code. A thread is a single sequential flow of control within a process. A
thread is also known as lightweight process.
» A process can have many threads of execution.
» Different threads, which are part of a process, share the same address space; meaning they share the data memory,
code memory and heap memory area.
» Threads maintain their own thread status (CPU register values), Program
Counter (PC) and stack.
» The memory model for a process and its associated threads are given in the following figure.
The Concept of Multithreading: The process is split into multiple threads, which executes a portion of the process;
there will be a main thread and rest of the threads will be created within the main thread.
The multithreaded architecture of a process can be visualized with the
thread-process diagram, shown below.
Since the process is split into different threads, when one thread enters a wait state, the CPU can
be utilized by other threads of the process that do not require the event, which the other thread is
waiting, for processing. This speeds up the execution of the process.
Efficient CPU utilization. The CPU is engaged all time.
Thread Standards: deal with the different standards available for thread creation and management. These standards
are utilized by the Operating Systems for thread creation and thread management. It is a set of thread class
libraries. The commonly available thread class libraries are –
POSIX Threads: POSIX stands for Portable Operating System Interface. The POSIX.4 standard deals with
the Real Time extensions and POSIX.4a standard deals with thread extensions.
The POSIX standard library for thread creation and management is
„Pthreads‟.
„Pthreads‟ library defines the set of POSIX thread creationand management
functions in „C‟ language. (Example 1 – Self study).
Win32 Threads: Win32 threads are the threads supported by various flavors of Windows Operating
Systems.
The Win32 Application Programming Interface (Win32 API) libraries provide the standard set of
Win32 thread creation and management functions. Win32 threads are created with the API.
Java Threads: Java threads are the threads supported by Java programming Language.
The java thread class „Thread‟ is defined in the package „java.lang‟.
This package needs to be imported for using the thread creation
functions supported by the Java thread class.
There are two ways of creating threads in Java: Either by extending the
base „Thread‟ class or by implementing an interface.
Thread Pre-emption: is the act of pre-empting the currently cunning thread (stopping temporarily). It is dependent on
the Operating System.
» It is performed for sharing the CPU time among all the threads.
» The execution switching among threads are known as „Thread context
switching‟. Threads falls into one of the following types:
User Level Thread: User level threads do not have kernel/ Operating System support and they exist only in
the running process.
A process may have multiple user level threads; but the OS threats it as single thread and will not
switch the execution among the different threads of it.
It is the responsibility of the process to schedule each thread as and
when required.
Kernel Level/ System Level Thread: Kernel level threads are individual units of execution, which the OS
treats as separate threads.
The OS interrupts the execution of the currently running kernel thread and switches the execution to
another kernel thread based on the scheduling policies implemented by the OS.
There are many ways for binding user level threads with kernel/ system level threads; which are
explained below:
Many-to-One Model: Many user level threads are mapped to a single
kernel thread. Eg: Solaris Green threads and GNU Portable Threads
One-to-One Model: Each user level thread is bonded to a kernel/ system
level thread. Windows XP/NT/2000 and Linux threads
Many-to-Many Model: In this model many user level threads are
allowed to be mapped to many kernel threads. Windows NT/2000 with Thread Fiber.
Thread Process
Thread is a single unit of execution and is part of Process is a program in execution and contains one or
process. more threads.
A thread does not have its own data memory and Process has its own code memory, data memory, and
memory. stack memory.
A thread cannot live independently; it lives within the A process contains at least one thread.
process.
There can be multiple threads in a process; the first Threads within a process share the code, data and
(main) thread calls the main function and occupies heap memory; each thread holds separate memory
the start of the stack memory of the process. area for stack.
Threads are very inexpensive to create. Processes are very expensive to create; involves
many OS overhead.
Context switching is inexpensive and fast. Context switching is complex and involves lots of
OS overhead and comparatively slow.
If a thread expires, its stack is reclaimed by the If a process dies, the resource allocated to it are
process. reclaimed by the OS and all associated threads of the
process also dies.
The act of switching CPU among the processes or changing the current execution context is known as
„Context switching‟.
The act of saving the current context (details like Register details, Memory details, System Resource
Usage details, Execution details, etc.) for the currently running processes at the time of CPU switching is
known as
„Context saving‟.
The process of retrieving the saved context details for a process, which is going to be executed due to CPU
Types of Multitasking:
» Depending on how the task/ process execution switching act is implemented,
multitasking can is classified into –
» Co-operative Multitasking: Co-operative multitasking is the most primitive form of multitasking in
which a task/ process gets a chance to execute only when the currently executing task/ process voluntarily
relinquishes the CPU.
» In this method, any task/ process can avail the CPU as much time as it wants. Since this type of implementation
involves the mercy of the tasks each other for getting the CPU time for execution, it is known as co- operative
multitasking. If the currently executing task is non-cooperative, the other tasks may have to wait for a long time to get
the CPU.
Preemptive Multitasking: Preemptive multitasking ensures that every
task/ process gets a chance to execute.
» When and how much time a process gets is dependent on the implementation of the preemptive scheduling.
» As the name indicates, in preemptive multitasking, the currently running task/process is preempted to give a chance
to other tasks/process to execute.
» The preemption of task may be based on time slots or task/ process
priority.
Non-preemptive Multitasking: The process/ task, which is currently given
the CPU time, is allowedto execute until it terminates (enters the
„Completed‟ state) or enters the „Blocked/ Wait‟ state, waiting for an I/O.
» The co-operative and non-preemptive multitasking differs in their behavior when they are in the
„Blocked/Wait‟ state.
» In co-operative multitasking, the currently executing process/task need not relinquish the CPU when it
enters the „Blocked/ Wait‟ sate, waiting for an I/O, or a shared resource access or an event to occur
whereas in non-preemptive multitasking the currently executing task relinquishes the CPU when it waits
for an I/O.
TASK COMMUNICATION:
In a multitasking system, multiple tasks/ processes run concurrently (in pseudo parallelism) and each process may or
may not interact between. Based on the degree of interaction, the processes/ tasks running on an OS are classified
as –
Co-operating Processes: In the co-operating interaction model, one process
requires the inputs from other processes to complete its execution.
Competing Processes: The competing processes do not share anything among themselves but they share
the system resources. The competing processes compete for the system resources such as file, display
device, etc.
The co-operating processes exchanges information and communicate through
the following methods:
Co-operation through sharing: Exchange data through some shared resources.
Co-operation through Communication: No data is shared between the
2. Memory Mapped Objects: Memory mapped object is a shared memory technique adopted by certain Real Time
Operating Systems for allocating a shared block of memory which can be accessed by multiple process
simultaneously.
» In this approach, a mapping object is created and physical storage for it is reserved and committed.
» A process can map the entire committed physical area or a block of it to its virtual address space.
» All read and write operation to this virtual address space by a process is directed to its committed physical area.
» Any process which wants to share data with other processes can map the physical memory area of the mapped object
to its virtual memory space and use it for sharing the data.
The concept of memory mapped object is shown bellow.
1. Message Queues: Process which wants to talk to another process posts the message to a First-In-First-
Out (FIFO) queue called „Message queue‟, which stores the messages temporarily in a system defined
memory object, to pass it to the desired process.
» Messages are sent and received through send (Name of the process to which the message is to be sent,
message) and receive (Name of the process from which the message is to be received, message) methods.
» The messages are exchanged through
a message queue.
» The implementation of the message queue, send and receive methods are OS kernel dependent.
»One task/process creates the mailbox and other tasks/process can subscribe to this mailbox for getting
message notification.
»The implementation of the mailbox is OS kernel dependent.
»The MicroC/ OS-II RTOS implements mailbox as a mechanism for inter task
Communication.
Sockets are used for RPC communication. Socket is a logical endpoint in a two-way communication link
between two applications running on a network. A port number is associated with a socket so that the
network layer of the communication channel can deliver the data to the designated application.
» Sockets are of different types namely; Internet sockets (INET), UNIX sockets, etc.
» The INET Socket works on Internet Communication protocol. TCP/ IP, UDP, etc., are the communication
protocols used by INET sockets.
» INET sockets are classified into:
Stream Sockets: are connection oriented and they use TCP to establish a reliable connection.
Datagram Sockets: rely on UDP for establishing a connection.
TASK SYNCHRONIZATION:
In a multitasking environment, multiple processes run concurrently and share the system resources.
» Also, each process may communicate with each other with different IPC
mechanisms.
» Hence, there may be situations that; two processes try to access a shared memory area, where one process
tries to write to the memory location when the other process is trying to read from the same memory
location. This will lead to unexpected results.
» The solution is, make each process aware of access of a shared resource.
The act of making the processes aware of the access of shared resources by each process to avoid conflicts is
known as “Task/ Process Synchronization”.
Task/ Process Synchronization is essential for –
1. Avoiding conflicts in resource access (racing, deadlock, etc.) in multitasking
environment.
2. Ensuring proper sequence of operation across processes.
3. Establish proper communication between processes.
» The code memory area which holds the program instructions (piece of code) for accessing a shared
resource is known as „Critical Section‟.
» In order to synchronize the access to shared resources, the access to the critical section should be
exclusive.
Task Communication/ Synchronization Issues:
» Various synchronization issues may arise in a multitasking environment; if
processes are not synchronized properly in shared resource access, such as:
» 1. Racing: Look into the following piece of code:
From a programmer perspective, the value of counter will be 10 at the end of execution of processes A & B.
But it need not be always.
»The program statement counter++; looks like a single statement from a high level programming language
(C Language) perspective.
»The low level implementation of this statement is dependent on the underlying processor instruction set
and the (cross) compiler in use.
» The low level implementation of the high level program statement counter++; under Windows XP
operating system running on an Intel Centrino Duo processor is given below.
At the processor instruction level, the value of the variable counter is loaded to the Accumulator register
(EAX Register).
» The memory variable counter is represented using a pointer.
» The base pointer register (EBP Register) is used for pointing to the memory
variable counter.
» After loading the contents of the variable counter to the Accumulator, the Accumulator content is
incremented by one using the add instruction.
» Finally the content of Accumulator is loaded to the memory location which represents the variable counter.
Both the processes; Process A and Process B contain the program statement counter++; Translating this
into the machine instruction.
Process A Process B
eax,dword ptr [ebp-4] eax, dword ptr [ebp-4]
eax, 1 eax, 1
dword ptr [e bp-4], eax dword ptr [ebp-4], eax
Imagine a situation where a process switching (context switching) happens from Process A to Process B
when Process A is executing the counter++; statement. Process A accomplishes the counter++;
statement through three different low level instructions.
» Now imagine that the process switching happened at the point, where Process A executed the low level
instruction mov eax, dword ptr [ebp-4] and is about to execute the next instruction add eax, 1. The
scenario is illustrated in the following Figure.
Process B increments the shared variable „counter‟ in the middle of the operation where Process A tries to
increment it. When Process A gets the CPU time for execution, it starts from the point where it got
interrupted (If Process B is also using the same registers eax and ebp for executing counter++;
instruction, the original content of these registers will be saved as part of context saving and it will be
retrieved back as part of the context retrieval, when Process A gets the CPU for execution.
» Hence the content of eax and ebp remains intact irrespective of context switching). Though the variable
counter is incremented by Process B, Process A is unaware of it and it increments the variable with the
old value.
» This leads to the loss of one increment for the variable counter
2. Deadlock: Deadlock is the condition in which a process is waiting for a resource held by another
process which is waiting for a resource held by the first process; hence, none of the processes are able to
make any progress in their execution.
Process A holds a resource „x‟ and it wants a resource „y‟ held by Process B. Process B is
currently holding resource „y‟ and it wants the resource „x‟ which is currently held by Process A.
Both hold the respective resources and they compete each other to get the resource held by the
respective processes.
Handling Deadlock: The OS may adopt any of the following techniques to detect and prevent deadlock conditions.
Ignore Deadlocks: Always assume that the system design is deadlock free.
This is acceptable for the reason the cost of removing a deadlock is large compared to the chance
of happening a deadlock.
UNIX is an example for an OS following this principle.
A life critical system cannot pretend that it is deadlock free for any
reason.
Detect and Recover: This approach suggests the detection of a deadlock situation and recovery from it.
This is similar to the deadlock condition that may arise at a traffic junction. When the vehicles
from different directions compete to cross the junction, deadlock (traffic jam) condition is
resulted. Once a deadlock (traffic jam) is happened at the junction, the only solution is to back up
the vehicles from one direction and allow the vehicles from opposite direction to cross the
junction. If the traffic is too high, lots of vehicles may have to be backed up to resolve the traffic
jam. This technique is also known as „back up cars‟ technique.
Operating Systems keep a resource graph in their memory. The resource graph is updated on each
resource request and release.
A deadlock condition can be detected by analyzing the resource graph by graph analyzer
algorithms.
Once a deadlock condition is detected, the system can terminate a process or preempt the resource to break the
deadlocking cycle.
Avoid Deadlocks: Deadlock is avoided by the careful resource allocation techniques by the Operating System. It is
similar to the traffic light mechanism at junctions to avoid the traffic jams.
Prevent Deadlocks: Prevent the deadlock condition by negating one of the four conditions favoring the deadlock
situation (Mutual Exclusion, No Resource Preemption, Circular Wait).
» Ensure that a process does not hold any other resources when it requests a resource. This can be achieved by
implementing the following set of rules/ guidelines in allocating resources to processes.
1. A process must request all its required resource and the resources should be allocated before the process
Ensure that resource preemption (resource releasing) is possible at operating system level. This can be achieved by
implementing the following set of rules/ guidelines in resources allocation and releasing:
1. Release all the resources currently held by a process if a request made by the
process for a new resource is not able to fulfill immediately.
2. Add the resources which are preempted (released) to a resource list describing the resources which the
process requires to complete its execution.
3. Reschedule the process for execution only when the process gets its old resources and the new resource
which is requested by the process.
The resources which are shared among a process can be either for exclusive use by a process or for using by a number
of processes at a time.
» The display device of an embedded system is a typical example of a shared resource which needs exclusive access
by a process.
» The Hard disk (secondary storage) of a system is a typical example for
sharing the resource among a limited number of multiple processes.
» Based on the implementation, Semaphores can be classified into Binary Semaphore and Counting Semaphore.
Binary Semaphore: Implements exclusive access to shared resource by allocating the resource to a single
process at a time and not allowing the other processes to access it when it is being used by a process.
„Only one process/ thread‟ can own the binary semaphore at a time.
The state of a „binary semaphore‟ object is set to signaled when it is not owned by any process/ thread,
and set to non-signaled when it is owned by any process/ thread.
The implementation of binary semaphore is OS kernel dependent. Under certain OS kernel it is referred
as mutex.
Counting Semaphore: Maintains a count between zero and a maximum value. It limits the usage of
resource by a fixed number of processes/ threads.
The count associated with a „Semaphore object‟ is decremented by one when a process/ thread
acquires it and the count is incremented by one when a process/ thread releases the „Semaphore
object‟.
The state of the counting semaphore object is set to „signaled‟ when the count
of the object is greater than zero.
The state of the „Semaphore object‟ is set to non-signaled when the semaphore is acquired by the
maximum number of processes/ threads that the semaphore can support (i.e. when the count
associated with the „Semaphore object‟ becomes zero).
The creation and usage of „counting semaphore object‟ is OS kernel
dependent.
minimal.
Inter Process Communication and Task Synchronization: The implementation of Inter Process Communication and
Synchronization is OS kernel dependent. Certain kernels may provide a bunch of options whereas others provide
very limited options. Certain kernels implement policies for avoiding priority inversion issues in resource sharing.
» Modularization Support: Most of the operating systems provide a bunch of features. At times it may not be necessary
for an embedded product for its functioning. It is very useful if the OS supports moclularisation where in which the
developer can choose the essential modules and re-compile the OS image for functioning.
Windows CE is an example for a highly modular operating system.
Support for Networking and Communication: The OS kernel may provide stack implementation and driver support
for a bunch of communication interfaces and networking. Ensure that the OS under consideration provides support
for all the interfaces required by the embedded product.
Development Language Support: Certain operating systems include the run time libraries required for running
applications written in languages like Java and C#. A Java Virtual Machine (JVM) customized for the Operating
System is essential for running java applications. Similarly the .NET Compact Framework (.NETCF) is required
for running Microsoft .NET applications on top of the Operating System. The OS may include these components
as built-in component, if not; check the availability of the same from a third party vendor or the OS under
consideration.
Non-functional Requirements:
» Custom Developed or Off the Shelf: Depending on the OS requirement, it is possible to go for the complete
development of an operating system suiting the embedded system needs or use an off the shelf, readily available
operating system, which is either a commercial product or an Open Source product, which is in close match with
the system requirements. Sometimes it may be possible to build the required features by customizing an Open
source OS. The decision on which to select is purely de• pendent on the development cost, licensing fees for the
OS, development time and availability of skilled resources.
» Cost: The total cost for developing or buying the OS and maintaining it in terms of commercial product and custom
build needs to be evaluated before taking a decision on the selection of OS.
Development and Debugging Tools Availability: The availability of development and debugging tools is a critical
decision making factor in the selection of an OS for embedded design. Certain Operating Systems may be superior
in performance, but the availability of tools for supporting the development may be limited. Explore the different
tools available for the OS under consideration.
» Ease of Use: How easy it is to use a commercial RTOS is another important
feature that needs to be considered in the RTOS selection.
» After Sales: For a commercial embedded RTOS, after sales in the fom1 of e- mail, on-call services etc., for bug fixes,
critical patch updates and support for production issues, etc., should be analyzed thoroughly.
» The target embedded hardware without embedding the firmware is a dumb device and cannot function properly. If
you power up the hardware without embedding the firmware, the device may behave in an unpredicted manner.
Both embedded hardware and firmware should be independently tested (Unit
Tested) to ensure their proper functioning.
» Functioning of individual hardware sections can be done by writing small
utilities which checks the operation of the specified part.
» The functionalities of embedded firmware can easily be checked by the simulator environment provided by the
embedded firmware development tool's IDE. By simulating the firmware, the memory contents, register details,
status of various flags and registers can easily be monitored and it gives an approximate picture of "What happens
inside the processor/ controller and what are the states of various peripherals" when the firmware is running on the
target hardware. The IDE gives necessary support for simulating the various inputs required from the external
world, like inputting data on ports, generating an interrupt condition, etc.
Out-of-Circuit Programming:
» Out-of-circuit programming is performed outside the target board. The processor or memory chip into which the
firmware needs to be embedded is taken out of the target board and it is programmed with the help of a
programming device.
» The programming device is a dedicated unit which contains the necessary hardware circuit to generate the
programming signals. Most of the programming devices available in the market are capable of programming
different family of devices.
» The programming device will be under the control of a utility program running on a PC. Usually the programming
device is interfaced to the PC through RS-232C/USB/Parallel Port Interface.
The commands to control the programmer are sent from the utility program to the programmer through the interface
(see the following Figure).
The sequence of operations for embedding the firmware with a programmer is listed below:
1. Connect the programming device to the specified port of PC (USB/COM
port/Parallel port)
2. Power up the device (Most of the programmers incorporate LED to indicate Device power up. Ensure that
the power indication LED is ON)
3. Execute the programming utility on the PC and ensure proper connectivity is established between PC and
programmer. In case of error turn off device power and try connecting it again
4. Unlock the ZIF socket by turning the lock pin
5. Insert the device to be programmed into the open socket as per the insert diagram shown on the
programmer
6. Lock the ZIF socket
7. Select the device name from the list of supported devices
8. Load the hex file which is to be embedded into the device Program the device by 'Program' option of utility
program
10. Wait till the completion of programming operation (Till busy LED of programmer is off)
11. Ensure that programming is success by checking the status LED on the programmer (Usually 'Green' for
success and 'Red' for error condition) or by noticing the feedback from the utility program
12. Unlock the ZIF socket and d take the device out of programmer.
» The major drawback of out-of-circuit programming is the high development time. Whenever the firmware is
changed, the chip should be taken out of the development board for re-programming. This is tedious and
prone to chip damages due to frequent insertion and removal.
» The out-of-system programming technique is used for firmware integration for low end embedded products
which runs without an operating system.
In System Programming (ISP):
» With ISP, programming is done 'within the system', meaning the firmware is embedded into the target device
without removing it from the target board. It is the most flexible and easy way of firmware embedding.
The only pre- requisite is that the target device must have an ISP support. Apart from the target board, PC,
ISP cable and ISP utility, no other additional hardware is required for ISP.
» The target board can be interfaced to the utility program running on PC through Serial Port/ Parallel Port/
USB. The communication between the target device and ISP will be in a serial format. The serial protocols
used for ISP may be 'Joint Test Act Group (JTAG)' or 'Serial Peripheral Interface (SPI)' or any other
proprietary protocol.
In System Programming with SPI Protocol: Devices with SPI (Serial Peripheral Interface) ISP (In System
Programming) support contains a built- in SPI interface and the on-chip EEPROM or FLASH memory.
The primary I/O lines involved in SPI-In System Programming are listed below:
MOSI – Master Out Slave In
» MISO – Master In Slave Out
» SCK – System Clock
» RST – Reset of Target Device
» GND – Ground of Target Device
» PC acts as the master and target device acts as the slave in ISP. The program data is sent to the MOSI pin of
target device and the device acknowledgement is originated from the MISO pin of the device. SCK pin
acts as the clock for data transfer. A utility program can be developed on the PC side to generate the above
signal lines. Standard SPI-ISP utilities are feely available on the internet and, there is no need for going
for writing own program. For ISP operations, the target device needs to be powered up in a pre-defined
sequence.
» The power up sequence for In System Programming for Atmel's AT89S
series microcontroller family is listed below:
1. Apply supply voltage between VCC and GND pins of target chip
2. Set RST pin to "HIGH" state
3. If a crystal is not connected across pins XTAL 1 and XTAL2, apply a
3MHz to 24 MHz clock to XTALl pin and wait for at least 10 milliseconds
4. Enable serial programming by sending the Programming Enable serial instruction to pin MOSI/ Pl.5. The
frequency of the shift clock supplied at pin SCK/ P1.7 needs to be less than the CPU clock at XTALl
divided by 40
5. The Code or Data array is programmed one byte at a time by supplying the address and data together with
the appropriate Write instruction. The selected memory location is first erased before the new data is
written. The write cycle is self-timed and typically takes less than 2.5 ms at 5V
6. Any memory location can be verified by using the Read instruction, which
returns the content at the selected address at serial output MISO/ Pl.6
7. After successfully programming the device, set RST pin low or turn off the chip power supply and turn it
ON to commence the normal operation.
The key player behind ISP is a factory programmed memory (ROM) called 'Boot ROM‟. The Boot ROM normally
resides at the top end of code memory space and it varies in the order of a few Kilo Bytes (For a controller
with 64K code memory space and lK Boot ROM, the Boot ROM resides at memory location FC00H to
FFFFH).
» It contains a set of Low-level Instruction APIs and these APIs allow the processor/ controller to perform the
FLASH memory programming, erasing and Reading operations.
» The contents of the Boot ROM are provided by the chip manufacturer and the same is masked into every
device.
hardware and firmware functions as expected. Bring up process includes basic hardware spot checks/
validations to make sure that the individual components and busses/ interconnects are operational – which
involves checking power, clocks, and basic functional connectivity;
» basic firmware verification to make sure that the processor is fetching the code and the firmware execution is
happening in the expected manner;
» running advanced validations such as memory validations, signal integrity validation, etc.
DISASSEMBLER/ DECOMPLIER:
Disassembler is a utility program which converts machine codes into target processor specific Assembly codes/
instructions.
» The process of converting machine codes into Assembly code is known as 'Disassembling'. In operation,
disasseri1bling is complementary to assembling/ cross-assembling.
» Decompiler is the utility program for translating machine codes into corresponding high level language
instructions.
» Decompiler performs the reverse operation of compiler/ cross-compiler.
» The disassemblers/ decompilers for different family of processors/ controllers are different.
Disassemblers/ Decompilers are deployed in reverse engineering.
Reverse engineering is the process of revealing the technology behind the working of a product. Reverse
engineering in Embedded Product development is employed to find out the secret behind the working of
popular proprietary products.
» Disassemblers/ Decompilers are powerful tools for analyzing the presence of malicious codes (virus
information) in an executable image.
» Disassemblers/ Decompilers are available as either freeware tools readily available for free download from
internet or as commercial tools.
» It is not possible for a disassembler/ decompiler to generate an exact replica of the original assembly code/
high level source code in terms of the symbolic constants and comments used. However disassemblers/
decompilers generate a source code which is
somewhat matching to the original source code from which the binary code is generated.
Simulators:
» Simulators simulate the target hardware and the firmware execution can be inspected using simulators.
» The features of simulator based debugging are listed below.
1. Purely software based
2. Doesn't require a real target system
3. Very primitive (Lack of featured I/O support.Everything is a simulated one)
4. Lack of Real-time behavior.
Advantages of Simulator Based Debugging: Simulator based debugging techniques are simple and straightforward
.The major advantages of simulator based firmware debugging techniques are explained below.
No Need for Original Target Board: Technique is purely software oriented; Simulates the CPU of the target board;
User only needs to know about the memory map of various devices within the target board; Real hardware is not
required, hence, firmware development can start well in advance – This saves development time.
» Simulate I/O Peripherals: Option to simulate various I/O peripherals; Can edit the values for I/O registers –
Eliminates the need for connecting I/O devices for debugging the firmware.
» Simulates Abnormal Conditions: With simulator's simulation support you can input any desired value for any
parameter during debugging the firmware and can observe the control flow of firmware.
Limitations of Simulator Based Debugging: Though simulation based firmware debugging technique is very helpful in
embedded applications, they possess certain limitations and we cannot fully rely on the simulator-based firmware
debugging. Some of the limitations of simulator-based debugging are explained below:
» Deviation from Real Behavior: Developer may not be able to debug the firmware under all possible combinations of
input; Under certain operating conditions, we may get some particular result and it need not be the same when the
firmware runs in a production environment.
» Lack of Real Timeliness: The major limitation of simulator based debugging is that it is not real-time in 36ehaviour.
In a real application the I/O condition may be varying or unpredictable.
Monitor Program Based Firmware Debugging: Monitor program based firmware debugging is the first adopted
invasive method for firmware debugging (see the following Figure). In this approach a monitor program which
acts as a supervisor is developed.
» The monitor program controls the downloading of user code into the code memory, inspects and modifies register/
memory locations; allows single stepping of source code, etc.
» The monitor program implements the debug functions as per a pre-defined
command set from the debug application interface.
» The monitor program always listens to the serial port of the target device and according to the command received
from the serial interface it performs command specific actions like firmware downloading, memory
inspection/modification, firmware single stepping and sends the debug information .
Emulation Device: is a replica of the target CPU which receives various signals from the target board
through a device adaptor connected to the target board and performs the execution of firmware under the
control of debug commands from the debugg application.
Emulation Memory: is the Random Access Memory (RAM) incorporated in the Emulator device. It acts as
a replacement to the target board's EEPROM where the code is supposed to be downloaded after each
firmware modification. Hence the original EEPROM memory is emulated by the RAM of emulator. This
is known as 'ROM Emulation'. ROM emulation eliminates the hassles of ROM burning and it offers the
benefit of infinite number of reprogramming.
Emulator Control Logic: is the logic circuits used for implementing complex hardware breakpoints, trace
buffer trigger detection, trace buffer control, etc. Emulator control logic circuits are also used for
implementing logic analyzer functions in advanced emulator devices. The 'Emulator POD' is connected to
the target board through a 'Device adaptor' and signal cable.
Device Adaptors: act as an interface between the target board and emulator POD. Device adaptors are
normally pin-to-pin compatible sockets which can be inserted/ plugged into the target board for routing the
various signals from pins assigned for the target processor. The device adaptor is usually connected to the
emulator POD using ribbon cables.
On Chip Firmware Debugging (OCD): Advances in semiconductor technology has brought out new dimensions to
target firmware debugging. Today almost all processors/controllers incorporate built in debug modules called On
Chip Debug (OCD) support. Though OCD adds silicon complexity and cost factor, from a developer perspective it
is a very good feature supporting fast and efficient firmware debugging. The On Chip Debug facilities integrated
to the processor/ controller are chip vendor dependent and most of them are proprietary technologies like
Background Debug Mode (BDM), OnCE, etc.
components, signal corruption due to noise, etc. The only way to sort out these issues and figure out the real
problem creator is debugging the target board.
» Hardware debugging is not similar to firmware debugging. Hardware debugging involves the monitoring of various
signals of the target board (address/ data lines, port pins, etc.), checking the inter connection among various
components, circuit continuity checking, etc.
The various hardware debugging tools used in Embedded Product Development are explained below.
Multimeter:
» A multimeter is used for measuring various electrical quantities like voltage (Both AC and DC), current (DC as well
as AC), resistance, capacitance, continuity checking, transistor checking, cathode and anode identification of
diode, etc.
» Any multimeter will work over a specific range for each measurement. A multimeter is the most valuable tool in the
tool kit of an embedded hardware developer. It is the primary debugging tool for physical contact based hardware
debugging and almost all developers start debugging the hardware with it.
Digital CRO:
» Cathode Ray Oscilloscope (CRO) is a little more sophisticated tool compared to a multimeter. CRO is used for
waveform capturing and analysis, measurement of signal strength, etc.
» CRO is a very good tool in analyzing interference noise in the power supply line and other signal lines. Monitoring
the crystal oscillator signal from the target board is a typical example of the usage of CRO.
» CROs are available in both analog and digital versions.
» Various measurements like phase, amplitude, etc. are also possible with CROs. Tektronix, Agilent, Philips, etc. are
the manufacturers of high precision good quality digital CROs.
Logic Analyzer:
A logic analyzer is the big brother of digital CRO. Logic analyzer is used for capturing digital data (logic 1 and 0) from
a digital circuitry whereas CRO is employed in capturing all kinds of waves including logic signals.
» A logic analyzer contains special connectors and clips which can be attached to the target board for capturing digital
data. In target board debugging applications, a logic analyzer captures the states of various port pins, address bus
and data bus of the target processor/ controller, etc.
» Logic analyzers give an exact reflect on of what happens when a particular line of firmware is running. This is
achieved by capturing the address line logic and data line logic of target hardware. Most modem logic analyzers
contain provisions for storing captured data, selecting a desired region of the captured waveform, zooming
selected region of the captured waveform, etc.
Function Generator:
» Function generator is not a debugging tool. It is a input signal simulator tool. A function generator is capable of
producing various periodic waveforms like sine wave, square wave, saw-tooth wave, etc. with different
frequencies and amplitude.
» Sometimes the target board may require some kind of periodic waveform with a particular frequency as input to some
part of the board. Thus, in a debugging environment, the function generator serves the purpose of generating and
supplying required signals.
BOUNDARY SCAN:
As the complexity of the hardware increase, the number of chips present in the board and the interconnection among
them may also increase.
» The device packages used in the PCB become miniature to reduce the total board space occupied by them and
multiple layers may be required to route the interconnections among the chips.
» With miniature device packages and multiple layers for the PCB it will be very difficult to debug the hardware using
magnifying glass, multimeter, etc. to check the interconnection among the various chips.
Boundary scan is a technique used for testing the interconnection among the
various chips, which support JTAG interface, present in the board.
» Chips which support boundary scan associate a boundary scan cell with each pin of the device.
» A JTAG port contains the five signal lines, namely, TDI, TDO, TCK, TRST
and TMS form the Test Access Port (TAP) for a JTAG supported chip.
» Each device will have its own TAP.
» The PCB also contains a TAP for connecting the JTAG signal lines to the external world.
» A boundary scan path is formed inside the board by interconnecting the
devices through JTAG signal lines.
The TDI pin of the TAP of the PCB is connected to the TDI pin of the first device.
» The TDO pin of the first device is connected to the TDI pin of the second
device.
» In this way all devices are interconnected and the TDO pin of the last JTAG device is connected to the TDO pin of
the TAP of the PCB.
» The clock line TCK and the Test Mode Select (TMS) line of the devices are connected to the clock line and Test
mode select line of the Test Access Port of the PCB respectively. This forms a boundary scan path.