0% found this document useful (0 votes)
37 views

High-Level Optimizations: Embedded System Optimization

Your answers are shown below: 1. Which of the following options is not operated on SPPortal? (Single choice) Application management Rule management Asset Managementwrong View SP user detailscorrect 2. The Internet of Things application needs to use () Portal to connect to the OceanConneot platform? Alarm monitoring Portal SPPortalcorrect Operation Fortal Port OSS Management Portalwrong 3. Terminal intelligence is the foundation of the development of the Internet of Things.Huawei provides a comp

Uploaded by

Drin Portento
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views

High-Level Optimizations: Embedded System Optimization

Your answers are shown below: 1. Which of the following options is not operated on SPPortal? (Single choice) Application management Rule management Asset Managementwrong View SP user detailscorrect 2. The Internet of Things application needs to use () Portal to connect to the OceanConneot platform? Alarm monitoring Portal SPPortalcorrect Operation Fortal Port OSS Management Portalwrong 3. Terminal intelligence is the foundation of the development of the Internet of Things.Huawei provides a comp

Uploaded by

Drin Portento
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

IT2102

Embedded System Optimization


The aspect of optimization plays a central role during the design of embedded systems. Optimization is a process of
improving the efficiency of a system in time (speed), space (size), or resources. The first step in optimization is to
determine the problems that have been existing. Most of the optimizations performed on code involve a tradeoff
between execution speed and code size.

High-Level Optimizations
Software optimization is the process of modifying a software system to make some aspects work more efficiently or use
fewer resources.
• Simple Loop Transformations
o Loop Permutations – These permutations swap the order of two loops to increase parallelism, improve
spatial locality, or enable other transformations.
o It may have a positive effect on the reuse of array elements in the cache since the next iteration of the
innermost loop will access an adjacent location in memory.
▪ Caches are normally organized such that adjacent locations can be accessed significantly faster
than locations that are further away from the previously accessed location in which it is
exploiting spatial locality.
o Loop Unrolling - is a standard transformation creating several instances of the loop body.
▪ The number of copies of the loop is called the unrolling factor. Unrolling factors larger than
two are possible. Unrolling reduces the loop overhead (fewer branches per execution), and
therefore typically improves the speed.
o Loop Fusion and Fission – implies the combining or separating of loop nests.
▪ There may be cases in which two separate loops can be merged, and there may be cases in
which a single loop is split into two.
▪ Some versions might lead to an improved cache behavior and increase the potential for
parallel computations within the loop body.
• Loop Tiling/Blocking – It implies the chunking of nested loops. In this optimization, the use of memory
hierarchies may be beneficial in which it includes caches and scratchpad memories. A significant reuse factor for
the information in those memories is required. Otherwise, the memory hierarchy cannot be exploited.
• Loop Splitting – Performing this loop splitting manually is very difficult and error-prone. There are published
algorithms is based on a sophisticated analysis of accesses to array elements in loops. Optimized solutions are
generated using genetic algorithms. Run-times can be reduced by loop splitting for various applications and
architectures.
• Array Folding – Combining different array elements into one to avoid resource mismanagement. At any time,
only a subset of array elements is needed.
o The maximum number of elements needed is called the address reference window. Each array is
allocated the maximum of the space it requires during the entire execution time.

Embedded System Concurrency (Lee, 2017)


The physical world is concurrent with many things happening at once. Bridging this mismatch in semantics is one of
the major challenges that an embedded system designer face.
• Threads are imperative programs that run concurrently and share a memory space in which each program can
access the other’s variables.
o Creating Threads - The mechanism is provided in the form of a collection of procedures that a
programmer can use, such as a standardized API (application program interface).
o Implementing Threads - The core of the implementation of threads is a scheduler that decides which
thread to execute next when a processor is available to execute a thread.
o Mutual Exclusion (mutex) - A thread may be suspended between any two atomic operations to execute
another thread and/or an interrupt service routine.
o Deadlock - As mutex locks proliferate in programs, the risk of deadlock increases. A deadlock occurs
when some threads become permanently blocked trying to acquire locks.

06 Handout 1 *Property of STI


[email protected] Page 1 of 5
IT2102

• Processes are imperative programs with their own memory spaces. These programs cannot refer to each other’s
variables and do not exhibit the same difficulties as threads.
o To achieve concurrency, processes need to be able to communicate.
▪ Operating systems provide a variety of mechanisms such as creating shared memory spaces
leading to potential difficulties to multithreaded programming.
▪ A file system is simply a way to create a body of data that is persistent in the sense that it
outlives the process that creates it.
• One process can create data and write it to a file, and another process can read data
from the same file.
▪ Message Passing – One process creates a chunk of data, deposits it in a carefully controlled
section of memory that is shared, and then notifies other processes that the message is ready.
Those other processes can block waiting for the data to become ready.
• Semaphores are named after mechanical signals traditionally used on railroad tracks
to signal that a section of track has a train on it.
o It is possible to use a single section of track for trains to travel in both
directions (the semaphore implements mutex, preventing two trains from
simultaneously being on the same section of track).

Basics of Scheduling (Lee & Seshia, 2017)


• Multitasking is where multiple imperative tasks execute concurrently, either interleaved on a single processor
or in parallel on multiple processors.
o A scheduler decides what to do next at certain points in time, such as the time when a processor
becomes available.
• Real-time systems are collections of tasks where in addition to any ordering constraints imposed by each
precedence between the tasks.
o These constraints relate the execution of a task to real-time and physical time in the environment of
the computer executing the task.
o Tasks have deadlines in which are values of physical time by which the task must be completed. The
deadline is the time by which a task must be completed.
▪ Hard real-time deadline is a real physical constraint imposed by the application, where missing
the deadline is considered an error.
▪ Soft real-time deadline reflects a design decision that need not be enforced strictly. It is better
to meet the deadline, but missing the deadline is not an error.
• Scheduling Decisions - A scheduler decides what task to execute next when faced with a choice in the execution
of a concurrent program or set of programs. A scheduling decision is a decision to execute a task.
o Assignment: which processor should execute the task; The choice of processor is called processor
assignment.
o Ordering: in what order each processor should execute its tasks; and
o Timing: the time at which each task executes.

Types of Schedulers (Lee & Ha, 1989)


• A fully-static scheduler makes all decisions at design time in which it does not need semaphores or locks. It can
use timing instead to enforce mutual exclusion and precedence constraints.
• A static order scheduler performs the task assignment and ordering at design time but defers until the run time
the decision of when in physical time to execute a task.
• A static assignment scheduler performs the assignment at design time and everything else at run time. Each
processor is given a set of tasks to execute, and a run-time scheduler decides during execution what task to
execute next.
• A fully-dynamic scheduler performs all decisions at run time. When a processor becomes available (e.g., it
finishes executing a task or a task block acquiring a mutex), the scheduler decides at that point about what task
to execute next on that processor.

06 Handout 1 *Property of STI


[email protected] Page 2 of 5
IT2102

• A preemptive scheduler may make a scheduling decision during the execution of a task, assigning a new task to
the same processor. That is, a task may be in the middle of executing when the scheduler decides to stop that
execution and begin the execution of another task.
o The interruption of the first task is called preemption.
• A non-preemptive scheduler always lets tasks run to completion before assigning another task to execute on
the same processor.
• A priority-based scheduler assumes each task is assigned a number called a priority, and the scheduler will
always choose to execute the task with the highest priority.
▪ A fixed priority is a priority that remains constant over all executions of a task.
▪ A dynamic priority is allowed to change during execution.
o A preemptive priority-based scheduler is a scheduler that supports the arrival of tasks and executes
the enabled task with the highest priority all the time.
o A non-preemptive priority-based scheduler is a scheduler that uses priorities to determine which task
to execute next after the current task execution completes, but never interrupts a task during execution
to schedule another task.

Scheduler Implementation (Lee & Seshia, 2017)


• A scheduler may be part of a compiler or code generator, part of an operating system or microkernel, or both.
A run-time scheduler will typically implement tasks as threads (or as processes, but the distinction is not
important here).
o A timer interrupt occurs. o A task attempts to acquire a mutex.
o An I/O interrupt occurs. o A task tests a semaphore.
o An operating system service is invoked.
• Priority inversion is a scheduling anomaly where a high-priority task is blocked while unrelated lower-priority
tasks are executing.
o Priority inheritance - when a task blocks attempting to acquire a lock, then the task that holds the lock
inherits the priority of the blocked task. Thus, the task that holds the lock cannot be preempted by a
task with lower priority than the one attempting to acquire the lock.

Power Management (Peckol, 2019)


Embedded systems are now integrated into smartphones, smartwatches, wireless data modems, video cameras, net
browsers, wearable systems, and so on. All these devices use batteries as a source of energy supply.
• Categories of Power Consumption (CMOS-based systems)
o The dynamic power consumption relates to the charging and discharging of the load capacitance and
the short circuit currents.
o The static power (or leakage power) relates to leakage currents flowing even when a transistor is
closed.
• Importance of Power Management
1. Power consumption determines how long an embedded system can operate. Power consumption
determines the sizes of power supplies, the heat-dissipation overhead, the cost, weight, and area.
2. If temperature increases, then the device failure rates increase, too. The power dissipation leads to heating.
3. There are hard time constraints in embedded systems. Designers use pessimistic estimations of worst-case
execution time. It leads to redundancy in system resources (wastage) to meet worst-case performance
requirements.
4. Embedded systems often execute functions that require resource-intensive applications (for example,
multimedia processing). It leads to the optimization of embedded systems in the performance direction.
5. The development of CMOS technology leads to increasing both on-chip transistor density and speed. It
results in the chip utilization wall, which limits the fraction of the chip that can be simultaneously used at
full speed within the power budget.

06 Handout 1 *Property of STI


[email protected] Page 3 of 5
IT2102

6. More and more embedded systems are used in mobile convergence applications. For example, they are key
platforms for web browsing, video streaming, and others. Such an amount of these makes their total power
consumption very high.
7. Data processing leads to power consumption. Now many corporate IT departments follow the main
tendency of green computing. It means that they try to reduce the environmental effect of their activity.
Efficient power management is very important for achieving these goals.

Energy Saving Approaches (Marwedel, 2018)


• Dynamic Voltage and Frequency Scaling (DVFS) - is a technique that aims at reducing dynamic power
consumption by dynamically adjusting the voltage and frequency of a CPU. Typically, such an optimization step
follows code generation by the compiler.
o It is possible to shut down the processor during the slack time of 5 seconds.
o Another option is to initially run the processor at full speed and then reduce the voltage when the
remaining cycles can be completed at the lowest voltage.
o Run the processor at a clock rate just large enough to complete the cycles within the available time.
o If a variable voltage processor completes a task before the deadline, the energy consumption can be
reduced.
o If a processor uses a single supply voltage and completes a task just at its deadline, then the supply
voltage is the unique supply voltage that minimizes the energy consumption of the task.
o If a processor can only use some discrete voltage levels, then a voltage schedule using the two voltages,
which are the two immediate neighbors of the ideal voltage can be chosen. These two voltages lead to
the minimum energy consumption except if the need to use an integer number of cycles results in a
small deviation from the minimum.
• Dynamic Power Management (DPM) – allows a system or system's blocks to be placed in low-power sleep
modes when the systems are inactive. Normally, not all blocks of a system participate in performing different
functions, and it is useful to shut down inactive blocks to reduce power consumption.
o Straightforward approaches just use a simple timer to transition into a power-saving state.
o More sophisticated approaches model the idle times by stochastic processes and use these to predict
• Thermal Management - It is necessary to use run-time monitoring of temperatures. This means that thermal
sensors must be available in systems that potentially could get too hot. This information is then used to control
the generation of additional heat and possibly has an impact on cooling mechanisms as well.
o Controlling fans (when available) can be considered.
o Systems may be shutting down completely if temperatures are exceeding maximum thresholds.
o Adjusting the configuration that reduces performance of the system will lead to more available
hardware resources.
o Issuing less instructions per clock cycle leads to lessen the usage of some processor pipelines.

References:
Barkalov, A., Titarenko L. & Mazurkiewicz, M. (2019). Foundations of embedded systems. Springer International.
Barr. M, & Massa, A. (2006). Programming embedded systems (2nd ed.). Chap. 14 - optimization techniques. O'Reilly.
Cardoso, J., Coutinho, J., & Diniz, P. (2017). Embedded computing for high performance - Efficient mapping of
computations using customization, code transformations and compilation. Elsevier.
Colorado State University – Department of Computer Science. (n.d.). Legality of loop interchange [PDF]. Retrieved on
2021, July 9, from https://www.cs.colostate.edu/~mstrout/CS553Fall07/Slides/lecture23-looptransform.pdf
Darwish, T. & Bayoumi, M. (2005). The electrical engineering handbook (1st ed.). Academic Press.
Emertxe Information Technologies Pvt Ltd. (2014). Embedded C - optimization techniques [Slides]. SlideShare. Retrieved
on 2021, August 11, from https://www.slideshare.net/EmertxeSlides/embedded-c-optimization-techniques
Lee, E., Seshia, S. (2017). Introduction to embedded systems: a cyber-physical systems approach [2nd ed.]. MIT Press.
Lee, E., Ha, S. (1989). Scheduling strategies for multiprocessor real-time DSP. In Global Telecommunications Conference
(GLOBECOM). doi:10.1109/GLOCOM.1989.64160.

06 Handout 1 *Property of STI


[email protected] Page 4 of 5
IT2102

Marwedel, P. (2018). Embedded system design: Embedded systems, foundations of cyber-physical systems, and the
internet of things (3rd ed.). Springer International.
Pan, T., Zhu, Y. (2018). Designing embedded systems with Arduino: a fundamental technology for makers. Springer.
Peckol, J. (2019). Embedded systems – A contemporary design tool. (2nd ed.). Wiley.
Technische Universitt Hamburg. (n.d.). Optimization in embedded systems [Article]. Retrieved on 2021, August 11, from
https://www.tuhh.de/es/embedded-systems-design/teaching/seminars/optimization-in-embedded-systems.html

06 Handout 1 *Property of STI


[email protected] Page 5 of 5

You might also like