Updated__arm - Unit 2
Updated__arm - Unit 2
T H E A R C H I T E C T U R E F O R T TMH E D I G I T A L W O R L D
1
Agenda
• Introduction to ARM Ltd
Programmers Model
Instruction Set
ARM Ltd
• Founded in November 1990
• Spun out of Acorn Computers
High Performance
• Cortex-M3 delivering 1.25 DMIPS/MHz
• Separate data and instruction bus
• High code density and performance with Thumb-2 instruction set
• Excellent clock per instruction ratio
• Nested Vectored Interrupt Controller (NVIC) for outstanding interrupt handling
. The NVIC supports up to 240 interrupt vectors with 16 priority levels, which
allows for efficient handling of real-time events and reduces latency.
• Superior math capability
Thumb-2 Instruction Set Architecture (ISA)
Cortex-M3 supports 16- and 32-bit instructions available in the Thumb-2
instruction set. Both can be mixed without extra complexity and without
reducing the Cortex-M3 performance. Hardware divide instructions and a
number of multiply instructions give users high data-crunching throughput.
3-stage Pipeline Core Based on Harvard Architecture
The ARM Cortex-M3 3-stage pipeline includes instruction fetch, instruction
decode and instruction execution. Cortex-M3 also has separate buses for
instructions and data.
Overview of ARM Cortex M3
• The ARM Cortex M3 is a 32-bit microcontroller architecture that is
designed to provide high-performance and low-power embedded
solutions. It is based on the ARMv7-M architecture, which is designed for
microcontroller applications that require a balance of performance,
power efficiency, and ease of use
• It includes a high-performance 32-bit RISC core, a comprehensive set of
peripheral interfaces, and a flexible memory architecture.
• Harvard architecture with separate instruction and data buses. This
reduces bottlenecks common to shared data and instruction buses. This
allows for simultaneous access to both instruction and data memory,
which can improve performance and reduce power consumption.
• It includes a 3-stage pipeline that enables the core to execute
instructions in a single clock cycle.
• The Cortex-M3 supports up to 128KB of flash memory and 32KB of RAM, and
it supports a variety of peripherals, including ADCs, DACs, UARTs, timers, and
more
• M3 includes a Memory Protection Unit (MPU), which allows for memory
protection and isolation. The MPU can be used to define regions of
memory with different access permissions and can help to prevent
unauthorized access or modification of critical data.
• The SCB is another key feature of the Cortex M3 architecture, which
provides system-level control and configuration. The SCB includes registers
for controlling clock and power management, as well as system exceptions
and fault handling.
• supports low power modes, which can be used to reduce power
consumption in battery-powered applications. The Cortex M3 can enter a
sleep mode or a low-power mode when the system is not actively
processing data, which can help to extend battery life.
Data Sizes and Instruction Sets
• The ARM is a 32-bit architecture.
Operation modes
• Handler mode: When executing an exception handler such as an Interrupt
Service Routine (ISR). When in handler mode, the processor always has
privileged access level.
• Thread mode: When executing normal application code, the processor can
be either in privileged access level or unprivileged access level. This is
controlled by a special register called “CONTROL.”
• Software can switch the processor in privileged Thread mode to
unprivileged Thread mode. However, it cannot switch itself back from
unprivileged to privileged.
• If this is needed, the processor has to use the exception mechanism to
handle the switch.
• The separation of privileged and unprivileged access levels allows system
designers to develop robust embedded systems by providing a mechanism
to safeguard memory accesses to critical regions and by providing a basic
security model.
• Thread mode and Handler mode have very similar programmer’s models.
• Thread mode can switch to using a separate shadowed Stack Pointer (SP).
This allows the stack memory for application tasks to be separated from
the stack used by the OS kernel, thus allowing better system reliability.
By default, the Cortex-M processors start in privileged Thread mode and in
Thumb state.
• The debug state is used for debugging operations only. This state is entered
by a halt request from the debugger, or by debug events generated from
debug components in the processor.
• This state allows the debugger to access or change the processor register
values. The system memory, including peripherals inside and outside the
processor, can be accessed by the debugger in either Thumb state or debug
state
Registers
• The Cortex-M3 and Cortex-M4 processors have a number of registers inside
the processor core to perform data processing and control.
• Most of these registers are grouped in a unit called the register bank. Each
data processing instruction specifies the operation required, the source
register(s), and the destination register(s) if applicable.
Exceptions and Interrupts
28-12-2017 ARM - 36 22
Next 3 Slides provide a Quick
Summary of the above aspects
• The register bank in the Cortex-M3 and Cortex-M4 processors has 16 registers.
• Thirteen of them are general purpose 32-bit registers, and the other three have special
uses
• Registers R0 to R12 are general purpose registers. The first eight (R0 - R7) are also called
low registers. Due to the limited available space in the instruction set, many 16-bit
instructions can only access the low registers.
• The high registers (R8 - R12) can be used with 32-bit instructions, and a few with 16-bit
instructions, like MOV (move).
• The initial values of R0 to R12 are undefined.
• R13 is the Stack Pointer. It is used for accessing the stack memory via PUSH and POP
operations. Physically there are two different Stack Pointers: the Main Stack Pointer is
the default Stack Pointer. It is selected after reset, or when the processor is in Handler
Mode.
• The other Stack Pointer is called the Process Stack Pointer, The PSP, can only be used in
Thread Mode. The selection of Stack Pointer is determined by a special register called
CONTROL
• Both MSP and PSP are 32-bit, but the lowest two bits of the Stack Pointers (either MSP or
PSP) are always zero, and writes to these two bits are ignored.
• Only one stack pointer is active at a time. In a high-reliability operating system, we could activate
the PSP for user software and the MSP for operating system software. This way the user program
could crash without disturbing the operating system.
• In ARM Cortex-M processors, PUSH and POP are always 32-bit, and the addresses of the
transfers in stack operations must be aligned to 32-bit word boundaries.
• R14 is also called the Link Register (LR) and is used for holding the return address when
calling a function or subroutine. At the end of the function or subroutine, the program
control can return to the calling program and resume by loading the value of LR into the
Program Counter (PC). When a function or subroutine call is made, the value of LR is
updated automatically. If a function needs to call another function or subroutine, it needs
to save the value of LR in the stack first. Otherwise, the current value in LR will be lost
when the function call is made.
• R15 is the Program Counter (PC). It is readable and writeable: a read returns the current
instruction address plus 4 (this is due to the pipeline nature of the design, and
compatibility requirement with the ARM7TDMI processor). Writing to PC (e.g., using data
transfer/processing instructions) causes a branch operation.
The Cortex-M4 processor has an optional floating point unit. This provides
additional registers for floating point data processing, as well as a Floating
Point Status and Control Register (FPSCR)
CONTROL register
The CONTROL register defines:
• The selection of stack pointer (Main Stack Point/Process Stack Pointer)
• Access level in Thread mode (Privileged/Unprivileged)
In addition, for Cortex-M4 processor with a floating point unit, one bit of the
CONTROL register indicates if the current context (currently executed code)
uses the floating point unit or not.
The CONTROL register can only be modified in the privileged access level and
can be read in both privileged and unprivileged access levels
ARM-M3
ARM-M3
ARM-M3
ARM-M3
ARM-M3
ARM-M3
ARM-M3
ARM-M3
ARM-M3
ARM-M3
ARM-M3
ARM-M3
ARM-M3
Examples (Slides 52 to 67)
DATA PROCESSING INSTRUCTIONS
DATA PROCESSING INSTRUCTIONS
DATA PROCESSING INSTRUCTIONS
Summary
DATA PROCESSING INSTRUCTIONS
DATA TRANSFER INSTRUCTIONS
DATA TRANSFER INSTRUCTIONS
DATA TRANSFER INSTRUCTIONS
DATA TRANSFER INSTRUCTIONS
DATA TRANSFER INSTRUCTIONS
DATA TRANSFER INSTRUCTIONS
CONTROL FLOW INSTRUCTIONS
CONTROL FLOW INSTRUCTIONS
CONTROL FLOW INSTRUCTIONS
CONTROL FLOW INSTRUCTIONS
CONTROL FLOW INSTRUCTIONS
CONTROL FLOW INSTRUCTIONS
Memory system
Memory system features
The Cortex-M3 and Cortex-M4 processors have the following memory system features:
• 4GB linear address space e With 32-bit addressing, the ARM processors can access up
to 4GB of memory space. While many embedded systems do not need more than 1MB
of memory, the 32-bit addressing capability ensures future upgrade and expansion
possibilities. The Cortex-M3 and Cortex-M4 processors provide 32-bit buses using a
generic bus protocol called AHB LITE. The bus allows connections to 32/16/8-bit
memory devices with suitable memory interface controllers.
• Architecturally defined memory map - The 4GB memory space is divided into a
number of regions for various predefined memory and peripheral uses. This allows the
processor design to be optimized for performance. For example, the Cortex-M3 and
Cortex-M4 processors have multiple bus interfaces to allow simultaneous access from
the CODE region for program code and data operations to SRAM or peripheral regions.
• Support for little endian and big endian memory systems - The Cortex-M4 and Cortex-
M4 processors can work with either little endian or big endian memory systems. In
practice, a microcontroller product is normally designed with just one endian
configuration.
Bit band accesses (optional) -When the bit-band feature is included
(determined by microcontroller/System-on-Chip vendors), two 1MB regions in
the memory map are bit addressable via two bit-band regions. This allows
atomic access to individual bits in SRAM or peripheral address space.
• Write buffer - When a write transfer to a bufferable memory region will take
multiple cycles, the transfer can be buffered by the internal write buffer in the
Cortex-M3 or Cortex-M4 processor so that the processor can continue to
execute the next instruction, if possible. This allows higher program execution
speed.
• Memory Protection Unit (Optional) - The MPU is a programmable unit which
defines access permissions for various memory regions. The MPU in the
Cortex- M3 and Cortex-M4 processor supports eight programmable regions,
and can be used with an embedded OS to provide a robust system.
● Unaligned transfer support - All processors supporting ARMv7-M
architecture (including Cortex-M3 and Cortex-M4 processors) support
unaligned data transfers.
Memory map
The 4GB address space of the Cortex-M processors is partitioned into a number of memory
regions
The partitioning is based on typical usages so that different areas are designed to be used
primarily for:
• Program code accesses (e.g., CODE region)
• Data accesses (e.g., SRAM region)
• Peripherals (e.g., Peripheral region)
• Processor’s internal control and debug components (e.g., Private Peripheral Bus)
● The architecture also allows high flexibility to allow memory regions to be used for
other purposes. For example, programs can be executed from the CODE as well as the
SRAM region, and a microcontroller can also integrate SRAM blocks in CODE region
● The memory map arrangement is consistent between all of the Cortex-M processors.
● For example, the PPB address space hosts the registers for the Nested Vectored
Interrupt Controller (NVIC), processor’s configuration registers, as well as registers for
debug components. This is the same across all Cortex-M devices. This makes it easier to
port software from one Cortex-M device to another, and allows better software
reusability. It also makes it easier for tool vendors, as the debug control for the Cortex-
Code 0x00000000 to 0x1FFFFFFF
A 512MB memory space primarily for program code, including the default vector table that is a part of the
program memory. This region also allow data accesses.
SRAM 0x20000000 to 0x3FFFFFFF The SRAM region is located in the next 512MB of memory space. It is
primarily for connecting SRAM, mostly on-chip SRAM, but there is no limitation of exact memory type. The first
1MB of the SRAM region is bit addressable if the optional bit-band feature is included. You can also execute
program code from this region.
Peripherals 0x40000000 to 0x5FFFFFFF The Peripheral memory region also has the size of 512MB, and is use
mostly for on-chip peripherals. Similar to SRAM region, the first 1MB of the peripheral region is bit addressable if
the optional bit-band feature is included.
RAM 0x60000000 to 0x9FFFFFFF The RAM region contains two slots of 512MB memory space (total 1GB) for
other RAM such as off-chip memories. The RAM region can be used for program code as well as data.
Devices 0xA0000000 to 0xDFFFFFFF The Device region contains two slots of 512MB memory space (total
1GB) for other peripherals such as off-chip peripherals.
System 0xE0000000 to 0xFFFFFFFF The System region contains several parts:
Internal Private Peripheral Bus (PPB), 0xE0040000 to 0xE00FFFFF: The internal Private Peripheral Bus
(PPB) is used to access system components such as the NVIC, SysTick, MPU, as well as debug components
inside the Cortex-M3/M4 processors. In most cases this memory space can only be accessed by program code
running in privileged state.
External Private Peripheral Bus (PPB), 0xE0040000 to 0xE00FFFFF An addition PPB region is available for
additional optional debug components and so allow silicon vendors to add their own debug or vendorspecific
components. This memory space can only be accessed by program code running in privileged state. Note that
the base address of debug components on this bus can potentially be changed by silicon designers.
Vendor-specific area, 0xE0100000 to 0xFFFFFFFF The remaining memory space is reserved for vendor-
The Cortex-M3 Memory Map
The Memory Map
The Cortex-M3 has a predefined memory map. This allows the built-in
peripherals, such as the interrupt controller and the debug components,
to be accessed by simple memory access instructions.
The predefined memory map also allows the Cortex-M3 processor to be
highly optimized for speed and ease of integration in system-on-a-chip
(SoC) designs.
Overall, the 4 GB memory space can be divided into ranges as shown in
Figure .
The Cortex-M3 design has an internal bus infrastructure optimized for this
memory usage. In addition, the design allows these regions to be used
differently.
For example, data memory can still be put into the CODE region, and
program code can be executed from an external Random Access Memory
(RAM) region
Multiple bus interface for different memory regions
Connecting the processor to memory and peripherals
● The value of the current running exception is indicated by the special register
Interrupt Program Status register (IPSR), or from the Nested Vectored
Interrupt Controllers (NVICs) Interrupt Control State register (the
VECTACTIVE field)
● When an enabled exception occurs but cannot be carried out immediately (for
instance, if a higher-priority interrupt service routine is running or if the interrupt
mask register is set), it will be pended (except for some fault exceptions1 ).
● This means that a register (pending status) will hold the exception request
until the exception can be carried out.
● This is different from traditional ARM processors.
● Previously, the devices that generate interrupts, such as interrupt request
(IRQ)/fast interrupt request (FIQ), must hold the request until they are served.
Definitions of Priority
● In the Cortex-M3, whether and when an exception can be carried out can be
affected by the priority of the exception.
● A higher-priority (smaller number in priority level) exception can preempt a
lower priority (larger number in priority level) exception; this is the nested
exception / interrupt scenario.
● Some of the exceptions (reset, NMI, and hard fault) have fixed priority levels.
They are negative numbers to indicate that they are of higher priority than
other exceptions.
● Other exceptions have programmable priority levels.
● The Cortex-M3 supports three fixed highest-priority levels and up to 256
levels of programmable priority (a maximum of 128 levels of preemption)
● If the priority level configuration registers are 8 bits wide, why there are only
128 preemption levels?
● This is because the 8-bit register is further divided into two parts: preempt
priority and subpriority.
● Using a configuration register in the NVIC called Priority Group (a part of the
Application Interrupt and Reset Control register in the NVIC,), the priority-
level configuration registers for each exception with programmable priority
levels is divided into two halves.
● The upper half (left bits) is the preempt priority, and the lower half (right bits)
is the subpriority
● The preempt priority level defines whether an interrupt can take place when
the processor is already running another interrupt handler.
● The subpriority level value is used only when two exceptions with the same
preempt priority level occurred at the same time. In this case, the exception
with higher subpriority (lower value) will be handled first.
Nested vectored interrupt controller (NVIC)
The NVIC is a part of the Cortex-M processor.
It is programmable and its registers are located in the System Control Space (SCS) of
the memory map .
The NVIC handles the exceptions and interrupt configurations, prioritization, and
interrupt masking.
The NVIC has the following features:
• Flexible exception and interrupt management
• Nested exception/interrupt support
• Vectored exception/interrupt entry
• Interrupt masking
Flexible exception and interrupt management
● Each interrupt can be enabled or disabled and can have its pending status set or
cleared by software.
● The NVIC can handle various types of interrupt sources:
• Pulsed interrupt request - the interrupt request is at least one clock cycle long.
When the NVIC receives a pulse at its interrupt input, the pending status is set and
held until the interrupt gets serviced.
• Level triggered interrupt request the interrupt source holds the request high until
the interrupt is serviced.
● The signal level at the NVIC input is active high.
● However, the actual external interrupt input on the microcontroller could be
designed differently and is converted to an active high signal level by on-chip
logic.
● Nested exception/interrupt support- Each exception has a priority level.
● Some exceptions, such as interrupts, have programmable priority levels and
some others have a fixed priority level.
● When an exception occurs, the NVIC will compare the priority level of this
exception to the current level. If the new exception has a higher priority, the
current running task will be suspended.
● Some of the registers will be stored on the stack memory, and the processor will
start executing the exception handler of the new exception. This process is
called “preemption.” When the higher priority exception handler is complete, it is
terminated with an exception return operation and the processor automatically
restores the registers from stack and resumes the task that was running
previously. This mechanism allows nesting of exception services without any
Vectored exception/interrupt entry
● When an exception occurs, the processor will need to locate the starting point of
the corresponding exception handler.
● Traditionally, in ARM processors such as the ARM7TDMI, software handles this
step. The Cortex-M processors automatically locate the starting point of the
exception handler from a vector table in the memory.
● As a result, the delays from the start of the exception to the execution of the
exception handlers are reduced.
● Interrupt masking
● The NVIC in the Cortex-M3 and Cortex-M4 processors provide several interrupt
masking registers such as the PRIMASK special register.
● Using the PRIMASK register you can disable all exceptions, excluding HardFault
and NMI.
● This masking is useful for operations that should not be interrupted, like time
critical control tasks or real-time multimedia codecs.
Cortex-M3 Processor Applications