Compiler User Guide 100748 0611 00 en
Compiler User Guide 100748 0611 00 en
Version 6.11
User Guide
Arm® Compiler
User Guide
Copyright © 2016–2018 Arm Limited or its affiliates. All rights reserved.
Release Information
Document History
Your access to the information in this document is conditional upon your acceptance that you will not use or permit others to use
the information for the purposes of determining whether implementations infringe any third party patents.
THIS DOCUMENT IS PROVIDED “AS IS”. ARM PROVIDES NO REPRESENTATIONS AND NO WARRANTIES,
EXPRESS, IMPLIED OR STATUTORY, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTABILITY, SATISFACTORY QUALITY, NON-INFRINGEMENT OR FITNESS FOR A PARTICULAR PURPOSE
WITH RESPECT TO THE DOCUMENT. For the avoidance of doubt, Arm makes no representation with respect to, and has
undertaken no analysis to identify or understand the scope and content of, third party patents, copyrights, trade secrets, or other
rights.
TO THE EXTENT NOT PROHIBITED BY LAW, IN NO EVENT WILL ARM BE LIABLE FOR ANY DAMAGES,
INCLUDING WITHOUT LIMITATION ANY DIRECT, INDIRECT, SPECIAL, INCIDENTAL, PUNITIVE, OR
CONSEQUENTIAL DAMAGES, HOWEVER CAUSED AND REGARDLESS OF THE THEORY OF LIABILITY, ARISING
OUT OF ANY USE OF THIS DOCUMENT, EVEN IF ARM HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH
DAMAGES.
This document consists solely of commercial items. You shall be responsible for ensuring that any use, duplication or disclosure of
this document complies fully with any relevant export laws and regulations to assure that this document or any portion thereof is
not exported, directly or indirectly, in violation of such export laws. Use of the word “partner” in reference to Arm’s customers is
not intended to create or refer to any partnership relationship with any other company. Arm may make changes to this document at
any time and without notice.
If any of the provisions contained in these terms conflict with any of the provisions of any click through or signed written
agreement covering this document with Arm, then the click through or signed written agreement prevails over and supersedes the
conflicting provisions of these terms. This document may be translated into other languages for convenience, and you agree that if
there is any conflict between the English version of this document and any translation, the terms of the English version of the
Agreement shall prevail.
The Arm corporate logo and words marked with ® or ™ are registered trademarks or trademarks of Arm Limited (or its
subsidiaries) in the US and/or elsewhere. All rights reserved. Other brands and names mentioned in this document may be the
trademarks of their respective owners. Please follow Arm’s trademark usage guidelines at http://www.arm.com/company/policies/
trademarks.
Copyright © 2016–2018 Arm Limited (or its affiliates). All rights reserved.
LES-PRE-20349
Confidentiality Status
This document is Non-Confidential. The right to use, copy and disclose this document may be subject to license restrictions in
accordance with the terms of the agreement entered into by Arm and the party that Arm delivered this document to.
Preface
About this book ..................................................... ..................................................... 11
Glossary
The Arm® Glossary is a list of terms used in Arm documentation, together with definitions for those
terms. The Arm Glossary does not contain terms that are industry standard unless the Arm meaning
differs from the generally accepted meaning.
See the Arm® Glossary for more information.
Typographic conventions
italic
Introduces special terminology, denotes cross-references, and citations.
bold
Highlights interface elements, such as menu names. Denotes signal names. Also used for terms
in descriptive lists, where appropriate.
monospace
Denotes text that you can enter at the keyboard, such as commands, file and program names,
and source code.
monospace
Denotes a permitted abbreviation for a command or option. You can enter the underlined text
instead of the full command or option name.
monospace italic
Denotes arguments to monospace text where the argument is to be replaced by a specific value.
monospace bold
Denotes language keywords when used outside example code.
<and>
Encloses replaceable terms for assembler syntax where they appear in code or code fragments.
For example:
MRC p15, 0, <Rd>, <CRn>, <CRm>, <Opcode_2>
SMALL CAPITALS
Used in body text for a few terms that have specific technical meanings, that are defined in the
Arm® Glossary. For example, IMPLEMENTATION DEFINED, IMPLEMENTATION SPECIFIC, UNKNOWN, and
UNPREDICTABLE.
Feedback
Feedback on content
If you have comments on content then send an e-mail to [email protected]. Give:
• The title Arm Compiler User Guide.
• The number 100748_0611_00_en.
• If applicable, the page number(s) to which your comments refer.
• A concise explanation of your comments.
Arm also welcomes general suggestions for additions and improvements.
Note
Arm tests the PDF only in Adobe Acrobat and Acrobat Reader, and cannot guarantee the quality of the
represented document when used with any other PDF reader.
Other information
• Arm® Developer.
• Arm® Information Center.
• Arm® Technical Support Knowledge Articles.
• Technical Support.
• Arm® Glossary.
This chapter introduces Arm Compiler 6 and helps you to start working with Arm Compiler 6 quickly.
You can use Arm Compiler 6 from Arm Development Studio, Arm DS-5 Development Studio, Arm
Keil® MDK, or as a standalone product.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 1-13
reserved.
Non-Confidential
1 Getting Started
1.1 Introduction to Arm® Compiler 6
armlink
The linker combines the contents of one or more object files with selected parts of one or more
object libraries to produce an executable program.
armar
The archiver enables sets of ELF object files to be collected together and maintained in archives
or libraries. You can pass such a library or archive to the linker in place of several ELF files.
You can also use the archive for distribution to a third party application developer.
fromelf
The image conversion utility can convert Arm ELF images to binary formats and can also
generate textual information about the input image, such as its disassembly and its code and data
size.
Arm C++ libraries
The Arm C++ libraries are based on the LLVM libc++ project:
• The libc++abi library is a runtime library providing implementations of low-level language
features.
• The libc++ library provides an implementation of the ISO C++ library standard. It depends
on the functions that are provided by libc++abi.
Arm C libraries
The Arm C libraries provide:
• An implementation of the library features as defined in the C standards.
• Nonstandard extensions common to many C libraries.
• POSIX extended functionality.
• Functions standardized by POSIX.
Application development
A typical application development flow might involve the following:
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 1-14
reserved.
Non-Confidential
1 Getting Started
1.1 Introduction to Arm® Compiler 6
code
C/C++ A32 .c .o data
and T32 code
debug
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 1-15
reserved.
Non-Confidential
1 Getting Started
1.2 Installing Arm® Compiler
System Requirements
Arm Compiler 6 is available for the following operating systems:
• Windows 64-bit.
• Windows 32-bit.
• Linux 64-bit.
For more information on system requirements see the Arm® Compiler release note.
If you need to set any other environment variable, such as ARM_TOOL_VARIANT, see Toolchain
environment variables on page Appx-A-148 for more information.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 1-16
reserved.
Non-Confidential
1 Getting Started
1.2 Installing Arm® Compiler
Related tasks
1.3 Accessing Arm® Compiler from Arm® Development Studio or Arm® DS-5 Development Studio
on page 1-18
1.4 Accessing Arm® Compiler from the Arm® Keil® µVision® IDE on page 1-20
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 1-17
reserved.
Non-Confidential
1 Getting Started
1.3 Accessing Arm® Compiler from Arm® Development Studio or Arm® DS-5 Development Studio
1.3 Accessing Arm® Compiler from Arm® Development Studio or Arm® DS-5
Development Studio
Arm Development Studio and DS-5 Development Studio are development suites that provide Arm
Compiler 6 as a built-in toolchain.
This task describes how to access and configure Arm Compiler from DS-5 Development Studio.
Note
See the Arm Development Studio documentation for how to access and configure Arm Compiler from
Arm Development Studio.
Prerequisites
Ensure you have DS-5 Development Studio installed. Create a new C or C++ project in DS-5
Development Studio. For information on creating new projects in DS-5 Development Studio, see
Creating a new C or C++ project.
Procedure
1. Select the project in DS-5 Development Studio.
2. Select Project > Properties.
3. From the left-hand side menu, select C/C++ Build > Tool Chain Editor.
4. In the Current toolchain options, select ARM Compiler 6 if this is not already selected.
5. From the left-hand side menu, select C/C++ Build > Settings.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 1-18
reserved.
Non-Confidential
1 Getting Started
1.3 Accessing Arm® Compiler from Arm® Development Studio or Arm® DS-5 Development Studio
Figure 1-2 Accessing Arm Compiler settings from DS-5 Development Studio
For information about using DS-5, see the Arm® DS-5 Getting Started Guide and Arm® DS-5
Debugger Guide.
6. After setting the compiler options, right-click on the project and select Build Project.
Related reference
1.2 Installing Arm® Compiler on page 1-16
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 1-19
reserved.
Non-Confidential
1 Getting Started
1.4 Accessing Arm® Compiler from the Arm® Keil® µVision® IDE
1.4 Accessing Arm® Compiler from the Arm® Keil® µVision® IDE
MDK is a microprocessor development suite that provides the µVision® IDE, and Arm Compiler 6 as a
built-in toolchain.
This task describes how to access and configure Arm Compiler 6 from the µVision IDE:
Prerequisites
Ensure you have µVision installed. Create a new project in µVision.
Procedure
1. Select the project in µVision.
2. Select Project > Manage > Project 'project_name' Project Items.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 1-20
reserved.
Non-Confidential
1 Getting Started
1.5 Compiling a Hello World example
There is no default target for AArch32 state. You must specify either -march to target an
architecture or -mcpu to target a processor. This example uses -mcpu to target the Cortex‑A53
processor. The compiler generates code that is optimized specifically for the Cortex‑A53, but
might not run on other processors.
Use -mcpu=list or -march=list to see all available processor or architecture options.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 1-21
reserved.
Non-Confidential
1 Getting Started
1.5 Compiling a Hello World example
size against code speed, whereas -Omax uses aggressive optimizations to target performance
optimization.
• Instruction set. AArch32 targets support two instruction sets that you specify with the -m option. The
-marm option specifies A32, that is 32-bit instructions, to emphasize performance. The -mthumb
option specifies T32, that is mixed 32-bit and 16-bit instructions, to emphasize code density.
...
main
0x000081a0: e92d4800 .H-. PUSH {r11,lr}
0x000081a4: e1a0b00d .... MOV r11,sp
0x000081a8: e24dd010 ..M. SUB sp,sp,#0x10
0x000081ac: e3a00000 .... MOV r0,#0
0x000081b0: e50b0004 .... STR r0,[r11,#-4]
0x000081b4: e30a19cc .... MOV r1,#0xa9cc
...
• Examine the size of code and data in the executable:
fromelf --text -z a.out
See fromelf Command-line Options for the options from the fromelf tool.
This example compiles the two source files file1.c and file2.c for an AArch64 state target. The -o
option specifies that the filename of the generated executable is image.axf.
More complex projects might have many more source files. It is not efficient to compile every source file
at every compilation, because most source files are unlikely to change. To avoid compiling unchanged
source files, you can compile and link as separate steps. In this way, you can then use a build system
(such as make) to compile only those source files that have changed, then link the object code together.
The armclang -c option tells the compiler to compile to object code and stop before calling the linker:
armclang -c --target=aarch64-arm-none-eabi file1.c
armclang -c --target=aarch64-arm-none-eabi file2.c
armlink file1.o file2.o -o image.axf
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 1-22
reserved.
Non-Confidential
1 Getting Started
1.5 Compiling a Hello World example
Related information
armclang --target option
armclang -march option
armclang -mcpu option
Summary of armclang command-line options
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 1-23
reserved.
Non-Confidential
1 Getting Started
1.6 Using the integrated assembler
.global mystrcopy
.type mystrcopy, "function"
mystrcopy:
ldrb r2, [r1], #1
strb r2, [r0], #1
cmp r2, #0
bne mystrcopy
bx lr
The .section directive creates a new section in the object file named StringCopy. The characters in the
string following the section name are the flags for this section. The a flag marks this section as
allocatable. The x flag marks this section as executable.
The .balign directive aligns the subsequent code to a 4-byte boundary. The alignment is required for
compliance with the Arm® Application Procedure Call Standard (AAPCS).
The .global directive marks the symbol mystrcopy as a global symbol. This enables the symbol to be
referenced by external files.
The .type directive sets the type of the symbol mystrcopy to function. This helps the linker use the
proper linkage when the symbol is branched to from A32 or T32 code.
...
** Section #3 'StringCopy' (SHT_PROGBITS) [SHF_ALLOC + SHF_EXECINSTR]
Size : 14 bytes (alignment 4)
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 1-24
reserved.
Non-Confidential
1 Getting Started
1.6 Using the integrated assembler
Address: 0x00000000
$t.0
mystrcopy
0x00000000: f8112b01 ...+ LDRB r2,[r1],#1
0x00000004: f8002b01 ...+ STRB r2,[r0],#1
0x00000008: 2a00 .* CMP r2,#0
0x0000000a: d1f9 .. BNE mystrcopy ; 0x0
0x0000000c: 4770 pG BX lr
...
The example shows the disassembly for the section StringCopy as created in the source file.
Note
The code is marked as T32 by default because Armv8‑M Mainline does not support A32 code. For
processors that support A32 and T32 code, you can explicitly mark the code as A32 or T32 by adding the
GNU assembly .arm or .thumb directive, respectively, at the start of the source file.
int main(void) {
mystrcopy(dest, source);
return 0;
}
An extern function declaration has been added for the mystrcopy function. The return type and function
parameters must be checked manually.
If you want to call the assembly function from a C++ source file, you must disable C++ name mangling
by using extern "C" instead of extern. For the above example, use:
extern "C" void mystrcopy(char *dest, const char *source);
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 1-25
reserved.
Non-Confidential
1 Getting Started
1.7 Running bare-metal images
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 1-26
reserved.
Non-Confidential
1 Getting Started
1.7 Running bare-metal images
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 1-27
reserved.
Non-Confidential
1 Getting Started
1.8 Architectures supported by Arm® Compiler
arm-arm-none-eabi
Generates A32 and T32 instructions for AArch32 state. Must be used in conjunction with -
march (to target an architecture) or -mcpu (to target a processor).
To generate generic code that runs on any processor with a particular architecture, use the -march option.
Use the -march=list option to see all supported architectures.
To optimize your code for a particular processor, use the -mcpu option. Use the -mcpu=list option to see
all supported processors.
Note
The --target, -march, and -mcpu options are armclang options. For all of the other tools, such as
armasm and armlink, use the --cpu option to specify target processors and architectures.
Related information
armclang --target option
armclang -march option
armclang -mcpu option
armlink --cpu option
Arm Glossary
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 1-28
reserved.
Non-Confidential
Chapter 2
Using Common Compiler Options
There are many options that you can use to control how Arm Compiler 6 generates code for your
application. This section lists the mandatory and commonly used optional command-line arguments,
such as to control target selection, optimization, and debug view.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 2-29
reserved.
Non-Confidential
2 Using Common Compiler Options
2.1 Mandatory armclang options
Specifying a target
To specify a target, use the --target option. The following targets are available:
• To generate A64 instructions for AArch64 state, specify --target=aarch64-arm-none-eabi.
Note
For AArch64, the default architecture is Armv8‑A.
• To generate A32 and T32 instructions for AArch32 state, specify --target=arm-arm-none-eabi. To
specify generation of either A32 or T32 instructions, use -marm or -mthumb respectively.
Note
AArch32 has no defaults. You must always specify an architecture or processor.
Specifying an architecture
To generate code for a specific architecture, use the -march option. The supported architectures vary
according to the selected target.
To see a list of all the supported architectures for the selected target, use -march=list.
Specifying a processor
To generate code for a specific processor, use the -mcpu option. The supported processors vary according
to the selected target.
To see a list of all the supported processors for the selected target, use -mcpu=list.
It is also possible to enable or disable optional architecture features, by using the +[no]feature notation.
For a list of the architecture features that your processor supports, see the processor product
documentation. See the armclang Reference Guide for a list of architecture features that Arm Compiler
supports.
Use +feature or +nofeature to explicitly enable or disable an optional architecture feature.
Note
You do not need to specify both the architecture and processor. The compiler infers the architecture from
the processor. If you only want to run code on one particular processor, you can specify the specific
processor. Performance is optimized, but code is only guaranteed to run on that processor. If you want
your code to run on a range of processors from a particular architecture, you can specify the architecture.
The code runs on any processor implementation of the target architecture, but performance might be
impacted.
Examples
These examples compile and link the input file helloworld.c:
• To compile for the Armv8‑A architecture in AArch64 state, use:
armclang --target=aarch64-arm-none-eabi -march=armv8-a helloworld.c
• To compile for the Armv8‑R architecture in AArch32 state, use:
armclang --target=arm-arm-none-eabi -march=armv8-r helloworld.c
• To compile for the Armv8‑M architecture mainline profile, use:
armclang --target=arm-arm-none-eabi -march=armv8-m.main helloworld.c
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 2-30
reserved.
Non-Confidential
2 Using Common Compiler Options
2.1 Mandatory armclang options
Related information
armclang --target option
armclang -march option
armclang -mcpu option
armclang -marm option
armclang -mthumb option
Summary of armclang command-line options
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 2-31
reserved.
Non-Confidential
2 Using Common Compiler Options
2.2 Selecting source language options
Note
This topic includes descriptions of [COMMUNITY] features. See Support level definitions
on page Appx-A-139.
Source language
By default Arm Compiler treats files with .c extension as C source files. If you want to compile a .c
file, for example file.c, as a C++ source file, use the -xc++ option:
armclang --target=aarch64-arm-none-eabi -march=armv8-a -xc++ file.c
By default Arm Compiler treats files with .cpp extension as C++ source files. If you want to compile
a .cpp file, for example file.cpp, as a C source file, use the -xc option:
armclang --target=aarch64-arm-none-eabi -march=armv8-a -xc file.cpp
The -x option only applies to input files that follow it on the command line.
- - c++14 gnu++14
The default language standard for C code is gnu11 [COMMUNITY]. The default language standard for
C++ code is gnu++14. To specify a different source language standard, use the -std=name option.
Arm Compiler supports various language extensions, including GCC extensions, which you can use in
your source code. The GCC extensions are only available when you specify one of the GCC C or C++
language variants. For more information on language extensions, see the Arm® C Language Extensions in
Arm Compiler.
Since Arm Compiler uses the available language extensions by default, it does not adhere to the strict
ISO Standard. To compile to strict ISO standard for the source language, use the -Wpedantic option.
This shows warnings where the source code violates the ISO Standard. Arm Compiler does not support
strict adherence to C++98 or C++03.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 2-32
reserved.
Non-Confidential
2 Using Common Compiler Options
2.2 Selecting source language options
If you do not use -Wpedantic, Arm Compiler uses the available language extensions without warning.
However, where language variants produce different behavior, the behavior of the language variant
specified by -std will apply.
Note
Certain compiler optimizations can violate strict adherence to the ISO Standard for the language. To
identify when these violations happen, use the -Wpedantic option.
The following example shows the use of a variable length array, which is a C99 feature. In this example,
the function declares an array i, with variable length n.
#include <stdlib.h>
void function(int n) {
int i[n];
}
Arm Compiler does not warn when compiling the example for C99 with -Wpedantic:
armclang --target=aarch64-arm-none-eabi -march=armv8-a -c -std=c99 -Wpedantic file.c
Arm Compiler does warn about variable length arrays when compiling the example for C90 with -
Wpedantic:
Related information
Standard C++ library implementation definition
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 2-33
reserved.
Non-Confidential
2 Using Common Compiler Options
2.3 Selecting optimization options
The example shows the optimization performed with the -O1 optimization option. To perform this
optimization, compile your source file using:
armclang --target=arm-arm-none-eabi -march=armv7-a -O1 -c -S file.c
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 2-34
reserved.
Non-Confidential
2 Using Common Compiler Options
2.3 Selecting optimization options
The example shows the optimization performed with the -O0 optimization option. To perform this
optimization, compile your source file using:
armclang --target=arm-arm-none-eabi -march=armv7-a -O0 -c -S file.c
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 2-35
reserved.
Non-Confidential
2 Using Common Compiler Options
2.4 Building to aid debugging
When linking, there are several armlink options available to help improve the debug view:
• --debug. This option is the default.
• --no_remove to retain all input sections in the final image even if they are unused.
• --bestdebug. When different input objects are compiled with different optimization levels, this
option enables linking for the best debug illusion.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 2-36
reserved.
Non-Confidential
2 Using Common Compiler Options
2.5 Linker options for mapping code and data to target memory
2.5 Linker options for mapping code and data to target memory
For an image to run correctly on a target, you must place the various parts of the image at the correct
locations in memory. Linker command-line options are available to map the various parts of an image to
target memory.
The options implement the scatter-loading mechanism that describes the memory layout for the image.
The options that you use depend on the complexity of your image:
• For simple images, use the following memory map related options:
— --ro_base to specify the address of both the load and execution region containing the RO output
section.
— --rw_base to specify the address of the execution region containing the RW output section.
— --zi_base to specify the address of the execution region containing the ZI output section.
Note
For objects that include execute-only (XO) sections, the linker provides the --xo_base option to
locate the XO sections. These sections are objects that are targeted at Armv7‑M or Armv8‑M
architectures, or objects that are built with the armclang -mthumb option,
• For complex images, use a text format scatter-loading description file. This file is known as a scatter
file, and you specify it with the --scatter option.
Note
You cannot use the memory map related options with the --scatter option.
Examples
The following example shows how to place code and data using the memory map related options:
armlink --ro_base=0x0 --rw_base=0x400000 --zi_base=0x405000 --first="init.o(init)" init.o
main.o
Note
In this example, --first is also included to make sure that the initialization routine is executed first.
The following example shows a scatter file, scatter.scat, that defines an equivalent memory map:
LR1 0x0000 0x20000
{
ER_RO 0x0
{
init.o (INIT, +FIRST)
*(+RO)
}
ER_RW 0x400000
{
*(+RW)
}
ER_ZI 0x405000
{
*(+ZI)
}
}
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 2-37
reserved.
Non-Confidential
2 Using Common Compiler Options
2.6 Controlling diagnostic messages
Option Description
-Werror Turn all warnings into errors.
-Werror=foo Turn warning flag foo into an error.
See Controlling Errors and Warnings in the Clang Compiler User's Manual for full details about
controlling diagnostics with armclang.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 2-38
reserved.
Non-Confidential
2 Using Common Compiler Options
2.6 Controlling diagnostic messages
int y=i+x;
printf("Result of %d plus %d is %d\n", i, x); /* Missing an input argument for the third
%d */
call(); /* This function has not been declared and is therefore an implicit declaration
*/
return;
}
By default armclang checks the format of printf() statements to ensure that the number of % format
specifiers matches the number of data arguments. Therefore Arm Compiler generates this diagnostic
message:
file.c:9:36: warning: more '%' conversions than data arguments [-Wformat]
printf("Result of %d plus %d is %d\n", i, x);
^
By default armclang compiles for the gnu11 standard for .c files. This language standard does not allow
implicit function declarations. Therefore Arm Compiler generates this diagnostic message:
file.c:11:3: warning: implicit declaration of function 'call' is invalid C99 [-Wimplicit-
function-declaration]
call();
^
Some diagnostic messages are suppressed by default. To see all diagnostic messages use -Weverything:
armclang --target=aarch64-arm-none-eabi -march=armv8-a -c file.c -Weverything
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 2-39
reserved.
Non-Confidential
2 Using Common Compiler Options
2.7 Selecting floating-point options
Option Description
armclang -mfpu Specify the floating point architecture to the compiler.
armclang -mfloat-abi Specify the floating-point linkage to the compiler.
armclang -march Specify the target architecture to the compiler. This automatically selects the default
floating-point architecture.
armclang -mcpu Specify the target processor to the compiler. This automatically selects the default floating-
point architecture.
armlink --fpu Specify the floating-point architecture to the linker.
See the armclang Reference Guide for more information on the -march option.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 2-40
reserved.
Non-Confidential
2 Using Common Compiler Options
2.7 Selecting floating-point options
See the armclang Reference Guide for more information on the -mfpu option.
Floating-point linkage
Floating-point linkage refers to how the floating-point arguments are passed to and returned from
function calls.
For AArch64, Arm Compiler always uses hardware floating-point registers to pass and return floating-
point values. This is called hardware linkage.
For AArch32, Arm Compiler can use hardware linkage or software linkage. When using software
linkage, floating-point values are passed and returned using the general purpose registers. By default,
Arm Compiler uses software linkage. You can use the -mfloat-abi option to force hardware linkage or
software linkage.
softfp (This is the default) Software linkage. Use general-purpose Use hardware floating-point instructions.
registers. But if -mfpu=none is specified for
AArch32, then use software libraries.
Code with hardware linkage can be faster than the same code with software linkage. However, code with
software linkage can be more portable because it does not require the hardware floating-point registers.
Hardware floating-point is not available on some architectures such as Armv6‑M, or on processors where
the floating-point hardware might be powered down for energy efficiency reasons.
Note
In AArch32 state, if you specify -mfloat-abi=soft, then specifying the -mfpu option does not have an
effect.
See the armclang Reference Guide for more information on the -mfloat-abi option.
Note
All objects to be linked together must have the same type of linkage. If you link object files that have
hardware linkage with object files that have software linkage, then the image might have unpredictable
behavior. When linking objects, specify the armlink option --fpu=name where name specifies the
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 2-41
reserved.
Non-Confidential
2 Using Common Compiler Options
2.7 Selecting floating-point options
correct linkage type and floating-point hardware. This enables the linker to provide diagnostic
information if it detects different linkage types.
See the armlink User Guide for more information on how the --fpu option specifies the linkage type and
floating-point hardware.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 2-42
reserved.
Non-Confidential
2 Using Common Compiler Options
2.8 Compilation tools command-line option rules
armclang follows the same syntax rules as GCC. Some options are preceded by a single dash -, others
by a double dash --. Some options require an = character between the option and the argument, others
require a space character.
Keyword options
All keyword options, including keyword options with arguments, are preceded by a double dash
--. An = or space character is required between the option and the argument. For example:
armlink -- -ifile_1
In some Unix shells, you might have to include quotes when using arguments to some command-line
options, for example:
armlink obj1.o --keep='s.o(vect)'
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 2-43
reserved.
Non-Confidential
Chapter 3
Writing Optimized Code
To make best use of the optimization capabilities of Arm Compiler, there are various options, pragmas,
attributes, and coding techniques that you can use.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 3-44
reserved.
Non-Confidential
3 Writing Optimized Code
3.1 Optimizing loops
Loop unrolling
You can reduce the impact of this overhead by unrolling some of the iterations, which in turn reduces the
number of iterations for checking the condition. Use #pragma unroll (n) to unroll time-critical loops
in your source code. However, unrolling loops has the disadvantage of increasing the codes size. These
pragmas are only effective at optimization -O2, -O3, -Ofast, and -Omax.
Pragma Description
The examples below show code with loop unrolling and code without loop unrolling.
Bit counting loop without unrolling Bit counting loop with unrolling
The code below shows the code that Arm Compiler generates for the above examples. Copy the
examples above into file.c and compile using:
armclang --target=arm-arm-none-eabi -march=armv8-a file.c -O2 -c -S -o file.s
For the function with loop unrolling, countSetBits2, the generated code is faster but larger in size.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 3-45
reserved.
Non-Confidential
3 Writing Optimized Code
3.1 Optimizing loops
Bit counting loop without unrolling Bit counting loop with unrolling
countSetBits1: countSetBits2:
mov r1, r0 mov r1, r0
mov r0, #0 mov r0, #0
cmp r1, #0 cmp r1, #0
bxeq lr bxeq lr
mov r2, #0 mov r2, #0
mov r0, #0 mov r0, #0
.LBB0_1: LBB0_1:
and r3, r1, #1 and r3, r1, #1
cmp r2, r1, asr #1 cmp r2, r1, asr #1
add r0, r0, r3 add r0, r0, r3
lsr r3, r1, #1 beq .LBB0_4
mov r1, r3 @ BB#2:
bne .LBB0_1 asr r3, r1, #1
bx lr cmp r2, r1, asr #2
and r3, r3, #1
add r0, r0, r3
asrne r3, r1, #2
andne r3, r3, #1
addne r0, r0, r3
cmpne r2, r1, asr #3
beq .LBB0_4
@ BB#3:
asr r3, r1, #3
cmp r2, r1, asr #4
and r3, r3, #1
add r0, r0, r3
asr r3, r1, #4
mov r1, r3
bne .LBB0_1
.LBB0_4:
bx lr
Arm Compiler can unroll loops completely only if the number of iterations is known at compile time.
Loop vectorization
If your target has the Advanced SIMD unit, then Arm Compiler can use the vectorizing engine to
optimize vectorizable sections of the code. At optimization level -O1, you can enable vectorization using
-fvectorize. At higher optimizations, -fvectorize is enabled by default and you can disable it using
-fno-vectorize. See -fvectorize in the armclang Reference Guide for more information. When using
-fvectorize with -O1, vectorization might be inhibited in the absence of other optimizations which
might be present at -O2 or higher.
For example, loops that access structures can be vectorized if all parts of the structure are accessed
within the same loop rather than in separate loops. The following examples show code with a loop that
can be vectorized by Advanced SIMD, and a loop that cannot be vectorized easily.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 3-46
reserved.
Non-Confidential
3 Writing Optimized Code
3.1 Optimizing loops
For each example above, copy the code into file.c and compile at optimization level O2 to enable auto-
vectorization:
armclang --target=arm-arm-none-eabi -march=armv8-a -O2 file.c -c -S -o file.s
The vectorized assembly code contains the Advanced SIMD instructions, for example vld1, vshl, and
vst1. These Advanced SIMD instructions are not generated when compiling the example with the non-
vectorizable loop.
DoubleBuffer1: DoubleBuffer2:
.fnstart .fnstart
@ BB#0: @ BB#0:
movw r0, :lower16:buffer movw r0, :lower16:buffer
movt r0, :upper16:buffer movt r0, :upper16:buffer
vld1.64 {d16, d17}, [r0:128] ldr r1, [r0]
mov r1, r0 lsl r1, r1, #1
vshl.i32 q8, q8, #1 str r1, [r0]
vst1.32 {d16, d17}, [r1:128]! ldr r1, [r0, #12]
vld1.64 {d16, d17}, [r1:128] lsl r1, r1, #1
vshl.i32 q8, q8, #1 str r1, [r0, #12]
vst1.64 {d16, d17}, [r1:128] ldr r1, [r0, #24]
add r1, r0, #32 lsl r1, r1, #1
vld1.64 {d16, d17}, [r1:128] str r1, [r0, #24]
vshl.i32 q8, q8, #1 ldr r1, [r0, #36]
vst1.64 {d16, d17}, [r1:128] lsl r1, r1, #1
add r1, r0, #48 str r1, [r0, #36]
vld1.64 {d16, d17}, [r1:128] ldr r1, [r0, #48]
vshl.i32 q8, q8, #1 lsl r1, r1, #1
vst1.64 {d16, d17}, [r1:128] str r1, [r0, #48]
add r1, r0, #64 ldr r1, [r0, #60]
add r0, r0, #80 lsl r1, r1, #1
vld1.64 {d16, d17}, [r1:128] str r1, [r0, #60]
vshl.i32 q8, q8, #1 ldr r1, [r0, #72]
vst1.64 {d16, d17}, [r1:128] lsl r1, r1, #1
vld1.64 {d16, d17}, [r0:128] str r1, [r0, #72]
vshl.i32 q8, q8, #1 ldr r1, [r0, #84]
vst1.64 {d16, d17}, [r0:128] lsl r1, r1, #1
bxlr str r1, [r0, #84]
ldr r1, [r0, #4]
lsl r1, r1, #1
str r1, [r0, #4]
ldr r1, [r0, #16]
lsl r1, r1, #1
...
bx lr
Note
Advanced SIMD (Single Instruction Multiple Data), also known as Arm NEON™ technology, is a
powerful vectorizing unit on Armv7‑A and later Application profile architectures. It enables you to write
highly optimized code. You can use intrinsics to directly use the Advanced SIMD capabilities from C or
C++ code. The intrinsics and their data types are defined in arm_neon.h. For more information on
Advanced SIMD, see the Arm® C Language Extensions, Cortex®‑A Series Programmer's Guide, and
Arm® NEON™ Programmer's Guide.
Using -fno-vectorize does not necessarily prevent the compiler from emitting Advanced SIMD
instructions. The compiler or linker might still introduce Advanced SIMD instructions, such as when
linking libraries that contain these instructions.
To prevent the compiler from emitting Advanced SIMD instructions for AArch64 targets, specify
+nosimd using -march or -mcpu:
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 3-47
reserved.
Non-Confidential
3 Writing Optimized Code
3.1 Optimizing loops
To prevent the compiler from emitting Advanced SIMD instructions for AArch32 targets, set the option -
mfpu to the correct value that does not include Advanced SIMD, for example fp-armv8.
Related information
armclang -O option
pragma unroll
armclang -fvectorize option
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 3-48
reserved.
Non-Confidential
3 Writing Optimized Code
3.2 Inlining functions
__attribute__((always_inline)) Specify this function attribute on a function definition or declaration to tell the compiler
to always inline this function, with certain exceptions such as for recursive functions.
This overrides the -fno-inline-functions option.
__attribute__((noinline)) Specify this function attribute on a function definition or declaration to tell the compiler
to not inline the function. This is equivalent to __declspec(noinline).
-fno-inline-functions This is a compiler command-line option. Specify this option to the compiler to disable
inlining. This option overrides the __inline__ hint.
Note
• Arm Compiler only inlines functions within the same compilation unit, unless you use Link Time
Optimization. For more information, see Optimizing across modules with link time optimization
on page 4-59 in the Software Development Guide.
• C++ and C99 provide the inline language keyword. The effect of this inline language keyword is
identical to the effect of using the __inline__ compiler keyword. However, the effect in C99 mode
is different from the effect in C++ or other C that does not adhere to the C99 standard. For more
information, see Inline functions in the armclang Reference Guide.
• Function inlining normally happens at higher optimization levels, such as -O2, except when you
specify __attribute__((always_inline)).
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 3-49
reserved.
Non-Confidential
3 Writing Optimized Code
3.2 Inlining functions
In the example code, functions bar and row are identical but function row is always inlined. Use the
following compiler commands to compile for -O2 with -fno-inline-functions and without -fno-
inline-functions:
When compiling with -fno-inline-functions, the compiler does not inline the function bar. When
compiling without -fno-inline-functions, the compiler inlines the function bar. However, the
compiler always inlines the function row even though it is identical to function bar.
Related information
armclang -fno-inline-functions option
__inline keyword
__attribute__((always_inline)) function attribute
__attribute__((no_inline)) function attribute
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 3-50
reserved.
Non-Confidential
3 Writing Optimized Code
3.3 Examining stack usage
Copy the code example to file.c and compile it using the following command:
armclang --target=arm-arm-none-eabi -march=armv8-a -c -g file.c -o file.o
Compiling with the -g option generates the DWARF frame information that armlink requires for
estimating the stack use. Run armlink on the object file using --info=stack:
armlink file.o --info=stack
For the example code, armlink shows the amount of stack used by the various functions. Function
foo_mor has more arguments than function foo, and therefore uses more stack.
You can also examine stack usage using the linker option --callgraph:
armlink file.o --callgraph -o FileImage.axf
This outputs a file called FileImage.htm which contains the stack usage information for the various
functions in the application.
fact (ARM, 84 bytes, Stack size 12 bytes, file.o(.text))
[Stack]
Max Depth = 12
Call Chain = fact
[Called By]
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 3-51
reserved.
Non-Confidential
3 Writing Optimized Code
3.3 Examining stack usage
>> foo_mor
>> foo
foo (ARM, 36 bytes, Stack size 8 bytes, file.o(.text))
[Stack]
Max Depth = 20
Call Chain = foo >> fact
[Calls]
>> fact
[Called By]
>> main
foo_mor (ARM, 76 bytes, Stack size 16 bytes, file.o(.text))
[Stack]
Max Depth = 28
Call Chain = foo_mor >> fact
[Calls]
>> fact
[Called By]
>> main
main (ARM, 76 bytes, Stack size 8 bytes, file.o(.text))
[Stack]
Max Depth = 36
Call Chain = main >> foo_mor >> fact
[Calls]
>> foo_mor
>> foo
[Called By]
>> __rt_entry_main (via BLX)
See --info and --callgraph in the armlink User Guide for more information on these options.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 3-52
reserved.
Non-Confidential
3 Writing Optimized Code
3.4 Packing data structures
For each example use linker option --info=sizes to examine the memory used in file.o.
armlink file.o --info=sizes
The linker output shows the total memory used by the two objects c and d. For example:
Code (inc. data) RO Data RW Data ZI Data Debug Object Name
36 0 0 0 24 0 str.o
---------------------------------------------------------------------------
36 0 16 0 24 0 Object Totals
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 3-53
reserved.
Non-Confidential
3 Writing Optimized Code
3.4 Packing data structures
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 3-54
reserved.
Non-Confidential
3 Writing Optimized Code
3.4 Packing data structures
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 3-55
reserved.
Non-Confidential
3 Writing Optimized Code
3.4 Packing data structures
Dereferencing such a pointer can be unsafe even when unaligned accesses are supported by the target,
because certain instructions always require word-aligned addresses.
Note
If you take the address of a packed member, in most cases, the compiler generates a warning.
Related information
pragma pack
__attribute__((packed)) type attribute
__attribute__((packed)) variable attribute
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 3-56
reserved.
Non-Confidential
Chapter 4
Optimization Techniques
Describes how to use armclang to optimize for either code size or performance, and the impact of the
optimization level when debugging.
It contains the following sections:
• 4.1 Optimizing for code size or performance on page 4-58.
• 4.2 Optimizing across modules with link time optimization on page 4-59.
• 4.3 How optimization affects the debug experience on page 4-64.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 4-57
reserved.
Non-Confidential
4 Optimization Techniques
4.1 Optimizing for code size or performance
The following armclang option helps you optimize for both code size and code performance:
-flto
Enables link time optimization, which lets the linker make additional optimizations across
multiple source files.
In addition, choices you make during coding can affect optimization. For example:
• Optimizing loop termination conditions can improve both code size and performance. In particular,
loops with counters that decrement to zero usually produce smaller, faster code than loops with
incrementing counters.
• Manually unrolling loops by reducing the number of loop iterations, but increasing the amount of
work done in each iteration can improve performance at the expense of code size.
• Reducing debug information in objects and libraries reduces the size of your image.
• Using inline functions offers a trade-off between code size and performance.
• Using intrinsics can improve performance.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 4-58
reserved.
Non-Confidential
4 Optimization Techniques
4.2 Optimizing across modules with link time optimization
ELF Object
containing ELF Object
Bitcode .o
.o
Libraries Link time optimizer
libLTO
Note
In this figure, ELF Object containing Bitcode is an ELF file that does not contain normal code and data.
Instead, it contains a section called .llvmbc that holds LLVM bitcode.
Section .llvmbc is reserved. You must not create an .llvmbc section with, for example
__attribute__((section(".llvmbc"))).
Caution
Link Time Optimization performs aggressive optimizations. Sometimes this can result in large chunks of
code being removed.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 4-59
reserved.
Non-Confidential
4 Optimization Techniques
4.2 Optimizing across modules with link time optimization
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 4-60
reserved.
Non-Confidential
4 Optimization Techniques
4.2 Optimizing across modules with link time optimization
Scatter-loading
The output of the link time optimizer is a single ELF object file that by default is given a
temporary filename. This ELF object file contains sections and symbols just like any other ELF
object file, and these are matched by input section selectors as normal.
Use the armlink option --lto_intermediate_filename to name the ELF object file output.
You can reference this ELF file name in the scatter file.
Arm recommends that link time optimization is only performed on code and data that does not
require precise placement in the scatter file, with general input section selectors such as *(+RO)
and .ANY(+RO) used to select sections generated by link time optimization.
It is not possible to match bitcode in .llvmbc sections by name in a scatter file.
Note
The scatter-loading interface is subject to change in future versions of Arm Compiler 6.
int main(void)
{
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 4-61
reserved.
Non-Confidential
4 Optimization Techniques
4.2 Optimizing across modules with link time optimization
return foo(0);
}
// foo.c
#include <stdio.h>
return 0;
}
}
void bar(void)
{
printf("a is non-zero.\n");
}
Procedure
1. Build the example code with LTO disabled:
armclang --target=arm-arm-none-eabi -march=armv7-a -O2 -c main.c -o main.o
armclang --target=arm-arm-none-eabi -march=armv7-a -O2 -c foo.c -o foo.o
armlink main.o foo.o -o image_without_lto.axf
fromelf --text -c -z image_without_lto.axf
Results:
The compiler cannot inline the call to foo() because it is in a different object from main().
Therefore, the compiler must keep the conditional call to bar() within foo(), because the compiler
does not have any information about the value of the parameter a while foo.c is being compiled:
$a.0
foo
0x00008bd8: e3500000 ..P. CMP r0,#0
0x00008bdc: 0a000004 .... BEQ 0x8bf4 ; foo + 28
0x00008be0: e92d4800 .H-. PUSH {r11,lr}
0x00008be4: e3080c44 D... MOV r0,#0x8c44
0x00008be8: e3400000 ..@. MOVT r0,#0
0x00008bec: fafffd28 (... BLX puts ; 0x8094
0x00008bf0: e8bd4800 .H.. POP {r11,lr}
0x00008bf4: e3a00000 .... MOV r0,#0
0x00008bf8: e12fff1e ../. BX lr
main
0x00008bfc: e3a00000 .... MOV r0,#0
0x00008c00: eafffff4 .... B foo ; 0x8bd8
Additionally, bar() uses the Arm C library function printf(). In this example, printf() is
optimized to puts() and inlined into foo(). Therefore, the linker must include the relevant C library
code to allow the puts() function to be used. Including the C library code results in a large amount
of uncalled code being included in the image. The output from the fromelf utility shows the resulting
overall image size:
** Object/Image Component Sizes
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 4-62
reserved.
Non-Confidential
4 Optimization Techniques
4.2 Optimizing across modules with link time optimization
Results:
Although the compiler does not have any information about the call to foo() from main() when
compiling foo.c, at link time, it is known that:
• foo() is only ever called once, with the parameter a == 0.
• bar() is never called.
• The Arm C library function puts() is never called.
Because LTO is enabled, this extra information is used to make the following optimizations:
• Inlining the call to foo() into main().
• Removing the code to conditionally call bar() from foo() entirely.
• Removing the C library code that allows use of the puts() function.
$a.0
main
0x00008128: e3a00000 .... MOV r0,#0
0x0000812c: e12fff1e ../. BX lr
Also, this optimization means that the overall image size is much lower. The output from the fromelf
utility shows the reduced image size:
** Object/Image Component Sizes
Code (inc. data) RO Data RW Data ZI Data Debug Object Name
Related reference
4.1 Optimizing for code size or performance on page 4-58
4.2 Optimizing across modules with link time optimization on page 4-59
4.3 How optimization affects the debug experience on page 4-64
Related information
armclang -O option
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 4-63
reserved.
Non-Confidential
4 Optimization Techniques
4.3 How optimization affects the debug experience
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 4-64
reserved.
Non-Confidential
Chapter 5
Using Assembly and Intrinsics in C or C++ Code
All code for a single application can be written in the same source language. This is usually a high-level
language such as C or C++ that is compiled to instructions for Arm architectures. However, in some
situations you might need lower-level control than what C and C++ provide.
For example:
• To access features which are not available from C or C++, such as interfacing directly with device
hardware.
• To generate highly optimized code by manually writing sections using intrinsics or inline assembly.
There are a number of different ways to have low-level control over the generated code:
• Intrinsics are functions provided by the compiler. An intrinsic function has the appearance of a
function call in C or C++, but is replaced during compilation by a specific sequence of low-level
instructions.
• Inline assembly lets you write assembly instructions directly in your C/C++ code, without the
overhead of a function call.
• Calling assembly functions from C/C++ lets you write standalone assembly code in a separate source
file. This code is assembled separately to the C/C++ code, and then integrated at link time.
It contains the following sections:
• 5.1 Using intrinsics on page 5-66.
• 5.2 Writing inline assembly code on page 5-67.
• 5.3 Calling assembly functions from C and C++ on page 5-69.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 5-65
reserved.
Non-Confidential
5 Using Assembly and Intrinsics in C or C++ Code
5.1 Using intrinsics
Using compiler intrinsics, you can achieve more complete coverage of target architecture instructions
than you would from the instruction selection of the compiler.
An intrinsic function has the appearance of a function call in C or C++, but is replaced during
compilation by a specific sequence of low-level instructions. The following example shows how to
access the __qadd saturated add intrinsic:
#include <arm_acle.h> /* Include ACLE intrinsics */
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 5-66
reserved.
Non-Confidential
5 Using Assembly and Intrinsics in C or C++ Code
5.2 Writing inline assembly code
int main(void)
{
int a = 1;
int b = 2;
int c = 0;
c = add(a,b);
Note
The inline assembler does not support legacy assembly code written in armasm assembler syntax. See the
Migration and Compatibility Guide for more information about migrating armasm syntax assembly code
to GNU syntax.
code is the assembly instruction, for example "ADD R0, R1, R2". code_template is a template for an
assembly instruction, for example "ADD %[result], %[input_i], %[input_j]".
If you specify a code_template rather than code then you must specify the output_operand_list
before specifying the optional input_operand_list and clobbered_register_list.
output_operand_list is a list of output operands, separated by commas. Each operand consists of a
symbolic name in square brackets, a constraint string, and a C expression in parentheses. In this example,
there is a single output operand: [result] "=r" (res). The list can be empty. For example:
__asm ("ADD R0, %[input_i], %[input_j]"
: /* This is an empty output operand list */
: [input_i] "r" (i), [input_j] "r" (j)
);
input_operand_list is an optional list of input operands, separated by commas. Input operands use the
same syntax as output operands. In this example, there are two input operands: [input_i] "r" (i),
[input_j] "r" (j). The list can be empty.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 5-67
reserved.
Non-Confidential
5 Using Assembly and Intrinsics in C or C++ Code
5.2 Writing inline assembly code
clobbered_register_list is an optional list of clobbered registers whose contents are not preserved.
The list can be empty. In addition to registers, the list can also contain special arguments:
"cc"
The instruction affects the condition code flags.
"memory"
The instruction accesses unknown memory addresses.
The registers in clobbered_register_list must use lowercase letters rather than uppercase letters. An
example instruction with a clobbered_register_list is:
__asm ("ADD R0, %[input_i], %[input_j]"
: /* This is an empty output operand list */
: [input_i] "r" (i), [input_j] "r" (j)
: "r5","r6","cc","memory" /*Use "r5" instead of "R5" */
);
Use the volatile qualifier for assembler instructions that have processor side-effects, which the
compiler might be unaware of. The volatile qualifier disables certain compiler optimizations. The
volatile qualifier is optional.
Multiple instructions
You can write multiple instructions within the same __asm statement. This example shows an interrupt
handler written in one __asm statement for an Armv8‑M mainline architecture.
void HardFault_Handler(void)
{
asm (
"TST LR, #0x40\n\t"
"BEQ from_nonsecure\n\t"
"from_secure:\n\t"
"TST LR, #0x04\n\t"
"ITE EQ\n\t"
"MRSEQ R0, MSP\n\t"
"MRSNE R0, PSP\n\t"
"B hard_fault_handler_c\n\t"
"from_nonsecure:\n\t"
"MRS R0, CONTROL_NS\n\t"
"TST R0, #2\n\t"
"ITE EQ\n\t"
"MRSEQ R0, MSP_NS\n\t"
"MRSNE R0, PSP_NS\n\t"
"B hard_fault_handler_c\n\t"
);
}
Copy the above handler code to file.c and then you can compile it using:
armclang --target=arm-arm-none-eabi -march=armv8-m.main -c -S file.c -o file.s
Embedded assembly
You can write embedded assembly using __attribute__((naked)). For more information, see
__attribute__((naked)) in the armclang Reference Guide.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 5-68
reserved.
Non-Confidential
5 Using Assembly and Intrinsics in C or C++ Code
5.3 Calling assembly functions from C and C++
Note
For code portability, it is better to use intrinsics or inline assembly rather than writing and calling
assembly functions.
Note
armclang requires that you explicitly specify the types of exported symbols using the .type
directive. If the .type directive is not specified in the above example, the linker outputs warnings of
the form:
Warning: L6437W: Relocation #RELA:1 in test.o(.text) with respect to myadd...
int main()
{
int a = 4;
int b = 5;
printf("Adding %d and %d results in %d\n", a, b, myadd(a, b));
return (0);
}
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 5-69
reserved.
Non-Confidential
5 Using Assembly and Intrinsics in C or C++ Code
5.3 Calling assembly functions from C and C++
The AAPCS describes a contract between caller functions and callee functions. For example, for
integer or pointer types, it specifies that:
• Registers R0-R3 pass argument values to the callee function, with subsequent arguments passed
on the stack.
• Register R0 passes the result value back to the caller function.
• Caller functions must preserve R0-R3 and R12, because these registers are allowed to be
corrupted by the callee function.
• Callee functions must preserve R4-R11 and LR, because these registers are not allowed to be
corrupted by the callee function.
For more information, see the Procedure Call Standard for the Arm® Architecture (AAPCS).
4. Compile both source files:
armclang --target=arm-arm-none-eabi -march=armv8-a main.c myadd.s
Related information
Procedure Call Standard for the Arm Architecture
Procedure Call Standard for the Arm 64-bit Architecture
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 5-70
reserved.
Non-Confidential
Chapter 6
Mapping Code and Data to the Target
There are various options in Arm Compiler to control how code, data and other sections of the image are
mapped to specific locations on the target.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-71
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.1 What the linker does to create an image
Note
XO sections are supported only for images that are targeted at Armv7‑M or Armv8‑M architectures.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-72
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.1 What the linker does to create an image
If the location of some code or data lies outside all the regions that are specified in your scatter file, the
linker attempts to create a load and execution region to contain that code or data.
Note
Multiple code and data sections cannot occupy the same area of memory, unless you place them in
separate overlay regions.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-73
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.2 Placing data items for target peripherals with a scatter file
6.2 Placing data items for target peripherals with a scatter file
To access the peripherals on your target, you must locate the data items that access them at the addresses
of those peripherals.
To make sure that the data items are placed at the correct address for the peripherals, use the
__attribute__((section(".ARM.__at_address"))) variable attribute together with a scatter file.
Procedure
1. Create peripheral.c to place the my_peripheral variable at address 0x10000000.
#include "stdio.h"
int main(void)
{
printf("%d\n",my_peripheral);
return 0;
}
LR_2 0x01000000
{
ER_ZI +0 UNINIT
{
*(.bss)
}
}
LR_3 0x10000000
{
ER_PERIPHERAL 0x10000000 UNINIT
{
*(.ARM.__at_0x10000000)
}
}
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-74
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.3 Placing the stack and heap with a scatter file
Note
• If you re-implement __user_setup_stackheap() then your version does not get invoked when stack
and heap are defined in a scatter file.
• You might have to update your startup code to use the correct initial stack pointer. Some processors,
such as the Cortex-M3 processor, require that you place the initial stack pointer in the vector table.
See Stack and heap configuration in AN179 - Cortex®-M3 Embedded Software Development for more
details.
Procedure
1. Define two special execution regions in your scatter file that is named ARM_LIB_HEAP and
ARM_LIB_STACK.
2. Assign the EMPTY attribute to both regions.
Because the stack and heap are in separate regions, the library selects the non-default implementation
of __user_setup_stackheap() that uses the value of the symbols:
• Image$$ARM_LIB_STACK$$ZI$$Base.
• Image$$ARM_LIB_STACK$$ZI$$Limit.
• Image$$ARM_LIB_HEAP$$ZI$$Base.
• Image$$ARM_LIB_HEAP$$ZI$$Limit.
You can specify only one ARM_LIB_STACK or ARM_LIB_HEAP region, and you must allocate a size.
Example:
LOAD_FLASH …
{
…
ARM_LIB_STACK 0x40000 EMPTY -0x20000 ; Stack region growing down
{ }
ARM_LIB_HEAP 0x28000000 EMPTY 0x80000 ; Heap region growing up
{ }
…
}
3. Alternatively, define a single execution region that is named ARM_LIB_STACKHEAP to use a combined
stack and heap region. Assign the EMPTY attribute to the region.
Because the stack and heap are in the same region, __user_setup_stackheap() uses the value of the
symbols Image$$ARM_LIB_STACKHEAP$$ZI$$Base and Image$$ARM_LIB_STACKHEAP$$ZI$$Limit.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-75
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.4 Root region
Example
Root region with the same load and execution address.
LR_1 0x040000 ; load region starts at 0x40000
{ ; start of execution region descriptions
ER_RO 0x040000 ; load address = execution address
{
* (+RO) ; all RO sections (must include section with
; initial entry point)
}
… ; rest of scatter-loading description
}
Example
The following example shows an implicitly defined root region:
LR_1 0x040000 ; load region starts at 0x40000
{ ; start of execution region descriptions
ER_RO 0x040000 ABSOLUTE ; load address = execution address
{
* (+RO) ; all RO sections (must include the section
; containing the initial entry point)
}
… ; rest of scatter-loading description
}
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-76
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.4 Root region
init.o init.o
0x80000
Single (FIXED)
load
Empty
region
(movable)
*(RO) *(RO)
0x4000
You can use this to place a function or a block of data, such as a constant table or a checksum, at a fixed
address in ROM so that it can be accessed easily through pointers.
If you specify, for example, that some initialization code is to be placed at start of ROM and a checksum
at the end of ROM, some of the memory contents might be unused. Use the * or .ANY module selector to
flood fill the region between the end of the initialization block and the start of the data block.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-77
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.4 Root region
To make your code easier to maintain and debug, it is suggested that you use the minimum amount of
placement specifications in scatter files and leave the detailed placement of functions and data to the
linker.
Note
There are some situations where using FIXED and a single load region are not appropriate. Other
techniques for specifying fixed locations are:
• If your loader can handle multiple load regions, place the RO code or data in its own load region.
• If you do not require the function or data to be at a fixed location in ROM, use ABSOLUTE instead of
FIXED. The loader then copies the data from the load region to the specified address in RAM.
ABSOLUTE is the default attribute.
• To place a data structure at the location of memory-mapped I/O, use two load regions and specify
UNINIT. UNINIT ensures that the memory locations are not initialized to zero.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-78
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.5 Placing functions and data in a named section
Procedure
1. Create a C source file file.c to specify a section name foo for a variable and a section
name .bss.mybss for a zero-initialized variable z, for example:
#include "stdio.h"
int main(void)
{
int x = 4;
int y = 7;
z = x + y;
printf("%d\n",variable);
printf("%d\n",z);
return 0;
}
2. Create a scatter file to place the named section, scatter.scat, for example:
LR_1 0x0
{
ER_RO 0x0 0x4000
{
*(+RO)
}
ER_RW 0x4000 0x2000
{
*(+RW)
}
ER_ZI 0x6000 0x2000
{
*(+ZI)
}
ER_MYBSS 0x8000 0x2000
{
*(.bss.mybss)
}
ADDER 0x08000000
{
file.o (foo) ; select section foo from file.o
}
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-79
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.5 Placing functions and data in a named section
The ARM_LIB_STACK and ARM_LIB_HEAP regions are required because the program is being linked
with the semihosting libraries.
Note
If you omit file.o (foo) from the scatter file, the linker places the section in the region of the same
type. That is, ER_RW in this example.
Execution Region ADDER (Base: 0x08000000, Size: 0x00000004, Max: 0xffffffff, ABSOLUTE)
Note
• If scatter-loading is not used, the linker places the section foo in the default ER_RW execution
region of the LR_1 load region. It also places the section .bss.mybss in the default execution
region ER_ZI.
• If you have a scatter file that does not include the foo selector, then the linker places the section in
the defined RW execution region.
You can also place a function at a specific address using .ARM.__at_address as the section name.
For example, to place the function sqr at 0x20000, specify:
int sqr(int n1) __attribute__((section(".ARM.__at_0x20000")));
For more information, see 6.6 Placing functions and data at specific addresses on page 6-81.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-80
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.6 Placing functions and data at specific addresses
Note
The name of the section is only significant if you are trying to match the section by name in a scatter file.
Without overlays, the linker automatically assigns __at sections when you use the --autoat command-
line option. This option is the default. If you are using overlays, then you cannot use --autoat to place
__at sections.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-81
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.6 Placing functions and data at specific addresses
The automatic placement of __at sections is enabled by default. Use the linker command-line option,
--no_autoat to disable this feature.
Note
You cannot use __at section placement with position independent execution regions.
When linking with the --autoat option, the linker does not place __at sections with scatter-loading
selectors. Instead, the linker places the __at section in a compatible region. If no compatible region is
found, the linker creates a load and execution region for the __at section.
All linker execution regions created by --autoat have the UNINIT scatter-loading attribute. If you
require a ZI __at section to be zero-initialized, then it must be placed within a compatible region. A
linker execution region created by --autoat must have a base address that is at least 4 byte-aligned. If
any region is incorrectly aligned, the linker produces an error message.
A compatible region is one where:
• The __at address lies within the execution region base and limit, where limit is the base address +
maximum size of execution region. If no maximum size is set, the linker sets the limit for placing
__at sections as the current size of the execution region without __at sections plus a constant. The
default value of this constant is 10240 bytes, but you can change the value using the
--max_er_extension command-line option.
• The execution region meets at least one of the following conditions:
— It has a selector that matches the __at section by the standard scatter-loading rules.
— It has at least one section of the same type (RO or RW) as the __at section.
— It does not have the EMPTY attribute.
Note
The linker considers an __at section with type RW compatible with RO.
The following example shows the sections .ARM.__at_0x0000 type RO, .ARM.__at_0x4000 type RW,
and .ARM.__at_0x8000 type RW:
// place the RO variable in a section called .ARM.__at_0x0000
const int foo __attribute__((section(".ARM.__at_0x0000"))) = 10;
The following scatter file shows how automatically to place these __at sections:
LR1 0x0
{
ER_RO 0x0 0x4000
{
*(+RO) ; .ARM.__at_0x0000 lies within the bounds of ER_RO
}
ER_RW 0x4000 0x2000
{
*(+RW) ; .ARM.__at_0x4000 lies within the bounds of ER_RW
}
ER_ZI 0x6000 0x2000
{
*(+ZI)
}
}
; The linker creates a load and execution region for the __at section
; .ARM.__at_0x8000 because it lies outside all candidate regions.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-82
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.6 Placing functions and data at specific addresses
You can use the standard section-placement rules to place __at sections when using the --no_autoat
command-line option.
Note
You cannot use __at section placement with position-independent execution regions.
The following example shows the placement of read-only sections .ARM.__at_0x2000 and the read-
write section .ARM.__at_0x4000. Load and execution regions are not created automatically in manual
mode. An error is produced if an __at section cannot be placed in an execution region.
The following example shows the placement of the variables in C or C++ code:
// place the RO variable in a section called .ARM.__at_0x2000
const int foo __attribute__((section(".ARM.__at_0x2000"))) = 100;
// place the RW variable in a section called .ARM.__at_0x4000
int bar __attribute__((section(".ARM.__at_0x4000")));
The following scatter file shows how to place __at sections manually:
LR1 0x0
{
ER_RO 0x0 0x2000
{
*(+RO) ; .ARM.__at_0x0000 is selected by +RO
}
ER_RO2 0x2000
{
*(.ARM.__at_0x02000) ; .ARM.__at_0x2000 is selected by the section named
; .ARM.__at_0x2000
}
ER2 0x4000
{
*(+RW, +ZI) ; .ARM.__at_0x4000 is selected by +RW
}
}
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-83
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.6 Placing functions and data at specific addresses
Procedure
1. Create a C file abs_address.c to define an integer and a string constant.
unsigned int const number = 0x12345678;
char* const string = "Hello World";
2. Create a scatter file, scatter.scat, to place the constants in separate sections ER_RONUMBERS and
ER_ROSTRINGS.
LR_1 0x040000 ; load region starts at 0x40000
{ ; start of execution region descriptions
ER_RO 0x040000 ; load address = execution address
{
*(+RO +RW) ; all RO sections (must include section with
; initial entry point)
}
ER_RONUMBERS +0
{
*(.rodata.number, +RO-DATA)
}
ER_ROSTRINGS +0
{
*(.rodata.string, .rodata.str1.1, +RO-DATA)
}
; rest of scatter-loading description
4. Run fromelf on the image to view the contents of the output sections.
fromelf -c -d abs_address.axf
0x040000: 78 56 34 12 xV4.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-84
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.6 Placing functions and data at specific addresses
5. Replace the ER_RONUMBERS and ER_ROSTRINGS sections in the scatter file with the following
ER_RODATA section:
ER_RODATA +0
{
abs_address.o(.rodata.number, .rodata.string, .rodata.str1.1, +RO-DATA)
}
The following procedure describes how to place the jump table in a ROM .rodata section.
Procedure
1. Create a C file jump.c.
Make the PFUNC type a pointer to a void function that has no parameters. You can then use PFUNC to
create an array of constant function pointers.
extern void func0(void);
extern void func1(void);
extern void func2(void);
void jump(unsigned i)
{
if (i<=2)
table[i]();
}
3. Run fromelf on the image to view the contents of the output sections.
fromelf -c -d jump.o
Results: The table is placed in the read-only section .rodata that you can place in ROM as required:
...
** Section #3 '.text.jump' (SHT_PROGBITS) [SHF_ALLOC + SHF_EXECINSTR]
Size : 64 bytes (alignment 4)
Address: 0x00000000
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-85
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.6 Placing functions and data at specific addresses
$a.0
[Anonymous symbol #24]
jump
0x00000000: e92d4800 .H-. PUSH {r11,lr}
0x00000004: e24dd008 ..M. SUB sp,sp,#8
0x00000008: e1a01000 .... MOV r1,r0
0x0000000c: e58d0004 .... STR r0,[sp,#4]
0x00000010: e3500002 ..P. CMP r0,#2
0x00000014: e58d1000 .... STR r1,[sp,#0]
0x00000018: 8a000006 .... BHI {pc}+0x20 ; 0x38
0x0000001c: eaffffff .... B {pc}+0x4 ; 0x20
0x00000020: e59d0004 .... LDR r0,[sp,#4]
0x00000024: e3001000 .... MOVW r1,#:LOWER16: table
0x00000028: e3401000 ..@. MOVT r1,#:UPPER16: table
0x0000002c: e7910100 .... LDR r0,[r1,r0,LSL #2]
0x00000030: e12fff30 0./. BLX r0
0x00000034: eaffffff .... B {pc}+0x4 ; 0x38
0x00000038: e28dd008 .... ADD sp,sp,#8
0x0000003c: e8bd8800 .... POP {r11,pc}
...
** Section #7 '.rodata.table' (SHT_PROGBITS) [SHF_ALLOC]
Size : 12 bytes (alignment 4)
Address: 0x00000000
0x000000: 00 00 00 00 00 00 00 00 00 00 00 00 ............
...
The --map option displays the memory map of the image. Also, --autoat is the default.
In this example, __attribute__((section(".ARM.__AT_0x5000"))) specifies that the global variable
gValue is to be placed at the absolute address 0x5000. gValue is placed in the execution region
ER$$.ARM.__AT_0x5000 and load region LR$$.ARM.__AT_0x5000.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-86
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.6 Placing functions and data at specific addresses
The ARM_LIB_STACK and ARM_LIB_HEAP regions are required because the program is being linked
with the semihosting libraries.
4. Compile and link the sources:
armclang --target=arm-arm-none-eabi -march=armv8-a -c function.c
armclang --target=arm-arm-none-eabi -march=armv8-a -c main.c
armlink --no_autoat --scatter=scatter.scat --map function.o main.o -o squared.axf
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-87
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.6 Placing functions and data at specific addresses
In this example, the size of ER1 is unknown. Therefore, gValue might be placed in ER1 or ER2. To make
sure that gValue is placed in ER2, you must include the corresponding selector in ER2 and link with the
--no_autoat command-line option. If you omit --no_autoat, gValue is to placed in a separate load
region LR$$.ARM.__at_0x10000 that contains the execution region ER$$.ARM.__at_0x10000.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-88
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.7 Placement of Arm® C and C++ library code
RAM1 0x3000
{
*armlib* (+RO) ; all other Arm-supplied library code
; for example, floating-point libraries
}
RAM2 0x4000
{
* (+RW, +ZI)
}
}
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-89
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.7 Placement of Arm® C and C++ library code
The name armlib indicates the Arm C library files that are located in the directory
install_directory\lib\armlib.
Procedure
1. Create the following C++ program, foo.cpp:
#include <iostream>
2. To place the C++ library code, define the following scatter file, scatter.scat:
LR 0x8000
{
ER1 +0
{
*armlib*(+RO)
}
ER2 +0
{
*libcxx*(+RO)
}
ER3 +0
{
*(+RO)
The name *armlib* matches install_directory\lib\armlib, indicating the Arm C library files
that are located in the armlib directory.
The name *libcxx* matches install_directory\lib\libcxx, indicating the C++ library files that
are located in the libcxx directory.
3. Compile and link the sources:
armclang --target=arm-arm-none-eabi -march=armv8-a -c foo.cpp
armclang --target=arm-arm-none-eabi -march=armv8-a -c main.c
armlink --scatter=scatter.scat --map main.o foo.o -o foo.axf
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-90
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.8 Placement of unassigned sections
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-91
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.8 Placement of unassigned sections
The --any_contingency option prevents the linker from filling the region up to its maximum. It
reserves a portion of the region's size for linker-generated content and fills this contingency area only if
no other regions have space. It is enabled by default for the first_fit and best_fit algorithms,
because they are most likely to exhibit this behavior.
6.8.4 Specify the maximum region size permitted for placing unassigned sections
You can specify the maximum size in a region that armlink can fill with unassigned sections.
Use the execution region attribute ANY_SIZE max_size to specify the maximum size in a region that
armlink can fill with unassigned sections.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-92
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.8 Placement of unassigned sections
When ANY_SIZE is present, armlink does not attempt to calculate contingency and strictly follows
the .ANY priorities.
When ANY_SIZE is not present for an execution region containing a .ANY selector, and you specify the
--any_contingency command-line option, then armlink attempts to adjust the contingency for that
execution region. The aims are to:
• Never overflow a .ANY region.
• Make sure there is a contingency reserved space left in the given execution region. This space is
reserved for veneers and section padding.
If you specify --any_contingency on the command line, it is ignored for regions that have ANY_SIZE
specified. It is used as normal for regions that do not have ANY_SIZE specified.
Example
The following example shows how to use ANY_SIZE:
LOAD_REGION 0x0 0x3000
{
ER_1 0x0 ANY_SIZE 0xF00 0x1000
{
.ANY
}
ER_2 0x0 ANY_SIZE 0xFB0 0x1000
{
.ANY
}
ER_3 0x0 ANY_SIZE 0x1000 0x1000
{
.ANY
}
}
In this example:
• ER_1 has 0x100 reserved for linker-generated content.
• ER_2 has 0x50 reserved for linker-generated content. That is about the same as the automatic
contingency of --any_contingency.
• ER_3 has no reserved space. Therefore, 100% of the region is filled, with no contingency for veneers.
Omitting the ANY_SIZE parameter causes 98% of the region to be filled, with a two percent
contingency for veneers.
Name Size
sec1 0x4
sec2 0x4
sec3 0x4
sec4 0x4
sec5 0x4
sec6 0x4
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-93
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.8 Placement of unassigned sections
.ANY
}
ER_2 0x200 0x10
{
.ANY
}
}
Note
These examples have --any_contingency disabled.
Execution Region ER_2 (Base: 0x00000200, Size: 0x00000008, Max: 0x00000010, ABSOLUTE)
In this example:
• For first_fit the linker first assigns all the sections it can to ER_1, then moves on to ER_2 because
that is the next available region.
• For next_fit the linker does the same as first_fit. However, when ER_1 is full it is marked as
FULL and is not considered again. In this example, ER_1 is completely full. ER_2 is then considered.
• For best_fit the linker assigns sec1 to ER_1. It then has two regions of equal priority and
specificity, but ER_1 has less space remaining. Therefore, the linker assigns sec2 to ER_1, and
continues assigning sections until ER_1 is full.
Execution Region ER_2 (Base: 0x00000200, Size: 0x0000000c, Max: 0x00000010, ABSOLUTE)
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-94
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.8 Placement of unassigned sections
The linker first assigns sec1 to ER_1. It then has two equally specific and priority regions. It assigns sec2
to the one with the most free space, ER_2 in this example. The regions now have the same amount of
space remaining, so the linker assigns sec3 to the first one that appears in the scatter file, that is ER_1.
Note
The behavior of worst_fit is the default behavior in this version of the linker, and it is the only
algorithm available in earlier linker versions.
6.8.6 Example of next_fit algorithm showing behavior of full regions, selectors, and priority
This example shows the operation of the next_fit placement algorithm for RO-CODE sections in
sections.o.
The input section properties and ordering are shown in the following table:
Table 6-2 Input section properties for placement of sections with next_fit
Name Size
sec1 0x14
sec2 0x14
sec3 0x10
sec4 0x4
sec5 0x4
sec6 0x4
Note
This example has --any_contingency disabled.
The next_fit algorithm is different to the others in that it never revisits a region that is considered to be
full. This example also shows the interaction between priority and specificity of selectors. This is the
same for all the algorithms.
Execution Region ER_1 (Base: 0x00000100, Size: 0x00000014, Max: 0x00000020, ABSOLUTE)
Execution Region ER_2 (Base: 0x00000200, Size: 0x0000001c, Max: 0x00000020, ABSOLUTE)
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-95
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.8 Placement of unassigned sections
Execution Region ER_3 (Base: 0x00000300, Size: 0x00000014, Max: 0x00000020, ABSOLUTE)
In this example:
• The linker places sec1 in ER_1 because ER_1 has the most specific selector. ER_1 now has 0x6 bytes
remaining.
• The linker then tries to place sec2 in ER_1, because it has the most specific selector, but there is not
enough space. Therefore, ER_1 is marked as full and is not considered in subsequent placement steps.
The linker chooses ER_3 for sec2 because it has higher priority than ER_2.
• The linker then tries to place sec3 in ER_3. It does not fit, so ER_3 is marked as full and the linker
places sec3 in ER_2.
• The linker now processes sec4. This is 0x4 bytes so it can fit in either ER_1 or ER_3. Because both of
these sections have previously been marked as full, they are not considered. The linker places all
remaining sections in ER_2.
• If another section sec7 of size 0x8 exists, and is processed after sec6 the example fails to link. The
algorithm does not attempt to place the section in ER_1 or ER_3 because they have previously been
marked as full.
The input section properties and ordering are shown in the following table:
Table 6-3 Input section properties and ordering for sections_a.o and sections_b.o
sections_a.o sections_b.o
The following table shows the order that the sections are processed by the .ANY assignment algorithm.
Name Size
seca_4 0x14
secb_4 0x14
seca_3 0x10
secb_3 0x10
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-96
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.8 Placement of unassigned sections
Name Size
seca_1 0x4
seca_2 0x4
secb_1 0x4
secb_2 0x4
With --any_sort_order=descending_size, sections of the same size use the creation index as a
tiebreaker.
Command-line example
The following linker command-line options are used for this example:
--any_sort_order=cmdline sections_a.o sections_b.o --scatter scatter.txt
The following table shows the order that the sections are processed by the .ANY assignment algorithm.
Name Size
seca_1 0x4
seca_2 0x4
seca_3 0x10
seca_4 0x14
secb_1 0x4
secb_2 0x4
secb_3 0x10
secb_4 0x14
The following diagram represents the notional image layout during .ANY placement:
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-97
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.8 Placement of unassigned sections
Execution region
Base
.ANY
sections
Image
content
Free
space
2%
limit
The downward arrows for prospective padding show that the prospective padding continues to grow as
more sections are added to the .ANY selector.
Prospective padding is dealt with before the two percent veneer contingency.
When the prospective padding is cleared the priority is set to zero. When the two percent is cleared the
priority is decremented again.
You can also use the ANY_SIZE keyword on an execution region to specify the maximum amount of
space in the region to set aside for .ANY section assignments.
You can use the armlink command-line option --info=any to get extra information on where the linker
has placed sections. This can be useful when trying to debug problems.
Example
1. Create the following foo.c program:
#include "stdio.h"
int array[10] __attribute__ ((section ("ARRAY")));
struct S {
char A[8];
char B[4];
};
struct S s;
struct S* get()
{
return &s;
}
int main(void) {
int i;
for (i=0; i<10; i++) {
array[i]=i*i;
printf("%d\n", array[i]);
}
gSquared=sqr(i);
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-98
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.8 Placement of unassigned sections
return sizeof(array);
}
2. Create the following scatter.scat file:
LOAD_REGION 0x0 0x3000
{
ER_1 0x0 0x1000 {
.ANY
}
ER_2 (ImageLimit(ER_1)) 0x1500 {
.ANY
}
ER_3 (ImageLimit(ER_2)) 0x500
{
.ANY
}
ER_4 (ImageLimit(ER_3)) 0x1000
{
*(+RW,+ZI)
}
ARM_LIB_STACK 0x800000 EMPTY -0x10000
{
}
ARM_LIB_HEAP +0 EMPTY 0x10000
{
}
}
3. Compile and link the program as follows:
armclang -c --target=arm-arm-none-eabi -mcpu=cortex-m4 -o foo.o foo.c
armlink --cpu=cortex-m4 --any_contingency --scatter=scatter.scat --info=any -o foo.axf
foo.o
==============================================================================
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-99
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.8 Placement of unassigned sections
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-100
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.9 Placing veneers with a scatter file
Procedure
1. To place veneers at a specific location, include the linker-generated symbol Veneer$$Code in a
scatter file. At most, one execution region in the scatter file can have the *(Veneer$$Code) section
selector.
If it is safe to do so, the linker places veneer input sections into the region identified by the
*(Veneer$$Code) section selector. It might not be possible for a veneer input section to be assigned
to the region because of address range problems or execution region size limitations. If the veneer
cannot be added to the specified region, it is added to the execution region containing the relocated
input section that generated the veneer.
Note
Instances of *(IWV$$Code) in scatter files from earlier versions of Arm tools are automatically
translated into *(Veneer$$Code). Use *(Veneer$$Code) in new descriptions.
*(Veneer$$Code) is ignored when the amount of code in an execution region exceeds 4MB of 16-bit
T32 code, 16MB of 32-bit T32 code, and 32MB of A32 code.
Note
There are no state-change veneers in A64.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-101
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.10 Preprocessing a scatter file
You can:
• Add preprocessing directives to the top of the scatter file.
• Use simple expression evaluation in the scatter file.
For example, a scatter file, file.scat, might contain:
#! armclang --target=arm-arm-none-eabi -march=armv8-a -E -x c
#define ADDRESS 0x20000000
#include "include_file_1.h"
LR1 ADDRESS
{
…
}
The linker parses the preprocessed scatter file and treats the directives as comments.
You can also use the --predefine command-line option to assign values to constants. For this example:
1. Modify file.scat to delete the directive #define ADDRESS 0x20000000.
2. Specify the command:
armlink --predefine="-DADDRESS=0x20000000" --scatter=file.scat
This section contains the following subsections:
• 6.10.1 Default behavior for armclang -E in a scatter file on page 6-102.
• 6.10.2 Using other preprocessors in a scatter file on page 6-102.
On Windows, .exe suffixes are handled, so armclang.exe is considered the same as armclang.
Executable names are case insensitive, so ARMCLANG is considered the same as armclang. The portable
way to write scatter file preprocessing lines is to use correct capitalization and omit the .exe suffix.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-102
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.10 Preprocessing a scatter file
This means:
• The string must be correctly quoted for the host system. The portable way to do this is to use double-
quotes.
• Single quotes and escaped characters are not supported and might not function correctly.
• The use of a double-quote character in a path name is not supported and might not work.
These rules also apply to any strings passed with the --predefine option.
All preprocessor executables must accept the -o file option to mean output to file and accept the input
as a filename argument on the command line. These options are automatically added to the user
command line by armlink. Any options to redirect preprocessing output in the user-specified command
line are not supported.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-103
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.11 Reserving an empty block of memory
Note
The dummy ZI region that is created for an EMPTY execution region is not initialized to zero at runtime.
If the address is in relative (+offset) form and the length is negative, the linker generates an error.
The following figure shows a diagrammatic representation for this example.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-104
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.11 Reserving an empty block of memory
0x810000
Limit
Heap
0x800000
Base Limit
Stack
0x7F0000
Base
Note
The EMPTY attribute applies only to an execution region. The linker generates a warning and ignores an
EMPTY attribute that is used in a load region definition.
The linker checks that the address space used for the EMPTY region does not coincide with any other
execution region.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-105
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.12 Aligning regions to page boundaries
Note
• Alignment on an execution region causes both the load address and execution address to be aligned.
• The default page size is 0x8000. To change the page size, specify the --pagesize linker command-
line option.
To produce an ELF file with each execution region starting on a new page, and with code starting on the
next page boundary after the header information:
LR1 0x0 + SizeOfHeaders()
{
ER_RO +0
{
*(+RO)
}
ER_RW AlignExpr(+0, GetPageSize())
{
*(+RW)
}
ER_ZI AlignExpr(+0, GetPageSize())
{
*(+ZI)
}
}
If you set up your ELF file in this way, then you can memory-map it onto an operating system in such a
way that:
• RO and RW data can be given different memory protections, because they are placed in separate
pages.
• The load address everything expects to run at is related to its offset in the ELF file by specifying
SizeOfHeaders() for the first load region.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-106
reserved.
Non-Confidential
6 Mapping Code and Data to the Target
6.13 Aligning execution regions and input sections
Increases the section alignment of all the sections in an execution region, for example:
ER_DATA … ALIGNALL 8
{
… ;selectors
}
OVERALIGN
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 6-107
reserved.
Non-Confidential
Chapter 7
Embedded Software Development
Describes how to develop embedded applications with Arm Compiler, with or without a target system
present.
It contains the following sections:
• 7.1 About embedded software development on page 7-110.
• 7.2 Default compilation tool behavior on page 7-111.
• 7.3 C library structure on page 7-112.
• 7.4 Default memory map on page 7-113.
• 7.5 Application startup on page 7-115.
• 7.6 Tailoring the C library to your target hardware on page 7-116.
• 7.7 Reimplementing C library functions on page 7-117.
• 7.8 Tailoring the image memory map to your target hardware on page 7-119.
• 7.9 About the scatter-loading description syntax on page 7-120.
• 7.10 Root regions on page 7-121.
• 7.11 Placing the stack and heap on page 7-122.
• 7.12 Run-time memory models on page 7-123.
• 7.13 Reset and initialization on page 7-125.
• 7.14 The vector table on page 7-126.
• 7.15 ROM and RAM remapping on page 7-127.
• 7.16 Local memory setup considerations on page 7-128.
• 7.17 Stack pointer initialization on page 7-129.
• 7.18 Hardware initialization on page 7-130.
• 7.19 Execution mode considerations on page 7-131.
• 7.20 Target hardware and the memory map on page 7-132.
• 7.21 Execute-only memory on page 7-133.
• 7.22 Building applications for execute-only memory on page 7-134.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-108
reserved.
Non-Confidential
7 Embedded Software Development
• 7.23 Vector table for ARMv6 and earlier, ARMv7-A and ARMv7-R profiles on page 7-135.
• 7.24 Vector table for M-profile architectures on page 7-136.
• 7.25 Vector Table Offset Register on page 7-137.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-109
reserved.
Non-Confidential
7 Embedded Software Development
7.1 About embedded software development
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-110
reserved.
Non-Confidential
7 Embedded Software Development
7.2 Default compilation tool behavior
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-111
reserved.
Non-Confidential
7 Embedded Software Development
7.3 C library structure
For example, the following figure shows the C library implementing the function printf() by writing to
the debugger console window. This implementation is provided by calling _sys_write(), a support
function that executes a semihosting call, resulting in the default behavior using the debugger instead of
target peripherals.
Functions called by
ISO C your application,
for example, printf()
C Library
Device driver level.
input/ error stack and Use semihosting,
output handling heap other for example,
setup _sys_write()
Debug Implemented by
Agent Semihosting Support the debugging
environment
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-112
reserved.
Non-Confidential
7 Embedded Software Development
7.4 Default memory map
HEAP Calculated
by the linker
ZI
RW
RO
0x8000
Figure 7-2 Default memory map
Note
The processors based on Armv6‑M and Armv7‑M architectures have fixed memory maps. This makes
porting software easier between different systems based on these processors.
section A
ZI from file2.o
B Section A
RW from file1.o
DATA
A
CODE
RO
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-113
reserved.
Non-Confidential
7 Embedded Software Development
7.4 Default memory map
Generally, the linker sorts the input sections by attribute (RO, RW, ZI), by name, and then by position in
the input list.
To fully control the placement of code and data you must use the scatter-loading mechanism.
Related concepts
7.6 Tailoring the C library to your target hardware on page 7-116
Related information
The image structure
Section placement with the linker
About scatter-loading
Scatter file syntax
Cortex-M1 Technical Reference Manual
Cortex-M3 Technical Reference Manual
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-114
reserved.
Non-Confidential
7 Embedded Software Development
7.5 Application startup
Image
entry point ..__main
copy code and data
copy or decompress RW
. data
Initialize ZI data to
zeros
.
main()
causes the linker to link
in library initialization
code
__rt_entry
.. and heap
initialize library functions
call top-level
constructors (C++)
__main is responsible for setting up the memory and __rt_entry is responsible for setting up the run-
time environment.
__main performs code and data copying, decompression, and zero initialization of the ZI data. It then
branches to __rt_entry to set up the stack and heap, initialize the library functions and static data, and
call any top level C++ constructors. __rt_entry then branches to main(), the entry to your application.
When the main application has finished executing, __rt_entry shuts down the library, then hands
control back to the debugger.
The function label main() has a special significance. The presence of a main() function forces the linker
to link in the initialization code in __main and __rt_entry. Without a function labeled main() the
initialization sequence is not linked in, and as a result, some standard C library functionality is not
supported.
Related information
--startup=symbol, --no_startup linker options
Arm Compiler C Library Startup and Initialization
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-115
reserved.
Non-Confidential
7 Embedded Software Development
7.6 Tailoring the C library to your target hardware
Target- Target-
independent independent
C Library
Retarget
Target- Target- User
Code
dependent dependent
For example, you might have a peripheral I/O device such as an LCD screen, and you might want to
override the library implementation of fputc(), that writes to the debugger console, with one that
outputs to the LCD. Because this implementation of fputc() is linked in to the final image, the entire
printf() family of functions prints out to the LCD.
In a standalone application, you are unlikely to support semihosting operations. Therefore, you must
remove all calls to target-dependent C library functions or re-implement them with non semihosting
functions.
Related information
Using the libraries in a nonsemihosting environment
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-116
reserved.
Non-Confidential
7 Embedded Software Development
7.7 Reimplementing C library functions
Use armclang and armar to create a library from your reimplemented printf() function:
armclang --target=arm-arm-none-eabi -c -O2 -march=armv7-a -mfpu=none mylib.c -o mylib.o
armar --create mylib.a mylib.o
void foo(void)
{
printf("Hello, world!\n");
}
Use armclang to build the example application source file using the -nostdlib, -nostdlibinc and -
fno-builtin options. Then use armlink to link the example reimplemented library using the --
no_scanlib option.
If you do not use the -fno-builtin option, then the compiler transforms the printf() function to the
puts() function, and the linker generates an error because it cannot find the puts() function in the
reimplemented library.
armclang --target=arm-arm-none-eabi -c -O2 -march=armv7-a -mfpu=none -nostdlib -nostdlibinc
foo.c -o foo.o
armlink foo.o mylib.a -o image.axf --no_scanlib
Note
If the linker sees a definition of main(), it automatically creates a reference to a startup symbol called
__main. The Arm standard C library defines __main to provide startup code. If you use your own library
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-117
reserved.
Non-Confidential
7 Embedded Software Development
7.7 Reimplementing C library functions
instead of the Arm standard C library, then you must provide your implementation of __main or change
the startup symbol using the linker --startup option.
Related concepts
7.3 C library structure on page 7-112
Related information
--startup
Run-time ABI for the Arm Architecture
C Library ABI for the Arm Architecture
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-118
reserved.
Non-Confidential
7 Embedded Software Development
7.8 Tailoring the image memory map to your target hardware
Related information
Information about scatter files
--scatter=filename linker option
Armv7‑M Architecture Reference Manual
Armv6‑M Architecture Reference Manual
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-119
reserved.
Non-Confidential
7 Embedded Software Development
7.9 About the scatter-loading description syntax
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-120
reserved.
Non-Confidential
7 Embedded Software Development
7.10 Root regions
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-121
reserved.
Non-Confidential
7 Embedded Software Development
7.11 Placing the stack and heap
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-122
reserved.
Non-Confidential
7 Embedded Software Development
7.12 Run-time memory models
One-region model
The application stack and heap grow towards each other in the same region of memory, see the following
figure. In this run-time memory model, the heap is checked against the value of the stack pointer when
new heap space is allocated, for example, when malloc() is called.
Stack Base
0x40000
STACK
HEAP
Two-region model
The stack and heap are placed in separate regions of memory, see the following figure. For example, you
might have a small block of fast RAM that you want to reserve for stack use only. For a two-region
model you must import __use_two_region_memory.
In this run-time memory model, the heap is checked against the heap limit when new heap space is
allocated.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-123
reserved.
Non-Confidential
7 Embedded Software Development
7.12 Run-time memory models
Heap 0x28080000
Limit
Heap HEAP
Base 0x28000000
Stack 0x40000
STACK
Base
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-124
reserved.
Non-Confidential
7 Embedded Software Development
7.13 Reset and initialization
. data
initialize ZI data to zeros
2
3
.
__user_setup_stackheap()
set up application stack
and heap
..
__rt_entry
initialize library functions
call top-level
4 .
$Sub$$main()
enable caches and
interrupts
constructors (C++)
. 5
.
Exit from application
main()
6 causes the linker to link
in library initialization
code
If you use a scatter file to tailor stack and heap placement, the linker includes a version of the library
heap and stack setup code using the linker defined symbols, ARM_LIB_*, for these region names.
Alternatively you can create your own implementation.
The reset handler is normally a short module coded in assembler that executes immediately on system
startup. As a minimum, your reset handler initializes stack pointers for the modes that your application is
running in. For processors with local memory systems, such as caches, TCMs, MMUs, and MPUs, some
configuration must be done at this stage in the initialization process. After executing, the reset handler
typically branches to __main to begin the C library initialization sequence.
There are some components of system initialization, for example, the enabling of interrupts, that are
generally performed after the C library initialization code has finished executing. The block of code
labeled $Sub$$main() performs these tasks immediately before the main application begins executing.
Related information
About using $Super$$ and $Sub$$ to patch symbol definitions
Specifying stack and heap using the scatter file
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-125
reserved.
Non-Confidential
7 Embedded Software Development
7.14 The vector table
The vector table for the microcontroller profiles is very different to most Arm architectures.
Related concepts
7.23 Vector table for ARMv6 and earlier, ARMv7-A and ARMv7-R profiles on page 7-135
7.24 Vector table for M-profile architectures on page 7-136
Related information
Information about scatter files
Scatter-loading images with a simple memory map
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-126
reserved.
Non-Confidential
7 Embedded Software Development
7.15 ROM and RAM remapping
Note
This information does not apply to Armv6‑M, Armv7‑M, and Armv8‑M profiles.
Note
This information assumes that an Arm processor begins fetching instructions at 0x0. This is the standard
behavior for systems based on Arm processors. However, some Arm processors, for example the
processors based on the Armv7‑A architecture, can be configured to begin fetching instructions from
0xFFFF0000.
There has to be a valid instruction at 0x0 at startup, so you must have nonvolatile memory located at 0x0
at the moment of power-on reset. One way to achieve this is to have ROM located at 0x0. However, there
are some drawbacks to this configuration.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-127
reserved.
Non-Confidential
7 Embedded Software Development
7.16 Local memory setup considerations
Tightly Coupled Memories (TCM) must also be enabled before branching to __main, normally before
MMU/MPU setup, because you generally want to scatter-load code and data into TCMs. You must be
careful that you do not have to access memory that is masked by the TCMs when they are enabled.
You might also encounter problems with cache coherency if caches are enabled before branching to
__main. Code in __main copies code regions from their load address to their execution address,
essentially treating instructions as data. As a result, some instructions can be cached in the data cache, in
which case they are not visible to the instruction path.
To avoid these coherency problems, enable caches after the C library initialization sequence finishes
executing.
Related information
Cortex-A Series Programmer's Guide for Armv8-A
Cortex-A Series Programmer's Guide for Armv7-A
Cortex-R Series Programmer's Guide for Armv7-R
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-128
reserved.
Non-Confidential
7 Embedded Software Development
7.17 Stack pointer initialization
The stack_base symbol can be a hard-coded address, or it can be defined in a separate assembler source
file and located by a scatter file.
The example allocates 256 bytes of stack for Fast Interrupt Request (FIQ) and Interrupt Request (IRQ)
mode, but you can do the same for any other execution mode. To set up the stack pointers, enter each
mode with interrupts disabled, and assign the appropriate value to the stack pointer.
The stack pointer value set up in the reset handler is automatically passed as a parameter to
__user_initial_stackheap() by C library initialization code. Therefore, this value must not be
modified by __user_initial_stackheap().
Related information
Specifying stack and heap using the scatter file
Cortex-M3 Embedded Software Development
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-129
reserved.
Non-Confidential
7 Embedded Software Development
7.18 Hardware initialization
The linker replaces the function call to main() with a call to $Sub$$main(). From there you can call a
routine that enables caches and another to enable interrupts.
The code branches to the real main() by calling $Super$$main().
Related information
About using $Super$$ and $Sub$$ to patch symbol definitions
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-130
reserved.
Non-Confidential
7 Embedded Software Development
7.19 Execution mode considerations
Note
This does not apply to Armv6‑M, Armv7‑M, and Armv8‑M profiles.
Much of the functionality that you are likely to implement at startup, both in the reset handler and $Sub$
$main, can only be done while executing in privileged modes, for example, on-chip memory
manipulation, and enabling interrupts.
If you want to run your application in a privileged mode, this is not an issue. Ensure that you change to
the appropriate mode before exiting your reset handler.
If you want to run your application in User mode, however, you can only change to User mode after
completing the necessary tasks in a privileged mode. The most likely place to do this is in $Sub$
$main().
Note
The C library initialization code must use the same stack as the application. If you need to use a non-
User mode in $Sub$$main and User mode in the application, you must exit your reset handler in System
mode, which uses the User mode stack pointer.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-131
reserved.
Non-Confidential
7 Embedded Software Development
7.20 Target hardware and the memory map
Note
You can also use __attribute__((section(".ARM.__at_address"))) to specify the absolute address
of a variable.
It is important that the contents of these registers are not zero-initialized during application startup,
because this is likely to change the state of your system. Marking an execution region with the UNINIT
attribute prevents ZI data in that region from being zero-initialized by __main.
Related tasks
6.6 Placing functions and data at specific addresses on page 6-81
Related information
__attribute__((section("name"))) variable attribute
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-132
reserved.
Non-Confidential
7 Embedded Software Development
7.21 Execute-only memory
Related tasks
7.22 Building applications for execute-only memory on page 7-134
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-133
reserved.
Non-Confidential
7 Embedded Software Development
7.22 Building applications for execute-only memory
Note
Link Time Optimization does not honor the armclang -mexecute-only option. If you use the armclang
-flto or -Omax options, then the compiler cannot generate execute-only code and produces a warning.
Procedure
1. Compile your C or C++ code using the -mexecute-only option.
Example: armclang --target=arm-arm-none-eabi -march=armv7-m -mexecute-only -c
test.c -o test.o
The -mexecute-only option prevents the compiler from generating any data accesses to the code
sections.
To keep code and data in separate sections, the compiler disables the placement of literal pools inline
with code.
Compiled execute-only code sections in the ELF object file are marked with the SHF_ARM_NOREAD
flag.
2. Specify the memory map to the linker using either of the following:
• The +XO selector in a scatter file.
• The armlink --xo-base option on the command-line.
Example: armlink --xo-base=0x8000 test.o -o test.axf
Results:
The XO execution region is placed in a separate load region from the RO, RW, and ZI execution
regions.
Note
If you do not specify --xo-base, then by default:
• The XO execution region is placed immediately before the RO execution region, at address
0x8000.
• All execution regions are in the same load region.
Related concepts
7.21 Execute-only memory on page 7-133
Related information
-mexecute-only compiler option
--execute_only assembler option
--xo_base=address linker option
AREA
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-134
reserved.
Non-Confidential
7 Embedded Software Development
7.23 Vector table for ARMv6 and earlier, ARMv7-A and ARMv7-R profiles
7.23 Vector table for ARMv6 and earlier, ARMv7-A and ARMv7-R profiles
The vector table for Armv6 and earlier, Armv7‑A and Armv7‑R profiles consists of branch or load PC
instructions to the relevant handlers.
If required, you can include the FIQ handler at the end of the vector table to ensure it is handled as
efficiently as possible, see the following example. Using a literal pool means that addresses can easily be
modified later if necessary.
This example assumes that you have ROM at location 0x0 on reset. Alternatively, you can use the
scatter-loading mechanism to define the load and execution address of the vector table. In that case, the C
library copies the vector table for you.
Note
The vector table for Armv6 and earlier architectures supports A32 instructions only. Armv6T2 and later
architectures support both T32 instructions and A32 instructions in the vector table. This does not apply
to the Armv6‑M, Armv7‑M, and Armv8‑M profiles.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-135
reserved.
Non-Confidential
7 Embedded Software Development
7.24 Vector table for M-profile architectures
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-136
reserved.
Non-Confidential
7 Embedded Software Development
7.25 Vector Table Offset Register
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights 7-137
reserved.
Non-Confidential
Appendix A
Supporting reference information
The various features in Arm Compiler might have different levels of support, ranging from fully
supported product features to community features.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights Appx-A-138
reserved.
Non-Confidential
A Supporting reference information
A.1 Support level definitions
Product features
Product features are suitable for use in a production environment. The functionality is well-tested, and is
expected to be stable across feature and update releases.
• Arm endeavors to give advance notice of significant functionality changes to product features.
• If you have a support and maintenance contract, Arm provides full support for use of all product
features.
• Arm welcomes feedback on product features.
• Any issues with product features that Arm encounters or is made aware of are considered for fixing in
future versions of Arm Compiler.
In addition to fully supported product features, some product features are only alpha or beta quality.
Beta product features
Beta product features are implementation complete, but have not been sufficiently tested to be
regarded as suitable for use in production environments.
Beta product features are indicated with [BETA].
• Arm endeavors to document known limitations on beta product features.
• Beta product features are expected to eventually become product features in a future release
of Arm Compiler 6.
• Arm encourages the use of beta product features, and welcomes feedback on them.
• Any issues with beta product features that Arm encounters or is made aware of are
considered for fixing in future versions of Arm Compiler.
Alpha product features
Alpha product features are not implementation complete, and are subject to change in future
releases, therefore the stability level is lower than in beta product features.
Alpha product features are indicated with [ALPHA].
• Arm endeavors to document known limitations of alpha product features.
• Arm encourages the use of alpha product features, and welcomes feedback on them.
• Any issues with alpha product features that Arm encounters or is made aware of are
considered for fixing in future versions of Arm Compiler.
Community features
Arm Compiler 6 is built on LLVM technology and preserves the functionality of that technology where
possible. This means that there are additional features available in Arm Compiler that are not listed in the
documentation. These additional features are known as community features. For information on these
community features, see the documentation for the Clang/LLVM project.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights Appx-A-139
reserved.
Non-Confidential
A Supporting reference information
A.1 Support level definitions
Where community features are referenced in the documentation, they are indicated with
[COMMUNITY].
• Arm makes no claims about the quality level or the degree of functionality of these features, except
when explicitly stated in this documentation.
• Functionality might change significantly between feature releases.
• Arm makes no guarantees that community features will remain functional across update releases,
although changes are expected to be unlikely.
Some community features might become product features in the future, but Arm provides no roadmap
for this. Arm is interested in understanding your use of these features, and welcomes feedback on them.
Arm supports customers using these features on a best-effort basis, unless the features are unsupported.
Arm accepts defect reports on these features, but does not guarantee that these issues will be fixed in
future releases.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights Appx-A-140
reserved.
Non-Confidential
A Supporting reference information
A.1 Support level definitions
Arm C library
armasm
armasm syntax
syntax C/C++
C/C++ GNU
GNU syntax
syntax
LLVM Project
assembly
assembly Source
Source code
code Assembly
Assembly
libc++
armclang
armasm Source
Source code
code
headers
headers
LLVM Project
clang
Objects
Objects Objects
Objects Objects
Objects
armlink
Scatter/Steering/
Scatter/Steering/
Symdefs
Symdefs file
file
Image
Image
The dashed boxes are toolchain components, and any interaction between these components is an
integration boundary. Community features that span an integration boundary might have significant
limitations in functionality. The exception to this is if the interaction is codified in one of the
standards supported by Arm Compiler 6. See Application Binary Interface (ABI) for the Arm®
Architecture. Community features that do not span integration boundaries are more likely to work as
expected.
• Features primarily used when targeting hosted environments such as Linux or BSD might have
significant limitations, or might not be applicable, when targeting bare-metal environments.
• The Clang implementations of compiler features, particularly those that have been present for a long
time in other toolchains, are likely to be mature. The functionality of new features, such as support
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights Appx-A-141
reserved.
Non-Confidential
A Supporting reference information
A.1 Support level definitions
for new language features, is likely to be less mature and therefore more likely to have limited
functionality.
Unsupported features
With both the product and community feature categories, specific features and use-cases are known not
to function correctly, or are not intended for use with Arm Compiler 6.
Limitations of product features are stated in the documentation. Arm cannot provide an exhaustive list of
unsupported features or use-cases for community features. The known limitations on community features
are listed in Community features on page Appx-A-139.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights Appx-A-142
reserved.
Non-Confidential
A Supporting reference information
A.2 Standards compliance in Arm® Compiler
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights Appx-A-143
reserved.
Non-Confidential
A Supporting reference information
A.3 Compliance with the ABI for the Arm® Architecture (Base Standard)
A.3 Compliance with the ABI for the Arm® Architecture (Base Standard)
The ABI for the Arm Architecture (Base Standard) is a collection of standards. Some of these standards
are open. Some are specific to the Arm architecture.
The Application Binary Interface (ABI) for the Arm® Architecture (Base Standard) (BSABI) regulates the
inter-operation of binary code and development tools in Arm architecture-based execution environments,
ranging from bare metal to major operating systems such as Arm Linux.
By conforming to this standard, objects produced by the toolchain can work together with object libraries
from different producers.
The BSABI consists of a family of specifications including:
AADWARF64
DWARF for the Arm® 64-bit Architecture (AArch64). This ABI uses the DWARF 3 standard to
govern the exchange of debugging data between object producers and debuggers. It also gives
additional rules on how to use DWARF 3, and how it is extended in ways specific to the 64-bit
Arm architecture.
AADWARF
DWARF for the Arm® Architecture. This ABI uses the DWARF 3 standard to govern the
exchange of debugging data between object producers and debuggers.
AAELF64
ELF for the Arm® 64-bit Architecture (AArch64). This specification provides the processor-
specific definitions required by ELF for AArch64-based systems. It builds on the generic ELF
standard to govern the exchange of linkable and executable files between producers and
consumers.
AAELF
ELF for the Arm® Architecture. Builds on the generic ELF standard to govern the exchange of
linkable and executable files between producers and consumers.
AAPCS64
Procedure Call Standard for the Arm® 64-bit Architecture (AArch64). Governs the exchange of
control and data between functions at runtime. There is a variant of the AAPCS for each of the
major execution environment types supported by the toolchain.
AAPCS64 describes a number of different supported data models. Arm Compiler 6 implements
the LP64 data model for AArch64 state.
AAPCS
Procedure Call Standard for the Arm® Architecture. Governs the exchange of control and data
between functions at runtime. There is a variant of the AAPCS for each of the major execution
environment types supported by the toolchain.
BPABI
Base Platform ABI for the Arm® Architecture. Governs the format and content of executable and
shared object files generated by static linkers. Supports platform-specific executable files using
post linking. Provides a base standard for deriving a platform ABI.
CLIBABI
C Library ABI for the Arm® Architecture. Defines an ABI to the C library.
CPPABI64
C++ ABI for the Arm® Architecture. This specification builds on the generic C++ ABI
(originally developed for IA-64) to govern interworking between independent C++ compilers.
DBGOVL
Support for Debugging Overlaid Programs. Defines an extension to the ABI for the Arm
Architecture to support debugging overlaid programs.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights Appx-A-144
reserved.
Non-Confidential
A Supporting reference information
A.3 Compliance with the ABI for the Arm® Architecture (Base Standard)
EHABI
Exception Handling ABI for the Arm® Architecture. Defines both the language-independent and
C++-specific aspects of how exceptions are thrown and handled.
RTABI
Run-time ABI for the Arm® Architecture. Governs what independently produced objects can
assume of their execution environments by way of floating-point and compiler helper-function
support.
If you are upgrading from a previous toolchain release, ensure that you are using the most recent versions
of the Arm specifications.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights Appx-A-145
reserved.
Non-Confidential
A Supporting reference information
A.4 GCC compatibility provided by Arm® Compiler 6
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights Appx-A-146
reserved.
Non-Confidential
A Supporting reference information
A.5 Locale support in Arm® Compiler
Note
There is no support for Shift-Japanese Industrial Standard (Shift-JIS) encoded files.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights Appx-A-147
reserved.
Non-Confidential
A Supporting reference information
A.6 Toolchain environment variables
ARM_TOOL_VARIANT Required only if you have a DS-5 Development Studio toolkit license and you are running the Arm
Compiler tools outside of the DS-5 Development Studio environment.
If you have an ultimate license, set this environment variable to ult to enable the Ultimate features.
See Product and toolkit configuration for more information.
ARM_PRODUCT_DEF Required only if you have an Arm Development Studio toolkit license and you are running the Arm
Compiler tools outside of the Arm Development Studio environment.
Use this environment variable to specify the location of the product definition file.
ARMCOMPILER6_ASMOPT An optional environment variable to define additional assembler options that are to be used outside
your regular makefile.
The options listed appear before any options specified for the armasm command in the makefile.
Therefore, any options specified in the makefile might override the options listed in this environment
variable.
ARMCOMPILER6_CLANGOPT An optional environment variable to define additional armclang options that are to be used outside
your regular makefile.
The options listed appear before any options specified for the armclang command in the makefile.
Therefore, any options specified in the makefile might override the options listed in this environment
variable.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights Appx-A-148
reserved.
Non-Confidential
A Supporting reference information
A.6 Toolchain environment variables
ARMCOMPILER6_LINKOPT An optional environment variable to define additional linker options that are to be used outside your
regular makefile.
The options listed appear before any options specified for the armlink command in the makefile.
Therefore, any options specified in the makefile might override the options listed in this environment
variable.
ARMLMD_LICENSE_FILE This environment variable must be set, and specifies the location of your Arm license file. See the
Arm® DS-5 License Management Guide for information on this environment variable.
Note
On Windows, the length of ARMLMD_LICENSE_FILE must not exceed 260 characters.
C_INCLUDE_PATH GCC-compatible environment variable. Adds the specified directories to the list of places that are
searched to find included C files.
COMPILER_PATH GCC-compatible environment variable. Adds the specified directories to the list of places that are
searched to find subprograms.
CPATH GCC-compatible environment variable. Adds the specified directories to the list of places that are
searched to find included files regardless of the source language.
CPLUS_INCLUDE_PATH GCC-compatible environment variable. Adds the specified directories to the list of places that are
searched to find included C++ files.
TMP Used on Windows platforms to specify the directory to be used for temporary files.
TMPDIR Used on Red Hat Linux platforms to specify the directory to be used for temporary files.
Related information
Product and toolkit configuration
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights Appx-A-149
reserved.
Non-Confidential
A Supporting reference information
A.7 Clang and LLVM documentation
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights Appx-A-150
reserved.
Non-Confidential
A Supporting reference information
A.8 Further reading
Arm® publications
Arm periodically provides updates and corrections to its documentation. See Arm® Infocenter for current
errata sheets and addenda, and the Arm Frequently Asked Questions (FAQs).
For full information about the base standard, software interfaces, and standards supported by Arm, see
Application Binary Interface (ABI) for the Arm® Architecture.
In addition, see the following documentation for specific information relating to Arm products:
• Arm® Architecture Reference Manuals.
• Cortex®‑A series processors.
• Cortex®‑R series processors.
• Cortex®‑M series processors.
Other publications
This Arm Compiler tools documentation is not intended to be an introduction to the C or C++
programming languages. It does not try to teach programming in C or C++, and it is not a reference
manual for the C or C++ standards. Other publications provide general information about programming.
The following publications describe the C++ language:
• ISO/IEC 14882:2014, C++ Standard.
• Stroustrup, B., The C++ Programming Language (4th edition, 2013). Addison-Wesley Publishing
Company, Reading, Massachusetts. ISBN 978-0321563842.
The following publications provide general C++ programming information:
• Stroustrup, B., The Design and Evolution of C++ (1994). Addison-Wesley Publishing Company,
Reading, Massachusetts. ISBN 0-201-54330-3.
This book explains how C++ evolved from its first design to the language in use today.
• Vandevoorde, D and Josuttis, N.M. C++ Templates: The Complete Guide (2003). Addison-Wesley
Publishing Company, Reading, Massachusetts. ISBN 0-201-73484-2.
• Meyers, S., Effective C++ (3rd edition, 2005). Addison-Wesley Publishing Company, Reading,
Massachusetts. ISBN 978-0321334879.
This provides short, specific guidelines for effective C++ development.
• Meyers, S., More Effective C++ (2nd edition, 1997). Addison-Wesley Publishing Company, Reading,
Massachusetts. ISBN 0-201-92488-9.
The following publications provide general C programming information:
• ISO/IEC 9899:2011, C Standard.
The standard is available from national standards bodies (for example, AFNOR in France, ANSI in
the USA).
• Kernighan, B.W. and Ritchie, D.M., The C Programming Language (2nd edition, 1988). Prentice-
Hall, Englewood Cliffs, NJ, USA. ISBN 0-13-110362-8.
This book is co-authored by the original designer and implementer of the C language, and is updated
to cover the essentials of ANSI C.
• Harbison, S.P. and Steele, G.L., A C Reference Manual (5th edition, 2002). Prentice-Hall, Englewood
Cliffs, NJ, USA. ISBN 0-13-089592-X.
This is a very thorough reference guide to C, including useful information on ANSI C.
• Plauger, P., The Standard C Library (1991). Prentice-Hall, Englewood Cliffs, NJ, USA. ISBN
0-13-131509-9.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights Appx-A-151
reserved.
Non-Confidential
A Supporting reference information
A.8 Further reading
This is a comprehensive treatment of ANSI and ISO standards for the C Library.
• Koenig, A., C Traps and Pitfalls, Addison-Wesley (1989), Reading, Mass. ISBN 0-201-17928-8.
This explains how to avoid the most common traps in C programming. It provides informative
reading at all levels of competence in C.
See The DWARF Debugging Standard web site for the latest information about the Debug With Arbitrary
Record Format (DWARF) debug table standards and ELF specifications.
100748_0611_00_en Copyright © 2016–2018 Arm Limited or its affiliates. All rights Appx-A-152
reserved.
Non-Confidential