0% found this document useful (0 votes)
7 views

Chapter_01_See_Program_Running

Uploaded by

alinaahmad478
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Chapter_01_See_Program_Running

Uploaded by

alinaahmad478
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 46

Embedded Systems with ARM Cortex-M Microcontrollers in Assembly Language and C

Chapter 1
Computer and Assembly Language

Dr. Yifeng Zhu


Electrical and Computer Engineering
University of Maine

Spring 2018

1
Embedded Systems

2
Amazon Warehouse

Kiva Robot

3
Assembly Programs

http://www.andysinger.com/

4
Why do we learn Assembly?
 Assembly isn’t “just another language”.
 Help you understand how does the processor work
 Assembly program runs faster than high-level language. Performance critical codes
must be written in assembly.
 Use the profiling tools to find the performance bottle and rewrite that code section in
assembly
 Latency-sensitive applications, such as aircraft controller
 Standard C compilers do not use some operations available on ARM processors, such ROR
(Rotate Right) and RRX (Rotate Right Extended).
 Hardware/processor specific code,
 Processor booting code
 Device drivers
 A test-and-set atomic assembly instruction can be used to implement locks and
semaphores.
 Cost-sensitive applications
 Embedded devices, where the size of code is limited, wash machine controller,
automobile controllers
 5The best applications are written by those who've mastered assembly language or
Why ARM processor
 As of 2005, 98% of the more than one billion
mobile phones sold each year used ARM processors

ADC USB 2.0


 As of 2009, ARM processors accounted for
approximately 90% of all embedded 32-bit RISC Touch
processors DAC
sensing

USART,SPI, Advanced
 In 2010 alone, 6.1 billion ARM-based processor, I2C timers
representing 95% of smartphones, 35% of digital
televisions and set-top boxes and 10% of mobile Motor
LCD Driver control
computers

 As of 2014, over 50 billion ARM processors have


been produced

6
iPhone 7
Teardown

A10 processor:
• 64-bit system on chip
(SoC)
• ARMv8-A core

7
Apple Watch
 Apple S1 Processor
 32-bit ARMv7-A compatible
 # of Cores: 1
 CMOS Technology: 28 nm
 L1 cache 32 KB data
 L2 cache 256 KB
 GPU PowerVR SGX543

8
Kindle HD Fire

Texas
Instruments
OMAP 4460
dual-core
processor

9 http://www.ifixit.com
Fitbit Flex Teardown

STMicroelectronics
32L151C6 Ultra Low
Power ARM Cortex M3
Microcontroller

Nordic Semiconductor
nRF8001 Bluetooth Low
Energy Connectivity IC

10
www.ifixit.com
Samsung Galaxy Gear

 STMicroelectronics
STM32F401B ARM-
Cortex M4 MCU with
source: ifixit.com
128KB Flash

11
Pebble Smartwatch

source: ifixit.com

 STMicroelectronics STM32F205RE
ARM Cortex-M3 MCU, with a
maximum speed of 120 MHz
12
Oculus VR

 Facebook’s $2 Billion Acquisition Of Oculus in 2014


source: ifixit.com
 ST Microelectronics STM32F072VB ARM Cortex-M0 32-bit
RISC Core Microcontroller
13
HTC Vive

STMicroelectronics
32F072R8 ARM Cortex-
M0 Microcontroller
14 source: ifixit.com
Nest Learning Thermostat

source: ifixit.com

 ST Microelectronics STM32L151VB ultra-low-power


32 MHz ARM Cortex-M3 MCU
15
Samsung Gear Fit Fitness Tracker

source: ifixit.com

 STMicroelectronics STM32F439ZI
180 MHz, 32 bit ARM Cortex-M4
CPU
16
Data Address
Memory 8 bits 32 bits

 Memory is arranged as a series of 0xFFFFFFFF


“locations”
 Each location has a unique “address”
 Each location holds a byte (byte-addressable)
 e.g. the memory location at address 0x080001B0
contains the byte value 0x70, i.e., 112
 The number of locations in memory is limited 70 0x080001B
 e.g. 4 GB of RAM 0
BC 0x080001A
 1 Gigabyte (GB) = 230 bytes
18 F
 232 locations  4,294,967,296 locations! 01 0x080001A
 Values stored at each location can represent A0 E
either program data or program instructions 0x080001A
 e.g. the value 0x70 might be the code used to D
tell the processor to add two values together 0x080001A
C
0x00000000
17
Memory
Computer Architecture
Von-Neumann Harvard
Instructions and data are Data and instructions are
stored in the same stored into separate
memory. memories.

18
Computer Architecture
Von-Neumann Harvard
Instructions and data are Data and instructions are
stored in the same stored into separate
memory. memories.

19
ARM Cortex-M Series Family
Von-Neumann Harvard
Instructions and data are Data and instructions are
stored in the same stored into separate
memory. memories.

ARM ARM ARM ARM


Cortex-M0 Cortex-M0+ Cortex-M3 Cortex-M4

ARMv6-M ARMv6-M ARMv7-M ARMv7E-M

ARM ARM ARM ARM


Cortex-M1 Cortex-M23 Cortex-M7 Cortex-M33

ARMv6-M ARMv8-M ARMv7E-M ARMv8-M

20
Levels of Program Code 001000010000000
0
001000000000000
C Program Assembly Program Machine 0Program
111000000000000
int main(void){ 1
int i; 010001000000000
int total = 0; 1
for (i = 0; i < 10; i++) Compil Assemble
000111000100000
{ e 0
total += i;
} 001010000000101
while(1); // Dead loop 0
} 110111001111101
1
101111110000000
 High-level  Assembly  Hardware0
111001111111111
language language representati
0
 Level of abstraction  Textual on
closer to problem representation of  Binary digits
domain instructions (bits)
 Provides for  Encoded
productivity and instructions
portability and data

21
See a Program Runs
C Code
Assembly Code
int main(void)
{ MOVS r1, #0x00 ; int a
int a = 0; compiler =0
int b = 1; MOVS r2, #0x01 ; int b
int c; =1
c = a + b; ADDS r3, r1, r2 ; c = a + b
return 0; l er MOVS r0, 0x00 ; set return
b
} em value
s
as BX lr ; return
Machine Code
001000010000 2100 ; MOVS r1,
0000 2201 #0x00
001000100000 188 ; MOVS r2,
0001 B #0x01
000110001000 2000 ; ADDS r3, r1,
1011 4770 r2
In Binary In Hex
001000000000 ; MOVS r0,
22 0000 #0x00
010001110111 ; BX lr
Processor Registers
32 bits
 Fastest way to read and write
 Registers are within the processor chip
R0  A register stores 32-bit value
R1  STM32L has
R2  R0-R12: 13 general-purpose registers
Low R3  R13: Stack pointer (Shadow of MSP or
Registers
R4
PSP)
R5  R14: Link register (LR)
General
R6 Purpose
Register  R15: Program counter (PC)
R7
 Special registers (xPSR, BASEPRI,
R8
PRIMASK, etc)
R9
High
32 bits
Registers R10
R11 xPSR
R12 BASEPRI
Special
R13 (SP) R13 (MSP) R13 (PSP) PRIMASK Purpose
Register
R14 (LR) FAULTMASK
R15 (PC) CONTROL

23
Program Execution
 Program Counter (PC) is a register that holds the memory
address of the next instruction to be fetched from the memory.

Memory Address
1. Fetch
instruction
at PC 477 0x080001B
address 0 4
PC 200 0x080001B
0 2
3. 2. 188 0x080001B
Execute Decode B 0
the the PC = 0x080001B0
220 0x080001A
instructio instructio Instruction = 1188B Eor
n n 2000188B or 210 8B180020
0x080001A
0 C

24
Three-state pipeline:
Fetch, Decode, Execution
 Pipelining allows hardware resources to be fully utilized
 One 32-bit instruction or two 16-bit instructions can be fetched.

Pipeline of 32-bit instructions

25
Three-state pipeline:
Fetch, Decode, Execution
 Pipelining allows hardware resources to be fully utilized
 One 32-bit instruction or two 16-bit instructions can be fetched.

Clock

Instruction Instruction Instruction


Instruction i
Fetch Decode Execution

Instruction Instruction Instruction


Instruction i + 1
Fetch Decode Execution

Instruction Instruction Instruction


Instruction i + 2
Fetch Decode Execution

Instruction Instruction Instruction


Instruction i + 2
Fetch Decode Execution

Pipeline of 16-bit instructions


26
Machine codes are stored in memory
Data Address
r15 pc
0xFFFFFFFF
r14 lr
r13 sp
r12
r11
r10
r9 477 0x080001
r8 ALU 0 B4
r7 200 0x080001
r6 0 B2
r5 188 0x080001
r4 B B0
220 0x080001
r3
1 AE
r2
210 0x080001
r1 0 AC
r0
0x00000000
Registers CPU
27 Memory
Fetch Instruction: pc = 0x08001AC
Decode Instruction: 2100 = MOVS r1, #0x00
Data Address
0x080001
r15 pc
AC 0xFFFFFFFF
r14 lr
r13 sp
r12
r11
r10
r9 477 0x080001
r8 ALU 0 B4
r7 200 0x080001
r6 0 B2
r5 188 0x080001
r4 B B0
220 0x080001
r3
1 AE
r2
210 0x080001
r1 0 AC
r0
0x00000000
Registers CPU
28 Memory
Execute Instruction:
MOVS r1, #0x00
Data Address
0x080001
r15 pc
AC 0xFFFFFFFF
r14 lr
r13 sp
r12
r11
r10
r9 477 0x080001
r8 ALU 0 B4
r7 200 0x080001
r6 0 B2
r5 188 0x080001
r4 B B0
220 0x080001
r3
1 AE
r2
210 0x080001
r1 0x000000 0
00 AC
r0
0x00000000
Registers CPU
29 Memory
Fetch Next Instruction: pc = pc + 2
Decode & Execute: 2201 = MOVS r2, #0x01
Data Address
0x080001
r15 pc
AE 0xFFFFFFFF
r14 lr
r13 sp
r12
r11
r10
r9 477 0x080001
r8 ALU 0 B4
r7 200 0x080001
r6 0 B2
r5 188 0x080001
r4 B B0
220 0x080001
r3
1 AE
r2 0x000000
01
0x000000 210 0x080001
r1 0
00 AC
r0
0x00000000
Registers CPU
30 Memory
Fetch Next Instruction: pc = pc + 2
Decode & Execute: 188B = ADDS r3, r1, r2
Data Address
0x080001
r15 pc
B0 0xFFFFFFFF
r14 lr
r13 sp
r12
r11
r10
r9 477 0x080001
r8 ALU 0 B4
r7 200 0x080001
r6 0 B2
r5 188 0x080001
B B0
r4
220 0x080001
r3 0x000000 1
01 AE
0x000000
r2 210 0x080001
01
0x000000
r1 0
00 AC
r0
0x00000000
Registers CPU
31 Memory
Fetch Next Instruction: pc = pc + 2
Decode & Execute: 2000 = MOVS r0, #0x00
Data Address
0x080001
r15 pc
B2 0xFFFFFFFF
r14 lr
r13 sp
r12
r11
r10
r9 477 0x080001
r8 ALU 0 B4
r7 200 0x080001
r6 0 B2
r5 188 0x080001
r4 B B0
220 0x080001
r3
1 AE
r2 0x000000
01
0x000000 210 0x080001
r1 0
00
0x000000
AC
r0
00 0x00000000
Registers CPU
32 Memory
Fetch Next Instruction: pc = pc + 2
Decode & Decode: 4770 = BX lr
Data Address
0x080001
r15 pc
B4 0xFFFFFFFF
r14 lr
r13 sp
r12
r11
r10
r9 477 0x080001
r8 ALU 0 B4
r7 200 0x080001
r6 0 B2
r5 188 0x080001
r4 B B0
220 0x080001
r3
1 AE
r2 0x000000
01
0x000000 210 0x080001
r1 0
00
0x000000
AC
r0
00 0x00000000
Registers CPU
33 Memory
Example:
Calculate the Sum of an Array

int a[10] = {1, 2, 3, 4, 5, 6, 7, 8, 9,


10};
int total;

int main(void){
int i;
total = 0;
for (i = 0; i < 10; i++) {
total += a[i];
}
while(1);
}

34
Example:
Calculate the Sum of an Array

Instruction Data
Memory Memory
(Flash) (RAM)
int main(void){ int a[10] = {1, 2, 3, 4,
int i; 5, 6, 7, 8, 9, 10};
total = 0; int total;
for (i = 0; i < 10;
i++) { I/O
CPU total += a[i];
} Devices
while(1);
}
Starting memory address Starting memory address
0x08000000 0x20000000

35
Example:
Calculate the Sum of an Array
0010 0001 0000
0000
0100 1010 0000 MOVS r1, #0x00
Instruction 1000 LDR r2, =
Memory 0110 0000 0001 total_addr
0001 STR r1, [r2,
(Flash)
0010 0000 0000 #0x00]
int main(void){ 0000 MOVS r0, #0x00
int i; 1110 0000 0000 B Check
total = 0; 1000 Loop: LDR r1, = a_addr
for (i = 0; i < 10;
0100 1001 0000 LDR r1, [r1, r0,
i++) {
total += a[i]; 0111 LSL #2]
} 1111 1000 0101 LDR r2, =
while(1); 0001 total_addr
}
Starting memory address 0001 0000 0010 LDR r2, [r2,
0x08000000 0000 #0x00]
0100 1010 0000 ADD r1, r1, r2
0100 LDR r2, =
0110 1000 0001 total_addr
0010 STR r1,
0100 0100 0001 [r2,#0x00]
0001 ADDS r0, r0, #1
0100 1010 0000 Check: CMP r0, #0x0A
0011
36 0110 0000 0001 BLT Loop
0001 NOP
Example:
Calculate the Sum of an Array
0x200000 0x000 a[0] =
00 1 0x00000001
0x200000 0x000
02 0 a[1] =
Data 0x200000 0x000 0x00000002
04 2
Memory (RAM) 0x200000 0x000 a[2] =
06 0 0x00000003
0x200000 0x000
int a[10] = {1, 2, 3, 4, 5, 6, 7, 8, 08 3 a[3] =
0x200000 0x000
9, 10}; 0x00000004
int total; 0A 0
a[4] =
0x200000 0x000 0x00000005
0C 4
0x200000 0x000
0E 0 a[5] =
0x000 0x00000006
0x200000
5
10
Assume the starting memory 0x000 a[6] =
address of the data memory is 0x200000
0 0x00000007
0x20000000 12
0x000
0x200000
6 a[7] =
14
0x000 0x00000008
0x200000 0
16 0x000
0x200000 7 a[8] =
Memory 18 0x000 0x00000009
Memory
address
0x200000 0
1A content
in bytes 0x000 a[9] =
37 0x200000 8 0x0000000A
1C 0x000
Loading Code and Data into Memory

38
Loading Code and Data into Memory

39
Loading Code and Data into Memory

• Stack is mandatory
• Heap is used only if
dynamic allocation
(e.g. malloc, calloc) is
used.

40
View of a Binary Program

41
42
from st.com
43
from st.com
44
from st.com
STM32L4

45 from st.com
Memory
Map

46

You might also like