0% found this document useful (0 votes)
4 views104 pages

02-Instruction Sets

The document provides an overview of instruction sets in embedded systems, focusing on ARM architecture and its characteristics, including RISC vs. CISC, memory architectures, and instruction execution methods. It details the ARM instruction set, including data processing instructions, conditional execution, and the use of a barrel shifter for efficient operations. Additionally, it covers the ARM register set, program counter behavior, and the handling of immediate values in instructions.

Uploaded by

5vj464qmh6
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views104 pages

02-Instruction Sets

The document provides an overview of instruction sets in embedded systems, focusing on ARM architecture and its characteristics, including RISC vs. CISC, memory architectures, and instruction execution methods. It details the ARM instruction set, including data processing instructions, conditional execution, and the use of a barrel shifter for efficient operations. Additionally, it covers the ARM register set, program counter behavior, and the handling of immediate values in instructions.

Uploaded by

5vj464qmh6
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 104

Instruction Sets

Chapter 2
COE 306: Introduction to Embedded Systems
Dr. Aiman El-Maleh
Computer Engineering Department

College of Computer Sciences and Engineering


King Fahd University of Petroleum and Minerals
Next . . .
 Computer Architecture Taxonomy
 ARM Instruction Set
 PIC16F Instruction Set
 TI CX55 DSP
 TI C64X DSP

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 2
Von Neumann Architecture
 Memory holds data, instructions
 Central processing unit (CPU) fetches instructions
from memory
 Allows self-modifying code

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 3
Harvard Architecture
 Has separate memories for data and program & allows
two simultaneous memory fetches
 Harvard architectures are widely used today
 the separation of program and data memories provides higher
performance for digital signal processing (DSP); higher memory
bandwidth

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 4
Instruction Set Characteristics
 The instruction set of the computer defines the interface
between software modules and the underlying hardware.
 Instruction set characteristics:
 Fixed vs. variable length
 Addressing modes
 Number of operands and type
 Types of operations supported
 We often characterize architectures by their word length:
4-bit, 8-bit, 16-bit, 32-bit, 64-bit, etc.
 Instruction set architecture: instruction set, memory,
programmer accessible registers

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 5
RISC vs. CISC
 Complex instruction set computer (CISC):
 a variety of instructions that may perform very complex tasks;
 many addressing modes.
 Reduced instruction set computer (RISC):
 fewer and simpler instructions;
 load/store;
 pipelined instructions.
 Early RISC designs substantially outperformed CISC
designs.
 Performance gap between RISC and CISC has
narrowed as CISC used RISC techniques to efficiently
execute CISC instructions.
Instruction Sets COE 306– Introduction to Embedded System– KFUPM
slide 6
Little-Endian vs. Big-Endian
 Little-endian: least-significant byte first
 Big-endian: most-significant byte first
Big-Endian Little-Endian

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 7
Instruction Execution
 Single-issue: A single-issue processor executes one
instruction at a time.
 Multiple-issue
 A superscalar processor uses specialized logic to identify at
run time instructions that can be executed simultaneously
 A VLIW processor relies on the compiler to determine what
combinations of instructions can be legally executed together
 VLIW is more energy-efficient than superscalar
 VLIW is widely used in signal processing
 Multimedia processing
 Processing multiple channels of signals

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 8
Development of the ARM
Architecture

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 9
ARM Architecture Version
Summary

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 10
ARM Architecture Version
Summary

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 11
ARM Instruction Set
 RISC architecture: load/store
 32-bit words and instructions
 ARM7: Von Neumann architecture – our focus
 ARM9: Harvard architecture
 Cortex-M: either architecture based on model
 ARM7: 32-bit addresses, byte-addressable
 Configurable endianness on power-up
 Most ARM’s implement two instruction sets
 32-bit ARM Instruction Set
 16-bit Thumb Instruction Set

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 12
ARM Processor Modes
 ARM has seven basic operating modes:
 User: unprivileged mode under which most tasks run
 FIQ: entered when a high priority (fast) interrupt is raised
 IRQ: is used for general-purpose interrupt handling
 Supervisor: is a protected mode for the operating system
entered when the CPU is reset or a software interrupt instruction
is executed
 Abort: used to handle memory access violations
 Undefined: entered when an undefined instruction is executed
 System: privileged mode for the operating system. It is not
entered due to an exception

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 13
The ARM Register Set
 ARM has 37 registers

Note: System mode uses the User mode register set


Instruction Sets COE 306– Introduction to Embedded System– KFUPM
slide 14
The ARM Register Set
 ARM has 37 registers all of which are 32-bits long.
 1 dedicated program counter (pc)
 1 dedicated current program status register (cpsr)
 5 dedicated saved program status registers (spsr)
 30 general purpose registers

 The current processor mode governs which of several


banks is accessible. Each mode can access
 a particular set of r0-r12 registers
 a particular r13 (stack pointer, sp) and r14 (link register, lr)
 the program counter, r15 (pc)
 the current program status register, cpsr

 Privileged modes (except System) can also access


 a particular spsr (saved program status register)
Instruction Sets COE 306– Introduction to Embedded System– KFUPM
slide 15
Program Counter (r15)
 When the processor is executing in ARM state:
 All instructions are 32 bits wide
 All instructions must be word-aligned
 Therefore the PC value is stored in bits [31:2] with bits [1:0]
undefined (as instruction cannot be halfword)
 When the processor is executing in Thumb state:
 All instructions are 16 bits wide
 All instructions must be halfword-aligned
 Therefore the PC value is stored in bits [31:1] with bits [0]
undefined (as instruction cannot be byte-aligned)

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 16
Program Status Registers

 Condition code flags  Interrupt Disable bits.


 N = Negative result from ALU  I = 1: Disables the IRQ.
 Z = Zero result from ALU  F = 1: Disables the FIQ.
 C = ALU operation Carried out
 V = ALU operation oVerflowed  T Bit
 Architecture xT only
 Sticky Overflow flag - Q flag  T = 0: Processor in ARM state
 Architecture 5TE/J only  T = 1: Processor in Thumb state
 Indicates if saturation has occurred
 J bit  Mode bits
 Architecture 5TEJ only  Specify the processor mode
 J = 1: Processor in Jazelle state

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 17
ARM Instruction Set Format

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 18
Conditional Execution
 Most instruction sets only allow branches to be executed
conditionally.
 However by reusing the condition evaluation hardware,
ARM effectively increases number of instructions.
 All instructions contain a condition field which determines
whether the CPU will execute them.
 This removes the need for many branches, which stall
the pipeline.
 Allows very dense in-line code, without branches.

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 19
The Condition Field

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 20
Data Processing Instructions
 Largest family of ARM instructions, all sharing the same
instruction format.
 Contains:
 Arithmetic operations
 Comparisons (no results - just set condition codes)
 Logical operations
 Data movement between registers
 They each perform a specific operation on one or two
operands.
 First operand always a register - Rn
 Second operand sent to the ALU via barrel shifter.

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 21
Data Processing Instructions
31 28 27 26 25 24 21 20 19 16 15 12 1 1 0

cond 00 # opcode S Rn Rd operand 2

destination register
first operand register
set condition codes
arithmetic/logic function

25 11 8 7 0

1 #rot 8-bit immediate

immediate alignment
11 7 6 5 4 3 0
#shift Sh 0 Rm

25 immediate shift length


0 shift type
second operand register
11 8 7 6 5 4 3 0
Rs 0 Sh 1 Rm

register shift length

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 22
Arithmetic Operations
 Operations are:
 ADD operand1 + operand2
 ADC operand1 + operand2 + carry
 SUB operand1 - operand2
 SBC operand1 - operand2 + carry - 1
 RSB operand2 - operand1
 RSC operand2 - operand1 + carry - 1
 Syntax: <Operation>{<cond>}{S} Rd, Rn, Operand2
 Examples
 ADD r0, r1, r2
 SUBGT r3, r3, #1
 RSBLES r4, r5, #5

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 23
Logical Operations
 Operations are:
 AND operand1 AND operand2
 EOR operand1 EOR operand2
 ORR operand1 OR operand2
 BIC operand1 AND NOT operand2 [i.e. bit clear]
 Syntax: <Operation>{<cond>}{S} Rd, Rn, Operand2
 Examples:
 AND r0, r1, r2
 BICEQ r2, r3, #7
 EORS r1,r3,r0

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 24
Comparisons
 The only effect of the comparisons is to
 UPDATE THE CONDITION FLAGS. Thus no need to set S bit.
 Operations are:
 CMP operand1 - operand2, but result not written
 CMN operand1 + operand2, but result not written
 TST operand1 AND operand2, but result not written
 TEQ operand1 EOR operand2, but result not written
 Syntax: <Operation>{<cond>} Rn, Operand2
 Examples:
 CMP r0, r1
 TSTEQ r2, #5

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 25
Data Movement
 Operations are:
 MOV operand2
 MVN NOT operand2
Note that these make no use of operand1.
 Syntax: <Operation>{<cond>}{S} Rd, Operand2
 Examples:
 MOV r0, r1
 MOVS r2, #10
 MVNEQ r1,#0

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 26
The Barrel Shifter
 ARM doesn’t have actual shift instructions.
 A barrel shifter provides a mechanism to carry out shifts
as part of other instructions.

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 27
Using the Barrel Shifter:
The Second Operand

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 28
Second Operand:
Using a Shifted Register
 Multiplication by constants can often be done by using
some combination of MOVs, ADDs, SUBs and RSBs
with shifts.
 Multiplications by a constant equal to a ((power of 2) ± 1) can be
done in one cycle.
 Example: r0 = r1 * 5
Example: r0 = r1 + (r1 * 4)
ï ADD r0, r1, r1, LSL #2
 Example: r2 = r3 * 105
Example: r2 = r3 * 15 * 7
Example: r2 = r3 * (16 - 1) * (8 - 1)
ï RSB r2, r3, r3, LSL #4 ; r2 = r3 * 15
ï RSB r2, r2, r2, LSL #3 ; r2 = r2 * 7

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 29
Second Operand: Immediate Value
 No ARM instruction can contain a 32 bit immediate
constant
 All ARM instructions are fixed as 32 bits long
 The data processing instruction format has 12 bits
available for operand2

 4 bit rotate value (0-15) is multiplied by two to give range 0-30 in


steps of 2
 Values that cannot be generated in this way will cause an error.
https://alisdair.mcdiarmid.org/arm-immediate-value-encoding/

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 30
Second Operand: Immediate Value
 This gives us:
 0 - 255 [0 - 0xff]
 256,260,264,..,1020 [0x100-0x3fc, step 4, 0x40-0xff ror 30]
 1024,1040,1056,..,4080 [0x400-0xff0, step 16, 0x40-0xff ror 28]
 4096,4160, 4224,..,16320 [0x1000-0x3fc0, step 64, 0x40-0xff ror 26]

 These can be loaded using, for example:


 MOV r0, #0x40, 26 ; => MOV r0, #0x1000 (i.e. 4096)
 To make this easier, the assembler will convert to this
form for us if simply given the required constant:
 MOV r0, #4096 ; => MOV r0, #0x1000 (i.e. 0x40 ror 26)
 The bitwise complements can also be formed using
MVN: MOV r0, #0xFFFFFFFF; assembles to MVN r0, #0
Instruction Sets COE 306– Introduction to Embedded System– KFUPM
slide 31
Loading 32-bit Constants
 To allow larger constants to be loaded, the assembler
offers a pseudo-instruction: LDR rd, =const
 This will either:
 Produce a MOV or MVN instruction to generate the value (if
possible).
 Generate a LDR instruction with a PC-relative address to read
the constant from a literal pool (Constant data area embedded in
the code).
 For example
 LDR r0,=0xFF => MOV r0,#0xFF
 LDR r0,=0x55555555 => LDR r0,[PC,#Imm12]


DCD 0x55555555

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 32
Multiplication Instructions
 The Basic ARM provides two multiplication instructions.
 Multiply
 MUL{<cond>}{S} Rd, Rm, Rs ; Rd = Rm * Rs
 Multiply Accumulate - does addition for free
 MLA{<cond>}{S} Rd, Rm, Rs, Rn ; Rd = (Rm * Rs) + Rn
 Restrictions on use:
 Rd and Rm cannot be the same register
 Cannot use PC.
 Operands can be considered signed or unsigned
 Up to user to interpret correctly.

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 33
Load / Store Instructions

LDR STR Word


LDRB STRB Byte
LDRH STRH Halfword
LDRSB Signed byte load
LDRSH Signed halfword load
 Syntax:
 LDR{<size>} {<cond>} Rd, <address>
 STR{<size>} {<cond>} Rd, <address>

 All of these instructions can be conditionally executed by


inserting the appropriate condition code after STR / LDR.
 e.g. LDRBEQ

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 34
Encoding Format of LDR & STR

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 35
Address Accessed
 Address accessed by LDR/STR is specified by a base
register and an offset
 For word and unsigned byte accesses, offset can be
 An unsigned 12-bit immediate value (i.e. 0 - 4095 bytes).
LDR r0,[r1,#8]
 A register, optionally shifted by an immediate value
LDR r0,[r1,r2]
LDR r0,[r1,r2,LSL#2]
 Offset can be either added or subtracted from the base
register:
LDR r0,[r1,#-8]
LDR r0,[r1,-r2]
LDR r0,[r1,-r2,LSL#2]

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 36
Address Accessed
 For halfword and signed halfword / byte, offset can be:
 An unsigned 8 bit immediate value (i.e. 0-255 bytes).
 A register (unshifted).
 For all accesses, offset can be applied
 before the transfer is made: Pre-indexed addressing
 optionally auto-incrementing the base register, by postfixing the
instruction with an ‘!’.
 after the transfer is made: Post-indexed addressing
 causing the base register to be auto-incremented.

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 37
Pre- or Post-Indexed Addressing

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 38
ARM ADR Pseudo-Op
 Cannot refer to an address directly in an instruction.
 ADR pseudo-op generates instruction required to
calculate address:
ADR r1,FOO
 Generates value by performing arithmetic on PC.

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 39
Arithmetic Operations Example
 C: x = (a + b) - c;
 ARM Assembly:
ADR r4,a ; get address for a
LDR r0,[r4] ; get value of a
ADR r4,b ; get address for b, reusing r4
LDR r1,[r4] ; get value of b
ADD r3,r0,r1 ; compute a+b
ADR r4,c ; get address for c
LDR r2,[r4] ; get value of c
SUB r3,r3,r2 ; complete computation of x
ADR r4,x ; get address for x
STR r3,[r4] ; store value of x
Instruction Sets COE 306– Introduction to Embedded System– KFUPM
slide 40
Arithmetic Operations Example
 C: y = a * (b + c);
 ARM Assembly:
ADR r4,b ; get address for b
LDR r0,[r4] ; get value of b
ADR r4,c ; get address for c
LDR r1,[r4] ; get value of c
ADD r2,r0,r1 ; compute partial result
ADR r4,a ; get address for a
LDR r0,[r4] ; get value of a
MUL r2,r0,r2 ; compute final value for y
ADR r4,y ; get address for y
STR r2,[r4] ; store y
Instruction Sets COE 306– Introduction to Embedded System– KFUPM
slide 41
Logical Operations Example
 C: z = (a << 2) | (b & 15);
 ARM Assembly:
ADR r4,a ; get address for a
LDR r0,[r4] ; get value of a
MOV r0,r0,LSL #2 ; perform shift
ADR r4,b ; get address for b
LDR r1,[r4] ; get value of b
AND r1,r1,#15 ; perform AND
ORR r1,r0,r1 ; perform OR
ADR r4,z ; get address for z
STR r1,[r4] ; store value for z

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 42
ARM Branches and Subroutines
Branch: B{<cond>} label
 PC relative: ±32 Mbyte range.
Branch with Link: BL{<cond>}
sub_routine_label
 Stores return address in LR
 Returning implemented by restoring the PC from LR
 MOV pc, lr

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 43
Control Flow: If Statement
Example
 C:
if (a > b) { x = 5; y = c + d; } else x = c - d;
 ARM Assembly:
; compute and test condition
ADR r4,a ; get address for a
LDR r0,[r4] ; get value of a
ADR r4,b ; get address for b
LDR r1,[r4] ; get value for b
CMP r0,r1 ; compare a > b
BLE fblock ; if a <= b, branch to false block

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 44
Control Flow: If Statement
Example
; true block
MOV r0,#5 ; generate value for x
ADR r4,x ; get address for x
STR r0,[r4] ; store x
ADR r4,c ; get address for c
LDR r0,[r4] ; get value of c
ADR r4,d ; get address for d
LDR r1,[r4] ; get value of d
ADD r0,r0,r1 ; compute y
ADR r4,y ; get address for y
STR r0,[r4] ; store y
B after ; branch around false block

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 45
Control Flow: If Statement
Example
; false block
Fblock ADR r4,c ; get address for c
LDR r0,[r4] ; get value of c
ADR r4,d ; get address for d
LDR r1,[r4] ; get value for d
SUB r0,r0,r1 ; compute c-d
ADR r4,x ; get address for x
STR r0,[r4] ; store value of x
after ...

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 46
Control Flow: Switch Statement
 C:
switch (test) { case 0: … break; case 1: … }
 ARM Assembly:
ADR r2,test ; get address for test
LDR r0,[r2] ; load value for test
ADR r1,switchtab ; load address for switch table
LDR r1,[r1,r0,LSL #2] ; index switch table
MOV pc, r1
switchtab DCD case0
DCD case1
...

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 47
Control Flow: For Loop
 C:
for (i=0, f=0; i<N; i++)
f = f + c[i]*x[i];
 ARM Assembly:
; loop initiation code
MOV r0,#0 ; use r0 for i FIR FILTER
MOV r8,#0 ; use separate index for arrays
ADR r2,N ; get address for N
LDR r1,[r2] ; get value of N
MOV r2,#0 ; use r2 for f

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 48
Control Flow: For Loop
ADR r3,c ; load r3 with base of c
ADR r5,x ; load r5 with base of x
; loop body
loop LDR r4,[r3,r8] ; get c[i]
LDR r6,[r5,r8] ; get x[i]
MUL r4,r6,r4 ; compute c[i]*x[i]
ADD r2,r2,r4 ; add into running sum
ADD r8,r8,#4 ; add one word offset to array
index
ADD r0,r0,#1 ; add 1 to i
CMP r0,r1 ; exit?
BLT loop ; if i < N, continue

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 49
Examples of Conditional Execution
 Use a sequence of several conditional instructions
if (a==0) func(1); [Assume a & parameter are stored in r0]
CMP r0,#0
MOVEQ r0,#1
BLEQ func

 Set the flags, then use various condition codes


if (a==0) x=0; [Assume a is stored in r0 and x in r1]
if (a>0) x=1;
CMP r0,#0
MOVEQ r1,#0
MOVGT r1,#1

 Use conditional compare instructions


if (a==4||a==10) x=0;[Assume a is stored in r0 and x in r1]
CMP r0,#4
CMPNE r0,#10
MOVEQ r1,#0
Instruction Sets COE 306– Introduction to Embedded System– KFUPM
slide 50
Block Data Transfer
 The Load and Store Multiple instructions (LDM / STM)
allow between 1 and 16 registers to be transferred to or
from memory.
 Base register used to determine where memory access
should occur.
 4 different addressing modes allow increment and decrement of
the base register before or after the memory access.
 The base register can be incremented or decremented by one
word for each register in the operation.
 Base register can be optionally updated following the transfer by
appending it with an ‘!’.
 Lowest register number is always transferred to/from lowest
memory location accessed.

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 51
Block Data Transfer
 These instructions are efficient for:
 Saving and restoring context (stack)
 Moving large blocks of data around memory

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 52
Stack
 A stack is an area of memory which grows as new data
is “pushed” onto the “top” of it, and shrinks as data is
“popped” off the top.
 Two pointers define the current limits of the stack.
 A base pointer
 used to point to the “bottom” of the stack (the first location).
 A stack pointer
 used to point the current “top” of the stack.

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 53
Stack Operation
 Traditionally, a stack grows down in memory, with the
last “pushed” value at the lowest address.
 ARM also supports ascending stacks, where the stack
structure grows up through memory.
 The value of the stack pointer can either:
 Point to the last occupied address (Full Stack)
 and so needs pre-decrementing (i.e. before the push)
 Point to the next occupied address (Empty Stack)
 and so needs post-decrementing (i.e. after the push)

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 54
Stack Operation
 The stack type to be used is given by the postfix to the
instruction:
 STMFD / LDMFD : Full Descending stack
 STMFA / LDMFA : Full Ascending stack.
 STMED / LDMED : Empty Descending stack
 STMEA / LDMEA : Empty Ascending stack

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 55
Stack Examples

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 56
Function Calls
 Branch-and-link instruction: BL
 Uses r14 (lr) to store the current PC
 Procedure call stack
 Allows nested function calls
 Allows passing of parameters and return values
 Structure is left to the programmer
 Compilers use frames
 Also store local variables
 Stack pointer (sp) and frame pointer (fp)

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 57
ARM Procedure Call
Standard(APCS)
 r0 – r3: first 4 parameters
 Additional parameters use the stack frame
 r0: return value
 r4 – r7: register variables
 r11: frame pointer (fp)
 r12: intra procedure (ip) scratch register (used by the
linker and volatile across function calls)
 r13: stack pointer (sp)
 r10: stack size

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 58
Stacks and Subroutines
 Any registers needed can be pushed onto the stack at
start of subroutine and popped off again at end so as to
restore them before return to the caller:
STMFD sp!,{r0-r12, lr} ; stack all registers
........ ; and the return address
LDMFD sp!,{r0-r12, pc} ; load all the registers
; and return automatically

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 59
PICmicro PIC16F Instruction Set
 8-bit word size, 14-bit instructions
 Harvard architecture
 Instruction flash memory up to 8192 (i.e., 8K) words
 13-bit PC
 Data memory includes up to
 368 words SRAM Harvard
Architecture
 256 bytes EEPROM
 Low-power features: 8-bit Data
Bus
 Sleep mode Memory
14-bit
CP U
 Support for multiple clocks Bus
P rogram
Memory

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 60
Instruction Memory Organization
 Program memory is divided into
four 2k×14 pages
 Interrupt vectors in bottom of
memory
 PC can be loaded from 8-level
stack.
 Separate from program and data
memory.
 Stack operates as a circular buffer

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 61
Register File Concept
 All of data memory is
part of the register
file, so any location
in data memory may
be operated on
directly
 All peripherals are
mapped into data
memory as a series
of registers
 W is working register

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 62
Architecture Block Diagram

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 63
Data Memory Organization

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 64
Data Memory Organization

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 65
Status Register

IRP: Register Bank Select (used for Indirect addressing)


0 = Bank 0, 1 1 = Bank 2, 3
RP1:RP0: Register Bank Select Bits (used for direct addressing)
00 = Bank 0, 01 = Bank 1, 10 = Bank 2, 11 = Bank 3
TO: Time-out bit
0 = A WDT time-out occurred
PD: Power-down bit
0 = SLEEP instruction executed
Z: Zero bit
1 = Result of arithmetic operation is zero
DC: Digit cary / borrow bit
1 = Carry out of 4th low order bit occurred / No borrow occurred
C: Carry / borrow bit
1 = Carry out of MSb occurred / No borrow occurred

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 66
Program Counter
 13-bit PC can access up to 213 = 8192 words
 Contains address of NEXT instruction
 Lower byte accessible in data memory as PCL
 Upper byte indirectly accessible via PCLATH
 Runs freely across page boundaries
 PCL register can be used as an operand in instructions

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 67
PIC Instruction Set – Description
Convention

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 68
PIC16 Instruction Set

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 69
Instruction Set Overview
Byte Oriented Operations
13 8 7 6 0

Opcode f f f f f f f

Opcode d f f f f f f f
File Register Address

Destination (W or F)

ADDWF 0x25, W

File Register Address Destination

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 70
Instruction Set Overview
Bit Oriented Operations

13 10 9 7 6 0

Opcode b b b f f f f f f f
File Register Address

Bit Position (0-7)

BSF 0x25, 3

File Register Address Bit Position

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 71
Instruction Set Overview
Literal and Control Operations
13 10 8 7 0

Opcode

Opcode k k k k k k k k

Opcode k k k k k k k k k k k
Literal Value

MOVLW 0x55

Literal Value

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 72
PIC16 Addressing Modes
 Data Memory Access
 Direct addwf <data_address>, <d>
 Indirect addwf INDF, <d>
 Immediate (Literal) movlw <constant>
 Indirect addressing controlled by INDF and FSR (File
Select Register) registers
 Access to INDF causes indirect load through FSR
 Program Memory Access
 Absolute goto <program_address>
 Relative addwf PCL,f

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 73
Data Memory: Direct Addressing

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 74
Data Memory: Indirect Addressing

8 bits from FSR + 1 bit from IRP = 9 bits


Effective 29 = 512 memory locations

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 75
Register Indirect Addressing

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 76
Control Flow Instructions
 GOTO: Unconditional branch
 INCFSZ: Increment f, skip if 0
 DECFSZ: Decrement f, skip if 0
 BTFSC: Bit test f, skip if clear
 BTFSS: Bit test f, skip if set
 CALL: Call subroutine
 RETURN: Return from subroutine
 RETLW: Return with literal in W
 RETFIE: Return from interrupt
 SLEEP: Go into standby mode

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 77
PC Absolute & Relative Addressing
 PC Absolute Addressing (Program Memory)
 Jump to another program memory location out of PC sequence
 Call a subroutine
 Used by the CALL and GOTO instructions
 11 bits of the required 13 address bits are encoded in the
instruction
 2 additional bits will come from the PCLATH register
 PC-Relative Addressing is used when performing
Computed Goto operation
 Address to jump to is calculated by the program
 Computed address is written directly into the Program Counter

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 78
PC Absolute Addressing

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 79
PC Absolute Addressing

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 80
PC Relative Addressing

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 81
PC Relative Addressing: Lookup
Table

Note: After fetching an instruction PCL gets incremented by 1.


Instruction Sets COE 306– Introduction to Embedded System– KFUPM
slide 82
TI C55X Organization
 Relatively high-performance digital signal processor
 Accumulator architecture
 Word: 16-bit, Longword: 32-bit
 Some instructions are bit-addressable
 Most registers are special-purpose
 Most registers are memory-mapped
 Registers can be accessed by name or address
 Can execute two instructions in parallel
 Has a variable length instruction set (8-16-24-32-40-64
bits)

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 83
TI C55x Microarchitecture

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 84
Instruction Buffer Unit (I Unit)
 Receives program code into
its instruction buffer queue
and decodes instructions
 Passes data to the P unit,
the A unit, and the D unit for
the execution of instructions
 The CPU fetches 32 bits at
a time from program
memory
 8 bytes are transferred from
the queue to the instruction
decoder

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 85
Program Flow Unit (P Unit)
 The program-address generation
logic is responsible for generating
24-bit addresses for fetches from
program memory
 The program control logic
performs the following actions:
 Tests whether a condition is true for
a conditional instruction
 Initiates interrupt servicing
 Controls execution of repetition
instructions
 Manages instructions that are
executed in parallel.
Instruction Sets COE 306– Introduction to Embedded System– KFUPM
slide 86
P-Unit Registers

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 87
Address-Data Flow Unit (A Unit)
 Data-Address Generation
Unit (DAGEN)
 generates all addresses for
reads from or writes to data
space and I/O space
 A-Unit Arithmetic Logic
Unit (A-Unit ALU)
 Performs additions,
subtractions, comparisons,
Boolean logic operations,
signed shifts, logical shifts,
and absolute value
calculations

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 88
A-Unit Registers

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 89
Data Computation Unit (D Unit)
 40-bit shifter
 40-bit ALU
 Two Multiply-and-Accumulate
Units (MACs)
 each MAC can perform a 17-bit
×17-bit multiplication (fractional
or integer) and a 40-bit addition
or subtraction with optional
32-/40-bit saturation
 D-Unit Bit Manipulation Unit
(D-Unit BIT)
 Extracts and expands bit fields,
and performs bit counting
Instruction Sets COE 306– Introduction to Embedded System– KFUPM
slide 90
D-Unit Registers
 Accumulators
 AC0–AC3 40-bit Accumulators 0, 1, 2, and 3
 AC0L: low-order bits 0 – 15
 AC0H: high-order bits 16 – 31
 AC0G: guard bits 32-39
 Increase the range for intermediate calculations
 Transition Registers
 TRN0, TRN1 16-bit Transition registers 0 and 1

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 91
Memory Organization

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 92
Memory Organization
 24-bit address space
 Data, program, and I/O spaces are mapped to the same
physical memory
 Program space: byte-addressable, 24-bit addresses
 Data space: word-addressable
 I/O space: 64K words
 Data space: 128 pages of 64K words each
 Partial on-chip memory
 First 96 words of data page 0 & First 192 words of
program space
 Memory-mapped registers

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 93
C55x Addressing Modes
 Three addressing modes:
 Absolute addressing supplies an address in an instruction.
 Direct addressing supplies an offset.
 Indirect addressing uses a register as a pointer.
 Absolute Addressing
 k16: 16-bit + DPH register = 23-bit absolute address
 k23: 23-bit unsigned value
 I/O absolute address: 16-bit unsigned value, e.g. port(#1234)

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 94
Direct Addressing Modes
 DP direct
 Uses an offset relative to data page register (DP) to access a
memory location; ADP = DPH[22:15](DP+Doffset)
 SP direct
 Uses an offset relative to data stack pointer (SP) to access
stack values in data memory; ASP = SPH[22:15](SP+Soffset)
 Register-bit direct
 Uses an offset from LSB to specify a bit address, used to
access one register bit or two adjacent register bits
 PDP direct
 Uses an offset relative to peripheral data page register (PDP) to
specify an I/O address; APDP = PDP[15:6]PDPoffset

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 95
Indirect Addressing Modes
 AR indirect addressing: uses one of eight auxiliary
registers (AR0–AR7) to point to data
 Dual AR indirect addressing: similar to AR indirect mode;
used with an instruction that accesses two or more data-
memory locations at the same time
 CDP indirect addressing: uses the coefficient data
pointer (CDP) to point to coefficient data
 Coefficient indirect addressing: similar to CDP indirect
mode; used for instructions that can access a coefficient
in data memory at the same time they access two other
data-memory values using the dual AR indirect
addressing mode

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 96
C55x Data Instructions Examples
 MOV src, dst register/memory register/memory
 ADD src, dst dst = dst + src
 ADD dual(Lmem),ACx,ACy
 LO(ACy) = LO(Lmem) + LO(ACx)
 HI(ACy) = HI(Lmem) + HI(ACx)
 MPY src, dst Integer multiplication of 16-bit values
 MAC ACx, Tx, ACy
 ACy = ACy + (ACx * Tx)
 CMP src == dst, TC1
 Compare and set a test control flag TC1 in ST0_55 register

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 97
Control Flow Instructions
 Unconditional branch
 B ACx
 B label
 Conditional branch
 BCC label, cond
 Examples of cond
 ACx <= #0; content is less than or equal to 0
 !overflow(ACx) ACOVx bit is cleared to 0
 ARx != #0 content is not equal to 0
 CARRY; CARRY bit is set to 1
 TC1 ^ TC2; TC1 XOR TC2 is equal to 1

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 98
Loops and Procedure Calls
 Loops
 Single-instruction repeat registers: RPTC, CSR
 Block-repeat registers: BRC0, RSA0, REA0, BRC1, . . .
 2 levels of block-repeat
 Procedures:
 CALL target: Unconditional procedure call. Uses the stack
 CALLCC addr, cond: Conditional procedure call
 Fast-return vs. slow-return

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 99
TI 64X DSP
 High-performance VLIW DSP.
 Up to eight instructions per cycle
 Divided into two data paths A and B
 Floating-point and integer arithmetic
 Harvard architecture
 Load/store architecture
 Separate data and program on-chip memories
 Combined external memory
 Instruction grouping: fetch packet and execute packet

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 100
C64x Block Diagram
 Execution divided between two data paths
 Each data path has its own register file and 4 different
functional units

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 101
Function units
 .L1 and .L2 perform 32/40-bit arithmetic, 32-bit logical,
data packing/unpacking.
 .S1 and .S2 perform 32-bit arithmetic, 32/40-bit shift and
bit-field ops, 32-bit logical ops, branches, etc.
 .M1 and .M2 perform multiplications, bit interleaving,
rotation, Galois field multiplication, etc.
 .D1 and .D2 perform address calculations, loads/stores,
etc.

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 102
TI C64X VLIW Organization
 Fetch packet:
 A group of instructions fetched together
 8 words, aligned on 256-bit boundaries
 Up to 14 instructions per fetch packet
 Execute packet:
 A group of instructions that execute together
 Up to 8 instructions, if they use different functional units
 A fetch packet may result in multiple execute packets

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 103
Summary
 Both the von Neumann and Harvard architectures are in
common use today
 ARM is a load-store architecture. It provides a few
relatively complex instructions, such as saving and
restoring multiple registers
 The PIC16F is a very small, efficient microcontroller
 The C55x provides a number of architectural features to
support the arithmetic loops that are common on digital
signal processing code
 The C64x organizes instructions into execution packets
to enable parallel execution

Instruction Sets COE 306– Introduction to Embedded System– KFUPM


slide 104

You might also like