Chapter-6
Chapter-6
Chapter 6
ASSEMBLY
LANGUAGE
Intel Pentium 4
C Language
Intel Core i7
Program
A
GCC x86-64 AMD Ryzen
1
02/03/2019
CPU
PC
Memory
Registers
Instructions
What instructions are available? What do they do?
How are they encoded?
Registers
How many registers are there?
How wide are they?
Memory
How do you specify a memory location?
2
02/03/2019
Mainstream ISAs
CPU Addresses
Memory
Registers • Code
PC Data
• Data
Condition Instructions • Stack
Codes
Programmer-visible state
PC: the Program Counter (rip in x86-64)
Address of next instruction
Memory
Named registers
Byte-addressable array
Together in “register file”
Heavily used program data
Code and user data
Condition codes Includes the Stack (for
supporting procedures)
Store status information about most recent
arithmetic operation
Used for conditional branching
3
02/03/2019
4
02/03/2019
eax ax ah al accumulate
ecx cx ch cl counter
general purpose
edx dx dh dl data
ebx bx bh bl base
5
02/03/2019
What is an Assembler?
Our platform
6
02/03/2019
Character Set
Letters a..z A..Z
Digits 0..9
Special characters ? _ @ $ . ~
NASM (unlike most assemblers) is case-sensitive
with respect to labels and variables
It is not case-sensitive with respect to keywords,
mnemonics, register names, directives, etc.
7
02/03/2019
Literals
Integers
Numeric digits (including A..F) with no decimal point
may include radix specifier at end:
b or y binary
d decimal
h hexadecimal
q octal
Examples
200 decimal (default)
200d decimal
200h hex
200q octal
10110111b binary
8
02/03/2019
NASM Syntax
In order to refer to the contents of a memory location, use square
brackets.
In order to refer to the address of a variable, leave them out, e.g.,
mov eax, bar ;Refers to the address of bar
mov eax, [bar] ;Refers to the contents of bar
No need for the OFFSET directive.
NASM does not support the hybrid syntaxes such as:
mov eax,table[ebx] ;ERROR
mov eax,[table+ebx] ;O.K
mov eax,[es:edi] ;O.K
NASM does NOT remember variable types:
data dw 0 ;Data type defi ned as double word.
mov [data], 2 ;Doesn’t work.
mov word [data], 2 ;O.K
9
02/03/2019
Statemenmts
Syntax:
[label[:]] [mnemonic] [operands] [;comment]
[ ] indicates optionality
Note that all parts are optional blank lines are legal
[label] can also be [name]
Variable names are used in data definitions
Labels are used to identify locations in code
Statements are free form; they need not be formed
into columns
Statement must be on a single line, max 128 chars
10
02/03/2019
Example:
L100: add eax, edx ; add subtotal to total
Labels often appear on a separate line for code
clarity:
L100:
add eax, edx ; add subtotal to total
11
02/03/2019
Type of statements
1. Directives
limit EQU 100 ; defines a symbol limit
% define limit 100 ; like C #define
2. Data Definitions
msg db 'Welcome to Assembler!‘
db 0Dh, 0Ah
count dd 0
mydat dd 1,2,3,4,5
resd 100 ; reserves 400 bytes
3. Instructions
mov eax, ebx
add ecx, 10
Directives
12
02/03/2019
Defines a symbol
Including files
%include “some_file”
If you know the C preprocessor, these are the
same ideas as
#define SIZE 100 or #include “stdio.h
13
02/03/2019
Data formats
14
02/03/2019
Example : L8 db 0, 1, 2, 3
Examples
mov al , [L2] ;move a byte at L2 to al
mov eax, L2 ;move the address of L2 to eax
mov [L1], ah ;move ah to the byte pointed to by L1
mov eax, dword 5
add [L2], eax ;double word at L2 containing [L2]+eax
mov [L2], 1 ;does not work, why?
mov dword [L2], 1 ;works, why
15
02/03/2019
NASM directives
16
02/03/2019
Examples using $
17
02/03/2019
18
02/03/2019
Example
Uninitialized Data
19
02/03/2019
20
02/03/2019
Program structure
SECTION .data ;data section
msg: db "Hello World",10 ;the string to print 10=newline
len: equ $-msg ;len is value, not an addr.
SECTION .text ;code section
global main ;for linker
main: ;standard gcc entry point
mov edx, len ;arg3, len of str. to print
mov ecx, msg ;arg2, pointer to string
mov ebx, 1 ;arg1, write to screen
mov eax, 4 ;write sysout command to int 80 hex
int 0x80 ;interrupt 80 hex, call kernel
mov ebx, 0 ;exit code, 0=normal
mov eax, 1 ;exit command to kernel
int 0x80 ;interrupt 80 hex, call kernel
21
02/03/2019
Program layout
Consit of 3 parts:
Text
Data
Bss
; include directives
segment .data
; DX directives
segment .bss
; RESX directives
segment .text
global asm_main
asm_main:
; instructions
22
02/03/2019
segment .text
global asm_main
asm_main:
enter 0,0 ;setup
pusha ;save all registers
;put your code here
popa ;restore all registers
mov eax, 0 ;return value
leave
ret
23
02/03/2019
; include directives
segment .data
; DX directives
segment .bss
; RESX directives
segment .text
global asm_main
asm_main:
enter 0,0
pusha
; Your program here
popa
mov eax, 0
leave
ret
Example
segment .data
integer1 dd 15 ; first int
integer2 dd 6 ; second int
segment .bss
result resd 1 ; result
segment .text
global asm_main
asm_main:
enter 0,0
pusha
24
02/03/2019
I/O?
This is all well and good, but it’s not very interesting if we can’t
“see” anything
We would like to:
Be able to provide input to the program
Be able to get output from the program
Also, debugging will be difficult, so it would be nice if we could
tell the program to print out all register values, or to print out the
content of some zones of memory
Doing all this requires quite a bit of assembly code and
requires techniques that we will not see for a while
The author of our textbook provides a nice I/O package that
we can just use, without understanding how it works for now
25
02/03/2019
I/O
26
02/03/2019
Examples
Modified example
27
02/03/2019
28
02/03/2019
First program
;
; file: first.asm
; First assembly program. This program asks for two
integers as
; input and prints out their sum.
;
; To create executable:
;
; Using Linux and gcc:
; nasm -f elf first.asm
; gcc -o first first.o driver.c asm_io.o
29
02/03/2019
%include "asm_io.inc" ;
; initialized data is put in the .data segment
segment .data
;
; These labels refer to strings used for output
prompt1 db "Enter a number: ", 0 ; don’t forget null
prompt2 db "Enter another number: ", 0
outmsg1 db "You entered ", 0
outmsg2 db " and ", 0
outmsg3 db ", the sum of these is ", 0
; uninitialized data is put in the .bss segment
;
segment .bss
;
; These labels refer to double words used to store the inputs;
;
input1 resd 1
input2 resd 1
; code is put in the .text segment
segment .text
global asm_main
asm_main:
enter 0,0 ; setup routine
pusha
mov eax, prompt1 ; print out prompt
call print_string
call read_int ; read integer
mov [input1], eax ; store into input1
mov eax, prompt2 ; print out prompt
30
02/03/2019
call print_string
call read_int ; read integer
mov [input2], eax ; store into input2
mov eax, [input1] ; eax = dword at input1
add eax, [input2] ; eax += dword at input2
mov ebx, eax ; ebx = eax
dump_regs 1 ; dump out register values
dump_mem 2, outmsg1, 1 ; dump out memory
; next print out result message as series of steps
mov eax, outmsg1
call print_string ; print out first message
mov eax, [input1]
call print_int ; print out input1
mov eax, outmsg2
call print_string ; print out second message
mov eax, [input2]
31
02/03/2019
C driver
#include "cdecl.h"
int PRE_CDECL asm_main( void ) POST_CDECL;
int main() {
int ret_status;
ret_status = asm_main();
return ret_status;
}
All segments and registers are initialized by the C system
I/O is done through the C standard library
Initialized data in .data
Uninitialized data in .bss
Code in .text
Stack later
32
02/03/2019
Compiling
Linking
33
02/03/2019
Assembling/Linking Process
34
02/03/2019
Assembling/Linking Process
Assembling/Linking Process
35
02/03/2019
The macro dump_regs prints out the bytes stored in all the
registers (in hex), as well as the bits in the FLAGS register
(only if they are set to 1)
dump_regs 13
‘13’ above is an arbitrary integer, that can be used to distinguish outputs
from multiple calls to dump_regs
The macro dump_memory prints out the bytes stored in memory
(in hex). It takes three arguments:
An arbitrary integer for output identification purposes
The address at which memory should be displayed
The number minus one of 16-byte segments that should be displayed
for instance
dump_mem 29, integer1, 3
prints out “29”, and then (3+1)*16 bytes
36
02/03/2019
Example
37