0% found this document useful (0 votes)
18 views37 pages

Coal 3

Uploaded by

Muhammad Taaha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views37 pages

Coal 3

Uploaded by

Muhammad Taaha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Assembly Language

Fundamentals
Muhammad Kariz Kamal
OUTLINES
•Basic Language Elements

•Assembling, Linking, and Running Programs

•Defining Data

•Symbolic Constants
3.1 BASIC LANGUAGE
ELEMENTS

main PROC

mov eax, 5 ; move 5 to the EAX register


add eax, 6 ; add 6 to the EAX register

main ENDP

AddTwo program
.data
sum DWORD 0

.code
xyz PROC
mov eax, 15+9
mov var1, ‘A’
add eax,16
mov sum,eax
ret
xyz ENDP
Integer Literals
•An integer literal (also known as an integer constant) is made up of
an optional leading sign, one or more digits, and an optional radix
character that indicates the number’s base:
[{+ | - }] digits [ radix ]
•Real Number Literals
[sign] integer.[integer][exponent]
Examples
• 2., +3.0, -44.2E+05, 26.E5

•A character literal is a single character enclosed in single or double


quotes. The assembler stores the value in memory as the character’s
binary ASCII code.

• E.g. ‘A’, “x”.


•A string literal is a sequence of characters (including spaces)
enclosed in single or double quotes.
• E.g. 'ABC‘, 'X‘, “Good night, Gracie“

•Reserved words have special meaning and can only be used in their
correct context (not case sensitive).
• Instruction mnemonics, Register names, Directives
• Attributes, which provide size and usage information for variables and operands.
Examples are BYTE and WORD
• Operators, used in constant expressions
• Predefined symbols, such as @data, which return constant integer values at assembly
time.
•An identifier is a programmer-chosen name. It might identify a
variable, a constant, a procedure, or a code label.
•Rules:
• They may contain between 1 and 247 characters.
• They are not case sensitive.
• The first character must be a letter (A..Z, a..z), underscore (_), @ , ?, or $.
Subsequent
characters may also be digits.
• An identifier cannot be the same as an assembler reserved word.

•An instruction is a statement that becomes executable when a


program is assembled.
[label:] mnemonic [operands] [; comment]
I1: MOV AL, 0EFh ; copies 0EFh into AL
INC AL
RET

JMP I1
•Labels act as place markers
• marks the address (offset) of code and data

Data label
count DWORD 100
Code label
target of jump and loop instructions

target:
mov ax, bx
add ax, bx

jmp target
•Instruction Mnemonics:

•Operands: a value that is used for input or output for an instruction:


MOV AL, 15+6 ; comment
MOV X, AL
MOV Y, 29
•A directive is a command embedded in the source code that is
recognized and acted upon by the assembler.

• .data

• .code

• .stack
DIRECTIVE VS
INSTRUCTION
myVar DWORD 26 ;DWORD directive tells the assembler to reserve
space in the program for a doubleword variable.

mov eax,myVar ; The MOV instruction, on the other hand, executes


at runtime, copying the contents of myVar to the EAX
register
3.2 EXAMPLE
1: ; AddTwo.asm - adds two 32-bit integers
2: ; Chapter 3 example
3:
4: .386 ; take advantage of the extended capabilities of the Intel
;80386 processor and later processors that support
;these features.
5: .model flat,stdcall
6: .stack 4096
7: ExitProcess PROTO
9: .code
10: main PROC
11: mov eax,5 ; move 5 to the eax register
12: add eax,6 ; add 6 to the eax register
13:
14: INVOKE ExitProcess,0
15: main ENDP
16: END main
•.386 directive identifies the program as a 32-bit program that can
access 32-bit registers and addresses.

•.model flat, stdcall selects the programs memory model,


and identifies the calling convention.

•The stdcall keyword tells the assembler how to manage the


runtime stack when procedures are called.
• It is a calling convention, that is a scheme for how subroutines receive parameters
from their caller and how they return a result.
•.stack 4096 sets aside 4096 bytes of storage for the runtime stack.
•It tells the assembler how many bytes of memory to reserve for the
program’s runtime stack.
•Stack are used :
1. to hold passed parameters
2. to hold the address of the code that called the function. The CPU uses
this address to return when the function call finishes, back to the spot
where the function was called.
3. Stack can hold local variables.

•.code marks the beginning of the code area of a program.


•When your program is ready to finish, it calls ExitProcess that
returns an integer that tells the operating system that your program
worked just fine.

•The ENDP directive marks the end of a procedure.


• Our program had a procedure named main, so the endp must use the same name.

•Line 16 uses the END directive to mark the last line to be assembled
(end of program), and it identifies the program entry point (main).
• If you add any more lines to a program after the END directive, they will be ignored
by the
assembler.
ASSEMBLING, LINKING,
AND RUNNING
PROGRAMS
DEFINING DATA
•The assembler recognizes a basic set of intrinsic data types, which
describe types in terms of their size.

[name] directive initializer [, initializer]...

•Initializer: At least one initializer is required in a data definition,


even if it is zero.
.data
sum DWORD 0

•If you prefer to leave the variable uninitialized (assigned a random


value), the ? symbol can be used as the initializer
sum DWORD ?
Defining BYTE and SBYTE Data
• Each initializer must fit into 8 bits of storage

.data
value1 BYTE 'A'; character constant
value2 BYTE 0 ; smallest unsigned byte
value3 BYTE 255; largest unsigned byte
value4 SBYTE -128 ; smallest signed byte
value5 SBYTE +127 ; largest signed byte
value6 BYTE ? ; uninitialized byte
Offset Value
Multiple Initializers list 0000 10
0001 20
•If multiple initializers are used in the same data 0002 30
definition, its label refers only to the offset of the first 0003 40
initializer.
list BYTE 10,20,30,40
Within a single data definition, its initializers can use different radixes.
Character and string literals can be freely mixed. In the following example,
list1 and list2 have the same contents:

list1 BYTE 10, 32, 41h, 00100010b


list2 BYTE 0Ah, 20h, 'A', 22h
Offset Value
array1 0000 10
0001 20
0002
0003 40
0004 50
0005 60
Examples that use multiple initializers:
0006 70
0007 80
.data 0008 81
Array1 BYTE 10h,20h,?,40h 0009 00
BYTE 50h,60h,70h,80h 000A 82
000B 00
WORD 81h,82h,83h 000C 83
list3 BYTE ?,32,41h, 000D 00
list3 000E
000F 32
.code 0010 41
Mov al, array1 + 7 ; al = 80h
•Defining Strings:
•The most common type of string ends with a null byte (containing 0),
called a null-terminated string.
greeting1 BYTE "Good afternoon",0
greeting2 BYTE 'Good night',0
•Each character uses a byte of storage.
•The rule that byte values must be separated by commas does not
apply on strings.
•The DUP operator allocates storage for multiple data items, using a
integer expression as a counter

x BYTE 20 DUP(0) ;20 bytes, all equal to zero


y BYTE 20 DUP(?) ;20 bytes, uninitialized
S BYTE 4 DUP("STACK") ;20 bytes: "STACKSTACKSTACKSTACK“

Z BYTE 7 Dup (2 Dup(17,19)); 17 19 17 19

var4 10
0
var4 BYTE 10,3 DUP(0), 20 0
0
20
•Defining WORD and SWORD Data
•The WORD (define word) and SWORD (define signed word)
directives create storage for one or more 16-bit integers:

word1 WORD 65535 ; largest unsigned value


word2 SWORD –32768 ; smallest signed value
word3 WORD ? ; uninitialized, unsigned
word4 WORD "AB" ; double characters
myList WORD 1,2,3,4,5 ; array of words
array WORD 5 DUP(?) ; uninitialized array
•The legacy DW directive can also be used:
val1 DW 65535 ; unsigned
val2 DW -32768 ; signed

myList WORD 1,2,3,4,5


Defining DWORD and SDWORD Data
•Storage definitions for signed and unsigned 32-bit integers:
val1 DWORD 12345678h ; unsigned
val2 SDWORD –2147483648 ; signed
val3 DWORD 20 DUP(?) ; unsigned array
val4 SDWORD –3,–2,–1,0,1 ; signed array

myList DWORD 1,2,3,4,5


LITTLE ENDIAN ORDER

•All data types larger than a byte store their individual bytes in reverse
order. The least significant byte occurs at the first (lowest) memory
address.

val1 DWORD 12345678h


DECLARING UNITIALIZED
DATA
Declaring Uninitialized Data
•Use the .data? directive to declare an uninitialized data segment:
.data?

•When defining a large block of uninitialized data, the .DATA?


directive reduces the size of a compiled program
•Within the segment, declare variables with "?" initializers:
smallArray DWORD 10 DUP(?)

Advantage: the program's EXE file size is reduced.


.data
smallArray DWORD 10 DUP(0) ; 40 bytes
.data?
bigArray DWORD 5000 DUP(?) ; 20,000 bytes, not
initialized

The following code, on the other hand, produces a compiled program


20,000 bytes larger:

.data
smallArray DWORD 10 DUP(0) ; 40 bytes
bigArray DWORD 5000 DUP(?) ; 20,000 bytes
3.5 SYMBOLIC
CONSTANTS
•A symbolic constant (or symbol definition) is created by associating an
identifier (a symbol) with an integer expression or some text.
•Symbols do not reserve storage.

•Equal-Sign Directive
name = expression
expression is a 32-bit integer (expression or constant)
may be redefined
name is called a symbolic constant
•Current Location Counter : the symbol $, is the current location counter,
that returns offset of current location:
selfPtr DWORD $
• This statement declares a variable named selfPtr and initializes it with the variable’s offset
value.

•Calculating the Sizes of Arrays and Strings


•As $ operator returns the offset associated with the current program
statement, it can be used to calculate the size of array/string:
list BYTE 10,20,30,40
ListSize = ($ - list)
ListSize must follow immediately after list.
SUMMARY
•Integer expression, character constant

•Directive – interpreted by the assembler; Instruction – executes at runtime

•Code, Data, and Stack Segments

•Source, Object, Executable files

•Data definition directives:


• BYTE, SBYTE, WORD, SWORD, DWORD, SDWORD, QWORD, TBYTE

•DUP operator, location counter ($)

•Symbolic constant

You might also like