L4 The Elements of The Assembly Language and The Format of The Executable Programs
L4 The Elements of The Assembly Language and The Format of The Executable Programs
4
The elements of the assembly language and the
format of the executable programs
INTRODUCTION
The purpose of the paper is the presentation of the instruction format in
assembly language, of the most important pseudo-instructions when working with
segments and dates conservation and also the structure of the executable programs
.COM and .EXE.
The elements of the assembly language TASM
The format of the instructions
An instruction may be represented on a line of maximum 128 characters, the
general form being:
[<label>:] [<opcod>[<operatives>][;<comments>]]
where:
<label> is a name, maximum 31 characters (letters, numbers or special characters
_,?,@,..), the first character being a letter or one of the special characters. Each label
has a value attached and also a relative address in the segment where it belongs to.
<opcod> the mnemonic of the instruction.
<operatives> the operative (or operatives) associated with the instruction concordant
to the syntax required for the instruction. It may be a constant, a symbol or
expressions containing these.
<comments> a certain text forego of the character ; .
The insertion of blank lines and of certain number of spaces is allowed. These
facilities are used for assuring the legibility of the program.
The specification of constants
Numerical constants are presented through a row of numbers, the first being
between 0 and 9 (if for example the number is in hexadecimal and starts with a
character, a 0 will be put in front of its). The basis of the number is specified through
a letter at the end of the number (B for binary, Q for octal, D for decimal, H for
hexadecimal; without an explicit specification, the number is considered decimal).
Examples: 010010100B, 26157Q (octal), 7362D (or 7362), 0AB3H.
Character constants or rows of characters are specified between quotation ( ) or
apostrophes ( ).
Examples: row of characters, row of characters
Symbols
The symbols represent memory positions. These can be: labels or variables.
Any symbol has the next attributes:
- the segment where it is defined
- the offset (the relative address in the segment)
- the type of the symbol (belongs to definition)
1
Labels
The labels may be defined only in the zone program and then can be
operatives to CALL or JMP instructions.
The attributes of labels are:
- the segment (generally CS) is the address of the paragraph where begins the
segment which contains the label. When a reference is made to the label, the
value is found in CS (the effective value is known only during running)
- the offset is the distance in octets of the label beside the beginning of the
segment where it has been defined
- the type determines the reference manner of the label; there are two types:
NEAR and FAR. The NEAR type reference is a segment (only the offset) and
the FAR type reference specifies also the segment (segment: offset).
The labels are defined at the beginning of the source line. If after the label
follows : character then there will be the NEAR type.
Variables
The definition of variables (date labels) may be made with space booking
pseudo-instructions.
The purpose of variables are:
- segment and offset similarly to labels with the distinction that there may be other
ledger segments
- the type is a constant, which shows the length (in octets) of the booked zone:
BYTE (1), WORD (2), DWORD (4), QWORD (8), TWORD (10), STRUC (defined
by the user), RECORD (2).
Examples:
DAT
DB
0FH, 07H
; occupies one octet each, totally 2
DATW
LABEL WORD
; label for type conversion
MOV
MOV
MOV
AL,DAT
AX,DATW
AX,DAT
; AL<-0FH
; AL<-0FH, AH<-07H
; type error
Expressions
The expressions are defined through constants, symbols, pseudo-operatives
and operatives (for variables are considered only the address and not the content,
because when compiling, only the address is known).
Operatives (in the order of priorities)
1.
Brackets () []
. (dot) - structure_name.variable serves for binding the name of a structure
with its elements
LENGTH number of zone element
SIZE
the zone length in octets
WIDTH a fields width from RECORD
Example: if are declared
EXP DW 100 DUP (1)
Then:
LENGTH EXP has the value 100
Any segment is identified with a name and class, both specified by the user.
When defined, the segments receive a series of attributes, which specifies for the
assembler and for the link-editor the relations between segments.
The segments definition are made through:
segment_name
SEGMENT [align_type] [combine type] [class]
... ...
segment_name
ENDS
where:
segment_name is the segments name chosen by the user (the name is associated
with a value, corresponding to the segments position in the memory).
align_type is the segments alignment type (in memory). The values, which it may
take, are:
PARA (paragraph alignment, 16 octets multiple)
BYTE (octet alignment)
WORD (word alignment)
PAGE (page alignment 256 octets multiple)
combine_type is actually the segments type and represents an information for the
link-editor specifying the connection of segments with the same name. It may be:
PUBLIC specifies the concatenation
COMMON specifies the overlap
AT expression specifies the segments load having the address expression
*16
STACK shows that the current segment makes part of pile segment
MEMORY specifies the segments location as the last segment from the
program
class is the segments class; the link-editor continually arranges the segments
having the same class in order of its appearance. It is recommended to use the code,
data, constant, memory, stack classes.
The designation of the active segment
In a program may be defined more segments (code and date). The assembler
verifies whether the dates or the instructions addressed may be reached with the
segment register having a certain content. For a realization in proper conditions, the
assembler of the active segment must be communicated, meaning that the segment
register must contain the address of the loaded segment.
ASSUME <reg-seg>:<name-seg>, <reg-seg>:<name-seg> ...
reg-seg the register segment
name-seg the segment which will be active with the proper register segment
Example:
ASSUME CS:prg, DS:date1, ES:date2
Observations:
- the pseudo-instruction does not prepare the register segment but communicates to
the assembler where the symbols must be looked for
- DS is recommended to be shown at the beginning of the assembler with a typical
sequence:
ASSUME DS:name_seg_date
MOV AX, name_seg_date
MOV DS, AX
- CS must not be initialized but must be activated with ASSUME before the first label
;----------------------------------------------------------------------------------------; PROCEDURES
; other procedures from the main program
;----------------------------------------------------------------------------------------CODE
ENDS
NEXT:
ADD AL,SIR[SI]
INC SI
LOOP NEXT
MOV SUM,AL
; end of program
MOV AH,4Ch
INT 21H
CODE ENDS
END START
Laboratory tasks
The presented example will be studied.
Will be written the program for calculating the sum of a rows elements in
.COM format, will be assembled, link-edited and fault traced with Turbo
Debugger following the registers and memories content (SUM location).
Will be rewritten the .EXE format program and will be fault traced.
Will be modified the program in such a way to be able to add numbers written
on word (2 octets, DW) and will be studied the case where the numbers sum
does not enter on the same length with the numbers from the row.
10