Ele447 (Part 1)
Ele447 (Part 1)
The computer microprocessor understands only binary instructions. But it is really difficult for human beings to
read and/or remember binary instruction encodings. Assembly programming language bridges the gap between
complex low level machine binary instructions and simple high level instructions. The former (machine binary
instruction) is required to accomplish the execution of tasks on a digital computer while the latter (high level
instructions) are convenient tools of expression for programmers. Instructions written in assembly language must
be converted to machine codes so that the processor can understand. This conversion is done by an assembler.
-An assembler is a utility program that translates assembly language code into machine language.
Common examples of assemblers are:
-MASM (Microsoft Macro Assembler)
-TASM (Turbo Assembler)
-NASM (Netwide Assembler)
-GAS (GNU Assembler)
-as86
-A86
-MASM, TASM, and NASM are the most common assemblers used for Intel processors. MASM is compatible
with all versions of Microsoft Windows, beginning with Windows 95. A few of the advanced programs relating to
direct hardware access and disk sector programming will only run under MS-DOS,Windows 95, or 98, because of
tight security restrictions imposed by later versions of Windows.
-TASM has the most similar syntax to MASM.
-NASM assembler is the next closest in similarity to MASM, while GAS has a completely different syntax.
Some companion programs of an assembler are:
(1) Linker:A linker is a utility program that converts an object program into an executable program
(2) Debugger: A debugger is a utility program that provides a way for a programmer to trace the execution of a
program and examine the contents of memory.
Conversion of assembly language codes into executable format involves two stages;
(1) An assembler translates the source program into what is called an object program
(2) A linker transforms the object program into an executable program; if there are more than one object file, the
linker combines and transforms them to a single executable program.
The conversion process is illustrated in the block diagram shown inFig. 1
Assembly language
source code file n Assembler Executable file
Object Linker
file n
Object file from other
sources, e.g., C++ compiler
Link library
Note that an object program contains the machine language equivalent of an assembly language. Fig. 2 illustrates
the relationship between a high level language (in this case, C++), assembly language, and machine language.
1
Assembly language (MASM) Intel Machine language
C++ mov eax,A A1 00000000
mul B F7 25 00000004
cout<<(A*B+C);
add eax,C 03 05 00000008
call WriteInt E8 00500000
Fig. 2 Example of program translation from high level language (C++) to machine language
Assembly language is a low-level machine specific language. It is machine specific in the sense that programs
written for a particular processor family will not run on other machines belonging to a different processor family.
To learn assembly programming, we need to pick a processor family with a given ISA (Instruction Set
Architecture). A description of some selected ISA can be found in Table 1
Table 1 Description of some ISAs
Of all the ISAs listed in Table 1, the Intel x86 is the most common in today‟s computers. Intel Pentium and AMD
processors are good examples x86-based processors.
(1) Economical use of memory: Assembly language is an ideal tool for writing embedded programs in small
amount of memory; for example, in single-purpose devices such as telephones, automobile fuel and ignition
systems, air-conditioning control systems, security systems, video cards, sound cards, modems, printers, etc.
No other programming language can match up with assembly language in terms of small memory space
occupied by program code.
(2) High speed of program execution: If we craft assembly language programs carefully, they tend to run faster
than their high-level language counterparts. In time-critical applications, where tasks have to be completed
within a specified time period, assembly language is the programming language of choice. These applications,
also called real-time applications, include aircraft navigation systems, process control systems, robot control
software, communications software, and target acquisition (e.g., missile tracking) software.
(3) Accessibility to system hardware: assembly language provides direct control over system hardware. For
example, writing a device driver for a new scanner in the market almost certainly requires programming in
assembly language. High-level languages have restricted (abstract) view of the underlying hardware. Because
of this, it is almost impossible for a high level language to perform certain tasks that require access to the
system hardware.
2
Shortcomings of assembly language
(1) It is not portable: A language whose source program can be compiled and run on a wide variety of computer
systems is said to be portable. Assembly language makes no attempt to be portable. It is tied to a specific
processor family. For example, a program written in the x86 assembly language cannot be executed on a
PowerPC processor.
(2) It is difficult to write, debug, and maintain: Assembly language has a one-to-one correspondence with
machine language; i.e., each assembly language instruction corresponds to a single machine-language
instruction. The implication is that too many lines of code may be necessary for the execution of a seemingly
simple task. High level language counterparts have one-to-many correspondence with machine language,
making them compact and therefore less cumbersome than assembly language.
It is therefore rare to see large application programs coded completely in assembly language because they would
take too much time to write and maintain. Instead, assembly language is mostly used to optimize certain sections
of application programs for speed and also to access computer hardware.
Levels of Abstraction
Abstraction provides a way of managing complexity by separating relevant details from irrelevant details. Even
though, assembly language is considered a low-level language, programming in assembly language will not
expose a programmer to all the nuts and bolts of the system. As illustrated in Fig. 3, there are basically 6 levels of
abstraction, namely:
Level 5
Application program level
(e.g., spread sheet, word processor)
Machine independent
Level 4
High-level language level
(e.g., C, Java)
Level 3
Assembly language level
(e.g., x86 ISA)
Level 2
Operating system routine level
Level 0
Hardware level
4
Fig. 3 Language levels of abstraction
Elements of Assembly Language
These are the various components used in developing assembly language programs. They are as follows:
-Integer constant
An integer constant (or integer literal) is made up of an optional leading sign, one or more digits, and an optional
suffix character (called a radix) which indicates the number‟s base. It has the following format:
[sign]digits[radix]
If no radix is given, the integer constant is assumed to be a decimal number. Either upper case or lower case may
be used for the radix. Some examples are
h: Hexadecimal e.g., 3Bh, 17H, 0Eh
q/o: Octal e.g., 43o, 28q
d: Decimal e.g., 29d
b: Binary e.g., 0101101b
Note that a hexadecimal number that begins with a letter must be preceded by a zero to prevent the assembler
from confusing it for an identifier.
-Integer expression
An integer expression is a mathematical expression involving integer values and arithmetic operators. The
expression must evaluate to an integer, which can be stored in 32 bits (0 to 4,294,967,295). Some arithmetic
operators listed according to their precedence order is shown in Table 2.
Table 2 Arithmetic operators and the precedence levels
Operator Name Precedence level
() Parenthesis 1
+,- Unary plus, minus 2
*, / Multiply, divide 3
MOD modulus 3
+, - Add, subtract 4
For example, the expression 7 - 3 MOD 2 is solved as: compute 3 MOD 2 and subtract the result from 7
Further examples of expressions and their results are:
6 / 4 = 1,
7+ 4 – 3 * 2 = 5,
8 MOD 3 = 2
5
-1.0
-71.6E+02
Note that at least one digit and a decimal point are required
-Character Constants
A character constant is a single character enclosed in single or double quotes.
Examples are
'T'
"p"
-String Constants
A string constant is a sequence of characters (including spaces) enclosed in single or double quotes:
'ABCDE'
"Assembly language is interesting"
Embedded quotes are permitted when used in the manner shown by the following examples:
"I don‟t like being idle"
'The librarian said, "There are many assembly language books in the library" '
-Reserved Words
Reserved words have special meaning in MASM and can only be used in their correct context. There are different
types of reserved words:
• Instruction mnemonics, such as MOV, ADD, MUL, etc.
• Register names, such as EAX, EBX, SI, DI, etc
•Directives, which tells the assembler how to assemble programs, e.g., .code, .386, EQU, etc
•Attributes, which provide size and usage information for variables and operands. Examples are BYTE
WORD, etc
•Operators, used in constant expressions, e.g., +, -, *, etc
•Predefined symbols, such as @data, which return constant integer values at assembly time
-Identifiers
An identifier is a programmer-chosen name. It might identify a variable, a constant, a procedure, or a code label.
The following rules must be followed when creating identifiers:
•They may contain between 1 and 247 characters.
•They are not case sensitive.
•The first character must be a letter (that is any of A through Z or a through z), underscore ( _ ), @ , ?, or $.
Subsequent characters may also be digits.
•An identifier cannot be the same as an assembler reserved word.
•An identifier must not contain a space
Some examples of legitimate identifiers are: discount, sales, $Var,Count, first, _someFiles, etc.
A directive is a command embedded in the source code that is recognized and acted upon by the assembler.
Directives do not execute at runtime. Directives can define variables, macros, and procedures. They can assign
names to memory segments and perform many other housekeeping tasks related to the assembler. In MASM,
directives are case insensitive. For example, it recognizes .data, .DATA, and.Dataas the same. Although all
assemblers for Intel processors share the same instruction set, they have completely different sets of directives.
Segments directives
One important function of assembler directives is to define program sections, or segments. For example,
6
- The .DATA directive identifies the area of a program containing variables
-The .CODE directive identifies the area of a program containing executable instructions:
-The .STACK directive identifies the area of a program holding the runtime stack, setting its size:
Equal-sign directive: it is used to define symbols that have integer (or single character) quantities
associated with them. This directive does not allow real and string operands. The format is:
symbol= value
e.g.,
voltage = 100
MASM allows redefinition of symbols declared with the “=” directive. Therefore, a manifest
constant‟s scope is from the point it is defined to the point it is redefined.
e.g.,
voltage = 110
.
.
current = 10
.
.
voltage = 230
EQU directive: It allows declaration of operands that are numeric, text, or string literal constants.
Note that you cannot redefine symbols you declare with the equ directive. There are 3 formats:
e.g. Format 2:
voltage1 = 230
voltage2equ voltage1 ; voltage2 is 230
7
e.g. Format 3:
voltage1 equ<23 * 10 > ; voltage1 is not 230 but the text “230 * 10”
TEXTEQU directive: similar to EQU, TEXTEQUcreates what is known as a text macro. There are three
different formats: the first assigns text, the second assigns the contents of an existing text macro, and the
third assigns a constant integer expression:
name TEXTEQU <text>
name TEXTEQU textmacro
name TEXTEQU %constExpr
e.g.,
num TEXTEQU %(2 * 2) ; num is 4
move TEXTEQU <mov> ; move is same as mov
executeNow TEXTEQU <move al , num> ; executeNow is same as the instruction
; “ mov al , 4 ”
Note that MASM allows redefinition of symbols declared with the TEXTEQU directive.
Variablename typevalue[,value]…
A type can be byte, word, dword, qword, or any of the types listed in Table 3.1. In addition, it can be any of
the legacy directives shown in Table 3.2, which are supported also by NASM and TASM.
Table 3.1 Intrinsic type directive Table 3.2 Legacy type directive
Directive Usage Directive Usage
BYTE 8-bit unsigned integer. B stands for byte DB 8-bit integer
SBYTE 8-bit signed integer. S stands for signed DW 16-bit integer
WORD 16-bit unsigned integer (can also be a Near DD 32-bit integer or real
pointer in real-address mode) DQ 64-bit integer or real
SWORD 16-bit signed integer DT define 80-bit (10-byte) integer
DWORD 32-bit unsigned integer (can also be a Near
pointer in protected mode). D stands for double
SDWORD 32-bit signed integer. SD stands for signed
double
FWORD 48-bit integer (Far pointer in protected mode)
QWORD 64-bit integer. Q stands for quad
TBYTE 80-bit (10-byte) integer. T stands for Ten-byte
REAL4 32-bit (4-byte) IEEE short real
REAL8 64-bit (8-byte) IEEE long real
REAL10 80-bit (10-byte) IEEE extended real
For example, the following variable declarations will create symboltype of either byte or sbyte:
Note that the „0‟ after the string implies null termination. Also note that if the value of a symbol is to be left
uninitialized, a question mark (?) must be used. This is observed in the uservalue symbol in the last
example.
Processor directives
By default, MASM only assembles instructions that are available on the 8086 processor but it does not
assemble instructions available on processors beyond the 8086. If it is desired to assemble instructions
present on later processors, the processor directives can be used to enable this feature. Some processor
directives are:
.8086 .387
.8087 .386P
.186 .486
.286 .486P
.287 .586
.286P .586P
.386
Since the 80x86 family is backwards compatible, specifying a particular processor directive enables all
instructions on that processor and all earlier processors as well. Selectively enabling or disabling various
instruction sets in a program is possible. For example, the 80386 instructions can be turned on for several
lines of code and then return back to 8086 only instructions. The following code sequence demonstrates
this:
Directive Description
.BREAK Generates code to terminate a .WHILE or .REPEAT block
.CONTINUE Generates code to jump to the top of a .WHILE or .REPEAT block
.ELSE Begins block of statements to execute when the .IF condition is false
9
.ELSEIF Generates code that tests condition and executes statements that follow, until an .ENDIF
condition
.ENDIF Terminates a block of statements following an .IF, .ELSE, or .ENDIF directive
.ENDW Terminates a block of statements following a .WHILE directive
.IF condition Generates code that executes the block of statements if condition is true.
.REPEAT Generates code that repeats execution of the block of statements until condition becomes true
.UNTIL condition Generates code that repeats the block of statements between .REPEAT and .UNTIL until
condition becomes true
.UNTILCXZ Generates code that repeats the block of statements between .REPEAT and .UNTIL until CX
equals zero
.WHILE Generates code that executes the block of statements between .WHILE and .ENDW as long as
condition condition is true
A condition is a Boolean expression that evaluates to either true or false. Relational and logical operators are used
in developing conditional expressions. A list of relational and logical operators is illustrated in Table 5
.IFcondition1
statements
[.ELSEIFcondition2
statements ]
[.ELSE
statements ]
.ENDIF
10
MODEL Directive
MASM uses the .MODEL directive to determine several important characteristics of a program: e.g., its
memory model type, procedure naming scheme, and parameter passing convention. The last two are
particularly important when assembly language is called by programs written in other programming
languages. The syntax of the .MODEL directive is
-MemoryModel
The memorymodel field can be one of the models described in Table 6. All of the modes, with the
exception of flat, are used when programming in 16-bit real-address mode
-ModelOptions
The modeloptions field in the .MODEL directive can contain both a language specifier and a stack
distance. The language specifier determines calling and naming conventions for procedures and public
symbols. The stack distance can be NEARSTACK (the default) or FARSTACK.
Label
A label is an identifier that acts as a place marker for instructions and data. Labels are in two categories.
(1) Data label: A data label identifies the location of a variable, providing a convenient way to reference
the variable in code. A label placed just before a variable implies the variable‟s address since the
assembler assigns a numeric address value to the label. It is possible to assign a single label to multiple
data items as follows:
11
Voltage BYTE 230
BYTE 240
BYTE 250
(2) Code label: A label placed just before an instruction implies the instruction‟s address. A label in the
code area (where instructions are located) of a program must end with a full colon (:) character. Code
labels are used as targets of jump and loop instructions. For example, the following JMP (jump)
instruction transfers control to the location marked by the label named “target”, creating a loop:
target:
mov ax, bx
...
jmp target
Label names are created using the rules governing the creation of identifiers. The same code label can be
used more than once in a program as long as each label is unique within its enclosing procedure (a
procedure is like a function).
Instruction Mnemonic
An instruction mnemonic is a short word that identifies an instruction. In English, a mnemonic is a device
that assists memory. Similarly, assembly language instruction mnemonics such as mov, add, sub, etc
provide hints about the type of operation they perform. Examples of some instruction mnemonics and
their functions are:
Mnemonic Function
mov : Move (assign) one value to another
add : Add two values
sub : Subtract one value from another
mul : Multiply two values
jmp : Jump to a new location
call : Call a procedure
Operands
An operandcan be a register, memory operand, constant expression, or input-output port. A memory
operand is specified by the name of a variable or by one or more registers containing the address of a
variable. Examples of each category of operand are:
Assembly language instructions can have between zero, one, two or three operands as shown in the
following examples:
Instruction Number of operands and meaning of instruction
STC : set carry flag (no operand)
12
INC cx : add one to cx register (one operand)
MOV ax,10 : assign 10 to ax register (two operands)
IMUL eax, ebx, 2 : multiply register ebx by 2 and store the product in register eax
Comments
Comments are used to describe various parts of source codes. They usually contain technical notes about
the program‟s implementation. Comments can be specified in two ways:
(1) Single-line comment: it begins with a semicolon character (;). All characters following the
semicolon on the same line are ignored by the assembler, e.g.,
(2) Multi-line (or block)comment: it begins with the COMMENT directive and a user-specified
symbol. All subsequent lines of text after the user-specified symbol are
ignored by the assembler until the same user-specified symbol appears,
e.g., the following example uses the symbol “!” to indicate a multi-line
comment:
COMMENT !
This program initially assumes that voltage is constant at 230 V.It should however
be noted that there will be a slight variation in voltage which will be accounted for
in due course.
!
Number System
The most commonly used number systems are:
(1) Decimal number system (base 10)
(2) Binary number system (base 2)
(3) Octal number system (base 8)
(4) Hexadecimal number system (base 16)
A number written in positional notation can be expanded in a power series in R. For example, the number N
written as
N = (an-1 an-2 . . . a2 a1 a0)R (1)
can be expanded as
N =an-1× Rn-1+ an-2× Rn-2+…+a2 × R2+ a1× R1+ a0 × R0 (2)
n 1
= ai R i (3)
i 0
where n is the number of digits in the number, ai is the coefficient of Ri, and 0 ≤ ai ≤ R − 1. If the arithmetic
indicated in the power series expansion is done in base 10, then the result is the decimal equivalent of N.
e.g., A2F16= 10 × 162+ 2 × 161+ 15 × 160= 2560 + 32 + 15 = 260710
Table 7 contains a list of the first 20 numbers of some selected number systems.
13
Table 7 The first 20 numbers in some selected number systems
Decimal (base 10) Binary (base 2) Octal (base 8) Hexadecimal (base 16)
00 00000 00 00
01 00001 01 01
02 00010 02 02
03 00011 03 03
04 00100 04 04
05 00101 05 05
06 00110 06 06
07 00111 07 07
08 01000 10 08
09 01001 11 09
10 01010 12 0A
11 01011 13 0B
12 01100 14 0C
13 01101 15 0D
14 01110 16 0E
15 01111 17 0F
16 10000 20 10
17 10001 21 11
18 10010 22 12
19 10011 23 13
20 10100 24 14
Decimal → binary : divide decimal value by 2 (the base) until the value is 0
Example: convert the number 3610 to binary
36/2 = 18 r=0 ← LSB
18/2 = 9 r=0
9/2 = 4 r=1
4/2 = 2 r=0
2/2 = 1 r=0
1/2 = 0 r=1 ← MSB
Hexadecimal → binary
write down the 4 bit binary code for each hexadecimal digit
Example: convert the hexadecimal number 39C8 to binary
3 9 C 8 (hexadecimal) = 0011 1001 1100 1000 (binary)
14
Binary → Octal
1. group the digits into 3's starting at least significant symbol (if the number of bits is not evenlydivisible
by 3, then add 0's at the most significant end)
2. write one octal digit for each group
Example:
100 010 111 (binary) = 4 2 7 (octal)
Hexadecimal → octal
do it in 2 steps: hexadecimal → binary → octal
Decimal → hexadecimal
do it in 2 steps: decimal → binary → hexadecimal
Data Representation
Computers operate on binary numbers. Numbers are either integers or real.
Integer number
There are three common representations for binary integers
1. unsigned
2. sign-magnitude
3. signed complement
(1) Unsigned
It comprises only positive values.
In unsigned numbers, the leftmost bit is the most significant bit of the number.
(2) Sign-magnitude
In ordinary arithmetic, a negative number is indicated by a minus sign and a positive number by a plus sign.
Because of hardware limitations, computers must represent everything with binary digits. It is customary to
represent the sign with a bit placed in the leftmost position of the number. The convention is to make the sign bit
0 for positive and 1 for negative.
15
Range:-(2n-1 - 1) to +(2n-1 - 1) (including positive and negative zero)
For example, 4-bit numbers have a range of values -7 to +7 (including the sign bit)
If the binary number is signed, then the leftmost bit represents the sign and the rest of the bits represent the
number‟s magnitude.
Note that
-because of the sign bit, there are 2 representations for 0; this is a problem for hardware
0000 is 0 and 1000 is also 0
1’s complement
EXAMPLE: convert the following decimal numbers to4-bit 1‟s complement numbers
(i) +7 (ii) -7
Solution
(i) Since number is positive, just write down the value the way it is done for signed magnitude:
therefore +710 =0 111
(ii) Since number is negative:
-first write the positive equivalent: i.e., 0111
-then take bitwise complement (or inverse): i.e., 1000
therefore -7 = 1 000
EXAMPLE: Convert the 1‟s complement number 11100 to its decimal equivalent
Solution:
-The 1‟s complement number 11100 must be a negative number because the leftmost bit is 1.
16
-To find out which number it denotes, find the additive inverse!
00011 is +3 by observation,
So, 11100 must be -3
Things to notice: 1. any negative number will have a 1 as its leftmost bit.
2. there are 2 representations for 0, i.e., 00000 and 11111.
2's COMPLEMENT
It is a variation on 1's complement that does NOT have two representations for 0. This makes the hardware that
does arithmetic faster than for the other representations. Positive values in 2‟s complement representation are the
same as for signed magnitude.
Solution:
take the positive value: 0111 (+7)
EXAMPLE:
What decimal value does the two's complement number 110011 represent?
Solution 1:
It must be a negative number, since the leftmost bit is 1.
The 2‟s complement can be taken to determine its positive number equivalent.
001100 (1's complement)
+ 1 (add 1)
---------
001101 (2's complement of given number is +13, i.e., -(-13) )
17
Alternatively, the required decimal number can be determined as follows:
ai R i 25 24 1 23 0 2 2 0 21 1 20 1 32 16 0 0 2 1
n2
2 n 1
i 0
= -32+16+2+1= -13
Real number
Real numbers are numbers that have fractional parts. In decimal number system, the decimal point is used to
identify the fractional part of a number. The position immediately to the right of the decimal point has the weight
10-1, the next position 10−2, and so on. In general, a number N with radix R having a fractional part represented as
radix point
↓
N an 1 an 2 a2 a1 a0 a1 a 2 a m R (4)
can be expanded as
e.g., B6.7F216 = (11 × 161)+( 6 × 160) + (7 × 16-1) + (15 × 16-2) + (2 × 16-3) =182.4966
Example:
Convert 0.687510 to binary.
Solution
First, 0.6875 is multiplied by 2 to give an integer and a fraction. Then the new fraction is multiplied by 2 to give a
new integer and a new fraction. The process is continued until the fraction becomes 0 or until the number of digits
has sufficient accuracy. The coefficients of the binary number are obtained from the integers as follows:
18
Example
Convert (0.513)10 to octal.
Solution
0.513 * 8 = 4.104
0.104 * 8 = 0.832
0.832 * 8 = 6.656
0.656 * 8 = 5.248
0.248 * 8 = 1.984
0.984 * 8 = 7.872
The answer, to seven significant figures, is obtained from the integer part of the products:
(0.513)10 = (0.406517…)8
Example
Convert (306.D)16 to binary
Solution
(306.D)16 = (0011 0000 0110 . 1101)2
31 30 23 22 0 Bit position
19
1 bit 11 bits 52 bits
Sign
Exponent (E) Mantissa (M)
(S)
0 Bit position
63 62 52 51
S, E, and M represent fields within the representation. Each is just a bunch of bits.
→ E is an exponent field. The E field is a biased-127 (or biased-1,023 for long real, biased-16,383
for extended real) representation. So, the true exponent represented is (E - bias). The radix for
the number is ALWAYS 2.
→ M is the mantissa. It is in a somewhat modified form. There are23 bits (long real : 52 bits,
extended real : 63 bits) available for the mantissa. It turns out that if floating point numbers are
always stored in a normalized form, then the leading bit (the one on the left, or MSB) is always a
1. So, why store it at all? It gets put back into the number (giving 24 bits of precision for the
mantissa of short real) for any calculation, but we only have to store 23 bits. This MSB is called
the HIDDEN BIT.
Procedure: The procedure for converting a real number into floating point equivalent consists of four steps:
Step 1: Convert the real number to binary.
1a: Convert the integer part to binary
1b: Convert the fractional part to binary
1c: Put them together with a binary point.
Step 2: Normalize the binary number.
Move the binary point left or right until there is only a single 1 to the left of the
binary point while adjusting the exponent appropriately. You should increase the
exponent value by 1 if the binary point is moved to the left by one bit position;
decrement by 1 if moving to the right. Note that 0.0 is treated as a special case
20
Step 3: Convert the exponent to excess or biased form.
For short real, use 127 as the bias;
For long real, use 1023 as the bias.
Step 4: Combine the three components.
Combine mantissa, exponent, and sign to get the desired format.
0 10000101 00111011010000000000000
or 429DA000r
Example: Determine the single precision representation of -64.2.
Step 1: Convert 64.210 to the binary form.
1a: Convert 64 to the binary.
6410 = 10000002
21
. . .
so, the binary representation for 0.2 is .001100110011. . .
Step 2: Normalize the binary number (make it look like scientific notation)
1000000.0011001100110011. .. = 1.000000 00110011. . . x 26
S E M
1 10000101 00000000110011001100110
More examples of real numbers expressed in single precision format are presented in Table 8
Example
Convert the IEEE short real number 0 10000010 01011000000000000000000 to decimal real
22
Solution
1. The number is positive because sign bit is 0.
2. The biased exponent is 100000102 = 13010
Therefore, the unbiased exponent is 13010-12710 = 310
3. Combining the sign, exponent, and mantissa, the binary number is +1.01011 x 23.
4. The denormalized binary number is +1010.11.
5. The decimal value is 1x23 + 0x22 + 1x21 + 0x20 + 1x2-1 + 1x2-2 = 10.75.
Binary Operations
Some binary operations that are of interest are:
(1) addition
(2) subtraction
(3) multiplication
(4) division
(5) logical operations (NOT, AND, OR, NAND, NOR, XOR, XNOR)
(6) shifting
The result of an arithmetic operation can set the condition of some bits in the flag register of the microprocessor.
The conditions are: N, Z, V, C
N →The result is negative
Z →The result is zero
V → The operation caused an overflow
C → The operation caused a carryout of (or into) MSB
It is important to note that the processor does not know whether we, as programmers, are choosing to interpret
stored values as signed or unsigned values.
(1) Addition
(2) Subtraction
General rules:
1-1=0
0-0=0
1-0=1
0 - 1 = 1 borrow 1
Note: In subtraction, the first operand is called minuend while second operand is called subtrahend.
Whenever the result of an arithmetic operation can not be expressed correctly because of insufficient number of
available bits, an overflow occurs. e.g., unsigned decimal number 300 requires a minimum of 9-bits to express it
23
in binary. If an attempt is made to express it, for example using 8 bits, it will be expressed as 00101100 (i.e., 44)
instead of 100101100 (300) & we say that an overflow has occurred.
UNSIGNED ADDITION
Unsigned addition rule:
(i) Add all the bits and discard any carry out of the MSB
Examples: Final result “254” is correct since it falls within the range 0 to 255
(i) 0000 0100 (410) (unsigned 8-bit range).
+1111 1010 (25010) Overflow = no (confirmation: There is no carry out of MSB)
-------------
1111 1110 (25410)
Final result “1” is incorrect. Expected result is 257 but it does
not fall within the range 0 to 255 (unsigned 8-bit range). Note
(ii) 0000 0111 (710)
that the bit carried out of the MSB has been discarded.
+1111 1010 (25010)
Overflow = yes (confirmation: carry out of MSB)
-------------
0000 0001 (110)
UNSIGNED SUBTRACTION
Unsigned addition rule:
(i) Subtract subtrahend bits from minuend bits. If subtrahend is greater than minuend, then borrow into the MSB
of minuend.
24
Examples Borrow in
25
(iii) 0 000 1010 (+1010)
+ 1 111 1010 (-12210)
------------- Do not add because the signs of
the two numbers are different.
Subtract instead!
SIGN-MAGNITUDE SUBTRACTION
- when the integers are of different signs (i.e. the sign of „a‟ is different from the sign of „b‟ ), then do
a - b becomes a + (-b) i.e., rewrite expression so that the minuend and subtrahend have
same signs
e.g., (-4) – (+2) becomes (-4) + (-2)
different signs Same signs
a+b becomes a - (-b) i.e., rewrite expression so the addends have same signs
e.g., (-4) + (+2) becomes (-4) – (-2)
different signs Same signs
Rule (i)
Both operands
are +ve (i.e. same 0 111 1010 (+122)
(i) 0 111 1010 (+122) sign) and (ii) 0 000 0100 (+4) - 0 000 0100 (+4)
- 0 000 0100 (+4) subtrahend > - 0 111 1010 (+122) --------------
----------------- minuend -------------- 0 111 0110 (+118)
0 111 0110 (+118) ↓ invert sign
Rule (iii):|+4|< |+122| (therefore, switch order)
1 111 0110 (-118)
Rule (iv): reverse the sign of result because the
order of operand was switched
Result is correct (since +118 falls
within the range to -127 +127) Result is correct (since -118 falls
Overflow = no (confirmation: within the range -127 to +127)
there is no carry out of the Overflow = no (confirmation:
26
magnitude’s MSB) there is no carry out of the
magnitude’s MSB)
Carry = no
(iii) 0 111 1010 (+122) 0 111 1010 (+122)
- 1 000 0100 (-4) + 0 000 0100 (+4)
------------------ --------------
0 111 1110 (+12610)
Rule (ii): Signs are different; rewrite
problem(+122) - (-4) as Result is correct (since +126 falls
(+122) + (+4) within the range -127 to +127)
Overflow = no (confirmation: there
is no carry out of the magnitude’s
MSB)
Carry = no
1 111 1010 (-122)
(iv) 1 111 1010 (-122)
- 0 000 0111 (+7) + 1 000 0111 (-7)
------------------ --------------
1 000 0001 (-110)
Rule (ii): Signs are different; rewrite
problem (-122) - (+7) as (-
122) + (-7) Final result “-1” is incorrect since the expected result
“-129” does not fall within the range -127 to +127 (8-bit
sign-magnitude range).
Overflow = yes (confirmation: there is a carry out of
the magnitude’s MSB)
Examples
(i) 0 111 1010 (+122) Final result “-126” is incorrect since the
(ii) 0 111 1010 (+122)
+1 111 1011 (-4) + 0 000 0111 (+7) expected result “129” does not fall within the
-------------- -------------- range -127 to +127 (8-bit 1’s complement
1 0 111 0101 1 000 0001 (-126) range).
1 Overflow = yes
--------------- Confirmation #1: sign of operands are the
0 111 0110 (+118) same (i.e., positive) but different from sign of
End-around
result (i.e., negative))
carry
Result is correct (since +118 falls within the Confirmation #2: carry into MSB (i.e.,’1’) ≠
range -127 to +127) carry out of MSB (i.e., ‘0’)
Overflow = no (confirmation: carry into 27
MSB (i.e.,’1’) = carry out of MSB (i.e., ‘1’)
(ii) 1 000 0101 (-122) Final result “+126” is incorrect since the expected result “-129” does
+ 1 111 1000 (-7) not fall within the range -127 to +127 (8-bit 1’s complement range).
-------------- Overflow = yes
End-around carry 1 0 111 1101 Confirmation #1: sign of operands are the same but different form
Carry = no 1 sign of result)
------------------ Confirmation #2: carry into MSB (i.e.,’0’) ≠ carry out of MSB (i.e., ‘1’)
0 111 1110 (+126)
Examples
1 000 0101 (-122)
(i) 1 000 0101 (-122) +0 000 0111 (+7)
- 1 111 1000 (-7) --------------
-------------- 1 000 1100 (-115)
Don’t subtract! Rewrite Result is correct (since -115 falls within the range
(-122)-(-7) as (-122)+(+7) -127 to +127).
Overflow = no (confirmation: carry into MSB
(i.e.,'0') = carry out of MSB (i.e., ‘0’)
28
Two's complement Addition:
Examples
Final result “+127” is incorrect since the expected result “-129” does
not fall within the range -128 to +127 (2’s complement 8-bit range).
Overflow = yes
(i) 1 000 0110 (-122)
confirmation #1: carry into MSB (i.e.,’0’) ≠ carry out of MSB (i.e., ‘1’)
+ 1 111 1001 (-7)
confirmation #2: sign of operands are the same but different from
--------------
sign of result
0 111 1111 (+127)
Final result “-115” is correct since it falls within the range -128 to +127
(ii) 1 000 0110 (-122) (2’s complement 8-bit range).
+ 0 000 0111 (+7) Overflow = no (confirmation: carry into MSB (i.e.,’0’) = carry out
-------------- of MSB (i.e., ‘0’))
1 000 1101 (-115)
Final result “+115” is correct since it falls within the range -128 to
(iii) 1 111 1001 (-7)
+127 (2’s complement 8-bit range).
+ 0 111 1010 (+122)
Overflow = no
----------------
confirmation : carry into MSB (i.e.,’1’) = carry out of MSB (i.e., ‘1’)
0 111 0011 (+115)
Examples
1 000 0110 (-122)
(i) 1 000 0110 (-122) + 0 000 0111 (+7)
- 1 111 1001 (-7) --------------
-------------- 1 000 1101 (-115)
Don’t subtract! Rewrite Result is correct since -115 falls within the range -128 to +127 (8-bit
(-122) - (-7) as (-122) + (+7) 2’s complement range)
Overflow = no
confirmation: carry into MSB (i.e.,’0’) = carry out of MSB (i.e., ‘0’)
29
(ii) 0 111 1010 (+122) 0 111 1010 (+122)
- 0 000 0111 (+7) + 1 111 1001 (-7)
-------------- --------------
0 111 0011 (+115)
Problem
If A = 1001 0110 and B = 0101 1011, find A – B and B -A if
(a) A and B are unsigned numbers
(b) A and B are sign-magnitude numbers
(c) A and B are 2‟s complement numbers
For each case determine whether the expression results in an overflow and state your reason
Solution
(a) A and B are unsigned B - A:
A - B:
0101 1011
1001 0110
- 1001 0110
- 0101 1011
--------------
--------------
1100 0101
0011 1011
There is no overflow since there is no Overflow occurred since there is a
carry out of nor borrow into the MSB. borrow into the MSB of B
Hence B-A ≠ 1100 0101
A-B = 0011 1011
30
(b) A and B are sign-magnitude numbers
A - B: 1001 0110 1001 0110
- 0101 1011 + 1101 1011
Rewrite:
-------------- --------------
A-(+B) = A+(-B)
0011 1011 1111 0001
There is no overflow since there is no carry out
of the magnitude‟s MSB
Hence, A-B = 1111 0001
31