Intel x86 Assembly GCC
Intel x86 Assembly GCC
Table of Contents
Introduction 0
C to ASM Logic 1
DDD Tips and Tricks 2
Bitwise Operators 3
Assembly 4
The Binary Numeric System 4.1
x86 Architecture 4.2
x86 Architecture pt.2 4.3
Getting Started 4.4
Inline Assembly 4.5
Midterm1 Review 5
Midterm 2 Review 6
Final Review 7
2
Intel x86 GNU Assembler AT&T Syntax
My Awesome Book
This file file serves as your book's preface, a great place to describe your book's content and
ideas.
Introduction 3
Intel x86 GNU Assembler AT&T Syntax
C to ASM Logic
Boiler
Global
# comments
# comments
.text
.globl _start
_start:
done:
movl %eax, %eax # placeholder
Stack
helper_function:
#prologue
push %ebp
movl %esp, %ebp
subl $3*wordsize, %esp #make space for locals
push %ebx
push %edi
# arguments
.equ arg1, (2*wordsize) #(%ebp)
.equ arg2, (3*wordsize) #(%ebp)
.equ arg3, (4*wordsize) #(%ebp)
# locals
.equ loc1, (-1*wordsize) #(%ebp)
.equ loc2, (-2*wordsize) #(%ebp)
C to ASM Logic 4
Intel x86 GNU Assembler AT&T Syntax
# code
#epilogue
pop %esi
pop %ebx
main_function:
#prologue
push %ebp
movl %esp, %ebp
subl $3*wordsize, %esp #make space for locals
push %ebx
push %edi
# arguments
.equ arg1, (2*wordsize) #(%ebp)
.equ arg2, (3*wordsize) #(%ebp)
.equ arg3, (4*wordsize) #(%ebp)
# locals
.equ loc1, (-1*wordsize) #(%ebp)
.equ loc2, (-2*wordsize) #(%ebp)
# code
# push arguments
call helper_function
movl result(%ebp), %eax
addl $num_args*wordsize, %esp #clear arguments
#epilogue
movl result(%ebp), %eax # return saved value
pop %edi
pop %ebx
C to ASM Logic 5
Intel x86 GNU Assembler AT&T Syntax
# global declaration
str:
.string "I love CS, sometimes."
x:
.long 5
.equ wordsize, 4
x:
.long 1
.long 5
.long 2
.long 18
# alternatively
x:
.rept 1000
.long 8
.endr
c:
.space 3*4*wordsize
# 1 'space' is 1 byte
# so this kind of allocation in particular can be used to allocate space
Pointers
C to ASM Logic 6
Intel x86 GNU Assembler AT&T Syntax
# set pointer
# *ecx = &x;
# dereference pointer
# ebx = *(ecx) or ebx = x[0]
# pointer incrementation
# ecx = (ecx + 4) or ecx = x[1]
Advance Indexing
Globals
# eax is i
# ebx is j
Stack
C to ASM Logic 7
Intel x86 GNU Assembler AT&T Syntax
#ecx = i, #edx = j
movl a(%ebp), %ebx #ebx = a
movl (%ebx, %ecx, wordsize), %ebx #ebx = a[i]
movl (%ebx, %edx, wordsize), %ebx #ebx is a[i][j]
----
#main[*counter][i] = share[i]
#share[i]
movl share(%ebp), %ecx
movl (%ecx,%edi,wordsize), %ecx #ecx = share[i]
#main[*counter]
movl counter(%ebp), %esi #esi = &counter
movl (%esi), %esi #dereference position @ counter
movl main(%ebp), %eax # eax = **main
movl (%eax, %esi, wordsize), %eax #main[*counter]
movl %ecx, (%eax, %edi, wordsize) #main[counter][i] = share[i]
----
#share[select] = items2[i];
movl items2(%ebp), %ebx
movl (%ebx, %edi,wordsize), %ecx #ecx = items2[i]
movl share(%ebp), %eax
movl select(%ebp), %esi
movl %ecx, (%eax,%esi,wordsize)
Leal
C to ASM Logic 8
Intel x86 GNU Assembler AT&T Syntax
#edi = i
movl values(%ebp), %eax #eax = values
leal (%eax,%edi,wordsize), %ebx #ebx has values[i]
addl (%ebx),%ecx #deference memory location, add to ecx
----
----
int counter = 0;
int *p_counter = &counter;
leal counter(%ebp) , %eax
...
#(*counter)++;
movl counter(%ebp), %esi #esi = &counter
movl (%esi), %esi #dereference position @ counter
incl %esi #counter++
movl counter(%ebp), %eax #move orginal counter to open register
movl %esi, (%eax) #restore back original position (we already incremented it internal
While Loops
# while(str[len] != 0)
while_not_end:
cmpb $0, str(%ebx) # str[len] compared with 0
jz end_while # if == 0, end
# do your stuff
incl %ebx # len++
jmp while_not_end # go back
end_while:
For Loops
Globals
C to ASM Logic 9
Intel x86 GNU Assembler AT&T Syntax
...
incl eax
jmp inner_loop
end_inner_loop:
Stack
...
Function Mimicing
C to ASM Logic 10
Intel x86 GNU Assembler AT&T Syntax
# Method 1 (bad)
jmp function1
back:
...
function1:
# do something
jmp back
# Method 2 (better)
returnMin:
# do your stuff
ret
call returnMin
# returns here
Comparison Logic
Note you want to have the complement of your whatever operation you will be doing in
C, because all the comparisons are against zero. So an boolean expression like i <
num_rows, would become i - num_rows >= 0, using the jge operator.
C to ASM Logic 11
Intel x86 GNU Assembler AT&T Syntax
Saving Registers
When: Variable are become vulnerable to change and use, so save thier current state.
# save c and i
movl %eax, c(%ebp)
movl %ecx, i(%ebp)
Restoring Registers
After use we can go back to our original values.
#restore c and i
movl c(%ebp), %eax
movl i(%ebp), %ecx
Malloc
C to ASM Logic 12
Intel x86 GNU Assembler AT&T Syntax
Inline
Boiler
__asm__(
assembly code :
outputs :
inputs :
clobbered
);
*/
Variation Favorites
C to ASM Logic 13
Intel x86 GNU Assembler AT&T Syntax
C to ASM Logic 14
Intel x86 GNU Assembler AT&T Syntax
p ((int**)$eax)[i][j]
Where i and j are the coordinates of the element you are interested in. i and j can be
constants or they can be registers.
Bitwise Operators
Great Reference
A note that C/C++ used arithmetic shift on singed values, and logical shift on unsigned
numbers. This is determined by the compiler, but is usually the case.
Converts to 2's complement, and maintains the precision of the number, as well as the
sign (interally, that is).
1 byte = 8 bits. Float_int will then contain 31 bits (start counting from 0). We can use this
information to determine the number of bits we are working with, so we don't commit
any overflow.
Bit shift operators work on the bits themselves directly, which are expressed in binary.
The reference above is of great help in explaining what they do.
A negative value will use arithmetic, and will preserve the sign of the most
significant bit.
Logical shift is not sign preserving, and explicitly shifts 0's to the appropriate
amount.
float f = 1;
bitset<32> floatBit (f);
cout << floatBit << endl; //00000000000000000000000000000001
Bitwise Operators 16
Intel x86 GNU Assembler AT&T Syntax
float f = -1;
cout << floatBit << endl; //111111111....1
The most significant bit represents whether the number is a negative number.
So like the arithmetic shift, any shift will maintain '1' as a bit.
But wait! bitset<> coerses the number to an integer form. We can prove this: float f =
1.5f; will return the same solution we've seen first.
int i_f = 1;
bitset<32> floatBit(*(unsigned*)(&1_f)); // 0000....1
int i_f = 1;
bitset<32> floatBit(*(unsigned*)(&1_f)); // 00111111100000000000000000000000
00111111110000000000000000000000
f = -27;
bitset<32> floatBit(*(unsigned*)(&f)); // 11000001110110000000000000000000
Bitwise Operators 17
Intel x86 GNU Assembler AT&T Syntax
11000001110110000000000000000000
01111111100000000000000000000000 // equivalent representation of 0x7f800000
Interpreted: If we and the binary rep, with 1, then what will be returned will be directly
what is within that bit.
In our case, what is going to be returned is the following:
//return
01000001100000000000000000000000
Because we understand that the mantissa is represented by the next 23 bits, or that the
total size of our unsigned integer is 32 bits, we can do this to shift things down and treat
it as an actual binary number.
Bitwise Operators 18
Intel x86 GNU Assembler AT&T Syntax
Assembly
Will be reviewing
Linux intel x86 assembly langauge.
Hex Editor:
Used to examine a variety of files:
jpg, png, exe.
Digits represented in hexidecimal.
Attempts to represent the numerical values as text (not always successful).
Offset: location in file.
Can change and modify values, just like any other text editor, but on a lower level.
Assembly 19
Intel x86 GNU Assembler AT&T Syntax
Addition
Same as with your conventional base 10 addition, except that 2 is the combination 10,
just like when we reach 9 + 1 = 10.
e.g. 11+1=100 base 2
Subtraction
Same idea, but the concept of borrowing we take borrow from the first least significant
1, change it to a zero, and then the proceeding zero becomes 10. Borrowing from 10
gives us one (10 - 1 ) = 1, since 2 - 1 = 1.
e.g. 100 - 1 = 011
x86 Architecture
Basic History
Processors:
Read and write from memory
Tasks are interpreted line by line
x86 = family of processors
Developed from embedded systems (single function computers e.g. routers, cars)
16 bit support
Then 32 bit .. and then 64.
Other companies other than intel produce .86 compatable processors.
linux, max and windows
Old programs all the way from 1978 work on new processors. Backwards compatable.
New processors have more operation modes
Real mode (old) limited memory access 1MB.
Default mode for the current processor. The OS moves on to newer modes.
Protected mode (newer): more memory 1GB.
Helps manage multitasking in the OS.
Open more than one app.
Long mode:
Handles 64 bit data
Structure
Programs written in RAM memory.
byte = 8 bits
hexadecimal digits or nibble = 4 bits
e.g. 89 C1 01 C9 01 C1
These are characters represented in base 16. AKA Opcode.
They are interpreted as instructions.
89 C1 = move ecx, eax
01 C9 = add ecx, ecx
01 C1 = add ecx,eax
We either add, subtract, or move data
Instructions sizes all vary, from 1 byte to 10 bytes
Can only read the numerical representation of the instructions.
x86 Architecture 21
Intel x86 GNU Assembler AT&T Syntax
x86 Architecture 22
Intel x86 GNU Assembler AT&T Syntax
Same thing, expect the subcomponents are branched reletive to the initial 64 bit
size.
e.g. 64/32/16/8
long mode gives us access to this
More registers: (32 bit)
esi = source index register
edi = destination index register
subcomponents are limited to 16 bits max (one subsystem)
Instruction pointer = epi
Flag registers
Stack pointers
esp = stack pointer
ebp = base pointer
Note that initially the registers have some garbage/unknown value inside them.
First Instructions
we use mnemonic or shortcut for an instructions name (easier than numeric)
structure:
mneumonic arg1, arg2...
arguements are encoded into the numeric representation (that actual assembly)
text is tranlated - aka assembler
the assembler is what interprets the textual format of assemby
various assemblers have different tastes in syntax
thats why we have different assemblers like fasm or flat, or nasm
Instructions:
move = COPY data
mov destination, src
mov eax, 8CBh
h = read as base 16
so we copy the number to the eax 32 bit register
mov ecx, edx
copy edx to ecx register
mov si, cx
copy cx subregister (16 bit) to si subregister (16 bit)
operation vary in size
error:
mov 13h, ecx
because we are copying a 32 bit register to a number. A number is not a
x86 Architecture 23
Intel x86 GNU Assembler AT&T Syntax
destination.
error:
move ecx, dh
because the components must also carry the same size
dh = 8 bits
ecx = 32 bits
Analysis:
start with some existing garbage value
copy and move to existing locations in the cpu
notice the overwrites
notice that the entire 32 bits of information are overwritten.
Analysis:
It is important to recognize that ax and 9Ch are both 16 bits, so they will take the
lower half of ecx 32 bit information of the previous value.
Next, moving to eax completely overwrites the existing value, both consist of 32
bits.
cl is the lower half of cx, so it is 8 bits.
the last one is the most tricky: cl is taken, and move to another 8 bit location within
another register. We know that it much be 8 bits becasue it much be the same size.
ah, is part of ax (the higher portion), so it overwrites the previous ah, within eax.
x86 Architecture 24
Intel x86 GNU Assembler AT&T Syntax
x86 Architecture 25
Intel x86 GNU Assembler AT&T Syntax
Two things FFFFFFFF is a very large binary number, which consists of all 1's.
When we add 3 to this, we should expect some wrap around, because we
would otherwise require more than 32 bits.
We start from 0 again. 0,1,2 giving us the result we now expect, 2
FFFFFFF += 0000000A
Same reasoning above, start from 0, and counter to 10.
Analysis:
add ch = cx higher 8 bits = 07 to al or the lower 8 bits of ax = FF
FF+=07 = wrap around = 06
pheeewh
di = 16 bits of edi, or FFFF+= to cx or second half of ecx = 0703
FFFF+=0703; another wrap around 0703 = 0703 - 1 = 0702
AB290702 += 0AB29FFFFh; = AB29FFFF
Hex addition is pretty easy when you know that F = max, so it overwrite
everything
AB29FFFF+=00000703; = AB2A0702
Wraparounds work based on the size of the arguements.
Subtraction
sub destination, src
destination-=src
-src is found using 2's complement.
Examples
sub eax, edx
eax -=edx
sub cl,dl
cl-=dl (8 bit operation)
sub eax, dl
error: not same size
sub 1Ah, dl
x86 Architecture 26
Intel x86 GNU Assembler AT&T Syntax
Like before, we need a place to store our information. A number does not
suffice.
x86 Architecture 27
Intel x86 GNU Assembler AT&T Syntax
Basic Arithmetic:
inc, dec - increase and decrease
by 1
inc destination
dec destination
wrap arounds exists on overflow or underflow, based of size of number
examples
inc eax
eax+=1
dec si
si -=1
inc 1C5h
error: a number is not a destination
1C5+=1 seems to be able to work intuitively, but we need to be able to
store the result. A literal is not a place.
Instruction eax
FFFFFFFE
inc eax FFFFFFFF
inc al FFFFFF00
dec al FFFFFFFF
inc ax FFFF0000
dec ax FFFFFFFF
Analysis:
FFFFFFFE += 1; = 11111111111111111111111111111110 + 1 =
11111111111111111111111111111111 or FFFF FFFF
al = lower FF += 1; overflow = FFFFFF00
al = lower 00 -= 1; underflow = FFFFFFFF
ax = FFFF+= 1; overflow = FFFF0000
ax = 0000-= 1; underflow = FFFFFFFF
mul - multiplication
unsigned (only positive) numbers
syntax:
mul arguement
single arguement
Example:
ax arguement
Forms:
ax = al * argument (8 bits)
dx:ax = ax * argument (16 bits)
edx:eax = eax * argument (32 bit)
What this means:
the result stored is twice the size of the argument
why this is the case:
It ensures no wrap around, and we get the result we precisely
want.
For example dx:ax = ax * argument
both ax and argument are 16 bits. Regardless of what their
values are, the maximum possible value will be 32 bits.
There size cannot exceed the result.
: means bit concatenation
we can extended a subsystem with this operator.
dx:ax = 32 bits (16 bits each)
edx:eax = 64 bits
Example:
mul ecx
edx:eax = eax * ecx
mul si
dx:ax = ax * si
mul al
ax = al * al
mul 2Ah
error: 2A is technically an argument, but this opcode cannot exist.
div - division
First Program
gas will be the main assembler I will use.
Lets write our first program. I will be using asm file called temp.s for practice, along with
a makefile for convienience.
The Makefile:
temp.out: temp.o
ld -melf_i386 -o temp.out temp.o
temp.o: temp.s
as --32 --gstabs -o temp.o temp.s
clean:
rm -f temp.o temp.out
ddd temp.out
We make sure to break at the label 'done' to make sure our program does not seg fault.
To see the result, we break at done - and view the value of the registers.
The general process of writing asm is to first wright C, and then translate the logic into
asm.
Getting Started 31
Intel x86 GNU Assembler AT&T Syntax
x:
.long 1
.long 5
.long 2
.long 18
sum:
.long 0
.globl _start
_start:
movl $4, %eax # EAX will serve as a counter for
# the number of words left to be summed
movl $0, %ebx # EBX will store the sum
movl $x, %ecx # ECX will point to the current
# element to be summed
top: addl (%ecx), %ebx
addl $4, %ecx # move pointer to next element
decl %eax # decrement counter
jnz top # if counter not 0, then loop again
done: movl %ebx, sum # done, store result in "sum"
Analysis
C style code will be reference before the asm for clarity when appropriate.
// this is a comment
# Comment
.data
First component of asm file.
'.' means directive - informing the compiler to that the following section will contain
data, or variables.
Getting Started 32
Intel x86 GNU Assembler AT&T Syntax
x:
.long 1
.long 5
.long 2
.long 18
labels:
x:
naming convention for your own reference.
Whatever is referenced below this label will be represented by this label.
This entire long sequence of memory will be referenced by 'x'.
We will typically use labels if we want to reference to them later in the program.
Note that labels are only useful conventions for humans. They are NOT
actually variables, but it can be helpful to interpret them this way.
.long:
Storage in memory.
long means 32 bit size, so whatever follows long is the actual 32 bit number.
Can be placed consecutively and will develope a consecutive blocks of memory.
This effectively creates an array.
Protip: what if we wanted to represent an array of 1000 elements?
.rept
Example:
int x[1000];
for (int i =0; i< 1000;i++) x[i] = 8;
x:
.rept 1000
.long 8
.endr
Anything between .rept and .endr will repeated inline 1000 times.
For something of smaller size, we can use .space to preform something similiar.
char y[6];
y:
.space 6
Getting Started 33
Intel x86 GNU Assembler AT&T Syntax
y:
.string "hello"
Which allocated 6 bytes (one for every character, including the null char), which is all
preformed automatically with the .string directive.
sum = 0;
sum:
.long 0
Same thing as before, but because we want to represent the sum an independent
variable, we set it under its own label.
.text is the second component of asm, which signals that anything that comes after it will
be part of the actual code.
.globl _start
_start:
.global puts _start under the global namespace, so your linker knows where to begin.
_start is a label that signals where your program actually begins execution.
eax = 4;
movl will copies the value 4, and places it into the eax register.
$denotes constant values.
Otherwise, it would denote an address.
'l' denotes 'long' or 32 bit, which also for the appropriate size of the register.
Note the registers used are generally arbitrary, but they are used to operate arthmetic
Getting Started 34
Intel x86 GNU Assembler AT&T Syntax
*ecx = &x;
$ here still denotes a constant like before, except the constant is the address of x. ecx
then effectively becomes a pointer.
top:
addl (%ecx), %ebx
The next line adds whatever was stored within the address of ecx into ebx (our main
storage register). Ecx previously pointed to the address of x, which was the first location
in our array x.
Addl will take this 32 bit register and storage it into another register.
ecx = (ecx + 4)
ecx holds the address of our array, but as a pointer. Therefore, when we increment this
pointer, we are mimicing pointer arithmetic, and move to the next location of 32 bits. 4
means move 4 bytes (in memory), which translates to 4*8=32 bits. Therefore, ecx will
then point to the start of the next 'number' in the consecutive memory block.
eax--;
Getting Started 35
Intel x86 GNU Assembler AT&T Syntax
The next operation, jnz together integrates with the operation above it.
This is the EFLAGS register
Within an arithmetic instruction in the CPU, it records whether the result of the
instruction is 0. When it is, it sets the zero flag in the EFLAGS register to 1, to
signal that the result of the operation was true.
jnz can be interpreted as the command 'jump if not zero', which says 'hey, if the
result of your last operation was not zero, then go back to 'top'.
There also the complementary incruction 'JZ' which says 'if the last operation was
zero, then go to _.
Together with our decrement operation, we will be able to affectively loop 4 times (the
size of the array), and store our sum within the register.
sum = ebx;
This final label is not required, or prehap redundent because we dont reference it
nowhere in the program, but its a design choice.
Our final operation stores our result into memory label, sum.
Getting Started 36
Intel x86 GNU Assembler AT&T Syntax
Inline Assembly
//source: http://asm.sourceforge.net/articles/rmiyagi-inline-asm.txt
------------------------------------------------------------------------
Introduction to GCC Inline Asm
By Robin Miyagi
http://www.geocities.com/SiliconValley/Ridge/2544/
Inline Assembly 37
Intel x86 GNU Assembler AT&T Syntax
--------------------------------------------------------------------
Notice that as uses C comment syntax. As can also use `#' that
works the same way as `;' in most other intel assemblers.
Notice that in the above example, the __ prefixing and suffixing asm
are not neccesary, but may prevent name conflicts in your program.
You can read more about this in [C enxtensions|extended asm] under
the info documentation for gcc.
Also notice the '\n\t' at the end of each line except the last, and
that each line is inclosed in quotes. This is because gcc sends
each as instruction to as as a string. The newline/tab combination
is required so that the lines are fed to as according to the correct
format (recall that each line in asssembler is indented one tab
stop, generally 8 characters).
You can also use labels from your C code (variable names and such).
In Linux, underscores prefixing C variables are not Necessary in
your code; e.g.
Notice that in the documentation for DJGPP, it will say that the
underscore is necessary. The difference is do to the differences
between djgpp RDOFF format and Linux's ELF format. I am not
certain, but I think that the old Linux a.out object files also use
underscores (please contact me if you have comments on this).
* Extended Asm
------------------------------------------------------------------------
The code in the above example will most probably cause conflicts
with the rest of your C code, especially with compiler optimizations
(recall that gcc is an optimizing compiler). Any registers used in
your code may be used to hold C variable data from the rest of your
program. You would not want to inadvertently modify the register
without telling gcc to take this into account when compiling. This
is where extended asm comes into play.
Inline Assembly 38
Intel x86 GNU Assembler AT&T Syntax
Example code
--------------------------------------------------------------------
#include <stdlib.h>
accumulator = sum;
The first the line that begins with ':' specifies the output
operands, the second indicates the input operands, and the last
indicates the clobbered operands. the "r", "g", and "0" are
examples of constraints. Output constraints must be prefixed with
an '=', as in "=r" (= is a constraint modifier, indicating write
only). Input and output constraints must have its correspoding C
argument included with it enclosed in parenthisis (this must not be
done with the clobbered line, I figured this out after an hour of
fustration). "r" means assign a general register register for the
argument, "g" means to assign any register, memory or immediate
integer for this.
Notice the use of "0", "1", "2" etc. These are used to ensure that
when the same variable is indicated in more than one place in the
extended asm, that is variable is only `mapped' to one register. If
you had merely used another "r" for example, the compiler may or may
not assign this variable to the same register as before. You can
surmise from this that "0" refers to the first register assigned to
a variable, "1" the second etc. When these registers are used in
the asm code, they are refered to as "%0", "%1" etc.
Inline Assembly 39
Intel x86 GNU Assembler AT&T Syntax
`o'
`V'
`<'
`>'
`r'
Inline Assembly 40
Intel x86 GNU Assembler AT&T Syntax
`i'
`n'
`E'
`F'
`G', `H'
`s'
Inline Assembly 41
Intel x86 GNU Assembler AT&T Syntax
`g'
`X'
addl #35,r12
Inline Assembly 42
Intel x86 GNU Assembler AT&T Syntax
`p'
(define_insn ""
[(set (match_operand:SI 0 "general_operand" "=r")
(plus:SI (match_dup 0)
(match_operand:SI 1 "general_operand" "r")))]
""
"...")
which has two operands, one of which must appear in two places,
and
(define_insn ""
[(set (match_operand:SI 0 "general_operand" "=r")
(plus:SI (match_operand:SI 1 "general_operand" "0")
(match_operand:SI 2 "general_operand" "r")))]
""
"...")
Inline Assembly 43
Intel x86 GNU Assembler AT&T Syntax
the form
the first pattern would not apply at all, because this insn does
not contain two identical subexpressions in the right place.
The pattern would say, "That does not look like an add
instruction; try other patterns." The second pattern would say,
"Yes, that's an add instruction, but there is something wrong
with it." It would direct the reload pass of the compiler to
generate additional insns to make the constraint true. The
results might look like this:
(insn N2 PREV N
(set (reg:SI 3) (reg:SI 6))
...)
(insn N N2 NEXT
(set (reg:SI 3)
(plus:SI (reg:SI 3) (reg:SI 109)))
...)
Inline Assembly 44
Intel x86 GNU Assembler AT&T Syntax
The '=' in the "=r" is a constraint modifier, you can find more
information about constraint modifiers, in the gcc info under
Machine Descriptions : Constraints : Modifiers.
The gcc info documentation also explains how to use a specific CPU
register for a constraint for various hardware including the i386.
You can find this information under [gcc : Machine Desc :
Constraints : Machine Constraints] in the info documentation.
Inline Assembly 45
Intel x86 GNU Assembler AT&T Syntax
* __asm__ __volatile__
------------------------------------------------------------------------
========================================================================
comments and suggestions <[email protected]>
My Notes
Integrating into C, inline
__asm__(
)
or
asm(
)
Extended ASM
Allows us to specify input registers, output registers, and clobbered registers - so that
we don't inadvertently modify other registers.
Indicated by collins:
Inline Assembly 46
Intel x86 GNU Assembler AT&T Syntax
Inline Assembly 47
Intel x86 GNU Assembler AT&T Syntax
Midterm1 Review
1. Everything in the computer is represented using
i. bits
2. There are no (?) at assembly level.
i. types
3. If you have N unique states, what is minimum number of bits needed to represent all of
them?
i. num(bits) = ceiling(log_2(states))
4. If you have B bits how many unique states can you represent?
i. num(states) = 2^(bits)
5. How does typing in a high level language affect what assembly code will be generated?
i. Higher level languages will generally have more restrictions that assembly. For
example, it is possible to add two different variable types in assembly.
6. Be able to convert from one base to any other base:
Midterm1 Review 48
Intel x86 GNU Assembler AT&T Syntax
decimal. Finally, use the most significant bit to represent the sign.
ii. 1101 1110
i. Unsigned: 222
ii. Signed: -94
iii. 2's complement: 00100001 + 1 = 100010 = -34
8. Know how a real number is stored using IEEE floating point format.
Byte 87 65 43 21
Address 0 1 2 3
1. If the value 0x 45 89 55 67 is stored in bytes 1000 – 1003 what is the value of the
number being stored if the machine is little endian? Big endian?
2. Big endian: (read as 45895567)
Midterm1 Review 49
Intel x86 GNU Assembler AT&T Syntax
Byte 45 89 55 67
Byte 45 89 55 67
Address 1003 1002 1001 1000
i. In reality, instructions are are only byte addressable, we're data can be be
accessed 8 bits at a time.
ii. Every instruction is stored in an individual byte. For example, address 1002
containing a single instruction.
iii. This is in contrast to 'word addressable' where an address in memory refers to, for
example, a sequence of 3 bytes or instructions.
2. What are the two options for storing multidimensional arrays.
i. Because it is some binary string, we cannot be entirely sure without more context to
Midterm1 Review 50
Intel x86 GNU Assembler AT&T Syntax
its representation.
ii. We know for sure that it is something contained with 8 bits of information, but this
could very well be a float, int, 2's complement int, signed or unsigned ... etc.
5. What are the 4 major steps executed by the CPU?
i. 1) Because of different assemblers. While the general concepts are the same,
there are variations in syntactical taste toward how assemblers read information.
More importantly, different operating systems have different function calls, that
every program relies on using and communicating with.
ii. 2) Different hardware. The way the CPU structures its architecture can vary by
machine. For example, the intel x86 processor vs the MIPS CPU.
8. Be able to correctly use the bitwise operators and bit shifting.
i.
ii. Construct a mask that once anded with a 32 bit number would allow you to
examine bits 11, 5, and 3.
i. In order to examine individual bits, our and operation must zero out all bits
except for 11, 5, and 3. To do with we and all the binary elements with 0 except
for bit numbers 11, 5, and 3. It should look something like this:
ii. & 0000 0000 0000 0000 0000 0100 0001 0100
9. Negative numbers in hex:
i. Convert original hex to 2's complement (binary, invert, and add 1), the convert the
solution back to hex.
Midterm1 Review 51
Intel x86 GNU Assembler AT&T Syntax
Understand how the location of a variable's declaration affects where it will be stored. Ie if a
variable is local, global, or static where will space be made for it.
Understand how typing in C affects the assembly instructions that are generated by the
compiler.
Each though no types can exist at assembly level code, this does not mean that
different types assigned at higher level code will have same kind of instructions.
Remember, the context of the problem will determine the kinds of instructions we will
use. Pointers, for example, regardless if they are assigned to an int or char, will be 4
bytes of 32 bits. In addition any kind of arithmetic with pointers will be interpreted as
pointer arithmetic.
Translate the following code to assembly. Assume that we want to store Assume that we
want to store x in eax and c in ecx.
1 int* x = 100;
2 char* c = 70;
3 x += 10;
4 c += 10;
Translated
1 movl $100, %eax # store the memory address 100 into register
2 movl $70, %ecx # store memory address of 70 to register. We still use 'l' since all pointers ar
3 addl $10*wordsize, %eax # here we're doing pointer arithmetic, so moving 10 locations of size 4
4 addl $10*wordsize, %ecx # move a total of 10 bytes, since wordsize is 1 byte in this case. We d
What are the gcc C calling conventions? What happens if a function breaks the calling
conventions?
1) EAX, ECX, and EDX should not have live values when a function is called. This is
because the function will use this registers, and possibly change the value in the
Midterm 2 Review 52
Intel x86 GNU Assembler AT&T Syntax
process. So, before calling a function, it is recommended save these values back onto
the stack. This responsibility is placed on the caller of the function. It is highly
encouraged to follow these conventions, even though you don't really have to. It is best
to follow rules for consistency a set of guidelines for bug free results.
2) Functions will return into the EAX register. Again, you have control over this, but the
general convention with gcc compiles are through this methods, so its better have this
precaution.
3) Before calling a function, the arguments are pushed onto the stack, from right to left.
Understand and be able to use the advanced indexing mode. Assume that eax = 100 and
ecx = 5. Which addresses in memory are accessed by the following instructions?
#eax = 100
#ecx = 5
1. movl (%eax), %ebx
2. movb (%ecx,%eax), %bl
3. movl %eax, %ebx
4. movw (%ecx, %eax, 4), %bx
The stack begins at the location pointer %esp points (top of stack), and all the
addresses above it.
The start of ebp and extending all the way to %esp. Stack frames are “chained”. What
does this mean? How is it achieved?
This means we can move on to previous stack frames within our program if we wanted
to. We achieve this through the prologue: push %ebp, movl %esp, %ebp. How can you
use this to say access the third argument of the function 4 calls prior to you.
Midterm 2 Review 53
Intel x86 GNU Assembler AT&T Syntax
subl $4, %esp # move pointer down 4 bytes (all pointers are 4 bytes)
movl src, (%esp) # moves src on the set up esp location (top of stack frame)
What this means is that whenever we push some source into memory, we are
decreasing the memory location of the stack pointer esp by 4 bytes of memory. Esp
always points to the 'top of the stack'; but depending on what way you orient it, it can
also be on the 'bottom of the stack'. However, it will always move towards the lower
memory address (towards memory 0).
Write an assembly function that is callable from C that emulates the following C code.
Write an assembly function that is callable from C that emulates the following C code.
Midterm 2 Review 54
Intel x86 GNU Assembler AT&T Syntax
Midterm 2 Review 55
Intel x86 GNU Assembler AT&T Syntax
Final Review
Be able to convert hex to binary and vice versa.
Binary to Hex:
Group the bits into 4 digits, starting from the right
Convert each group to decimal
For each decimal group, represent it in its hex form
0 - 9: decimal, and 10 - 15 a - f respectively
If (MSB == 1); To represent the 2's complement number, negate the bits and add 1.
Then represent read as a (-) binary value.
When we add in binary, one way of doing it can be to simply first convert binary to
decimal and add 1 that way.
If (MSB == 0), then its signed, unsigned, and 2's complement expression will all be
the same.
Final Review 56
Intel x86 GNU Assembler AT&T Syntax
<31 sign><30----exp-----23><22------------mantissa--------------0>
1. Convert floating point to binary, by doing so independently for the part before or after
the decimal.
2. Convert the binary expression into its proper form that is:
i. 1.something
ii. Multiply by 2^exp amount (shifting decimal place)
3. Exp += 127
4. The part after your decimal spot will be your mantissa.
5. Sign = 0, then +, otherwise -.
6. Convert exponent to binary
7. Fill in your mantissa will 0's, if need be to so that it fulls 22 bits total (starting from 0).
i. The amount of zeros necessary to fill in == numBits(mantissa) - 23
Final Review 57
Intel x86 GNU Assembler AT&T Syntax
// Defined globally:
# eax is i
# ebx is j
#ecx = i, #edx = j
movl a(%ebp), %ebx #ebx = a
movl (%ebx, %ecx, wordsize), %ebx #ebx = a[i]
movl (%ebx, %edx, wordsize), %ebx #ebx is a[i][j]
Final Review 58
Intel x86 GNU Assembler AT&T Syntax
Final Review 59
Intel x86 GNU Assembler AT&T Syntax
Sets up interrup vector table, and provides the code for the interrupt service
routines.
It goes through process of fetch/decode/execute/store/check for interrupt. The
'check for interrupt is triggered by the I/O device, which then sets the interrupt
line (between the CPU and I/O device) to true. Then the CPU will activate
'interupt acknowledged' in order to now service the I/O device. The device itself
Final Review 60
Intel x86 GNU Assembler AT&T Syntax
replies with its identification number (IRQ). Then the (ISR) service handler will
goes to this address, which is calculated by:
The interrupt diserupter contains the list of I/O devices, and there
identifications.
What does the hardware do?
Provides capability to the software. Hardware interupts by the user can happen
at any time, during any executing instruction.
Final Review 61
Intel x86 GNU Assembler AT&T Syntax
Final Review 62
Intel x86 GNU Assembler AT&T Syntax
registers of the current program, restoring the registers for the program to be
run, and then jumping to the first instruction of the program to be run. Sets up
and maintains process table. Provides the code to do the actual switching of
the processes.
What does the hardware do?
Provides the interrupt capabilities in CPU, to become able to switch to another
task.
Final Review 63
Intel x86 GNU Assembler AT&T Syntax
Final Review 64
Intel x86 GNU Assembler AT&T Syntax
Final Review 65
Intel x86 GNU Assembler AT&T Syntax
: The linked variable is both an input and output. E.g. +b(bar); means bar = ebx
& : Early clobber variable. The value will be written to before all the inputs are used up,
forcing gcc to place it within its own register. Makes it so we use less registers, since it
cannot be placed in same register as an input.
a : Used to assign to a specific register -> eax (I prefer just using r)
r : Let gcc chose the register for you
Midterm 1 Extras:
Can represent 2^B objects with B number of bits.
equivocally, you can represent N number of objects with ceiling log_2(N) number of
bits.
A binary string cannot mean anything without the context of the problem. These are
within the assembly with jg, jl, etc instructions.
CPU Cycle:
Fetch, decode, execute, store.
OS is used for:
memory control/allocation, file systems, system calls, and program communication
with the OS.
I/O device management
Process management (starting programs, and time sharing)
Final Review 66
Intel x86 GNU Assembler AT&T Syntax
- opposite to above
- 12 >> 1 = 12 / 2^1
- does not wrap
- sign preserving
- -12 >> 2 = -12 / 2^2 = -3
Differing hardware. Remember that different CPUs might have completely different
instruction sets — for example, what was an add instruction on an Intel CPU might
mean something completely different on a MIPs CPU.
Different OS. Nearly all (if not all) programs rely on making system calls to the OS,
but these system calls will differ across different operating systems. For example,
the open() call for opening a file on a Unix system is not the same as the call for
opening a file on a Windows system. If the operating systems involved are different,
the program won’t work on both machines even if the hardware is identical.
Final Review 67
Intel x86 GNU Assembler AT&T Syntax
Midterm 2 Extras:
Understand how the location of a variable's declaration affects where it will be stored.
I.e if a variable is local, global, or static where will space be made for it.
* Code: or .text. This is where the body of instructions your code contains.
* Data: or .data. Where both global and static variables are contained.
* Stack: holds local variables, arguments from functions, and their return addresses.
* Heap: dynamic memory (i.e. with new in C++ or malloc).
Final Extras:
What is the use of input section in inline asm?
Before the program is run, the registers within the input section will be assigned to
Final Review 68
Intel x86 GNU Assembler AT&T Syntax
Final Review 69
Intel x86 GNU Assembler AT&T Syntax
Least Recently Used: Remove the line in the set that was the least recently
accessed
Makes the most sense but is too expensive to implement exactly for sets that
have morethan 2 ways
Most Recently Used: Remove the line in the set that has been most recently
accessed
This would be good for streaming
First in First out: Remove the line in the set that has been there the longest
Easy to implement
Random: randomly remove one
Simulations show that it only performs slightly worse than techniques based on
usage
What is the memory tree hierarchy?
Sorted by priority: CPU (register manipulation), cache (locality), memory (RAM),
disk (static memory). On the top of the hierarchy, things are more expensive, power
hungry, and data sparse (less physical space to store byte)
Advantages and disadvantages of write through or write back?
Write through:
Writing to cache always means writing to memory as well.
Advantages:
works, memory stays up to date.
a small number writes means
Disadvantages:
slow
Write back:
Advantages:
Every write to cache is not always written to in memory.
Disadvantages:
Extra memory dirty bit per address
How does reading from address A cache work?
Locate the set.
Search through set by looking at the tag bits in every line.
Locate the line offset to locate the byte we want to return to the CPU.
If the search fails:
Remove a line from cache based. This is based off your replacement policy.
If dirty: write out to memory
Copy line containing address A into cache
Clear dirty bit and set the valid bit
Use the line offset to determine which byte in the line we want and return it to
the CPU
Final Review 70
Intel x86 GNU Assembler AT&T Syntax
Writing to a Cache
Given that we want to write to address A
First use mapping equation to locate the set that could contain A
Search through this set comparing the tag bits of each line with the tag bits of the
address
If there is a match with the tag bits then this line contains the value we are
searching for
Use the line offset to determine which byte in the line we want to write to and
do the write
If there is no match then that means the line is not in the cache
First choose a line in the set to remove based on the replacement policy
If that line is dirty write it out to memory
Copy the line that contains address A into the cache
Do the write to the cache
Set the valid bit
If using write back set the dirty bit to 1
If using write through write the value to memory as well
Types of Misses:
Compulsory Miss:
First time accessing a line will be a miss
Increase line size to reduce this
Capacity miss:
Program execution overloads the cache size, so it requires to go out to
memory, copy the stuff out, and then bring it back into cache repetitively.
Conflict miss:
When lines in memory map to the same set.
Increase number of ways per set to reduce this.
Final Review 71