Crash Course in C and Assembly: Zeljko Vrba
Crash Course in C and Assembly: Zeljko Vrba
(v.2006-08-04)
Željko Vrba
These notes are intended to serve as coding guidelines for the Operating Systems course at the
University of Oslo. The text focuses on subjects that students have most trouble with while coding
their solutions.
1 Introduction 1 D. E. Knuth: “Premature optimization is
2 Variables 2 the root of all evil (or at least most of it) in
3 Calling convention 3 programming.” You are graded for cor-
4 Pointers. 4 rectness, NOT performance! Leave all
5 Bit-fields 5 optimizations for the competition.
6 Arrays 5
7 Ring buffer 6 • Do not do more than is required by the as-
8 Linked lists 6 signment. Always try to find out the min-
9 Bit-vector 8 imum needed to correctly accomplish the
10 Inline assembler 8 assignment task. Less code – less de-
11 Memory operands. 9 bugging. Code for fun only after you have
12 Literature 9 completely implemented the assignment.
1
subset of standard C headers. These are x is initialized on each entry to f(), while y is
<float.h>, <limits.h>, <stdarg.h>, <std- initialized only once, before the program starts
def.h>, and (available only in C99) <stdint.h>. up. y is accessible only within the function f
and all changes to it persist across function calls
The <stddef.h> header defines the following to f.
constants and macros.
2.3 Storage allocation.
1. The null pointer constant NULL,
• The storage for automatic variables is auto-
2. the size t type which is an unsigned integer matically allocated and initialized on each
type large enough to contain any size on the function entry, and deallocated on function
given architecture (usually 32 bits on 32-bit exit. Automatic variables are usually stored
architectures), and on the processor’s stack. 3
3. offsetof macro which is used to calculate • The storage for static variables is allocated
the offset of a particular member in a struc- only once, at compilation time. They are
ture. Exercise: Study in detail what the also initialized only once, before the main
offsetof macro does, and implement your function starts to run.
own.
2.4 Recursive functions. Each recursive
invocation of a recursive function will get a
2 Variables freshly initialized copy of automatic variables.
Note that all recursive invocations of the func-
2.1 Static variables static keyword is tion share the same (only!) copy of static vari-
used to declare function-scope variables whose ables.
value persists across calls. 1 2
2.5 lifetime. static variables exist as long
2.2 Automatic variables are function- as the program is running. Automatic variables
scope variables (also sometimes called local ) de- exist only as long as the function they are de-
clared without the static keyword are called fined in has not returned. The latter point can
automatic variables. be a source of nearly impossible to find bugs,
which arise when a function returns pointer to
Example: static vs. automatic. In func- an automatic variable.
tion f, local variable (x) is automatic, and vari-
able (y) is static. Example: unsafe function. When the un-
safe function returns, the x variable is deallo-
void f(void) cated, so the caller receives a pointer pointing
{ to invalid data. Exercise: Why is the call to g
int x = 1; safe?
static int y = 3;
1
Variables can have function- or file-scope. This usage (the only described here) affects the variable’s storage class.
2
Another use of static is to influence the symbol linkage.
3
The C standard does not mention stack explicitly. It might not even exist on certain processor architectures. The
standard just specifies the semantics of automatic variables.
2
Example: safe function. The code listed int x = 10, y = 11, *z = a+1;
for the safe function is valid since the variable
f1(x, a);
x is static. This is a way to make a local static f1(x, z);
variable visible outside of the function. f2(&z);
--*z;
int *safe(void) }
{
static int x = 12; 1. After the first call to f1 we have x ==
return &x;
}
10 and a[1] == 3. Notice how an array
has effectively decayed into a pointer. Had
the function been declared like this: void
f1(int x, int y[]), the effect would have
3 Calling convention been the same. These two function declara-
tions are equivalent.
This term refers to semantics and mechanism of
passing arguments to and returning values from
2. After the second call to f1 we have x ==
functions. 4
10, a[2] == 4, while z == a+1, i.e. it still
points to the second element of array a. No-
3.1 Function-call semantics. In C there
tice how an element of an array is indirectly
are two basic rules:
changed through the pointer, while the value
of the pointer itself is unchanged on return.
1. All arguments are passed by value.
This means that a copy of the argument is
3. After the call to f2, z is changed and equals
pushed onto the stack. Any changes made
a+2. Therefore, after the --*z statements is
to arguments within the function will not
executed, we have a[2] == 3.
be visible to its caller. Care should be taken
to distinguish between changing the pointer
3.2 Function-call mechanism. Consider a
and the value pointed to.
function with the prototype int f(int x, int
*y) having two integer local variables a and
2. Array decays into pointer to the first element
b. Suppose that it is called as x = f(z, &c).
instead of being copied. 5
Once the frame pointer is set up, arguments
and local variables are at fixed offsets with re-
Example: argument-passing. Study the fol-
spect to the EBP register. Figure 1 shows stack
lowing code and explanation carefully, for it is
layout only for the default case. The layout
essential to understand the C language.
can be different, depending on the compiler
options; -fomit-frame-pointer is particularly
void f1(int x, int *y)
{ often used as it frees the EBP register for other
++x; ++y; ++*y; uses. This option makes the code faster, but
} also harder to debug.
void f2(int **z)
{
++*z;
}
void g(void)
{
int a[3] = { 1, 2, 3 };
4
The reader should distinguish functions from preprocessor macros which don’t really pass arguments, but perform
simple textual substitution.
5
This is somewhat imprecise when multi-dimensional arrays are considered.
3
··· tees that sizeof(char) == sizeof(unsigned
lower addresses
char) == 1, so expression like 16*sizeof(char)
&c
z unnecessarily clutters the code as it always
EBP+8
ret equals 16.
ESP after call
ebp EBP+0
4.4 Integer types. When you must resort
a EBP-4
to conversion between pointers and integers, al-
b ESP after alloc
ways use unsigned integer types. Otherwise,
strange bugs can happen due to arithmetic sign-
Figure 1 Stack diagram after execut-
extensions. The recommended type to use is
ing the function prologue. Each “cell”
uintptr t, defined in <stdint.h> when avail-
is exactly 4 bytes (32 bits).
able. 7 Otherwise, the size t type should be
used. Both types are unsigned.
The caller pushes arguments in right to left or-
der, and must clean them up after the function
4.5 Initialization to fixed address. Some-
returns. The return address is automatically
times, a pointer has to be initialized to a specific
pushed by call and popped by ret instruc-
memory location. This is often the case when a
tions. Also, the called function must not modify
program needs access to memory-mapped hard-
certain registers.
ware registers.
4
tiple of 2n . If the pointer is already aligned, it highest 20 bits, or to the lowest 20 bits of an un-
is left unchanged. signed int. The latter case does not conform
to the PTE format expected by the CPU. When
void *ptr_align_down(void *p, unsigned n) a strict bit-layout and cross-platform compati-
{ bility is needed, it is recommended not to use
uintptr_t pi = (uintptr_t)p;
uintptr_t mask = (1 << n) - 1;
this feature and to manually manipulate the
return (void*)(pi & ~mask); bits within a word.
}
5 Bit-fields
6 Arrays
This is a feature of C which seems quite conve-
nient to use for interfacing to hardware. Their
Arrays store their elements consecutively in
main disadvantage is that they cannot be reli-
memory. An array holding N elements of type T
ably used to write portable code, or to access
is declared as T arr[N]. Array indices start at 0
hardware.
and extend up to and including N-1. Accessing
an array outside of its bounds is an unchecked
Example: pitfalls of bit-fields. In order
error and more often than not it leads to prob-
to access individual fields within an x86 page
lems that are extremely difficult to debug.
table entry, one may be tempted to declare a
structure similar to the following:
6.1 Arrays and pointers. The array name
itself is a pointer to the first element of the ar-
struct pte {
unsigned pba:20; ray. 9 Pointers themselves can be indexed. In
unsigned avl:3; fact, the indexing operator is just syntactic sug-
/* etc... */ ar, and the expression arr[i] is equivalent to
}; *(arr+i). However, the code in function f1 is
invalid because the pointer p is not initialized
This code might not work, depending on the to valid memory.
compiler. Namely, the C standard does not
mandate how the bits within a bit-field are allo- void f1(void)
cated. The pba field might get assigned to the
9
The array is said to decay into a pointer.
5
{ added to the tail of the buffer.
unsigned int *p;
p[3] = 0;
}
7.2 Storing/retrieving bytes. rb getchar
reads a single byte from the ring buffer rb. It re-
6.2 Automatic arrays. Care has to be turns -1 if the ring buffer is empty, otherwise an
taken when declaring arrays within a function integer in range 0-255 is returned. rb putchar
without the static storage specifier, like in stores a single byte b in the ring buffer rb. It
function f2. returns -1 if the ring buffer is full, and 0 other-
wise.
void f2(void)
{ int rb_getchar(struct ringbuf_t *rb)
unsigned int arr[512]; {
if(rb->head == rb->tail)
/* some code */ return -1;
} rb->head = (rb->head+1) % MAX_SIZE;
return rb->buffer[rb->head];
}
Such declaration uses stack space that is auto-
matically allocated on function entry and deal- int rb_putchar(
located on function exit. In this example, it struct ringbuf_t *rb, unsigned char b);
amounts to 512 * sizeof(int) bytes, or 2kB
given the usual size of 4 bytes for int. When Exercise: Implement the rb putchar function
the available stack space is very limited, it is according to the given specification and pro-
easily overflown if large automatic arrays are totype. Note that this is an “inverse” of
used. There are no checks and in the case of rb getchar, so use that function as a hint.
overflow, some other data will be overwritten.
Again, this leads to very hard to find and debug 7.3 Larger objects. Larger objects can
problems. Exercise: design an efficient way to be stored and retrieved with the following func-
detect stack overflows. tions:
int rb_write(
struct ringbuf_t *rb, void *obj, size_t len);
7 Ring buffer int rb_read(
struct ringbuf_t *rb, void *obj, size_t len);
This is a data structure that supports storage
and retrieval of bytes in FIFO manner. The to- where obj points to the object and len is the
tal amount of data that can be stored is prede- length of the buffer. The rb write function
termined. Here is presented an implementation tries to write len bytes in the ring buffer; it re-
by circular buffer. turns 0 on success and -1 if there is not enough
space. The rb read function tries to read up
7.1 Types. The ringbuf t structure in- to len bytes from the ring buffer and returns
cludes basic fields needed to have a functional the actual number of bytes read, which can be
ring buffer. The ring buffer is empty when rb- smaller than len. It should return -1 if the
>head == rb->tail. Therefore, the ring buffer ring buffer is empty. Exercise: implement these
can hold at most MAX SIZE - 1 bytes. functions using rb getchar and rb putchar.
struct ringbuf_t {
unsigned int head, tail;
unsigned char buffer[MAX_SIZE];
8 Linked lists
};
There are many variants of linked lists. It is
Elements are consumed from the head, and most convenient to use a circular, doubly-linked
with a dummy node. The dummy node does-
6
n’t contain any useful data; its only purpose is Example: populating a list. The queue of
to prevent the list from ever becoming empty. tasks can be represented by a static array of
This greatly simplifies the code since it elim- task structures:
inates many special cases in insertion and re-
moval code. The macros are given below. 10 struct task {
/* task data */
struct task *next, *prev;
#define LINK_NEXT(node, newnode) \
} tasks[16];
do { \
(newnode)->prev = node; \
(newnode)->next = (node)->next; \ Note the addition of link fields in the structure.
(node)->next->prev = newnode; \ The following sequence of operations
(node)->next = newnode; \
} while(0)
/* initialize dummy node */
#define LINK_PREV(node, newnode) \ struct task *head;
do { \ QUE_INIT(head, &tasks[0]);
(newnode)->next = node; \
(newnode)->prev = (node)->prev; \ /* insert some nodes */
(node)->prev->next = newnode; \ LINK_PREV(head, &task[1]);
(node)->prev = newnode; \ LINK_PREV(head, &task[2]);
} while(0) LINK_NEXT(head, &task[3]);
#define QUE_IS_EMPTY(head) \
((head) == (head)->next) n
3
#define QUE_INIT(head, dummy) \ p
do { \
head = dummy; \
(head)->next = (head)->prev = head; \
} while(0) n n
1 0
An advantage of using macros is that they are p p
untyped : they can be used on any structure
which defines prev and next fields as pointers.
n
8.1 Empty list. The list is empty when it 2
p
contains only the dummy node. This situation
is depicted in Figure 2, and justifies the im- Figure 3 Populated list
plementation of QUE IS EMPTY and QUE INIT
macros. 8.2 Removal from a list. To remove a
node, invoke the LINK REMOVE macro on it. The
n node is not deallocated, it is just removed from
0 the list. Exercise: what happens when you re-
p move the dummy node when the list is not emp-
ty? And when it is empty?
Figure 2 Empty list
10
The do while(0) idiom enables macros to be used (almost) as functions.
7
8.3 Traversing a list. The dummy node 9.2 Addressing bits in a bit vector. The
is not used to store information; otherwise it goal is to make macros TESTBIT(v, n), etc.,
wouldn’t be possible to distinguish between which work for the general case, where v is an
an empty list and list with one data element. array of integers, and n is a bit index within the
Therefore the traversal starts from head->next, bounds of an array. n is allowed to be larger
and is done when the dummy node is encoun- than the number of bits in an integer. Exercise:
tered again. The loop should not be executed Using macros TESTBITw, etc., code the macros
at all if the list is empty (i.e. contains only the which work for the general case. Hint: you will
dummy node). This code illustrates a possible need to use / and % operators.
way to accomplish the task:
These macros modify in place their first argu- unsigned clearbit(unsigned n, int b)
ment. The key to understanding them is to no- {
tice that the BITMASK(n) macro evaluates to an return n & ~BITMASK(b);
}
unsigned integer having just the n-th bit set.
11
Here we quietly assume that unsigned char has 8 bits. This need not be the case; the actual size is given by the
CHAR BITS constant.
12
char is also an integer type.
8
11 Memory operands. structions on an SMP system, the lock pre-
fix must be used; for example lock; xchgl
Almost all x86 instructions accept memory %esp, stored esp. Another consideration are
operands. Exploiting these instructions can CLI/STI instructions. They disable/enable in-
make the code much cleaner and easier to read, terrupts only on the CPU which executes them
as illustrated in the following code snippets. – they have no effect on other CPUs. Thus,
they cannot be used to implement critical sec-
11.1 Read-modify-write instructions. The tions on SMP systems.
following problem is needed in the context
switching code in one of the assignments. The 11.3 Saving memory operands to the
task is to exchange the %esp register with a stack. The following is a possible solution
memory location named stored esp. No other to save and restore the contents of memory lo-
registers may be changed. The %esp is chosen cation var a to the stack:
on purpose so that the stack itself can’t be used
as a temporary storage. /* save var_a on the stack */
movl var_a, %eax
pushl %eax
The following code fragment uses only memory
load and store instructions. /* restore var_a from the stack */
popl %eax
movl %eax, temp_eax movl %eax, var_a
movl %esp, %eax
movl stored_esp, %esp The simpler and recommended way is to do it
movl %eax, stored_esp directly:
movl temp_eax, %eax
9
1. C Frequently Asked Questions. ming, Deep C Secrets. Pearson Education,
http://www.c-faq.com 1994. ISBN 0131774298.
http://www.taclug.org/booklist/devel-
2. Brian W. Kernighan and Dennis M. Ritchie: opment/C/Deep C Secrets.html
The C programming language. Prentice
Hall, Inc., 1988. ISBN 0-13-110362-8 (pa- 4. John R. Levine: Linkers and Loaders.
perback), 0-13-110370-9 (hardback). Morgan-Kauffman, 1999. ISBN 1-55860-
http://cm.bell-labs.com/cm/cs/cbook 496-0.
http://www.iecc.com/linker
3. Peter van der Linden: Expert C Program-
10