Advanced C - Workshop: by Madheswaran D
Advanced C - Workshop: by Madheswaran D
By
Madheswaran D
Version: <1.3>
Date: < 23/10/2006>
Introduction
Pre-requisites
• Basic C knowledge (Not covered here)
• Familiarity with Unix environment (basic unix commands, user
knowledge on Vi or Emacs editors). (Not covered here)
• Alignment & Padding issues.
• Good understanding about pointers.
• Good understanding about bitwise operators
Advanced Topics
• Handy Expressions involving bit wise operations
• Stack frames What happens during function calls
• Variable arguments How is it implemented?
• Dynamic memory allocation A sample design of malloc/free
Wipro confidential 2
Alignment & Padding
For today’s session, assume following sizes and study the following program
• Char: 1 byte
• Int: 4 bytes
• Short: 2 bytes
typedef struct
{
char name[30];
int empno;
int salary;
} EmpRec, * EmpRecPtr;
main()
{
int x = 1;
char y = 2;
int z = 3;
EmpRec abc;
EmpRecPtr empr = &abc;
Wipro confidential 3
Alignment & Padding
For today’s session, assume following sizes and study the following program
• Char: 1 byte
• Int: 4 bytes
• Short: 2 bytes Sizes: int=4, char=1, EmpRec=40, EmpRecPtr=4
typedef struct Address: &x=0xbffffb3c, &y=0xbffffb3b, &z=0xbffffb34,
{ &abc=0xbffffb00, &empr=0xbffffafc
char name[30];
Address: &abc.name=0xbffffb00,
int empno;
int salary; &abc.empno=0xbffffb20, &abc.salary=0xbffffb24
} EmpRec, * EmpRecPtr;
main()
{
int x = 1;
char y = 2;
int z = 3;
EmpRec abc;
EmpRecPtr empr = &abc;
&(abc.name)
&(abc.salary) &(abc.empno
)
typedef struct b00 | | |
{ Lower | | |
char name[30]; Memory
| | |
int empno;
int salary; | | |
} EmpRec, *EmpRecPtr; | | |
| | |
main() Stack grows
{ | | |
int x = 1; this way
| | p | p b1f
char y = 2;
b20 | | |
int z = 3;
EmpRec abc; b24 | | |
b27
EmpRecPtr empr = &abc; b28 p | p | p | p
Higher p | p | p | p
printf("Sizes: int=%d, char=%d, EmpRec=%d, Memory
EmpRecPtr=%d\n", sizeof(x), sizeof(y), p | p | p | p
sizeof(abc), sizeof(empr)); 3 | 0 | 0 | 0 b37
printf("Address: &x=%p, &y=%p, &z=%p, b34
&abc=%p, &empr=%p\n", &x, &y, &z, &abc, b38 p | p | p | 2 b3b
&empr); 1 | 0 | 0 | 0 b3f
printf("Address: &abc.name=%p,
b3c
&abc.empno=%p, &abc.salary=%p\n",
&abc.name, &abc.empno, &abc.salary);
}
&z &x &y
Wipro confidential 5
Alignment & Padding
Wipro confidential 6
SIGBUS or SPLIT ACCESS EXAMPLE
main()
{
char x = 10;
char y = 20;
int *p = &y;
printf(“&y=%p &p=%p\n”, &y, &p);
printf(“%d\n”, (*p) & 0xff); /* SIGBUS or unaligned split access */
}
Wipro confidential 7
Alignment & Padding
Wipro confidential 8
Example – Where padding causes trouble
Register sequence:
Two 32 bit Registers(CNTL1, CNTL2, in that order)
Three 8 bit Registers(STAT1, STAT2, STAT3, in that order)
One 32 bit register CNTL3
One 8 bit register ERR
Bad structure that can cause trouble:
typedef struct hw_device {
unsigned int cntl1;
unsigned int cntl2;
unsigned char stat1;
unsigned char stat2;
unsigned char stat3;
unsigned int cntl3;
unsigned char err;
} MYDEVICE, *MYDEVICE_PTR;
Wipro confidential 9
Pointers – Basics
const int *p = x;
*p = 10; /* illegal statement */
Arrays and Pointers
char a[10]=“Hello”;
char *str = a;
int b[10];
Int *intp = b;
printf(“%s %s\n”, a, str);
printf(“%c %c %c %c\n”, a[1], str[1], 1[str], 1[a]);
printf(“%p %p %p %p\n”, a, str, &a[1], str+1);
printf(“%p %p %p %p\n”, b, intp, &b[1], intp+1);
Output:
Hello Hello
eeee
0x22ccd0 0x22ccd0 0x22ccd1 0x22ccd1
Then pointers and arrays be interchangeably used?
• Yes and No.
Wipro confidential 1
Pointers – Basics
Wipro confidential 1
Pointers – Basics
&str1[0][0] b
&str[0][0]
0 1 2 3 4 5 6 7 8 9
a[0] O n e \0 b[0] One\0
a[1] T w o \0 b[1] Two\0
a[2] T h r e e \0 b[2] Three\0
a[3] F o u r \0 b[3] Four\0
b[4] Five\0
a[4] F i v e \0
&str[1][1] translates to
&(*(*(str+4byte) + 1)),
i.e &(*(*(0x22cca4) +
1))). This is equal to str1[1][0]=
&(*(0 + 1)) and is equal str1[1]=*(s str1[1][1]
to 1. So str[1][1] will be tr1 + *(*(str1+4byte
a junk value 4byte) s)+0) = T
Wipro confidential 1
Pointer Basics
a b
Junk\0
0 1 2 3 4 5 6 7 8 9
a[0] O n e \0 b[0] One\0
Wipro confidential 1
Pointers – Arithmetic & Casting
typedef struct
{
int day;
ef be ad de ee ff c0 00 ‘a’ ‘b’ ‘c’ ‘d’ ‘e’ 0 0 0 1f ca af de de ca 0 0 1 0 0 0
int month;
int year;
} Date;
all
cp+1 sp+1 ip+1cp+5 sp+3 ip+2 dp+1 dp+2
ip + 3 ip + 6
int abc=0xdeadbeef; sp + 8 sp + 16
int def=0xc0ffee;
char ghi[6]=“abcde”; cp + 12 cp + 24
int jkl = 0xdeafca1f;
short mno = 0xcade;
Date pqr = { 1, 1, 2006 }; • When a pointer is incremented by 1, depending upon
the type, number of bytes moved will defer.
main()
{
int *ip = &abc;
• When casting from one type to another
short *sp = (short *)&abc;
char *cp = (char *) &abc; • Take care alignment issues (SIGBUS)
Date *dp = (Date *) &abc;
…….. • Keep in mind that number of bytes that will move
}
for every increment/decrement will change.
Note: Little endian architecture assumed
Wipro confidential 1
Pointers – Function pointers
void* (**TempMyAlloc)(int);
TempMyAlloc = MemAllocAlgo;
MyAlloc = MemAllocAlgo[user_choice];
/* Alternate1: MyAlloc= TempMyAlloc[user_choice]
Alternate 2: MyAlloc = *(TempMyAlloc + user_choice) */
}
Wipro confidential 1
Volatile pointers
Volatile keywords informs compiler that optimisations must be disabled for that
variable.
The following code has a problem. Compiler will not generate code for
disabling interrupt.
Wipro confidential 1
Think about this – Volatile & Constant together
Wipro confidential 1
Bit Manipulations
main()
{
int x = -10;
printf(“%d\n”, ~x+1);
}
main()
{
unsigned int x = 5;
while(--x >= 0)
{
printf(“Hello World\n”);
}
}
Wipro confidential 1
Bit Manipulations
main()
Output: 10
{
~x represents one’s compliment of x.
int x = -10;
~x+1 represents two’s compliment of x. (i.e
printf(“%d\n”, ~x+1); negative of x)
}
main()
{ HelloWorld is printed infinitely.
unsigned int x = 5; X becomes MAXINT when it is decremented
while(--x >= 0) while it is having a value of 0.
{
printf(“Hello World\n”);
}
}
Wipro confidential 2
Arithmetic and Logical Shifts
main()
{
unsigned int x = 0x80000000;
printf(“%x\n”, x >> 1); Output: 0x40000000
} Unsigned number, “sign extension”
doesn’t happen for right shift. (Logical shift)
Wipro confidential 2
Advanced Topics
Wipro confidential 2
Handy Bitwise Expressions
3 X >> 1 Divide by 2
8 Mask = (X-Y) >> 31 Min(X, Y) without using Mask contains all zeros if Y is less than or
Result = (Mask & X) | (~Mask & Y) comparisions equal to X and all ones if X is less than Y.
9 Mask = (X-Y) >> 31 Max(X, Y) without using Mask contains all zeros if X is greater than
Result = (Mask & Y) | (~Mask & X) comparisions or equal to Y and all ones if Y is greater
than X.
Wipro confidential 2
Exercise 1: Power of 2.
Wipro confidential 2
X86 Stack view during function calls in C
Sample program:
/* assuming EBP as a integer pointer */ ESP
int f2(int x1, int y1) /* x1 is *(EBP+2), y1 *(EBP + 3) */
{ Saved regs in f2
int l5 = 110; /* l5 is *(EBP – 1), l6 is *(EBP -2) */ l6
int l6 = 120; l5
Lower EBP
Memory Prev Frame Ptr
return (0)
Ret addr in f1
} x1
y1
Wipro confidential 2
Scope – Check your understanding now
printf(“%x\n”, abc);
{
int abc = 0xc0ffee;
int x = 0x100;
printf(“%x\, %xn”, abc, x);
}
}
(Clue: Local variables can get into stack or register)
Wipro confidential 2
Function Arguments – Check your understanding
now
Wipro confidential 2
Exercise 2: Stack tracing
Wipro confidential 2
Security issue – Buffer Overflow on stack
if (argc != 2)
{
fprintf(stderr, "Usage: %s filename\n", argv[0]);
exit(1);
}
….
flag = check_permission();
strcpy(filename, argv[1]); /* Depending upon argv[1], the return address could get corrupted */
........
if (flag == 0xdeadbeef)
{
/* execute the as root or deposit million dollars in a bank account */
}
else
{
/* execute the program as normal user, deduct $10 from an account */.
}
}
/* clever hacker will manage the argv[1] such that return address is changed to a desired location. Or he can change the value of
flag */
/* typically entire binary program of “undesired program” is also passed as an argument, along with return address change */
Wipro confidential 2
Variable arguments in C
Wipro confidential 3
Exercise 3: MyPrintf Implementation
Wipro confidential 3
Dynamic Memory allocation -- Internals
Wipro confidential 3
Dynamic Memory Allocation
Main()
Heap Start {
ptr1
MCB 1: 0, 108 char *ptr1 = malloc(100);
Allocation 1
char *ptr2 = malloc(200);
MCB 2: 0, 208 ptr2 Control
Last allocation Allocation 2 …….
Free Memory free(ptr1);
….
ptr1 = malloc(300);
Heap End ….
free(ptr1)
….
Note:
ptr1 = malloc(50);
When 100 bytes are requested, actually 100 bytes +
sizeof(MCB) has been utilized. MCB is the overhead. ….
free(ptr1);
free(ptr2);
}
Wipro confidential 3
Dynamic Memory Allocation
Main()
Heap Start {
MCB 1: 1, 108 char *ptr1 = malloc(100);
Allocation 1
char *ptr2 = malloc(200);
MCB 2: 0, 208 ptr2
Allocation 2 …….
MCB 3: 0, 308 free(ptr1);
ptr1
Last allocation Allocation 3
….
Free Memory
ptr1 = malloc(300);
Control
Heap End ….
free(ptr1)
….
Note:
ptr1 = malloc(50);
free doesn’t take size as the the argument. It is
calculated by accessing *(ptr1 – sizeof(MCB) + ….
sizeof(int))
MCB1 & allocation 1 remains intact even though it is
free(ptr1);
not allotted.
free(ptr2);
}
Wipro confidential 3
Dynamic Memory Allocation
Main()
Heap Start {
ptr1
MCB 1: 0, 108 char *ptr1 = malloc(100);
Allocation 1
char *ptr2 = malloc(200);
MCB 2: 0, 208 ptr2
Allocation 2 …….
MCB 3: 1, 308 free(ptr1);
Last allocation
Allocation 3
….
Free Memory
ptr1 = malloc(300);
Heap End ….
free(ptr1)
….
Note:
ptr1 = malloc(50);
First allocation is re-utilised. That is 100 bytes are
allocated, when 50 bytes are requested. Control ….
free(ptr1);
free(ptr2);
}
Wipro confidential 3
Dynamic Memory Allocation
Main()
Heap Start {
MCB 1: 1, 108 char *ptr1 = malloc(100);
Allocation 1
char *ptr2 = malloc(200);
MCB 2: 1, 208
Allocation 2 …….
Last allocation MCB 3: 1, 308 free(ptr1);
Allocation 3
….
Free Memory
ptr1 = malloc(300);
Heap End ….
free(ptr1)
….
ptr1 = malloc(50);
….
free(ptr1);
Control
free(ptr2);
}
Wipro confidential 3
Pros & cons of MCB approach
Wipro confidential 3
Rainy day scenarios – What happens now?
1. Lost memory
ptr1=malloc(100);
ptr1=malloc(200);
2. Double free
ptr1=malloc(100);
free(ptr1);
free(ptr1);
3. Accessing memory after free
ptr1=malloc(10);
free(ptr1)
*(ptr1+9) = ‘\0’’;
4. Out of range access
ptr1=malloc(10);
*(ptr1+10) = ‘\0’;
Wipro confidential 3
Exercise 4: Implement myalloc & myfree functions
Requirements:
• You should not use any of the existing C or C++ library calls for memory
alloc (malloc, calloc, realloc, new, etc…)
• Implement void *myalloc(int elem_size) and void free(void *p);
• You also have to implement void InitMem(char *ptr, int size_in_bytes)
• Through this interface, one big chunk of memory will be provided to you.
• Use the part of these memory in further calls of myalloc & myfree functions.
• You also have to implement int MemEfficiency() that returns number of
MCBs (chunks) present.
• If you are adventurous, address following aspects
• Minimize the fragmentation by joining adjacent free blocks
• Put a time limit towards myalloc and myfree functions.
• Make it thread safe.
• Allocated memory should align with the size requested. Example: myalloc(10)
means the pointer returned must be divisible by 10.
Wipro confidential 3
Security issue – buffer overflow on heap
char *filename_p;
if (argc != 2)
{
fprintf(stderr, "Usage: %s filename\n", argv[0]);
exit(1);
}
filename_p = malloc(1024);
strcpy(filename, argv[1]); /* Depending upon argv[1], MCBs or even other areas can get
corrupted */
....
}
Wipro confidential 4
Security issues – Be aware
Wipro confidential 4
Thank you.
Information contained and transmitted by this presentation is proprietary to Wipro Limited and is intended for use only by the individual or entity to which it is addressed,
and contains information that is privileged, confidential or exempt from disclosure under applicable law.
Wipro confidential 4