PM-notes
PM-notes
Anupama Potluri
School of Computer and Information Sciences
University of Hyderabad
2 Specification 4
3 Design of Algorithms 5
3.1 Algorithmic Primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
8 C language discussion 14
8.1 Declaration versus Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
8.2 Global Variables and Side Effects . . . . . . . . . . . . . . . . . . . . . . . . . 16
8.3 Scope and Extent of Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
8.4 Parameter Passing Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . 19
8.5 File Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
8.6 Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
8.7 Bit Manipulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
8.8 Pre-processor directives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
8.9 Command Line Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
8.10 typedef and union . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
8.10.1 typedef . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
8.10.2 union . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
9 Some Common Compiler Errors and What They Mean 31
9.1 warning: implicit declaration of function printf . . . . . . . . . . . . . . . . . 31
9.2 warning: val may be used uninitialized in this function . . . . . . . . . . . . . 31
9.3 sqrt max.c:7:1: error: expected , or ; before int . . . . . . . . . . . . . . . . . 32
9.4 error: i undeclared . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
9.5 sqrt max.c:(.text+0x6a): undefined reference to ‘sqrt’ . . . . . . . . . . . . . 33
10 Coding Standards 33
10.1 Meaningful Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
10.2 Constants/Macros . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
10.3 Column width of 75-78 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
10.4 Indentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
10.5 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
10.6 Good Parenthesisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
10.7 Braces for Functions vs Other Compound statements . . . . . . . . . . . . . . 35
10.8 Block Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1 Introduction
Programming is primarily about solving problems. It is about capturing the process of how
the human brain solves the problem and writing it in terms that the computer can understand.
Unlike human beings, computers have to be told every small detail – not even a small child
needs to be told in as much detail as a computer needs to be. That is what makes it challenging
because humans make many hidden assumptions that are not easy for them to assess and
capture them in explicit statements. But, when programming is approached in this fashion,
i.e., as a process of our own brain examining itself, it can be a lot of fun.
Programming starts with somebody asking us to make the computer do a task. An
example would be “Find the maximum of the given set of numbers”. Sounds simple to a
human and most people would do it, including pretty small children. The usual response of
humans is to start listening for numbers or see if somebody writes the numbers on the board.
They know when the person has stopped giving them numbers by the cues of conversation
such as a long enough pause in speech or the person stops writing. Now, the computer,
in typical scenarios (if we ignore all the new-fangled AI personal assistants) cannot see or
hear. The input is given to them quite differently. And, the problem starts here. How do we
give the input to the computer? So, the statement above is not enough to start instructing
the computer how to solve the problem. Much more information is needed. The process
of extracting all the relevant information is therefore the first step in solving problems by
computer. The output of this extraction is a Specification.
Once the relevant information is all available, the human captures the logic or functioning
of their brain in a series of logical steps. These steps should result in the correct answer under
all normal and abnormal conditions. The steps should also be done at some point, i.e., the
computer must halt its computation at some point when following these steps. Such a series
of steps is called an Algorithm. An algorithm is independent of the computer hardware,
the operating system and the computer language in which it may finally be executed in the
computer. This abstraction of details specific to particular environments make it a powerful
tool to solve the problem without getting lost in nitty gritty details. This process is called
Design and is at the heart of programming.
When designing solutions to problems, users have to think through all possible conditions,
especially error conditions and what are called boundary conditions. Most solutions fail under
these conditions. The design process needs to give emphasis to these conditions.
Once design is done, the programmer needs to come up with test cases that will be used
to verify if the algorithm is correct or not. Based on these test cases, the algorithm is traced
with different inputs. If the algorithm gives correct results in all the test cases, then, we can
proceed to implement the algorithm in a specific language on a specific system. Of course,
the tracing may not be possible for complex systems but we will see how even for reasonable
systems, these steps can be done without undue strain.
2 Specification
Let us go back to the problem posed in the previous section: “Find the maximum of the given
set of numbers”. This is called a Requirement.
What is missing in this statement for a computer? How do we go about finding the missing
information? Are there ambiguities in the question? If so, what are they?
4
Disambiguating the requirement and ensuring all the necessary information without any
hidden assumptions is obtained, as we said earlier is the first step. This is written down in clear
terms and is called Specification. In other words, a specification consists of unambiguously
identifying all the input to the computer and how and in what format it is given and
similarly for the output – all the output for all possible conditions and in what
format is it given.
Let us see what information is missing in the above requirement. As we noted earlier, a
human knows how the numbers are being given and when no more numbers are forthcoming
from other cues. For the computer, the numbers can be given through a keyboard or they
may be read from a file which is on a specific device such as a hard disk, a CD or a pen drive
or it may come over the network etc. We need to inform the computer where from the data
needs to be read. If it is a file or a network, there may be ways for the computer to determine
that it has reached the end of the input. But, if we are typing the data with the keyboard,
there needs to be some signal to the computer that the data has all been given.
In our example, we may say that the input is given from the keyboard, that when a value
of -1 is entered, it means end of input. Here, there is a hidden assumption that no negative
values (or at least -1) are part of the input data whose maximum we have to find. On the
other hand, we may say that first we will give an input of how many numbers will be entered
and then enter that many numbers. So, we first enter a value N followed by N values. We
return the maximum of the numbers entered at the end.
Examining the problem further, we can ask ourselves if the data consists of duplicates.
We can ask if it matters if there are duplicates? It does if the user is expecting, but does
not specify, that the output will print the location of the maximum. If the location needs to
be printed, then the question is whether to print all locations, only the first occurrence or
only the last occurrence. Based on the answer to these questions, the way we write our logic
changes. Thus, you can see that a specification has an impact on the design of the algorithm.
3 Design of Algorithms
This is, of course, the heart of problem solving or programming – how to solve the problem.
There are five essential qualities for a good program. They are:
1. Generality: Make the solution to the problem as general as possible.
2. Modularity: This is the process of breaking the given problem into small sub-problems,
each of which appears almost trivial but put together, lead to the overall solution to
the problem.
3. Portability: This relates more to the implementation in a specific computer language
than design. This should be written such that the program can be run on any hardware
or operating system without modifications.
4. Readability: The program should be easily understood when it is read. Therefore, the
names used for variables, functions etc. should be meaningful and convey immediately
their purpose.
5. Maintainability: A readable code with clear comments wherever the logic is complex
helps in maintenance of the code. Maintenance is needed whenever some bugs are
encountered or new features need to be added to the existing software.
5
The most important lesson in programming is “Think First, Code Later”. There are two
methods of problem solving – bottom-up or top-down. Normally, we use a top-down approach,
where we take the given problem and break it down into small pieces. Then, we take the
small pieces and break them down into even smaller pieces until the piece is easily doable.
An e.g. would be finding mode of given set of numbers. We can say that this consists of two
steps: find the frequency of each value and then the maximum of the frequencies. Then, each
of these sub-problems can be done separately. An example of how to work through the entire
problem may be seen in Prof.Chakravarthy’s notes on “How to Program” [4].
2. Arrays: Arrays are contiguous memory spaces where each element of the array has
the same data type and can be accessed using the location or index operator []. For
example, if we have a variable for a table of integers which is an array, say T able, each
element of the table can be addressed as follows: T able[1], T able[2] and so on until the
last element. The total size of an array is usually declared using a constant as follows:
T able[M AXV AL].
3. Structures: Structures can also be called Records. The elements of a structure, unlike
the elements of an array, have different data types. For e.g., if we want to store student
information, we need to store the N ame which is a string, Reg.N o. which can be a
string or an integer, CGP A which is a float. We can then have arrays of such structures
to maintain information of multiple students. Such data types are called complex data
types.
6
Print “Error: N must be greater than or equal to 0”
end if
if N = 0 then
Print “No input is given”
end if
6. Logical Operators: These allow one to combine conditions such as the following:
7. Arithmetic Operators: These are standard arithmetic operator for addition, sub-
traction, multiplication and division represented by the standard symbols. However,
the remainder function is represented using the “%” symbol. Thus a % b would mean
the remainder of a divided by b.
8. Looping Statements: There are situations in programs where we want to repeat the
same operation or set of statements on multiple data points. We use looping statements
for such purposes. An e.g. would be to find the sum of the first N integers. We need
to repeat the addition of the current value to a variable called sum. This is done as
follows:
for i := 1 to N do
sum ←− sum + i
end for
When we need to repeat as long as a condition is TRUE, we use the while loop construct.
Thus if we need to read values and add them up until a negative value is entered, we
write it as follows:
Read num
sum ←− 0
while num > 0 do
sum ←− sum + num
end while
4.1 Specification 1
We find maximum of N numbers.
7
If any value other than a number is given as input, it should print an error string “Error in
giving input...must be a number”.
If no values are given (N = 0), it should print a message “No values have been given”.
4.2 Design 1
The algorithm for the specification in 4.1 is given in Algorithm 1.
Read N
if N < 0 then
Print “N must be > 0”
return
end if
if N = 0 then
Print “No input to find maximum”
end if
Read max
if max not an integer then
Print “Wrong Input”
end if
for i ←− 1 to N − 1 do
Read val
if val not an integer then
Print “Wrong Input”
end if
if val > max then
max ←− val
end if
end for
Print “Maximum of values input = ” max
return
Algorithm 1: Algorithm for “Finding Maximum” with Specification 1
4.3 Specification 2
We find maximum of numbers input where the input ends if a negative value is encountered.
The maximum value and the last location it occurs in are to be given as output.
If any value other than a number is given as input, it should print an error string “Error in
giving input...must be a number” and exit.
8
If no values are given (i.e., if the first number is negative), it should print a message “No
values have been given”.
4.4 Design 2
The algorithm for the specification in 4.3 is given in Algorithm 2.
Read val
if val not an integer then
Print “Wrong Input”
return
end if
if val < 0 then
Print “No input to find maximum”
return
end if
max ←− val
loc ←− 1
i ←− 1
while val ≥ 0 do
Read val
if val not an integer then
Print “Wrong Input”
return
end if
i ←− i + 1
if val ≥ max then
max ←− val
loc ←− i
end if
end while
Print “Maximum of values input = ” max
Print “and its last occurrence is at ” loc
return
Algorithm 2: Algorithm for “Finding Maximum” with Specification 2
4.5 Specification 3
We find maximum of numbers input where the input ends if a negative value is encountered.
There are duplicates in the given values.
The maximum value and all the locations it occurs in are to be given as output.
If any value other than a number is given as input, it should print an error string “Error in
giving input...must be a number”.
If no values are given (first value input is negative), it should print a message “No values
have been given”.
9
4.6 Design 3
This is left as an exercise for the user.
10
5 “Finding the Mode” problem
This is another illustration of how the specification impacts design. Four different specifica-
tions are given with hints on how the design changes. The actual design of the problems is
left as an exercise for the user.
11
is the mode as the requirement is that the last occurring value in the file is the mode, not
the maximum of equal frequency values. All error and boundary conditions need to be taken
care of as usual.
12
Testing consists of many levels. At the beginners stage that we are discussing there are
at least two stages of testing, namely, unit testing and integration testing. Modularity
plays a big role in reducing the complexity and time taken for testing.
13
for highly complex software, it is not easy to come up with nor run all possible test cases.
Hence, in many cases there remain hidden bugs in software. The duty of a good programmer
is to design test cases such that the code coverage is as close to 100% as possible.
1. A boundary condition would be that only one value is entered. Did we design our
code correctly to handle this? Many times such boundary conditions are missed in the
design. Such errors are called off-by-one errors and are some of the most common errors
in software.
2. If we have more than one number, another boundary condition for this problem would
be two or more values are input and all of them have the same value since duplicates
are allowed. The output in this case should print the last location for the occurrence of
the value.
3. Another test case would be more than one number and all values are unique. In this
case, there are sub-testcases – the maximum is the first value or some middle value or
last value.
4. Another test case is where more than one value is input and there are duplicates of the
maximum value. The proper location is printed for the maximum in this case.
As can be seen above, there are many test cases for as simple a problem as “finding the
maximum value”. For each such test case, we need to give the proper input which satisfies
the conditions specified in the test case. We know what is the expected output and verify it
against the output from the program. If the output from the program does not match the
expected output then, there is a bug in the program corresponding to that test case.
Designing test cases such that there is maximum code coverage is very important for
reliability of software. Once again, as pointed above, modularity comes to the rescue of the
programmer. Unit testing is done per module and test case design per module will ensure a
much better code coverage than if someone were to test complex software without doing unit
testing. In fact, that is the reason why there are many levels of testing in software – to ensure
that all test cases are taken care of in a methodical and scientific manner.
8 C language discussion
In this section, we will discuss some of the interesting aspects of the C programming lan-
guage. For a comprehensive and detailed exposition, the user is directed to read “The C
Programming Language” by Kernighan and Ritchie.
14
8.1 Declaration versus Definition
Function Declaration versus Definition Let us first discuss the difference between dec-
laration and definition of functions. A function declaration or prototype specifies only the
name of the function, the parameters passed to it in terms of data types and the return value
data type. It does NOT include the actual logic of the function. The function definition
is where the logic of the function is specified. Declarations may be needed if the definition
follows a call by another function or the function is defined in some .c file and called by a
function in another .c file. Declarations of functions which are used across .c files are given
in .h files. Such .h files are included in all .c files where these functions are called or defined.
Global versus Local Variables A variable which is declared within a function definition
is called a local variable. A variable which is defined outside the definitions of functions is
called a global variable. In the example code in Program 1, max is a global variable and
i, N, val are local variables.
i n t max = −1;
i n t main ( v o i d )
{
int i , val , N;
/∗ Read N from t h e u s e r ∗/
f o r ( i = 0 ; i < N; i ++) {
/∗ Read v a l u e from u s e r ∗/
i f ( v a l > max)
max = v a l ;
}
#d e f i n e MAX 1000
#d e f i n e MAXSTR 128
int i = 0;
i n t Table [MAX] ;
15
i n t main ( v o i d )
{
c h a r f i l e n a m e [MAXSTR] ;
extern int i ;
i n t findmax ( i n t N)
{
int max = −MAXINT;
w h i l e ( i < N) {
i f ( Table [ i ] > max)
max = Table [ i ] ;
i ++;
}
16
This illustrates one of the major issues with using global variables – the problem of side
effects. Modifying the value of a variable in one location has an impact on the logic in another
location because the changes are carried over to multiple locations.
Therefore, a good programming principle is to minimize the use of global
variables and limit them to only those cases where it makes absolute sense to use
them.
i n t main ( v o i d )
{
int v a l [N] , i ;
s c a n f ( ‘ ‘% d ” , &N ) ;
read val ( val ) ;
17
sum odd ( val , N ) ;
p r i n t f ( ‘ ‘ Sum o f odd v a l u e s = %d\n ” , sum ) ;
exit (0);
}
f o r ( i = 0 ; i < N; i ++)
s c a n f ( ‘ ‘% d ” , &v a l [ i ] ) ;
}
18
Program 5: Program containing Module logic for Illustrating Scope and Extent
#i n c l u d e <s t d i o . h>
sum odd ( i n t ∗ t a b l e , i n t N)
{
int i;
static int sum1 = 0 ;
f o r ( i = 0 ; i < N; i ++) {
i f ( t a b l e [ i ] % 3 == 0 )
sum += t a b l e [ i ] ;
else
sum1 += t a b l e [ i ] ;
}
19
Figure 1: Parameter Passing: Call by Value [1]
20
8.6 Pointers
Pointers are addresses of memory locations where data is located. We can understand pointers
through a simple real-life example. Let us say that there is a room called visiting faculty room
in a building. The address of this can be S101 within the particular building. The person
who is currently sitting in the room may be Prof. A. How does this map to C concepts?
When we declare a variable such as int i, i is equivalent to visiting faculty room. The
address of i, i.e., &i is equivalent to S101 and the value stored in i when we initialise it as in
i = 1 is equivalent to saying Prof. A is currently in visiting faculty room. If Prof. A leaves
and Prof. B starts using the room S101, then it is equivalent to saying i = 2, i.e., the value
has changed in that memory location. The address has not changed however. Now, supposing
we build an extension to the building and decide that the visiting faculty room will in future
be the room N105, then, this is saying that the variable has been moved to a new location in
memory. In other words, it is like saying
vfac_room_addr = &vfac_room1;
/* A is currently in the room */
*vfac_room_addr = A;
...
/* B is currently in the room */
*vfac_room_addr = B;
/* Now, change the address visiting_faculty_room is pointing to */
vfac_room_addr = &vfac_room2;
/* where vfac_room1_addr=S101 and vfac_room2_addr=N105 for the */
/* room example above */
Given a pointer, the value stored in the address pointed to can be obtained by dereferencing
the pointer. Given that ptr is a pointer, the value stored in that location is obtained by the
expression ∗ptr. Any variable in C consists of four parameters that define it: the variable
name, data type, address and value. The data type will determine the amount of memory
occupied by that variable in bytes. So, typically, an int variable occupies 4B whereas a char
occupies 1B and so on.
Why do we need pointers? C is the only language to support them other than C++
which is derived from C, of course! We should remember that C is a systems programming
language. Operating systems are written in C. Typically, an OS needs to access specific
memory locations to store data in those locations. So, C, which was invented to write the
Unix operating system comes with this powerful mechanism.
21
x=a x=a
y=b y=b
u=a u=b
v=b v=a
Figure 2: Memory snapshot when swap function is called and after swap is executed. x, y
are actual parameters and u, v are formal parameters.
Let us look at the first version of swap function as given in K&R’s book on C [5].
temp = u ;
u = v;
v = temp ;
p r i n t f ( ‘ ‘ swap : U = %c , V = %c \n ” , u , v ) ;
return ;
}
Now, the function swap is called by the main function with actual parameters x, y as
shown in Program 7.
22
Program 7: Main function calling swap function
i n t main ( v o i d )
{
char x = ‘a ’ , y = ‘b ’ ;
swap ( x , y ) ;
p r i n t f ( ‘ ‘ main : X = %c , Y = %c \n ” , x , y ) ;
exit (0);
}
We will find that the values are NOT swapped in the main function. However, the values
are found to be swapped in the swap function. This is because the parameters are passed
using Call-by-Value as discussed earlier in Section 8.4. The memory in the system can be
represented as shown in Fig. 2. Each slot shown represents one byte each. The memory
locations x and y are different from the locations u and v. So, when swap returns to main
the values in x, y do not change. At the same time since the extent of the variables u, v is the
function swap, these memory locations are no longer available to the program.
Program 8: Program to swap two variables whose pointers are passed to the swap function
v o i d swap ( c h a r ∗u , c h a r ∗v )
{
char temp ;
temp = ∗u ;
∗u = ∗v ;
∗v = temp ;
p r i n t f ( ‘ ‘ swap : U = %c , V = %c \n ” , ∗u , ∗v ) ;
return ;
}
i n t main ( v o i d )
{
char x = ‘a ’ , y = ‘b ’ ;
swap(&x , &y ) ;
p r i n t f ( ”X = %c , Y = %c \n ” , x , y ) ;
exit (0);
}
To achieve swapping of variables, we will have to pass the pointers of the actual variables
to the swap() function as shown in Program 8. When we pass pointers, the memory of the
system is as shown in Fig. 3. Here, the actual parameters are the addresses of variables x, y,
i.e., &x, &y as seen in main() function in Program 8. In other words, the parameters u and
v contain the addresses of variables x and y as shown in Fig. 3. Therefore, ∗u refers to the
value in location pa or in other words ∗u is a and similarly ∗v is b. When we swap ∗u and
∗v as shown in Program 8, we are swapping values in locations pa and pb. Therefore, at the
23
end of the swap function, the values of x, y are swapped but the values of u, v do not change
in the function. Of course, as stated earlier, u, v are no longer accessible once we exit the
function.
pa pa
x=a x=b
pb
pb y=b y=a
*u *v
u = pa u = pa
v = pb v = pb
Figure 3: Memory snapshot when swap function is called with pointers and after swap is
executed.
Some Important Dos and Don’ts with Pointers Whenever pointers are used, the most
important thing to keep in mind is to initialize it to NULL. If the pointer is not intialized,
like any other variable, it is occupying a memory location which may contain any value. The
problem with this is that this is considered to be an address by the program when using it.
So, if one is lucky, the program segment faults and exits. Otherwise, it is possible that the
memory location is valid for the program which may result in corruption of data or even
instructions. This is the memory corruption problem, one of the worst bugs anyone can be
called upon to debug. One can make life much easier on themselves by initializing pointers
to NULL.
24
extensively used in Computer Networks where different bits represent different functionality.
Bits need to be set or checked if they are set and so on.
The standard operations with bits are AND (&), OR (|), XOR (ˆ), NOT (∼), left shift
(<<) and right shift (>>) with the symbols used by C for these operations in parentheses.
1 1 0 0 1 1 0 1
Original Data item that is
to be shifted
0 0 1 1 0 1 0 0
Data after shifting it left
by two bits
0 0 1 1 0 0 1 1
Data after shifting it right
by two bits
1. AND: When 1 and 0 are ANDed, a 0 is the result whereas 1 is the result when 1 is
ANDed with 1.
3. XOR: When 0 is XORed with 0 or 1 with 1, the result is 0 whereas 0 XORed with 1
results in 1.
4. NOT: NOT is the complement of the bit – so, 1 becomes 0 and vice versa.
5. LEFT/RIGHT SHIFT: When a data item is left shifted, the rightmost bits become
zero. It is the reverse for right shift; the leftmost bits become zero. Thus, if the data
item is 11001101, if it is left shifted by 2 bits, the value becomes 00110100. The two
leftmost bits are shifted out and the two rightmost bits become 0. If we do right shift
of the same data, the result will be 00110011. This is shown in Fig. 4.
Is a bit set? Now, if the problem is to determine if a bit is set, we need to do the following:
let us say that we are dealing only with a single byte. So, there are only 8 bits (or locations).
We want to verify if bit 3 is set or not where we are starting bit positions from 0. The easiest
way to do this is to shift the data so that bit 3 now becomes bit 0 (or the rightmost/least
significant bit) and AND it with 1. If the bit is 1, the result will be 1; otherwise it will be 0.
This is shown in Fig. 5.
25
Bit positions 7 6 5 4 3 2 1 0
(a) 1 1 0 0 1 1 0 1
Original Data to test if
bit 3 is set
(b) 0 0 0 1 1 0 0 1
Data right shifted by 3 bits
(c) 0 0 0 0 0 0 0 1
Bit 1 with which the byte
in (b) will be ANDed
(d) 0 0 0 0 0 0 0 1
Result of the AND operation
of (b) and (c)
Set a bit To set a bit, operation OR is used. So, if we need to set bit 3 as in the previous
example, we take bit 1, left shift it by 3 bits and OR it with the original data. Since all the
other bits in the value 1 are all 0s, the original contents will not be modified. When we OR
with 1, irrespective of whether the original data had 0 or 1, the result will be 1. Thus, the
bit is set.
Clear a bit Clearing a bit means setting it to 0 always. To clear a bit, we do OR with a
0. The operation is exactly the same as in Set a bit.
26
Bit positions 7 6 5 4 3 2 1 0
(a) 1 1 0 0 1 1 0 1
(b) 0 0 0 0 1 1 0 0
Data right shifted by 4 bits
(c) 1 1 1 1 1 0 0 0
NOT of 0 left shifted by 3 bits
since we need to extract 3 bits
(d) 0 0 0 0 0 1 1 1
NOT of (c)
(e) 0 0 0 0 0 1 0 0
Figure 6: Illustration of steps for extracting bits 4-6 from the given data byte
typedef s t r u c t node s {
int value ;
27
s t r u c t node s ∗ next ;
} node t ;
#e n d i f
cp a.c b.c
argc gives the total number of arguments including the name of the executable file. The
array of character pointers argv contains the addresses of the strings which are the actual
arguments including the name of the executable file. Thus, from the cp command example
above, argv[0] = cp, argv[1] = a.c, argv[2] = b.c and argc =3.
The first thing we need to do when writing programs with command line arguments is to
verify if the number of arguments is equal to what we are expecting. cp command expects
a minimum of two arguments to it – the source and destination files. So, argc has to be 3.
Any value other than that would be wrong. We need to check this before we proceed with
the program. If we do not do it and proceed to use argv, we will hit a NULL pointer and
segment fault.
So, the cp program would start as follows:
28
Program 10: Program illustrating use of command line arguments
#i n c l u d e <s t d i o . h>
i n t main ( i n t argc , c h a r ∗ argv [ ] )
{
i f ( a r g c != 3 ) {
f p r i n t f ( s t d e r r , ‘ ‘ Usage : cp <s r c > <d e s t >\n ” ) ;
exit (1);
}
/∗ c a l l t h e f u n c t i o n t h a t c o p i e s from f i l e 1 t o f i l e 2 ∗/
copy ( argv [ 1 ] , argv [ 2 ] ) ;
exit (0);
}
Program 10 illustrates many good coding practices. When we find that the argc value
does not match the expected value, we need to print an error. We use fprintf with stderr
instead of printf because the error message will be printed on the terminal even if stdout is
redirected to a file. It is always good to differentiate between the normal print messages and
error messages. Error messages should always be printed to stderr.
Secondly, look at the use of the function/system call exit. The parameter passed to this
function is the return status of the program. By convention, Unix/Linux use 0 to indicate
success and any other value to indicate failure. So, whenever there is a failure to execute
and the program is exiting, it is useful to give a status value. In this case, we gave the value
1. If there are multiple errors, for each of which, the program exits, then, the status value
has to be different for each of them. We can check the status of the program by running the
command “$?” in Linux to get the status which will tell us where the program failed.
struct std_record {
char name[MAX_NAME];
char programme:4;
char year:4;
float cgpa;
}
The declarations for programme and year are stating that the number of bits allocated for
these fields is 4 each so that, together, they occupt one byte. (This is one of the interesting
facilities provided by C to reduce memory footprint. We have to remember that C is used
primarily for systems programming such as operating systems and every byte saved is useful
in that context. This is even more so in the context of embedded operating systems.)
29
8.10.1 typedef
Using the keyword typedef, a new data type can be defined in C. We can define a new data
type called std rec t as follows:
All future references can declare variables of this structure by using this new data type
instead of using the struct:
std_rec_t rec;
instead of
We can also define new data types using basic data types themselves, e.g., we can define
a new data type called uint16 t which is an unsigned short in almost all the systems.
uint16_t ip_protocol;
However, since the data type’s storage is not defined by the C language and it varies from
system to system, such a data type is useful to ensure that exactly the required storage is
allocated. This is extremely important in computer networking where systems of different
architectures and operating systems are connected together and all of them have to interpret
the messages exactly as per the protocol specifications.
8.10.2 union
The declaration of a union looks very similar to a structure declaration. Given below is one
such declaration:
union {
char data[4];
float f;
}
If the above were a structure, the storage space allocated to it would be the sum of the
space needed for both the variables – typically 8B. In a union, however, the storage allocated
is equal to the storage of the largest data item within the union definition. In the above
example, both the fields within the union have the same size of 4B and so the union will have
a storage space of 4B. But, if we had only a char along with a float, it would still be 4B as
normally float occupies 4B and char occupies 1B.
(Note: union can be used to define data types in such a way that it can lead to polymor-
phism in the context of object-oriented design.)
30
9 Some Common Compiler Errors and What They Mean
It is recommended to use gcc for compilation which is a default in Linux/Unix systems.
Compilation consists of two steps that we are interested in as a beginner of programming –
compiling and linking. The first step converts C language statements into machine language
which is nothing but a series of 0s and 1s. Linking is needed to get the C library code to be
linked with the source code we write where we use these library functions. Without this step,
the computer will not know what to do when a C library function is called.
When compiling, it is good to enable all warnings and take care of them. We will discuss
why this is important when we come to the specific warning. The command to compile a C
program is given below:
The option “-Wall” enables all warnings. As everyone knows, the C compiler creates an
executable file with the name a.out by default. However, you are strongly discouraged to do
this! We observe that if you have many programs, at any point of time, you can have exactly
one(!) executable program because every time you recompile some source code, it overwrites
the existing executable file!
Instead, use the option “-o” to name the executable file. Please note that good program-
ming practice consists of naming files properly. A typical naming strategy is to have the
executable file name to be the same as the name of the source file containing the main func-
tion but without the .c extension. Please also note that the extension “.o” is for object files
– i.e., for files which contain the machine language equivalent of the C code, but which are
not executable. An executable file consists of such object files and also is linked with library
files which provide the code for the C library functions we use.
We now look at some of the common C compilation errors and how to fix them:
because I just had the comment “read val” but no valid statement initializing val. Same
warning will be noticed for N too. This is a very important warning because any uninitialized
variable has unpredictable value and can lead to wrong output from the program. Hence,
31
make sure all variables are properly initialized. Fixing this warning at compile time can save
hours of debugging time trying to figure out why an error in output occurs at run time. This
is especially true if we assume val = 0 and say, most of the time the computer does find a
memory location with 0 in it. However, once in a while, the program may allocate a memory
location which does not have 0 in it. Only under such conditions will the program fail. So,
the program will run fine for some runs and maybe even for years and then suddenly fail one
day. Such errors can be a nightmare for debugging. And, all it needed was to ensure there
was no warning during compilation to avoid this nightmare!
Program 11: Program illustrating the linking error “Undefined Reference”
#i n c l u d e <s t d i o . h>
#i n c l u d e < s t d l i b . h>
#i n c l u d e <math . h>
i n t max = −1;
i n t main ( v o i d )
{
int i , val , N;
/∗ Read N from t h e u s e r ∗/
s c a n f ( ‘ ‘% d ” , &N ) ;
f o r ( i = 0 ; i < N; i ++) {
/∗ Read v a l u e from u s e r ∗/
s c a n f ( ‘ ‘% d ” , &v a l ) ;
i f ( v a l > max)
max = v a l ;
}
32
for (i = 0; i < N; i++) {
^
sqrt_max.c:13:8: note: each undeclared identifier is reported only once
for each function it appears in
It shows that the error occurs in Line 13. In fact, every time i is referred to in the program,
this error occurs. However, as given in the error message, the compiler gives it only for the
first occurrence. Once the proper declaration of int i is given, this error disappears from all
occurrences.
The “-lm” option means include “library” “m”. Similarly, there may be other programs
we write which may need specialized libraries, e.g., if we write multithreaded programs we
need to include the option “-lpthread” when compiling the code to access the pthread library
functions.
10 Coding Standards
Some good resources for coding standards are [2] and [3].
33
10.2 Constants/Macros
Constants or Macros in programming are by convention always all in capital letters. Thus,
we define PI and not “pi” as a constant. As soon as any term is seen that is all in capital
letters, it should be obvious that this is either a constant or a macro. An example of a macro
is:
M AX can then be used in the program. Both constants and macros will be expanded in
place before compilation as they are part of the pre-processor directives.
10.4 Indentation
Indentation is a very important part of programming for readability. While there are tools
that help to indent automatically and/or after the fact, it is more useful to develop the instinct
to program with indentation as almost a reflex action. Many a time, people try to indent the
code after finishing the testing, which is a superb waste of time. If we learn to indent as we
code, there is no need to revisit the code for indentation.
Indentation with tabs is easy but ends up possibly creating issues with column limits
if there are more than two levels of indentation. It is better to use a 4 space indentation.
However, whether you choose space or tab as your indentation strategy, be consistent! Do
NOT mix up spaces and tabs since then the indentation will be quite ruined based on tab
definition in different systems.
10.5 Comments
Whenever we write complicated programs, it is good to comment the parts of the code that
are difficult to understand. Explain the logic in English in terms easily understood that will
enhance the readability and maintainability of the program. In fact, a well commented code
can then be used to generate a design document using tools.
34
10.7 Braces for Functions vs Other Compound statements
Braces can be used either in K&R style or they can be on standalone lines when used with
looping or conditional statements. However, as cautioned in other cases, stick to one conven-
tion throughout and do NOT mix up the styles.
However, when starting and ending functions, it is advised to have the braces on separate
lines. This helps to navigate functions very fast in vi by using the “[[” and “]]” commands.
References
[1] http://www.equestionanswers.com/c/parameters-are-passed-call-by-value.php
[5] Brian W Kernighan and Dennis M Ritchie, The C Programming Language 2nd edition,
Prentice Hall, 1988.
Acknowledgements
I wish to thank Dr.Anjeneya Swami for his comments and catching the errors in the earlier
drafts. Any remaining errors are obviously my responsibility.
35