OS_merged
OS_merged
The OS provides convenient abstractions over the hardware resources, and standard services
that application programmers can use in their programs. In fact, the major focus of this course is
to help you learn these abstractions and services, and how to use them in your programs. Due
to this focus on learning, explaining and programming using these abstractions and services,
courses with similar content use titles like "System Programming," or "Principles of Computer
Systems" or "Introduction to Computer Systems" at some universities.
Goals of an OS
As mentioned above, a major goal of an OS is to provide a convenient software interface to
hardware resources. The OS takes physical resources such as the processor, memory and disk,
and provides software abstractions over these resources. The OS also works to maximize
utilization of the hardware resources and to manage contention of resources among different
users of the OS. OS provides standardized software libraries to allows programs to interact with
the provided abstractions. Additional goals of the OS include providing security and supporting
software development.
OS Services
Standard services provided by OS include:
Process management
An OS is responsible for starting a new program and for ending programs.
OS provides this functionality via the process abstraction, which we can think of as a running
program.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-introduction-to-operating-systems?module_item_id=24111727 1/3
6/8/24, 1:53 PM Exploration: Introduction to Operating Systems: OPERATING SYSTEMS I (CS_374_400_S2024)
An OS also provides interfaces for reading and writing files, as well as interfaces to
communicate with external devices.
Process coordination
Different processes running an OS may need to access shared resources, e.g., files,
devices, etc. Access to these resources may need to be coordinated and OS provides
support for managing concurrent access to shared resources.
OS Kernel
In addition to providing the services listed above, an OS typically provides additional services for
application developers and users. These can include software to interact with the user, such as
shells and graphical user interfaces, as well as libraries for application developers.
A distinction is generally made between the OS Kernel, and the additional software for user
interaction and application developers that is bundled with an OS but is not part of the kernel.
The kernel is responsible for providing the standard services that we listed above and is the
program that is always running on the computer. The kernel is allocated its own memory space
which is called kernel memory. To keep an OS running even when applications fail or act
maliciously, kernel memory must be protected from direct access by the applications running on
the OS. Access to kernel services and memory is controlled by providing system calls, which
is an application programming interface (API) provided by an OS to applications for interacting
with the kernel.
Interacting with OS
At a higher level, we can consider two different types of interactions with an OS, interaction by
users and interaction by program.
Interaction by Users
There are two primary modes by which users interact with the OS.
These require the user to type commands to access OS services. Early OSs only supported
command line shells.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-introduction-to-operating-systems?module_item_id=24111727 2/3
6/8/24, 1:53 PM Exploration: Introduction to Operating Systems: OPERATING SYSTEMS I (CS_374_400_S2024)
Learning commands to type at command line shell imposed a high learning curve on the end
user. This prompted the development of GUIs where users can interact with the OS using
graphical elements.
Interaction by Programs
OS exposes its services via system calls which is the programming way a program requests
services from an OS. When writing an application program, you may or may not have direct
access to make the system calls. However, most programming languages provide APIs that
make system calls behind the scenes. We will study system calls and API related to system
calls in great details in the course.
Exercises
Think about all the electronic devices you use on a daily basis.
Additional Resources
Here are some references to learn more about the topics we discussed in this exploration.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-introduction-to-operating-systems?module_item_id=24111727 3/3
6/8/24, 1:53 PM Exploration: Unix, Linux and POSIX: OPERATING SYSTEMS I (CS_374_400_S2024)
Multics
The roots of Unix are in project called MULTiplexed Information and Computer Services or
Multics which was a new time-sharing OS developed by AT&T Bell Labs, MIT and General
Electric. The Multics project was started in 1964. It was a large project that was beset by delays.
Unix
Some Bell Labs researchers who had been working on Multics decided to implement a new
project around 1970. They named their project Uniplexed Information and Computing Service,
or Unics , as a play on name Multics. The name of the project later became Unix. Key people in
the Unix project included Ken Thompson and Dennis Ritchie, who were both later awarded ACM
(Association of Computing Machinery) Turing Award "for their development of generic operating
systems theory and specifically for the implementation of the UNIX operating system." The
Turning Award is universally considered the highest award in computer science.
C Programming Language
Unix was originally written in the assembly programming language. In 1973, Unix Version 4 was
rewritten in the C programming language, which was also originally developed by Dennis
Ritchie. C and C++, an object-oriented programming language derived from C, are still the most
widely used languages for implementing OS.
Unix Derivatives
Linux
Unix became widely popular starting in late 70s and early 80s. Many commercial software
companies started providing Unix and Unix-like systems. In early 1990s, Linus Torvalds, a
Finnish-American software engineer, developed the initial version of Linux. Linux was a Unix-
like OS, which was open source and also was not beholden to expensive license agreements
with AT&T. Today there are many distributions of Linux. Linux is very widely used for computer
servers and you can also run it at home.
Mobile OS
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-unix-linux-and-posix?module_item_id=24111728 1/2
6/8/24, 1:53 PM Exploration: Unix, Linux and POSIX: OPERATING SYSTEMS I (CS_374_400_S2024)
Linux in-turn has spawned many derivative OSs. One of them is Android, the most popular
mobile OS since 2013. Other derivative OSs of Linux and Android run on smart televisions,
tablets, and wearables. Currently iOS from Apple is the second most popular OS. iOS is derived
from Apple's macOS, which again is derived from Unix!
POSIX
To maintain compatibility between OSs, IEEE Computer Society specified a family of standards
called Portable Operating System Interface or POSIX. POSIX defines standards for software
compatibility across Unix variants as well as other OSs.
Of particular interest to us in this course is the POSIX standard related to system calls. Recall
that system calls are the API provided by an OS that a program uses to request services from
the OS. In addition to system calls, POSIX also includes standards command line shells and OS
utility interfaces.
Exercises
Additional Resources
Here are some references to learn more about the topics we discussed in this exploration.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-unix-linux-and-posix?module_item_id=24111728 2/2
6/8/24, 1:54 PM Exploration: Unix Shell: OPERATING SYSTEMS I (CS_374_400_S2024)
Start programs
Kill programs
Connect processes together (using pipes)
Manage I/O to and from processes
Create, delete and manage files and directories
There are many different shells that are typically installed on Unix/Linux distributions. These
shells (and the standard path to the shell command on most installations), include the following:
In this class, we will use the BASH shell , which is commonly written as bash.
man
Unix systems provide access to an online reference manual via the man command. This
command can be used to read the reference manual for shell commands, system calls, and
many functions in the standard C library. To see the detailed description of a command use the
man command with the name of the command. For example, you can view the reference for the
cp command using the following command:
$ man cp
Directory/file management
pwd
Print working directory
Which directory am I in?
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-unix-shell?module_item_id=24111729 1/6
6/8/24, 1:54 PM Exploration: Unix Shell: OPERATING SYSTEMS I (CS_374_400_S2024)
cd
Change directory
Moves your current working directory to a different one
ls
Display the files in a given directory
mkdir
Create directory
rmdir
Remove directory
rm
Remove files (and directories if used recursively)
mv
Move or rename files and directories
cp
Copy files and directories
chmod
Change mode, i.e., change the permissions of files or directories
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-unix-shell?module_item_id=24111729 2/6
6/8/24, 1:54 PM Exploration: Unix Shell: OPERATING SYSTEMS I (CS_374_400_S2024)
0:00 / 6:16 1x
cat
Concatenate character data stored in a file with another file
Primary use is to dump data to the terminal
more
Take character data and display one screen-full at a time
less
Similar to more
head
Display the beginning of a text file
tail
Display the (tail) end of a text file
grep
Search a text file
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-unix-shell?module_item_id=24111729 3/6
6/8/24, 1:54 PM Exploration: Unix Shell: OPERATING SYSTEMS I (CS_374_400_S2024)
0:00 / 4:36 1x
Shell Scripting
All the commands that are accessible from the shell can be placed in a shell "script." Shell
scripts are executed line by line, as if the lines where being typed in one by one. Shell scripts
provide the features of high-level programming languages, such as, variables, conditional
expressions, loops, etc.
Shell scripts are commonly used to automate frequent tasks and to simplify complex
commands. For example, scripts may be written to run nightly backups of data, or to start up
servers. Sometime scripts are written for small programs that you need to create quickly or you
need to change frequently. Scripts are also used as a glue to connect together other programs.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-unix-shell?module_item_id=24111729 4/6
6/8/24, 1:54 PM Exploration: Unix Shell: OPERATING SYSTEMS I (CS_374_400_S2024)
0:00 / 1:39 1x
Exercise
The wccommand prints various counts about data in a text file. Read the man page for wc
(http://man7.org/linux/man-pages/man1/wc.1.html) for details. What options you can use with wc
to print the longest line in the text file wc_test.txt
(https://canvas.oregonstate.edu/courses/1971495/files/103693470/download) .
Answer
Additional Resources
Here are some references to learn more about the topics we discussed in this exploration.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-unix-shell?module_item_id=24111729 6/6
6/8/24, 1:54 PM Exploration: System Calls, and Reading and Writing Files in C: OPERATING SYSTEMS I (CS_374_400_S2024)
System Calls
System calls are the mechanism by which a user program asks the OS to perform services for
it. In fact, the shell commands also make system calls to invoke OS services. One popular
categorization of system calls into 6 categories is as follows:
Process control
create process
terminate process
load, execute
get/set process attributes
wait for time, wait event, signal event
allocate and free memory
File management
create file, delete file
open, close
read, write, reposition
get/set file attributes
Device management
request device, release device
read, write, reposition
get/set device attributes
logically attach or detach devices
Information maintenance
get/set time or date
get/set system data
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-system-calls-and-reading-and-writing-files-in-c?module_item_id=24111730 1/8
6/8/24, 1:54 PM Exploration: System Calls, and Reading and Writing Files in C: OPERATING SYSTEMS I (CS_374_400_S2024)
Communication
create, delete communication connection
send, receive messages
transfer status information
attach or detach remote devices
Protection
get/set file permissions
"Hello World" in C
By convention, the first program when you are learning a language is the "Hello World" program.
Here is a "Hello World" program in C.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-system-calls-and-reading-and-writing-files-in-c?module_item_id=24111730 2/8
6/8/24, 1:54 PM Exploration: System Calls, and Reading and Writing Files in C: OPERATING SYSTEMS I (CS_374_400_S2024)
In C, as in many other languages, the main() function is special: when you run a C program the
main() function is where the program execution starts. Functions have a return type, in this
case int , and may take arguments. In the case of our program, the function returns an int
and doesn't take any arguments.
The body of the function is wrapped in curly braces and in this example has just 2 lines. The first
line prints "Hello, world!" to the terminal when you run the program. The second line returns the
integer 0. The first line of the program has an include statement to include stdio.h , the header
for the standard C library for input and output.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-system-calls-and-reading-and-writing-files-in-c?module_item_id=24111730 3/8
6/8/24, 1:54 PM Exploration: System Calls, and Reading and Writing Files in C: OPERATING SYSTEMS I (CS_374_400_S2024)
1_4_hello_world_put @cs344
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-system-calls-and-reading-and-writing-files-in-c?module_item_id=24111730 4/8
6/8/24, 1:54 PM Exploration: System Calls, and Reading and Writing Files in C: OPERATING SYSTEMS I (CS_374_400_S2024)
Note: In this exploration, we are not going to look into the details of the write() system
call. However, over the next few explorations we will have learned every concept employed
in this program.
In Unix, files are simply a linear array of bytes. Let us now see how we can carry out basic
operations related to file managing, such as creating a file, reading from a file, writing to a file.
Note: At this point in the course, we are not going to look into the details of the following
program. However, over the next few explorations we will have learned every concept
employed in this program.
Creating a File
We can create a file by using the open system call and passing it the O_CREAT flag. In this
program, we are also passing the flag O_TRUNC which will truncate the file if it already exists. If
you run the following program, it will create a file named grades.txt . You can use the ls
command from the shell to confirm that the program created the file.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-system-calls-and-reading-and-writing-files-in-c?module_item_id=24111730 5/8
6/8/24, 1:54 PM Exploration: System Calls, and Reading and Writing Files in C: OPERATING SYSTEMS I (CS_374_400_S2024)
1_4_createfile.c @cs344
Uses the open system call to create a new file named "newFile.txt" or if should a file exists,
truncates it and opens it, then
Uses the write system call to write the text "THE BUSINESS" to this a file, then
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-system-calls-and-reading-and-writing-files-in-c?module_item_id=24111730 6/8
6/8/24, 1:54 PM Exploration: System Calls, and Reading and Writing Files in C: OPERATING SYSTEMS I (CS_374_400_S2024)
Uses the read system call to read the contents of this file and prints them to the terminal.
Exercise
Study the man page for the open system call (http://man7.org/linux/man-
pages/man2/open.2.html) and modify the flags passed to open in the above program so that if a
file already exists, open does not overwrite the contents of the file. Click the answer button for
solution.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-system-calls-and-reading-and-writing-files-in-c?module_item_id=24111730 7/8
6/8/24, 1:54 PM Exploration: System Calls, and Reading and Writing Files in C: OPERATING SYSTEMS I (CS_374_400_S2024)
Answer
Additional Resources
Here are some references to learn more about the topics we discussed in this exploration.
The categorization of system calls given above is from Operating System Concepts, by
Abraham Silberschatz, Greg Gagne, and Peter B Galvin, 2018
Chapter 4 of The Linux Programming Interface has a detailed discussion on file I/O in Linux.
We are going to study files in more detail in subsequent modules.
The Linux man-pages project (https://www.kernel.org/doc/man-pages/) provides
extensive documentation on Linux system calls and the C standard library. This
documentation is an essential reference for you to use during this course.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-system-calls-and-reading-and-writing-files-in-c?module_item_id=24111730 8/8
6/8/24, 1:54 PM Exploration: Variables & Data Types, Input & Output in C: OPERATING SYSTEMS I (CS_374_400_S2024)
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-variables-and-data-types-input-and-output-in-c?module_item_id=24111731 1/5
6/8/24, 1:54 PM Exploration: Variables & Data Types, Input & Output in C: OPERATING SYSTEMS I (CS_374_400_S2024)
1_5_multiples @cs344
main
int denominator;
int boundary;
In C, the data type of each variable must be explicitly declared at compile-time. This is in
contrast with other languages such as Python, where the type of a variable does not need to be
explicitly specified and can be inferred at run-time.
A variable of type int can only store an integer value. In addition to int , historically C has
provided three additional basic data types. These are as follows:
float : can be used to store floating-point numbers, i.e., numbers containing decimal points
double : similar to float, it can be used to store floating-point numbers, but it has double the
precision of float
char : can be used to store a single character. The character value is specified within single
quotes. E.g., 'A' or '1' or ','
#include directive
In our program, we call two functions multiple times. These functions are printf() and
scanf() . But these functions are not defined in our program. In C we can use functions defined
elsewhere by including the header files that contain the descriptions of these functions. This is
being done in the following line of code where we are including the header file stdio.h
#include stdio.h
As our C program is transformed into an executable program, the contents of this header file are
copied into our program. Header files typically contain description of functions and definition of
variables.
printf("Enter another positive integer up to which you want to see multiples printed: ");
The first two calls are very similar. The function is being called with one argument, which is a
string. As we had learned C doesn’t support strings as a basic data type. We are going to look at
strings in greater detail later, but here is a quick introduction.
A string in C is an array of characters which is terminated by the null character. The null
character is represented by escape sequence \0 . Unlike a single character, a string is
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-variables-and-data-types-input-and-output-in-c?module_item_id=24111731 3/5
6/8/24, 1:54 PM Exploration: Variables & Data Types, Input & Output in C: OPERATING SYSTEMS I (CS_374_400_S2024)
enclosed in double quotes. Thus, the following two variables have different values and data
types:
The data type of the variable oneChar is character, whereas the data type of the variable
stringWithCharA is an array of characters. The [] at the end of stringWithCharA indicate that it
is an array. The number of characters in stringWithOneChar is 2, the first character is A and the
second character is the null character '\0' which terminates the string and is automatically
added when we enclose the characters in double quotes to create a string.
Here printf() is called with three parameters. The first parameter is a string (or equivalently an
array of characters), while the other two parameters are integer values. The string provided as
the first parameter to printf() contains two occurrences of %d . printf() interprets each
occurrence of %d as saying that a corresponding integer value will be provided as an additional
argument to printf() . The d in %d stands for decimal. printf() replaces the first %d with the
value of the variable denomiator and the second %d with the value of the variable boundary . To
print variables of some other common types using printf() , you can use the following
characters after % (for a complete list see the man pages (https://man7.org/linux/man-
pages/man3/printf.3.html) for printf() )
printf() paramaters
Exercise
Run the program and input a number greater than 15 when asked for the second integer.
Observe the output. Now change %d to %x in the 3rd statement that uses printf() and run
the program again by entering the same numbers as the previous numbers. Observe the output
and explain the reason for any differences in the output after the change .
Answer
scanf("%d", &denominator);
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-variables-and-data-types-input-and-output-in-c?module_item_id=24111731 4/5
6/8/24, 1:54 PM Exploration: Variables & Data Types, Input & Output in C: OPERATING SYSTEMS I (CS_374_400_S2024)
scanf("%d", &boundary);
The first argument to scanf() is a string that contains formatting instructions to read input. Just
as %d in printf() means print a decimal integer, %d in scanf() means read a decimal
integer. The second argument to scanf() contains the name of the variable to which scanf()
assigns the value of the integer it read in (for now ignore the &, we will discuss it later).
In our example, the first call to scanf() assigns the integer value entered by the user to the
variable denominator . The second call to scanf() assigns the second value entered by the user
to the variable boundary .
The formatting characters available for use in scanf() are similar to the characters available in
printf() . A few of these are listed below. For a complete list see the man pages
scanf() parameters
Additional Resources
Here are some references to learn more about the topics we discussed in this exploration.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-variables-and-data-types-input-and-output-in-c?module_item_id=24111731 5/5
6/8/24, 1:54 PM Exploration: From C Programs to Machine Code: OPERATING SYSTEMS I (CS_374_400_S2024)
0:00 / 11:05 1x
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-from-c-programs-to-machine-code?module_item_id=24111732 1/8
6/8/24, 1:54 PM Exploration: From C Programs to Machine Code: OPERATING SYSTEMS I (CS_374_400_S2024)
Pre-processing
The C pre-processor is responsible for analyzing and processing some special statements that
are identified with a #. We have already seen one example of such statements, namely, the
#include directive which is used to include header files in a program. Header files typically
contain the description of functions and variables needed by the program.
Another type of special statement that is processed by the C pre-processor is called a macro.
Macros are defined with #define . Here is an example:
#define PI 3.14
If a program includes the above macro, the C pre-processor will replace all instances of PI in
the program with the value 3.14. A common use of macros is to define a symbolic constant once
and use them in multiple places in the program. Note that there is no semi-colon at the end of
the #define directive.
Additional directives that the pre-processor is responsible for are #ifdef , #endif , #else , and
#ifndef which are used for conditional compilation. You can find details of that in the
Compilation
The C compiler is responsible for parsing your code, checking it for errors and generating
assembly language code. The compiler then calls the assembler which converts the assembly
code into machine binary code. A file with machine binary code is commonly referred to as an
object file and has the extension .o .
We will use the gcc compiler in this course. gcc provides many compilation options, some of the
mostly useful ones are listed below
Compiler Warnings
If the compiler finds an error in your program the compilation will fail. E.g., if you misspell the
name of the function main to mian compilation will fail and the compiler will report an error.
In addition to errors, the compiler can also report warnings which are diagnostic messages
indicating possible problems with your program. By default warnings do not cause compilation
to fail and your program may even run correctly in the presence of warnings. However, warnings
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-from-c-programs-to-machine-code?module_item_id=24111732 2/8
6/8/24, 1:54 PM Exploration: From C Programs to Machine Code: OPERATING SYSTEMS I (CS_374_400_S2024)
indicate possible issues with your program that can cause program execution to fail or have
unspecified behavior.
Example
In the following program change %d to %f in line 20. If you now press "Run" you will see a
warning but the program will successfully compile. However, when you run the program the
behavior is incorrect.
2_1_multiplier.c @cs344
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-from-c-programs-to-machine-code?module_item_id=24111732 3/8
6/8/24, 1:54 PM Exploration: From C Programs to Machine Code: OPERATING SYSTEMS I (CS_374_400_S2024)
gcc supports many options related to warnings. The most commonly used of these is -Wall
which despite its name does not cover all possible warnings! You can find more information in
the GCC manual's section on Warning Options. (https://gcc.gnu.org/onlinedocs/gcc-
4.8.5/gcc/Warning-Options.html)
Tip: Always review and fix all warnings reported by the compiler.
Linking
Virtually all C programs depend on multiple files. As an example, even a simple hello_world.c
program that prints a Hello World message to the screen needs to include stdio.h in order to
use the printf function. stdio.h includes the description of printf , but does not include its
code. Linker or link editor is the part of the compilation chain that stitches or links together the
various object files with each other and creates one executable file.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-from-c-programs-to-machine-code?module_item_id=24111732 4/8
6/8/24, 1:54 PM Exploration: From C Programs to Machine Code: OPERATING SYSTEMS I (CS_374_400_S2024)
0:00 / 4:43 1x
In addition to linking object files, the linker will also link library archives which are collections of
object files (.o) gathered into a single large file. Library archive files have indexes that make
accessing them fast and usually a lot faster than having to read every .o file. This is especially
useful when the object files in the library seldom change. The standard C library libc contains
the object code for standard C functions, such as printf and is automatically linked in by gcc
without you needing to specify it with the gcc command.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-from-c-programs-to-machine-code?module_item_id=24111732 5/8
6/8/24, 1:54 PM Exploration: From C Programs to Machine Code: OPERATING SYSTEMS I (CS_374_400_S2024)
When you click the "Run" icon on a repl, the program is compiled and executed. For a C repl,
the following commands are run by default:
As you can see, repl.it uses the clang compiler to compile the file main.c to an executable file
named main and then runs this executable. For course assignments you are required to use the
gcc compiler on os1. Should you optionally want to use the gcc compiler with a repl, here are 2
ways to do this
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-from-c-programs-to-machine-code?module_item_id=24111732 6/8
6/8/24, 1:54 PM Exploration: From C Programs to Machine Code: OPERATING SYSTEMS I (CS_374_400_S2024)
1_6_multiplier_replit.c @cs344
Tip: To view a list of all the files in a repl, click the Files icon on the left side.
Additional Resources
Here are some references to learn more about the topics we discussed in this exploration.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-from-c-programs-to-machine-code?module_item_id=24111732 7/8
6/8/24, 1:54 PM Exploration: From C Programs to Machine Code: OPERATING SYSTEMS I (CS_374_400_S2024)
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-from-c-programs-to-machine-code?module_item_id=24111732 8/8
6/8/24, 1:54 PM Exploration: Getting in the Mindset of a C Programmer: OPERATING SYSTEMS I (CS_374_400_S2024)
Next, look over the methods available for File input/output (https://en.cppreference.com/w/c/io)
and for manipulating Null-terminated byte strings (https://en.cppreference.com/w/c/string/byte)
. Reading and writing to files and string manipulation are two fundamental skills for C
programming, and a significant amount of your assignments' code will be devoted to these tasks.
You don't need to memorize all of this, but you should have a general idea of what kinds of
functionality the standard library provides to you, and where you could find the information to
actually use these functions.
You should constantly refer to documentation to ensure you are properly calling the functions,
checking return values and handling errors.
Another major difference between C and other languages is that safe C programming requires a
lot of explicit error/return value checking and careful memory management. Programs should be
broken down into very small functions, each with clearly defined arguments, return values, and
effects they might have on program state (such as modifying global variables). A general rule of
thumb is that an individual function should be short enough to fit in a terminal window.
Throughout this course, the content will highlight useful functions and commands, and you should
regularly refer to the manual pages for each one to understand how to use them properly.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-getting-in-the-mindset-of-a-c-programmer?module_item_id=24111733 1/1
6/8/24, 1:55 PM Exploration: Conditionals, Loops and Variable Scope: OPERATING SYSTEMS I (CS_374_400_S2024)
if statements
C’s if statements very similar to if statements in other programming languages. Note the
parentheses around the Boolean expression and the use of braces for the code blocks. We
have the basic if statement which will execute if the provided expression evaluates to true.
We can also have the familiar if-then-else statement, were we have add additional else if
branches with expressions and optionally an else branch without an expression which will be
executed if none of the expressions provided to the other branches evaluate to true
if (expr) {
//...
} else if (expr2) {
// ...
} else {
// ...
}
The parentheses around the Boolean expression are mandatory. But if a code block contains
only one statement, we can omit the braces.
switch statements
In many programs, we end up using an if-then-else statement that compares the same variable
to different values.
Example: Consider the following program in which an integer value entered by the user is
compared to many different values:
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-conditionals-loops-and-variable-scope?module_item_id=24111737 1/11
6/8/24, 1:55 PM Exploration: Conditionals, Loops and Variable Scope: OPERATING SYSTEMS I (CS_374_400_S2024)
e2_2_if_then_else.c @cs344
main
Such if-then-else statements are frequent enough that C provides a switch statement
construct to code them. Here is the same program using the switch statement:
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-conditionals-loops-and-variable-scope?module_item_id=24111737 2/11
6/8/24, 1:55 PM Exploration: Conditionals, Loops and Variable Scope: OPERATING SYSTEMS I (CS_374_400_S2024)
e2_2_switch.c @cs344
main
Note that the expression is evaluated once. Then the matching case branch is executed. The
code to be executed for the branch ends with a break statement. If none of the case branches
match, the default branch will be executed.
Conditional operator
A conditional operator is a very succinct way of coding a simple decision. The general format of
the conditional operator is as follows:
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-conditionals-loops-and-variable-scope?module_item_id=24111737 3/11
6/8/24, 1:55 PM Exploration: Conditionals, Loops and Variable Scope: OPERATING SYSTEMS I (CS_374_400_S2024)
In this case, the condition will be evaluated. If the condition is true, then expression1 is
evaluated and its value is returned. Otherwise, expression2 is evaluated and its value is
returned. A common usage of the conditional operator is to set the value of a variable to one of
two expressions based on a condition.
e2_2_conditionalop.c @cs344
main
Loops
Looping with for
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-conditionals-loops-and-variable-scope?module_item_id=24111737 4/11
6/8/24, 1:55 PM Exploration: Conditionals, Loops and Variable Scope: OPERATING SYSTEMS I (CS_374_400_S2024)
We now complete the program from a previous exploration that asks the user for two integers
and then prints multiples of the first integer between 1 and the the second integer. To do this we
will use the for loop.
2_1_multiplier.c @cs344
main
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-conditionals-loops-and-variable-scope?module_item_id=24111737 5/11
6/8/24, 1:55 PM Exploration: Conditionals, Loops and Variable Scope: OPERATING SYSTEMS I (CS_374_400_S2024)
The initialization statement is executed once. The loop condition is evaluated for each execution
of the loop, including the first execution. If the condition evaluates to true, the body of the for
loop is executed. After the body has been executed, the repeating statement is executed. Next
the loop condition is evaluated again. The for loop continues until the loop condition evaluates to
false.
In this example, the initialization statement sets the value of the variable i to 1. The loop
condition checks the value of i with the variable boundary . As long as i is less than or equal
to the value of boundary , the loop will continue to execute. After the body of the for loop is
executed, the repeating statement is executed which increments i by 1. Therefore, this for
loop will execute the same number of times as the value of boundary .
Note: In the above example of for loop, the variable i is declared in the initialization
statement. Allowing variables to be declared within the loop's initialization statement is an
enhancement added by C99 standard. When compiling the code with gcc, make user to
use C99 or GNU99 standard by using the flag -std=gnu99 or -std=c99 .
while(expr) {
Statement or statements to execute
}
The expression expr is evaluated. If it evaluates to true then the body of the while loop is
executed; otherwise the loop ends. The while loop will keep executing the body as long as
expr evaluates to true. If the body of the while loop has only one statement, braces aren’t
required.
Exercise
Modify the program shown below to use while loop instead of the for loop.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-conditionals-loops-and-variable-scope?module_item_id=24111737 6/11
6/8/24, 1:55 PM Exploration: Conditionals, Loops and Variable Scope: OPERATING SYSTEMS I (CS_374_400_S2024)
2_1_multiplier.c @cs344
main
Looping with do
The do statement is similar to the while statement with one crucial difference: the expression
is evaluated after the body has been executed. The basic syntax is:
do {
statement or statements to execute
} while(expr)
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-conditionals-loops-and-variable-scope?module_item_id=24111737 7/11
6/8/24, 1:55 PM Exploration: Conditionals, Loops and Variable Scope: OPERATING SYSTEMS I (CS_374_400_S2024)
This means that a do loop will execute at least once. This will happen even if expr is false
when the do loop is to be executed the first time.
2_2_multiplier5.c @cs344
main
In this program, the for loop can continue to execute even when we have printed 5 multiplier.
The break statement provides us a way to break out a loop as soon as we want rather than
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-conditionals-loops-and-variable-scope?module_item_id=24111737 8/11
6/8/24, 1:55 PM Exploration: Conditionals, Loops and Variable Scope: OPERATING SYSTEMS I (CS_374_400_S2024)
In the following program, we modify the previous program to use the break statement to break
out of the for loop as soon as the desired number of multipliers have been printed.
2_2_mult_break.c @cs344
main
This program will break out of the for loop as soon as the desired number of multipliers has
been printed.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-conditionals-loops-and-variable-scope?module_item_id=24111737 9/11
6/8/24, 1:55 PM Exploration: Conditionals, Loops and Variable Scope: OPERATING SYSTEMS I (CS_374_400_S2024)
2_2_scope.c @cs344
main
Two variables are defined in this program. The variable numerator is defined within the function
main and is termed a local variable. This means that numerator can only be used inside main.
In general, a local variable is local to the block, defined by { } , in which it is defined. E.g., in
the following example, the variable k is local to the for loop and cannot be accessed anywhere
else even in the function in which this for loop is used:
The variable denominator is defined outside any function. Such a variable is called a global
variable. A global variable can be used anywhere in the program. In this program, we see it is
being used in both of the functions main and isDivisible .
Static Variables
Static variables in C are initialized just once. These variables remain in memory until the end of
the program. If a static variable is defined inside a function, it retains its value across multiple
invocations of the function, as in the following example:
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-conditionals-loops-and-variable-scope?module_item_id=24111737 10/11
6/8/24, 1:55 PM Exploration: Conditionals, Loops and Variable Scope: OPERATING SYSTEMS I (CS_374_400_S2024)
2_2_static.c @cs344
main
where data_type is the type of the variable, e.g., int , char , float , etc.
Exercise
Write a program that prompts the user for an integer between 1 and 20 and then prints the
factorial of that integer.
Additional Resources
Here are some references to learn more about the topics we discussed in this exploration:
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-conditionals-loops-and-variable-scope?module_item_id=24111737 11/11
6/8/24, 1:56 PM Exploration: Arrays & Structures: OPERATING SYSTEMS I (CS_374_400_S2024)
Arrays
We can define arrays of any of the basic data types (in addition to other user defined types as
we will see later). You can declare an array by specifying the name of the array, the data type of
its elements and the number of elements the array can store. Let’s look at a program that uses
an array of integers
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-arrays-and-structures?module_item_id=24111738 1/9
6/8/24, 1:56 PM Exploration: Arrays & Structures: OPERATING SYSTEMS I (CS_374_400_S2024)
2_3_int_array.c @cs344
main
In this program, we have declared an array of size 5. Note that individual elements of an array
are read and written to just like an ordinary variable of that type. Elements in the array are
referenced via a 0-based index whose value goes up to the size of the array minus 1. In our
example, the element index goes from 0 to 4 as shown in the following figure.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-arrays-and-structures?module_item_id=24111738 2/9
6/8/24, 1:56 PM Exploration: Arrays & Structures: OPERATING SYSTEMS I (CS_374_400_S2024)
The elements of an array are allocated together in memory:The index of the first element
in the array is 0.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-arrays-and-structures?module_item_id=24111738 3/9
6/8/24, 1:56 PM Exploration: Arrays & Structures: OPERATING SYSTEMS I (CS_374_400_S2024)
2_3_string.c @cs344
main
In the program we declare two arrays of characters. Let’s first look at the array of characters
named message . We didn’t specify the size of the array. However, we initialize this array to the
string "Hi!" which is 4 characters long as shown in the table below.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-arrays-and-structures?module_item_id=24111738 4/9
6/8/24, 1:56 PM Exploration: Arrays & Structures: OPERATING SYSTEMS I (CS_374_400_S2024)
A string is terminated with a null character: You must make sure that the array is large
enough to hold the null character in addition to the other characters in the string.
Instead of using
we could equivalently have explicitly written the size of the array as follows:
Yet another way of declaring an equivalent array would be to individually specify the characters
as follows:
In our program, we also have a character array named state of size 3. We use this array to
store the input entered by the user.
scanf("%s", state);
As we see in this program, both printf and scanf work with strings. The format character
used for strings is %s . scanf will read the user input until whitespace is encountered and store
the value in state terminating the value with a null character. Note that we didn’t write an &
before state . We will look at the reason for this in the next exploration. A second thing to keep
in mind is that the size of name entered by the user must not be larger than the size of the array
state .
./main
If the program required input from the user, we used scanf to get that input. However, it is also
possible to pass arguments to our C program from the command line. To do this, we use a
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-arrays-and-structures?module_item_id=24111738 5/9
6/8/24, 1:56 PM Exploration: Arrays & Structures: OPERATING SYSTEMS I (CS_374_400_S2024)
When a program with this version of main is executed, the integer argument argc is set to the
number of space delimited strings entered on the command line and the array argument
argv holds each of the strings entered on the command line. For example, if the program is
executed as follows:
./main
then argc will be1 and the array argv will have one element with the value "./main"
./main file.txt 8
then argc will be 3 and the array argv will have 3 string elements with the values "./main",
"file.txt" and "8"
Example
The following program is a variation of the factorial program from the previous exploration. This
program must be called with an integer value and it will compute and print the factorial of this
integer value.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-arrays-and-structures?module_item_id=24111738 6/9
6/8/24, 1:56 PM Exploration: Arrays & Structures: OPERATING SYSTEMS I (CS_374_400_S2024)
2_factorial_args.c @cs344
main
To run this program, on the command line specify an integer value between 1 and 20 after the
command for the program. For example,
./main 10
Structures
Structures provide a way to group members together which can have different data types.
Structures in C are somewhat like a class in Java or C# but without the methods. We refer to the
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-arrays-and-structures?module_item_id=24111738 7/9
6/8/24, 1:56 PM Exploration: Arrays & Structures: OPERATING SYSTEMS I (CS_374_400_S2024)
members (or elements) of a structure by using a dot after the name of a variable of struct data
type.
Example
In the following program, we define a structure student with the elements name , studentId and
major . We then create two variables with this struct as their data type.
3_struct.c @cs344
main
Linked lists in C are frequently created by defining a structure that includes a pointer (which are
discussed in the next exploration) to its type, e.g.,
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-arrays-and-structures?module_item_id=24111738 8/9
6/8/24, 1:56 PM Exploration: Arrays & Structures: OPERATING SYSTEMS I (CS_374_400_S2024)
struct student {
char* name;
int studentId;
char* major;
struct student *next;
};
Additional Resources
Here are some references to learn more about the topics we discussed in this exploration:
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-arrays-and-structures?module_item_id=24111738 9/9
6/8/24, 1:56 PM Exploration: Pointers: OPERATING SYSTEMS I (CS_374_400_S2024)
Exploration: Pointers
Introduction
We now come to pointers - one of the most powerful and unique features of C.
Pointers hold addresses of variables in memory and allow us to indirectly access these
variables. Proper use of pointers facilitates writing very performant programs in C. On the other
hand, improper use of pointers is the cause of many bugs and vulnerabilities in C programs!
Pointers in C
A pointer in C is a pointer to a specific data type. We can thus have a pointer to an int, char,
float, etc. We can use pointers as variable and in expressions. There are two fundamental
operators related to pointers:
type* var;
type *var;
Here type is the type of the variable of whose address the pointer will hold, e.g., int* , char* ,
etc.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-pointers?module_item_id=24111739 1/9
6/8/24, 1:56 PM Exploration: Pointers: OPERATING SYSTEMS I (CS_374_400_S2024)
2_4_pointer_intro @cs344
main
We set the value of the pointer p to the address in memory where the variable i is stored via
the following statement
p = &i;
Thus, when you run the program you will see that &i and p have the same value.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-pointers?module_item_id=24111739 2/9
6/8/24, 1:56 PM Exploration: Pointers: OPERATING SYSTEMS I (CS_374_400_S2024)
Aside: printf prints values of pointers via %p , printf also requires that the type of the
pointer to be printed should be cast to (void*) . We will study type casting in a later
exploration.
When p holds the value of the address of i , we can access the value of i by using *p , as in
the following statement:
Using p , not only can we access the value in i , but additionally by using *p on the left-hand
side of an assignment statement, we can also update the value of i as in the following
statement:
If you run the following program, you will see that the value of the parameters remains the same
in main even after swap is called. This is an example of what is sometimes termed pass-by-
value. Here is what is happening:
In the function main , memory is allocated for the two integers i and j
When the function swap is called, memory is allocated for its parameters val1 and val2
The value of i is copied into the memory allocated for val1 and value of j is copied into
the memory allocated for val2
Within the function swap the values of val1 and val2 are indeed swapped with each other
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-pointers?module_item_id=24111739 3/9
6/8/24, 1:56 PM Exploration: Pointers: OPERATING SYSTEMS I (CS_374_400_S2024)
2_4_swap_unsuccess.c @cs344
Now let’s look at another version of swap where swap is passed pointers to i and j .
Within the body of swap we dereference these pointers to swap the values of the variables, i.e.,
i and j whose address is stored in these pointers.
This style of providing access to a variable allocated outside a function by passing a pointer to
the variable is sometimes called pass-by-reference or call-by-reference.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-pointers?module_item_id=24111739 4/9
6/8/24, 1:56 PM Exploration: Pointers: OPERATING SYSTEMS I (CS_374_400_S2024)
2_4_swap_success.c @cs344
main
int numbers[10];
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-pointers?module_item_id=24111739 5/9
6/8/24, 1:56 PM Exploration: Pointers: OPERATING SYSTEMS I (CS_374_400_S2024)
C supports arithmetic operations on pointers. For example, if we set the pointer to the start of an
array and then add one to the value of the pointer, we get the address of the next element of the
array. This means that given the declaration of numbers and ptr above, the expressions in
each of the following lines are equivalent:
When we pass an array as an argument to a function, the array is passed by reference, i.e., the
parameter value is the address of the first element of the array. Let’s look at a program that
illustrates these concepts:
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-pointers?module_item_id=24111739 6/9
6/8/24, 1:56 PM Exploration: Pointers: OPERATING SYSTEMS I (CS_374_400_S2024)
2_4_array_arg @cs344
main
main calls the function findMax as findMax(numbers) . Thus, the argument to findMax is a
pointer to the first element of the array numbers
In findMax we process the array by adding an integer value to this pointer and then
dereferencing it to get the value of the element at that index. Specifically:
The expression ptr +i adds the integer i to the pointer ptr and evaluates to the
address of the element at index i in the array numbers
The expression *(ptr + i) dereferences this pointer to get the value of the element at
index i in the array numbers
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-pointers?module_item_id=24111739 7/9
6/8/24, 1:56 PM Exploration: Pointers: OPERATING SYSTEMS I (CS_374_400_S2024)
Behind the scene, C automatically computes the address of the element based on the size of the
data type. In this example, the value of ptr + 1 will be the value of ptr plus the size of an int. If
we had a pointer fPtr which was a pointer to a float then the value of fPtr + 1 will be value of
fPtr plus the size of a float.
Exercise
The following program defines a function stringLength that returns the length of a string. The
code of the function is incomplete. At the point indicated in the program comments, add a line to
complete the function.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-pointers?module_item_id=24111739 8/9
6/8/24, 1:56 PM Exploration: Pointers: OPERATING SYSTEMS I (CS_374_400_S2024)
2_4_strlen_ex.c @cs344
main
Additional Resources
Here are some references to learn more about the topics we discussed in this exploration:
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-pointers?module_item_id=24111739 9/9
6/8/24, 1:56 PM Exploration: Memory Allocation: OPERATING SYSTEMS I (CS_374_400_S2024)
Memory Layout of a C Program: From the lowest address to the highest address the
segments are Code, Data, Heap and Stack. Code and Data segment have fixed size. Heap
grows towards higher address. Stack grows towards lower address.
Text Segment
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-memory-allocation?module_item_id=24111741 1/8
6/8/24, 1:56 PM Exploration: Memory Allocation: OPERATING SYSTEMS I (CS_374_400_S2024)
The text segment, which is also called, code segment contains the object code for the program.
The size and contents of this segment doesn’t change over the execution of the process.
Data Segment
The data segment contains the memory allocated for global and static variables. This memory is
allocated at the start of the program. Note that at a more granular level, separate segments
exist for uninitialized global and static variables, and initialized global and static variables. But
we are going to ignore this level of granularity.
Stack Segment
The stack segment contains memory for non-static and non-global variables. The contents of
the stack segment are maintained in a LIFO (last in first out) order, just like a stack data
structure that you have studied in your data structure class. When a function is called, the stack
grows. The arguments of the function are placed on the stack segment, followed by the return
address of the calling function and then the local variables of the function itself. This is referred
to as the stack frame (or activation frame) of the function. When a function returns, the stack
frame of the function is removed from the stack segment, and the stack segment shrinks in size.
Thus, the stack segment grows and shrinks as the process execution continues.
Heap Segment
The heap segment contains memory that is dynamically allocated. The C functions malloc and
calloc are used to allocate memory on the heap, while the function free is used to deallocate
or free-up previously allocated memory.
malloc takes one argument which is the size in bytes of the block of memory being requested
( size_t type is defined as an unsigned int ). If malloc is successful it returns a pointer to the
memory it allocated. If malloc fails to allocate a block of memory of the requested size, it
returns NULL . This can happen, e.g., if there isn’t any block of memory available in the heap
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-memory-allocation?module_item_id=24111741 2/8
6/8/24, 1:56 PM Exploration: Memory Allocation: OPERATING SYSTEMS I (CS_374_400_S2024)
segment that can satisfy the request. malloc doesn’t initialize the memory it returns.
calloc is declared as follows:
The first argument to calloc specifies how many elements to allocate while the second
argument specifies size of each element. calloc will try to allocate a block of numItems * size
bytes. calloc initializes the memory to 0. Just like malloc , calloc will return a pointer to the
memory it allocated on success, and a NULL on failure.
sizeof Operator
C provides a sizeof operator that returns the size of its argument in bytes. The argument of
sizeof can be a data type, a variable, an expression. This operator is very commonly used
when calling malloc or calloc , as in the following example:
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-memory-allocation?module_item_id=24111741 3/8
6/8/24, 1:56 PM Exploration: Memory Allocation: OPERATING SYSTEMS I (CS_374_400_S2024)
3_malloc.c @cs344
main
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-memory-allocation?module_item_id=24111741 4/8
6/8/24, 1:56 PM Exploration: Memory Allocation: OPERATING SYSTEMS I (CS_374_400_S2024)
This function takes one argument, a pointer to the start of the memory block to be returned to
the heap segment. This would be the value that is returned by calloc or malloc when the
memory block is allocated.
Example
In the following program, we create a linked list with elements of type struct customer by using
malloc to allocate memory for each element of the linked list. Note that the members of a
3_2_linked_list.c @cs344
main
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-memory-allocation?module_item_id=24111741 5/8
6/8/24, 1:56 PM Exploration: Memory Allocation: OPERATING SYSTEMS I (CS_374_400_S2024)
Pitfalls
The power of C features such as pointers and dynamic memory allocation can lead to bugs if
these aren’t used properly. We now discuss some of the most common memory-related bugs in
C programming.
Memory leaks
A memory leak occurs when a program allocates memory on the heap, but doesn’t free this
memory even when it is no longer needed, as in the following program:
3_mem_leak.c @cs344
main.c
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 void memoryLeak(){
5 // Memory is dynamically allocated on the heap but ptr goes out of
scope when the funciton returns
6 // As long as the program runs, this memory is not available for
allocation, but will never be used because the only reference to is gone
7 int* ptr = (int*) malloc(sizeof(int) * 10);
8 return;
9 }
10
11 int main(void) {
12 memoryLeak();
13 return 0;
14 }
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-memory-allocation?module_item_id=24111741 6/8
6/8/24, 1:56 PM Exploration: Memory Allocation: OPERATING SYSTEMS I (CS_374_400_S2024)
Memory leaks reduce the amount of heap segment that is available for memory allocation. In
long running programs, such as servers and daemons, this can be severe problem that can
result in program termination when requests for memory allocation can no longer be serviced
because of large number of memory leaks.
Buffer Overflow
A buffer overflow happens when a program writing data to a buffer in memory overruns the
boundary of the buffer. Let’s look at the following program to see an example:
2_3_string.c @cs344
main
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-memory-allocation?module_item_id=24111741 7/8
6/8/24, 1:56 PM Exploration: Memory Allocation: OPERATING SYSTEMS I (CS_374_400_S2024)
Here the user is asked to enter the 2 letter code for the state they live in. If instead they enter a
very long string, our program can run into problems. For example, run the program and enter
the string Thisnameisveryverylong at the prompt. The program will crash due to the following
reason:
Additional Resources
Here are some references to learn more about the topics we discussed in this exploration:
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-memory-allocation?module_item_id=24111741 8/8
6/8/24, 1:57 PM Exploration: Strings: OPERATING SYSTEMS I (CS_374_400_S2024)
Exploration: Strings
Introduction
Strings in C are simply an array of characters terminated by the null character. It is
not a basic data type, unlike many other “newer” languages, such as Java or C++. C includes
many functions for working with strings. We will first look at various ways of declaring strings
and their implication on how memory is allocated for the string. We will then study some basic C
library functions for manipulating strings.
While both these declarations will create a string with the same characters, there are important
differences in how memory is allocated in each case. Let’s look at each of these in turn.
In the following program, we initialize the variable myString using this declaration and later try
to change the value of a character in myString .
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-strings?module_item_id=24111743 1/9
6/8/24, 1:57 PM Exploration: Strings: OPERATING SYSTEMS I (CS_374_400_S2024)
3_4_stringdec.c @cs344
main
If you run the program, it will crash. Here is what causes this crash:
Segment.
When we try to change a character using the pointer myString , we are trying to change
memory in the read-only portion of the Data Segment, which causes the program to crash.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-strings?module_item_id=24111743 2/9
6/8/24, 1:57 PM Exploration: Strings: OPERATING SYSTEMS I (CS_374_400_S2024)
The memory allocated on the stack is editable, so the following program runs successfully.
3_4_str_stack.c @cs344
main
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-strings?module_item_id=24111743 3/9
6/8/24, 1:57 PM Exploration: Strings: OPERATING SYSTEMS I (CS_374_400_S2024)
3_4_str_functions.c @cs344
main
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-strings?module_item_id=24111743 4/9
6/8/24, 1:57 PM Exploration: Strings: OPERATING SYSTEMS I (CS_374_400_S2024)
This function compares the strings s1 and s2. If both strings are equal it returns 0, otherwise it
returns a non-zero value. Note that you mustn’t use == operator to compare two strings for
equality because that will compare the values of the two pointers rather than the contents of the
strings.
The related function strncmp takes a 3rd argument n and compares the first n characters of the
two strings.
strlen
size_t strlen(const char *s);
The function strlen returns the length of the string pointed to by the variable s . This length
doesn’t include the null character at the end of the string.
strcpy
char *strcpy(char *dest, const char *src);
The function strcpy copies the string pointed to by src to the buffer pointed to by dest . It
returns the pointer dest . The copied string includes the null character at the end. If the
destination buffer isn’t large enough to hold the string, this will cause buffer overflow.
The related function strncpy takes a 3rd argument n and copies the first n characters of src to
dest . If src is shorter than n bytes, then strncpy sets the remaining bytes to the null
character. If the first n bytes of the source string don’t contain the null character, then dest will
not be null terminated.
strcat
char *strcat(char *dest, const char *src);
The function strcat appends the string pointed to by src to the string pointed to by dest . The
function overwrites the terminating null character in dest and adds a terminating null character
at the end of the concatenated string. It returns the pointer dest . The array of characters
pointed to by dest must have enough space for the concatenated string, otherwise buffer
overflow will occur.
The related function strncat takes a 3rd argument n and concatenates at most n characters of
src to dest .
strdup
char *strdup(const char *s);
The function strdup creates a duplicate of the string s using malloc and returns a pointer to
this new string.
strtok
char *strtok(char *str, const char *delim);
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-strings?module_item_id=24111743 5/9
6/8/24, 1:57 PM Exploration: Strings: OPERATING SYSTEMS I (CS_374_400_S2024)
strtok can break a string into a sequence of nonempty tokens using the characters provided in
the delim string as delimiters. The token returned by strtok is a null-terminated string that
doesn’t include the character corresponding to the delimiting character using to tokenize the
input string. If no more tokens are found then strktok returns NULL.
3_4_strtok.c @cs344
main
The string to be parsed is provided as the first argument in the first call to strtok
In subsequent calls to parse the current string, the first argument should be NULL.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-strings?module_item_id=24111743 6/9
6/8/24, 1:57 PM Exploration: Strings: OPERATING SYSTEMS I (CS_374_400_S2024)
If a call to strtok passes a non-null string as the first argument, then strtok will start parsing
this string, regardless of whether the previous string had been completely parsed or not.
The characters in delim can be different in successive calls that are parsing the same
string.
When parsing a string strtok overwrites the delimiter character at the end of the current token
by the null character. This means that strtok changes the input string during the parsing.
Therefore, if we pass a string that has been allocated in the read-only Data Segment to strtok
for tokenizing, the program will crash. strtok considers that a sequence of more than one
contiguous delimiter characters terminate one token in the parsed string.
strktok_r
char *strtok_r(char *str, const char *delim, char **saveptr);
strktok_r function is a reentrant version of strtok . We will study the concept of reentrant
functions in a later module. But for tokenizing strings, this means that with strtok_r we can
parse different strings concurrently, whereas with strtok we cannot parse different strings
concurrently. With strtok_r , the 3rd argument saveptr is used to maintain context between
successive calls to parse a string.
Example
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-strings?module_item_id=24111743 7/9
6/8/24, 1:57 PM Exploration: Strings: OPERATING SYSTEMS I (CS_374_400_S2024)
3_4_strtok_r.c @cs344
main
Additional Resources
Here are some references to learn more about the topics we discussed in this exploration.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-strings?module_item_id=24111743 8/9
6/8/24, 1:57 PM Exploration: Strings: OPERATING SYSTEMS I (CS_374_400_S2024)
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-strings?module_item_id=24111743 9/9
6/8/24, 1:57 PM Exploration: Debugging C: OPERATING SYSTEMS I (CS_374_400_S2024)
Exploration: Debugging C
Introduction
Modern IDEs (Integrated Development Environment) such as Visual Studio, Eclipse,
Visual Studio Code, etc., come with excellent tools for code generation, compilation, debugging
and stepping through live code. However, most Unix system come with the debugger gdb
installed on them. Additionally, the tool valgrind is also helpful in finding memory leaks in C
programs on Unix. We take a look at gdb and valgrind next.
gdb
gdb is a debugger we can use with gcc . We need to compile our program with the -g flag.
This will cause gcc to add debugging information in the executable which is used by gdb . We
can then start the debugger on the executable program. For example, we can compile a file
testit.c and run the debugger on the executable file testit by running the following 2
commands:
When running gdb, here are some key commands you can run:
run :: (re)starts the program running; will stop at breakpoint (can add args, e.g.: run 6 myfile)
break :: sets a breakpoint where the debugger will stop and allow you to examine variables
or
single step
step :: executes a single line of C code; will enter a function call
next :: executes a single line of C code; will not enter a function call
continue :: continues execution again until another breakpoint is hit or the program
completes
print :: prints out a variable
quit :: stop debugging (exit gdb)
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-debugging-c?module_item_id=24111745 1/3
6/8/24, 1:57 PM Exploration: Debugging C: OPERATING SYSTEMS I (CS_374_400_S2024)
0:00 / 12:16 1x
valgrind
valgrind is a tool that helps find memory leaks in C programs. We need to compile the program
with the -g flag to add better diagnostics and then can run the program with valgrind using
appropriate flags.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-debugging-c?module_item_id=24111745 2/3
6/8/24, 1:57 PM Exploration: Debugging C: OPERATING SYSTEMS I (CS_374_400_S2024)
0:00 / 4:06 1x
Additional Resources
Here are some references to learn more about the topics we discussed in this exploration:
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-debugging-c?module_item_id=24111745 3/3
6/8/24, 1:58 PM Exploration: Data Types, Modifiers, Qualifiers & Conversion: OPERATING SYSTEMS I (CS_374_400_S2024)
Booleans
C didn’t have a basic Boolean data type for a long time. C99 added a basic data type
called _Bool which stores a value 0 or 1; it has the funny name to avoid name conflicts with
legacy code that often had custom type aliases (e.g. typedef char bool ). The header file
stdbool.h defines the type alias bool for _Bool . This header files also defines the
preprocessor macros true (1) and false (0). Typically, you should include this header and use
the "friendly" names provided, rather than the _Bool type. This gives legacy programs that
option to avoid name conflicts by not including stdbool.h .
In C, the value 0 is considered "false", while any non-zero value is considered "true"--for
example, as the controlling expression of an if, while, or for statement. The same is true with
bool : when assigning a non-zero value to a bool object, it is assigned the value 1 (true),
otherwise, 0 (false).
This value conversion step introduces an additional, often unnecessary, compiler overhead.
With most data types, type conversion rules are designed to mimic the behavior of a single mov
instruction in assembly. In contrast, assigning a bool is usually two instructions, for example,
Additionally, beware of type conversions in comparisons: 0 == false , but 2 != true for the
same reason 2 != 1 .
Example
In the following program we can see that the values 2.0 , 'a' , evaluate to true.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-data-types-modifiers-qualifiers-and-conversion?module_item_id=24111748 1/15
Source Editor: C source #1
6/8/24, 1:58 PM Exploration: Data Types, Modifiers, Qualifiers & Conversion: OPERATING SYSTEMS I (CS_374_400_S2024)
#include <stdio.h>
#include <stdbool.h>
int main(void) {
_Bool b = 1;
bool aBool = true;
char c = 'a';
float f = 2.0;
if(b){
printf("variable b evaluates to true\n");
}
if(aBool){
printf("aBool is true\n");
}
if(c){
printf("variable c evaluates to true\n");
}
if(f){
printf("variable f evaluates to true\n"); Edit on Compiler Explorer
}
Unions
Union data types in C are similar to structures in that we define unions with members that have
different names and types. However, an actual object of a union data type can hold only one of
the members at a time--all of the members share overlapping storage. Essentially unions
provide a means for the same object to hold a value of a different data type based on which
union member was last assigned a value.
Example
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-data-types-modifiers-qualifiers-and-conversion?module_item_id=24111748 2/15
Source Editor: hello_world.c
6/8/24, 1:58 PM Exploration: Data Types, Modifiers, Qualifiers & Conversion: OPERATING SYSTEMS I (CS_374_400_S2024)
#include <stdio.h>
int main(void) {
/* A `union token` type can hold either a value of type `int` or `char` */
union token{
int num;
char oper;
};
return 0;
} Edit on Compiler Explorer
Unions are commonly used in embedded programming contexts where memory is limited,
because they allow the same storage to be reused for different purposes.
Additionally, unions are used in low-level programming for type-punning: data may be written in
as one data type using one member, and read as another data type using another member. The
language guarantees that the data bytes of the type used to write in a value will be directly
reinterpreted as data bytes of the type used to read out a value. This can allow the same
physical data to be viewed according to different interpretations, but requires an understanding
of the actual representation of objects, which is implementation-defined. For example the
following real-life code example shows how unions and structures are used to represent the
registers of an AMD64 CPU,
#include <stdint.h>
uint16_t ax;
uint32_t eax;
uint64_t rax;
};
/* Register C */
union {
struct {
uint8_t cl;
uint8_t ch;
};
uint16_t cx;
uint32_t ecx;
uint64_t rcx;
};
/* Register D */
union {
struct {
uint8 t dl;
The signed keyword can also be used to explicitly denote a signed integer type, although this is
redundant-- signed int and int are the same exact type. However, there is one exception: the
types char , unsigned char , and signed char are distinct. The signed char and unsigned char
types are generic integer types with a size of one byte. In contrast, char is a special type which
is intended to be used only for representing ASCII characters. Since ASCII only uses values 0-
127, either a signed or unsigned single-byte value can be accommodate that range, and the
decision is left up the the implementation to use whichever is most efficient.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-data-types-modifiers-qualifiers-and-conversion?module_item_id=24111748 4/15
6/8/24, 1:58 PM Exploration: Data Types, Modifiers, Qualifiers & Conversion: OPERATING SYSTEMS I (CS_374_400_S2024)
short
The type short int , which can also be written more concisely as short , is an integer type with
a minimum range equivalent to a 16-bit integer.
long
The type long int , which can also be written more concisely as long , is an integer type with a
minimum range equivalent to a 32-bit integer. On Unix systems, it is typically a 64-bit integer
with the same size as long long .
long long
The type long long int , which can also be written more concisely as long long , is an integer
type with a minimum range equivalent to a 64-bit integer.
int
The plain int type, is an integer type that is defined as the native word size of the target
architecture's processor registers. Its minimum size is 16 bits, but typically it is 32 bits on
modern systems.
Example
The following program prints out the size of the basic integer types. Note that the value returned
may be different on different machines and compiler settings, depending on the data model
(https://en.wikipedia.org/wiki/64-bit_computing#64-bit_data_models) used.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-data-types-modifiers-qualifiers-and-conversion?module_item_id=24111748 5/15
Source Editor: C source #1
6/8/24, 1:58 PM Exploration: Data Types, Modifiers, Qualifiers & Conversion: OPERATING SYSTEMS I (CS_374_400_S2024)
#include <stdio.h>
#define print_size(t) \
printf("sizeof (" #t ") == %zu\n", sizeof (t))
int main(void) {
print_size(char);
print_size(short);
print_size(long);
print_size(long long);
print_size(int);
return 0;
}
Type Qualifiers
C provides a few type qualifiers, const , volatile , and restrict , of which const is the most
(mis)used.
const
Optimizing compilers examine the usage of objects in a program--if they can determine
conclusively that an object is never modified, they can make a lot of important optimizations. For
example, they can store such an object in read-only memory, and if the value of such an object
is known at compile-time, it can be directly substituted into any expressions where that object is
evaluated. Additionally, if an object is never modified, the compiler can reorder accesses to it in
order to optimize the overall program. All of these tricks can dramatically improve the
performance and resource usage of a program.
However, there are certain boundaries that are opaque to a compiler--for example, it cannot
possibly determine if externally linked objects are ever modified outside of the current source
file. Likewise, if the address of an object is ever passed to an external function defined in
another source file, the compiler can no longer determine that such an object is never modified.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-data-types-modifiers-qualifiers-and-conversion?module_item_id=24111748 6/15
6/8/24, 1:58 PM Exploration: Data Types, Modifiers, Qualifiers & Conversion: OPERATING SYSTEMS I (CS_374_400_S2024)
For example, carefully compare how the compiler is able or unable to optimize the following
three functions,
/* z has internal linkage, but its address is still exposed externally. The
* compiler cannot guarantee that z is never modified. No optimizations are
* possible, just as with x */
static int z = 37;
int *zp = &z; /* expose the address of z externally */
The C language gives the programmer the ability to explicitly assure the compiler than an object
is never modified, by qualifying its type with the const qualifier. This allows the compiler to
make optimizations it otherwise would have not been able to, but requires the programmer to
ensure that the constraint is observed--it is undefined behavior for a const-qualified object to be
modified. Notice how const-qualifying the declarations in the example above allow the compiler
to make the same optimizations in each example,
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-data-types-modifiers-qualifiers-and-conversion?module_item_id=24111748 7/15
Source Editor: C source #1
6/8/24, 1:58 PM Exploration: Data Types, Modifiers, Qualifiers & Conversion: OPERATING SYSTEMS I (CS_374_400_S2024)
/* z has internal linkage, and its address is still exposed externally, but it
* is const-qualified. The compiler has been assured that z is never modified.
* Just like with x, z can be optimized, but unlike y, space must still be
* allocated for it since its address might need to be known */
static int const z = 37;
int const *zp = &z; /* expose the address of z externally */ Edit on Compiler Explorer
Most compilers attempt to warn on attempted modification of const-qualified objects, but they
aren't required to. In fact, in situations where const-qualification helps the compiler make
optimizations it wouldn't have been able to otherwise, it's also not capable of identifying if the
constraint is being violated by an illegal modification. In essence, don't mistake const-
qualification for a safety-net. Actually, it is the opposite--it is a promise made to the compiler that
the programmer will ensure that an object in question is never be modified. A great many bugs
are the result of programmers failing to understand this.
Type Conversion
Type conversion, also called type casting, is a way of changing the data type of one expression
into another data type. In C, type conversion may be implicit or explicit. When a value of one
type is converted to another type, any component of the value, such as a fractional component
on a float to int conversion, that cannot be represented in the target type may be lost--
additionally, for certain types, like the signed integers, conversion from a value outside the
representable range (overflow) is undefined behavior.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-data-types-modifiers-qualifiers-and-conversion?module_item_id=24111748 8/15
6/8/24, 1:58 PM Exploration: Data Types, Modifiers, Qualifiers & Conversion: OPERATING SYSTEMS I (CS_374_400_S2024)
Implicit type conversion is automatically performed by the compiler. A frequent case is when the
compiler performs implicit type conversion on arithmetic expressions that include objects of
different data types, as in the following example:
int main() {
int i = 1;
float f = 10.2;
In the example, the value of a variable is converted from float to int . This causes truncation
of the value as the fractional part is lost.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-data-types-modifiers-qualifiers-and-conversion?module_item_id=24111748 9/15
Source Editor: C source #1
6/8/24, 1:58 PM Exploration: Data Types, Modifiers, Qualifiers & Conversion: OPERATING SYSTEMS I (CS_374_400_S2024)
#include <stdio.h>
int main() {
/* In this example, the addition operation is carried out with types `float`.
* Next, the result, 3.8f, is converted to an `int`, value 3, on assignment.
*/
int a = 1.9f + 1.9f;
printf("a = %d\n", a);
Tip: Be careful when mixing arithmetic data types to avoid issues caused by unexpected
type conversion.
Pointers to Functions
In C we can define pointers to functions, just as we can define pointers to data types. To declare a
pointer to a function, we need to specify the name of the pointer, the return type of the function
and the types of its arguments.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-data-types-modifiers-qualifiers-and-conversion?module_item_id=24111748 10/15
6/8/24, 1:58 PM Exploration: Data Types, Modifiers, Qualifiers & Conversion: OPERATING SYSTEMS I (CS_374_400_S2024)
Example
The following statement declares fptr as a pointer to a function which takes 2 integers as its
arguments and returns an integer.
Notice that this is different from int *fptr(int, int); which instead would declare a function with
return type int * .
Example
In the following program, we declare a function pointer fptr and then initialize this function
pointer to a function mult .
int main(void) {
/* fptr is a pointer to a function that returns an int and takes 2 arguments,
* both of which are int */
int (*fptr)(int, int);
/* We can then call the function `mult` through the pointer `fptr` */
printf("fptr(10, 20) = %d\n", fptr(10, 20));
/* We can then call the function add() through the pointer `fptr` */
printf("fptr(10, 20) = %d\n", fptr(10, 20));
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-data-types-modifiers-qualifiers-and-conversion?module_item_id=24111748 11/15
6/8/24, 1:58 PM Exploration: Data Types, Modifiers, Qualifiers & Conversion: OPERATING SYSTEMS I (CS_374_400_S2024)
The qsort() function sorts an array with nmemb elements of size size . The base argument
points to the start of the array.
The contents of the array are sorted in ascending order according to a comparison function
pointed to by compar , which is called with two arguments that point to the objects being
compared.
The function that must be provided as compar takes two values and returns an integer. It is
required that this function should return
an integer less than zero if the first argument is less than the second argument
0 if the arguments are equal
an integer greater than 0 if the first argument is greater than the second argument.
Exercise
Fill in the code for the function comparator so that it can be passed to qsort .
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-data-types-modifiers-qualifiers-and-conversion?module_item_id=24111748 12/15
Source Editor: C source #1
6/8/24, 1:58 PM Exploration: Data Types, Modifiers, Qualifiers & Conversion: OPERATING SYSTEMS I (CS_374_400_S2024)
#include <stdio.h>
#include <stdlib.h>
int main(void) {
int values[] = {10, 6, 8, 2, 12, 13};
for (int i = 0; i < sizeof values / sizeof values[0]; ++i) { Edit on Compiler Explorer
i tf("%d " l [i])
View the Solution (https://cs374.godbolt.org/#g:!((g:!((g:!((h:codeEditor,i:
(filename:'1',fontScale:14,fontUsePx:'0',j:1,lang:___c,selection:
(endColumn:1,endLineNumber:20,positionColumn:1,positionLineNumber:20,selectionStartColumn:1,
selectionStartLineNumber:20,startColumn:1,startLineNumber:20),source:'%23include+%3Cstdio.h%3
E%0A%23include+%3Cstdlib.h%3E%0A%0A/*+void*+is+used+as+a+%22generic%22+pointer+to+data
+of+any+type.+*/%0Aint+comparator(const+void+*a,+const+void+*b)+%7B%0A++/*+Convert+the+voi
d+pointers+to+int+pointers%0A+++*+Note:+It+is+the+responsibility+of+the+caller+to+ensure+that+a
+and+b+point%0A+++*+to+integers+and+not+something+else,+otherwise+this+would+be+undefined
%0A+++*+behavior%0A+++*/%0A++int+const+*_a+%3D+a%3B%0A++int+const+*_b+%3D+b%3B%0A
++/*+TODO:+Return+the+correct+value+that+represents+the+comparison+of+the+values%0A+++*+_
pointed+at_+by+_a+and+_b+*/%0A++return+*_a+-
+*_b%3B%0A%7D%0A%0Aint+main(void)+%7B%0A++int+values%5B%5D+%3D+%7B10,+6,+8,+2,+12
,+13%7D%3B%0A%0A++qsort(%26values,+sizeof+values+/+sizeof+values%5B0%5D,+sizeof(int),+co
mparator)%3B%0A%0A++for+
(int+i+%3D+0%3B+i+%3C+sizeof+values+/+sizeof+values%5B0%5D%3B+%2B%2Bi)+%7B%0A++++pr
intf(%22%25d+%22,+values%5Bi%5D)%3B%0A++%7D%0A++printf(%22%5Cn%22)%3B%0A++return+
0%3B%0A%7D'),l:'5',n:'0',o:'C+source+%231',t:'0')),k:100,l:'4',m:50,n:'0',o:'',s:0,t:'0'),(g:!((h:executor,i:
(argsPanelShown:'1',compilationPanelShown:'0',compiler:cg132,compilerName:'',compilerOutShown
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-data-types-modifiers-qualifiers-and-conversion?module_item_id=24111748 13/15
6/8/24, 1:58 PM Exploration: Data Types, Modifiers, Qualifiers & Conversion: OPERATING SYSTEMS I (CS_374_400_S2024)
:'0',execArgs:'',execStdin:'',fontScale:14,fontUsePx:'0',j:1,lang:___c,libs:!
(),options:'',source:1,stdinPanelShown:'1',wrap:'1'),l:'5',n:'0',o:'Executor+x86-64+gcc+13.2+
(C,+Editor+%231)',t:'0')),header:(),l:'4',m:50,n:'0',o:'',s:0,t:'0')),l:'3',n:'0',o:'',t:'0')),version:4)
Example
In the following example, we declare two type names, char_p for a pointer to a character and
osu_student for struct student .
Now instead of char* we can use char_p (don't actually do this, it's just for demonstration).
Similarly, instead of struct student we can use osu_student
struct student {
string name;
int id;
string major;
};
void print_student(student s)
{
printf("%s (id: %d) is a %s student at OSU!", s.name, s.id, s.major);
}
int main(void) {
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-data-types-modifiers-qualifiers-and-conversion?module_item_id=24111748 14/15
6/8/24, 1:58 PM Exploration: Data Types, Modifiers, Qualifiers & Conversion: OPERATING SYSTEMS I (CS_374_400_S2024)
Additional Resources
C Data Types Including Details of Modifiers (https://en.wikipedia.org/wiki/C_data_types)
Type Conversion in C Like Languages (https://en.wikipedia.org/wiki/Type_conversion#C-
like_languages)
For a discussion of typedef see Chapter 6.4 in the book Modern C
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-data-types-modifiers-qualifiers-and-conversion?module_item_id=24111748 15/15
6/8/24, 1:58 PM Exploration: Files: OPERATING SYSTEMS I (CS_374_400_S2024)
Exploration: Files
Introduction
Files are key abstractions provided by an OS. For a system programmer, a file is a
stream of bytes that can be accessed as a linear array of bytes. In Unix, all I/O devices are
modeled as files. A very visible aspect of this is using files to provide access to persistent data
stored on secondary storage. But additionally, other I/O devices, such as networks, terminals
and printers are also modeled as files.
Path
A file or a directory can be identified by its path, which specifies a unique location for it A path
can be absolute or relative.
Absolute Path
An absolute path always starts with a / and specifies the location of the file or directory relative
to the root of the directory tree structure.
Example: If we run the shell command pwd , it will display the current working directory as an
absolute path as the following example shows:
$ pwd
/nfs/stak/users/chaudhrn
Relative Path
A relative path starts at the current working directory. It does not start with a /. Here are some
examples of specifying relative paths
foo.txt
The path to the file foo.txt in the current working directory using a dot which is short-cut to the
current working directory
./foo.txt
The path to the file bar.txt in the directory txtFiles in the current working directory
textFiles/bar.txt
The path to the file bar.txt in the directory txtFiles in the current working directory using a
dot
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-files?module_item_id=24111749 1/7
6/8/24, 1:58 PM Exploration: Files: OPERATING SYSTEMS I (CS_374_400_S2024)
./textFiles/bar.txt
The path to the file baz.txt in the parent directory using the double dot short-cut which goes
one level up
../baz.txt
The pathname is the path to the file which can be relative or absolute.
The next argument is flags
This argument must include one of the three access modes that specify how the
process wants to access the file
O_RDONLY for reading only,
O_WRONLY for writing only, or
O_RDWR for both read and write.
In addition it can also include zero or more file creation and file status flags which are
OR-ed together.
The file creation flags impact the behavior of the open operation
The file status flags impact the I/O operations carried out on the file after open
The mode argument specifies the access permission to be set when a new file is created
and is relevant only when a file is to be created and is ignored otherwise.
Example
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-files?module_item_id=24111749 2/7
6/8/24, 1:58 PM Exploration: Files: OPERATING SYSTEMS I (CS_374_400_S2024)
3_5_open_close.c @cs344
In the example
We are opening the file for reading and writing using the flag O_RDWR.
We are specifying two file creation flags O_CREAT and O_TRUNC . The flag O_CREAT asks for
creating an empty file if one does not exist and O_TRUNC asks for truncating the file if it
already exists. Another useful flag is O_APPEND . If we want every write operation to append to
the end of the file, we will use the flag O_APPEND .
The values of the mode argument create the file so that it is readable and writeable by the
file owner. We discuss file permissions in a later exploration in this module.
If the open system call succeeds it returns a file descriptor, which is a small non-negative
integer. The file descriptor can be used in subsequent calls to read from and write to the file. If
the open system call fails, it returns the value -1 and the reason for the error is stored in errno .
And can be printed by using perror function.
When a process is done with a file, it can close it by calling the close function:
Exercise
The following C program tries to open a non-existent file. In the code, set the flags argument
so that the open system call fails. What is the error message printed by perror ?
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-files?module_item_id=24111749 3/7
6/8/24, 1:58 PM Exploration: Files: OPERATING SYSTEMS I (CS_374_400_S2024)
3_4_open_fail.c @cs344
The read system call reads from the file descriptor fd starting at the current file pointer into
the buffer pointed to by buf . The argument count specifies how many bytes read should
attempt to read. Read returns the number of bytes it transferred into the buffer. If there is an
error, the value of -1 is returned. If read encounters end-of-file, it returns 0.
The read system call does not allocate memory for the buffer. Thus, the buffer must be large
enough to accommodate count bytes. Read also does not put in a null character at the end of
the buffer. If needed, we must do that ourselves and the buffer must be large enough to add that
character. There are cases where read will transfer fewer than count bytes. This will happen
when, e.g., we are near the end of a file and read encounters the end of the file before count
bytes. At the end of read, the file offset will be incremented by the number of bytes read by read.
The write system call reads bytes from the buffer buf to the file corresponding to the file
descriptor fd . The argument count specifies the number of bytes to write. Write returns the
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-files?module_item_id=24111749 4/7
6/8/24, 1:58 PM Exploration: Files: OPERATING SYSTEMS I (CS_374_400_S2024)
number of bytes it writes to the file. This number may be less than count if, e.g., there is
insufficient space on the disk where the file is stored. If there is an error, the value of -1 is
returned. The file offset is incremented by the number of bytes that were written. Recall that we
can open a file with the O_APPEND flag. In that case, the file offset will first be set to the end of
the file and then write will write the bytes to the file.
Exercise
Run the following example of writing to a file and reading from it:
3_4_read.c @cs344
The following example is only slightly different from the above program. However, the read
system call does not read anything from the file. Why is that?
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-files?module_item_id=24111749 5/7
6/8/24, 1:58 PM Exploration: Files: OPERATING SYSTEMS I (CS_374_400_S2024)
3_4_read_fail.c @cs344
Answer
The lseek system call repositions the file offset associated with the file descriptor fd to the
argument offset based on the directive whence whose possible values include the following:
SEEK_SET set the file pointer to the byte specified in the argument offset
E.g., move to byte #16
lseek(fd, 16, SEEK_SET)
SEEK_CUR set the file pointer to the current value plus offset bytes
E.g., move forward 4 bytes
lseek(fd, 4, SEEK_CUR)
SEEK_END set the file pointer to the end of file plus offset bytes
E.g., move to 8 bytes from the end
lseek(fd, -8, SEEK_END)
Exercise
Modify the following program to read from the start of the file by using lseek system call:
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-files?module_item_id=24111749 6/7
6/8/24, 1:58 PM Exploration: Files: OPERATING SYSTEMS I (CS_374_400_S2024)
3_4_read_fail.c @cs344
Additional Resources
Here are some references to learn more about the topics we discussed in this exploration:
The man page for the open system call is available here (https://man7.org/linux/man-
pages/man2/open.2.html)
File manipulation is discussed in Chapter 4 of the book The Linux Programming Interface.
Kerrisk, M. (2010). The Linux programming interface : a Linux and UNIX system programming handbook. San
Francisco: No Starch Press.
A good discussion of files is provided in the chapter Files and Directories
(http://pages.cs.wisc.edu/~remzi/OSTEP/file-intro.pdf) in the book Operating Systems: Three
Easy Pieces
Arpaci-Dusseau, Remzi H., and Andrea C. Arpaci-Dusseau. Operating Systems: Three Easy Pieces, 2018.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-files?module_item_id=24111749 7/7
6/8/24, 1:58 PM Exploration: stdin, stdout, stderr & C I/O library: OPERATING SYSTEMS I (CS_374_400_S2024)
Schema of POSIX and C standard input, output and error: the image shows that a process
standard input is associated with the keyboard, standard output and standard error with the text
terminal.
Standard input has the file descriptor value of 0. A program reads its input data from standard
input. Instead of using the magic number 0, we can use the constant STDIN_FILENO for the file
descriptor which is defined in unistd.h . When we run a program from a shell, by default standard
input is associated with the keyboard.
Example: The echo commands displays a line of text. By default, it will display that line of text on
the terminal
We can redirect this output to a file named echo.txt by using the output redirect operator as
follows:
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-stdin-stdout-stderr-and-c-i-slash-o-library?module_item_id=24111750 2/4
6/8/24, 1:58 PM Exploration: stdin, stdout, stderr & C I/O library: OPERATING SYSTEMS I (CS_374_400_S2024)
Writing to a Stream
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-stdin-stdout-stderr-and-c-i-slash-o-library?module_item_id=24111750 3/4
6/8/24, 1:58 PM Exploration: stdin, stdout, stderr & C I/O library: OPERATING SYSTEMS I (CS_374_400_S2024)
size_t fwrite(const void *ptr, size_t size, size_t nmemb, FILE *stream)
Repositioning a Stream
stdio also provides functions (https://man7.org/linux/man-pages/man3/fseek.3.html) to reposition
a stream. One such function is fseek which is stdio version of lseek , and sets the file position
indicator for the specified stream
int fileno(FILE *stream) to get the file descriptor associated with a stream
void clearerr(FILE *stream) to clear the end-of-file and error indicators for a stream
As an alternative, we can use fgets or getline to read user input. In certain cases, this can
require us to do some parsing of the user input that fscanf can automatically do based on the
formatting string provided to it.
Additional Resources
Here are some references to learn more about the topics we discussed in this exploration:
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-stdin-stdout-stderr-and-c-i-slash-o-library?module_item_id=24111750 4/4
6/8/24, 1:58 PM Exploration: Directories: OPERATING SYSTEMS I (CS_374_400_S2024)
Exploration: Directories
Introduction
In this exploration, we study directories. We first look at directories and API for
manipulating them. We then discuss how we can get meta-data about files and directories.
Directories
A directory is in many ways a file. In Unix, it is just a text file. Logically, a directory is a linear
array of bytes. Internally, it connects a human readable text filename to an inode. An inode is a
structure that is maintained by the OS file system. Each entry in a directory refers to either files
or to other directories.
We can create and remove directories using mkdir and rmdir system calls:
mkdir
rmdir
opendir
To open a directory, we use the function opendir (https://man7.org/linux/man-
pages/man3/opendir.3.html)
#include <sys/types.h>
#include <dirent.h>
DIR *opendir(const char *name);
The opendir function opens the directory corresponding to the specified name and returns a
pointer to the directory stream. This stream is positioned at the first entry in the directory. Note
that there is a file descriptor associated with the directory stream, but the directory stream
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-directories?module_item_id=24111751 1/5
6/8/24, 1:58 PM Exploration: Directories: OPERATING SYSTEMS I (CS_374_400_S2024)
provides us a higher level abstraction with the file descriptor automatically associated with the
stream.
closedir
To close a directory, we use the function closedir (https://man7.org/linux/man-
pages/man3/closedir.3.html)
#include <sys/types.h>
#include <dirent.h>
int closedir(DIR *dirp);
The closedir function closes the directory stream corresponding to the argument dirp . This
call will also close the underlying file descriptor.
readdir
To read the directory entries, we use the function readdir (https://www.man7.org/linux/man-
pages/man3/readdir.3.html)
#include <dirent.h>
struct dirent *readdir(DIR *dirp);
The readdir function returns a pointer to a structure dirent which corresponds to the next
directory entry in the directory stream associated with dirp . When we have reached the end of
the directory stream, readdir will return a NULL value.
The dirent structure has many fields. Only 2 fields are mandated by POSIX.1 that correspond
to the inode number and the filename. The structure definition is shown below with those 2
fields:
struct dirent {
ino_t d_ino; /* Inode number */
…
char d_name[256]; /* Null-terminated filename */
};
Example
Let us look at example of using these functions to read a directory:
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-directories?module_item_id=24111751 2/5
6/8/24, 1:58 PM Exploration: Directories: OPERATING SYSTEMS I (CS_374_400_S2024)
3_4_directory.c @cs344
stat
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
int stat(const char *pathname, struct stat *statbuf);
Example
In the following program, we use the stat function to find out the name of the file or directory
whose name starts with the prefix student and which was modified last in the current directory:
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-directories?module_item_id=24111751 3/5
6/8/24, 1:58 PM Exploration: Directories: OPERATING SYSTEMS I (CS_374_400_S2024)
3_5_stat_example.c @cs344
In this program
We open the current directory and loop through all the entries in the directory by calling
readdir in a loop
We get the meta-date for the entry, which may be a file or a directory, by calling stat
The last modification time of the entry is specified in the element st_mtime
We use the C standard library function difftime (https://man7.org/linux/man-
pages/man3/difftime.3.html) to compare st_mtime with the latest modification time we have
currently seen
Whenever we find an entry with a last modification time later than any entry previously seen,
we call the function memset (https://www.man7.org/linux/man-pages/man3/memset.3.html)
to clear out the buffer entryName and use strcpy to copy the name of this entry into
entryName
Additional Resources
Here are some references to learn more about the topics we discussed in this exploration:
For more details on the system calls and functions discussed in this exploration see the
Linux manual page web site (https://www.kernel.org/doc/man-pages/) .
Chapter 18 of the book The Linux Programming Interface discuss directories.
Kerrisk, M. (2010). The Linux programming interface : a Linux and UNIX system programming handbook. San
Francisco: No Starch Press.
A good discussion of directories is provided in the chapter Files and Directories
(http://pages.cs.wisc.edu/~remzi/OSTEP/file-intro.pdf) in the book Operating Systems: Three
Easy Pieces
Arpaci-Dusseau, Remzi H., and Andrea C. Arpaci-Dusseau. Operating Systems: Three Easy Pieces, 2018.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-directories?module_item_id=24111751 4/5
6/8/24, 1:58 PM Exploration: Directories: OPERATING SYSTEMS I (CS_374_400_S2024)
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-directories?module_item_id=24111751 5/5
6/8/24, 1:58 PM Exploration: Permissions: OPERATING SYSTEMS I (CS_374_400_S2024)
Exploration: Permissions
Introductions
In this exploration we will discuss file and directory permissions in Unix. We will also understand
how to specify permissions when creating a file or directory.
Read: Read permission on a file allows reading the file. Read permission on a directory allows
reading the names of files in the directory.
Write: Write permission on a file allows modifying the file. Write permission on a directory
allows creating, deleting and renaming the files, but only if the execute permission is also
granted on the directory.
Execute: Execute permission on a file allows an executable program to be executed. Without
this permission the OS will not allow execution of a file. Execute permission on a file allows
access to file content and meta-data if the file name is known. Reading the names of the files
requires read permission.
This means that in all there are 9 permissions associated with a file or directory:
You can see the permissions by using the command ls -l which displays these permissions in a
symbolic notation consisting of 3 sets of 3 characters, of which:
In each set:
Example
Let us look at an example:
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-permissions?module_item_id=24111752 1/5
6/8/24, 1:58 PM Exploration: Permissions: OPERATING SYSTEMS I (CS_374_400_S2024)
ls -l
total 20
-rw-------. 1 chaudhrn upg11000 1147 Apr 19 11:39 catch-signals.c
-rwxr-x---. 1 chaudhrn upg11000 228 Apr 19 11:39 sigtest.sh
The owner is chaudhrn and the owner permissions are set to rw- . This means the file can be
read and written to by the owner chaudhrn , but the owner cannot execute it.
The group is upg11000 and the group permissions are set to --- . This means no permissions
have been granted to the group
The permissions for others are also set to --- . This means that anyone who is not the owner
or is in the group upg11000 doesn't have any permissions on the file.
The owner chaudhrn has permissions rwx and is therefore allowed to read, write and execute
the file,
All members of the group upg11000 have the permissions r-x and are allowed to read and
execute the file, but cannot write the file.
Others don't have any permissions on the file.
You may have noticed that there is one character with the value - that appears before the
owner's permissions. This character shows the type of the file and - means that it is a simple file.
For a directory, this character will have the value d .
The value of the octal digit for a scope corresponds to the symbolic notation as follows:
Thus, for a particular scope the value of the octal digit corresponds to the permissions for the
scope as follows:
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-permissions?module_item_id=24111752 2/5
6/8/24, 1:58 PM Exploration: Permissions: OPERATING SYSTEMS I (CS_374_400_S2024)
Now let's look at examples of the full 9 permissions in the symbolic notation along with
the corresponding octal notation (recall that octal integer literal have a 0 at the start, just as hexa-
decimal literals have a 0x at the start).
Example
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-permissions?module_item_id=24111752 3/5
6/8/24, 1:58 PM Exploration: Permissions: OPERATING SYSTEMS I (CS_374_400_S2024)
3_5_open_close.c @cs344
The value of the mode argument is 0600. This corresponds to the symbolic notation rw------- .
This means that the owner can read and write to the file, but cannot execute it. There are no
permission granted to group and others on this file.
of the file permissions. However, while the mode argument specifies the permissions that should
be granted to the new file or directory, the umask value of the process specifies the permissions
that must be denied to the new file or directory. umask thus helps to guard against cases where a
user might inadvertently grant dangerous permissions to a group or to others.
Example
Suppose we are calling mkdir with the mode argument set to 0777, i.e., grant all permissions on
the directory to the owner, the group and others. Consider the value of umask for the process is
set to 0007, i.e., deny grant of all permissions to others. Then the directory will be created with the
permissions 0770.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-permissions?module_item_id=24111752 4/5
6/8/24, 1:58 PM Exploration: Permissions: OPERATING SYSTEMS I (CS_374_400_S2024)
We can view the current value of umask by using the shell command umask
We can change the value of umask in a process by calling the umask system call
(https://man7.org/linux/man-pages/man1/umask.1p.html) .
Additional Resources
Here are some references to learn more about the topics we discussed in this exploration:
The example table showing symbolic and octal representation is from the Wikipedia page on
File-system permissions (https://en.wikipedia.org/wiki/File-
system_permissions#Numeric_notation) (last accessed Oct 10, 2020).
Chapter 15 of the book The Linux Programming Interface discusses file attributes.
Kerrisk, M. (2010). The Linux programming interface : a Linux and UNIX system programming handbook. San
Francisco: No Starch Press.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-permissions?module_item_id=24111752 5/5
6/8/24, 1:59 PM Exploration: Process Concept & States: OPERATING SYSTEMS I (CS_374_400_S2024)
Process ID
In Unix, each process has a unique identity, called the process ID or pid. The process ID is a
non-negative integer. At any given time, all processes are guaranteed to have a unique ID.
However, when a process terminates, its ID can be reused. The value of the process ID can be
obtained by using the getpid() function.
#include <sys/types.h>
#include <unistd.h>
pid_t getpid(void);
A related function getppid() returns the process id of a process’s parent process (we will look
at parent and child processes later in this module). Let’s see an example of calling these
functions:
int main(void) {
printf("My pid is %d\n", getpid());
printf("My parent's pid is %d\n", getppid());
return 0;
}
The process with ID 1 is generally the init process. When the OS is finishing its startup, it
starts the init process.
Process API
Unix, as well as other modern OSs, provides an API for processes which typically include
systems call related to the following areas:
Creating Processes
When we open a new shell, the OS creates a new process. When we run a command in a shell,
the OS creates another new process.
Each process has its own memory space which is allocated for it when the OS creates this
process. The OS abstracts this memory so that to the process this memory space starts at byte
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-process-concept-and-states?module_item_id=24111756 1/4
6/8/24, 1:59 PM Exploration: Process Concept & States: OPERATING SYSTEMS I (CS_374_400_S2024)
0 even though in the physical memory this allocated memory might start at a different address.
The OS loads this memory with the code of the program that the process is executing. The OS
also allocates memory for the stack and the heap. Recall that C programs use the stack for
function arguments, local variables and return addresses, and the heap for dynamically
allocated memory.
The OS provides an API to create new processes and to run specific programs in a process.
Relevant system calls for this include fork() and the exec() family of system calls.
Terminating Processes
Many processes will stop by themselves when they are complete.
Example: When we run the ls command in a shell, a new process is created to run the
program corresponding to the ls command. This process exits when the command is
completed.
Example: Suppose we start an editor to edit a file. Like all programs, this editor will run in a
process. When we are done updating a file, we want the OS to terminate this process perhaps
by clicking on an exit button.
Example: When a shell runs the ls command, it wants to wait for ls to finish before again
showing the prompt to the user.
Functions wait() and waitpid() allow a process to wait for another process to finish.
In addition to wait() and waitpid() , there are additional system calls related to controlling and
monitoring process. These include the following:
We can get or change the priority of a process by using getpriority() and setpriority()
We can get statistics about the time a process has spent on CPU by using times()
Process States
Over its lifetime, a program may do many different types of operations. E.g., a browser will get
data from a website across the network at some points, and at other times it might write a
downloaded file to the disk. This means that in general a single program cannot utilize the CPU
or the I/O devices all the time.
Modern operating systems provide multi-programming, where they keep multiple processes
ready to run on the CPU, thus giving the user the impression that multiple programs are running
simultaneously. At the level of individual processes this means that a process will be in different
states during its lifetime. The three primary states of a process are as follows:
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-process-concept-and-states?module_item_id=24111756 2/4
6/8/24, 1:59 PM Exploration: Process Concept & States: OPERATING SYSTEMS I (CS_374_400_S2024)
Ready
In the ready (or runnable) state, a process is ready to be put on the CPU. However, currently
the OS has allocated the CPU to some other process. Or in the case of a system with multiple
CPUs, the OS has allocated all the CPUs to some other processes.
Running
In the running state, the process is running on the CPU, i.e., the instructions of the process are
being executed.
Waiting
In the waiting (or blocked) state, a process is waiting for some event to happen before it can
be ready to run again. An example is when a process initiates an I/O operation and then has to
wait until the I/O operation completes. The process goes to the waiting state and the CPU can
be allocated to another process.
Additional Resources
For more discussion see the chapter The Abstration: The Process
(http://pages.cs.wisc.edu/~remzi/OSTEP/cpu-intro.pdf) in the book Operating Systems: Three
Easy Pieces by Remzi H. Arpaci-Dusseau and Andrea C. Arpaci-Dusseau.
Arpaci-Dusseau, R. H., & Arpaci-Dusseau, A. C. (2018). Operating systems: Three easy pieces. Arpaci-Dusseau
Books LLC.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-process-concept-and-states?module_item_id=24111756 3/4
6/8/24, 1:59 PM Exploration: Process Concept & States: OPERATING SYSTEMS I (CS_374_400_S2024)
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-process-concept-and-states?module_item_id=24111756 4/4
6/8/24, 1:59 PM Exploration: Process API – Creating and Terminating Processes: OPERATING SYSTEMS I (CS_374_400_S2024)
#include <sys/types.h>
#include <unistd.h>
pid_t fork(void);
The new process is referred to as the child process, while the original process that called
fork() is referred to as the parent process. The child process is almost an exact duplicate of
the parent process. The child process has its own memory space which is a copy of the parent's
memory space. The program counter is also copied and will have the same value for the child
process. But there are some differences. The most crucial difference is that fork() returns the
value 0 in the child process, while the pid of the child process is returned by fork() in the
parent process. This allows the child process and the parent process to diverge in their
behavior. Typically, the statement executed immediately after fork() is a branching statement
( if-else or switch ) which uses the value returned by fork() to pick a different branch to
execute in the child process vs. the branch picked for execution in the parent process.
Example
The following program has an integer variable intVal with initial value 10. When we run the
program, the process running the program calls fork() . This creates a new child process. The
switch statement uses the value returned by fork() to run different code in the child process
vs. the parent process. In the child process, 1 is added to the variable intVal , whereas in the
parent process 1 is subtracted from the variable intVal . When you run the program, you will
see that in the parent the value of the variable intVal is 9 and in the child it is 11.
If you run the program multiple times, it is very likely that the order in which the parent output
and the child output are printed will vary across multiple runs of this program. This is because
the order in which statements in the child process and the statements in the parent process get
executed depends on when the OS schedules these two processes and thus can vary over
multiple runs of the program. There may also be runs in which the child process output is not
printed. We will study the reason for that later in the course.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-process-api-creating-and-terminating-processes?module_item_id=24111757 1/4
6/8/24, 1:59 PM Exploration: Process API – Creating and Terminating Processes: OPERATING SYSTEMS I (CS_374_400_S2024)
4_fork_example.c @cs344
So why does the value of the variable differ in the parent process vs. the child process? This is
because the child process runs in its own memory space. At the time fork is executed this
memory space is a duplicate of the parent process’s memory space. This means that both the
processes have the same variables and these variables have the same value. But as the
example demonstrates, the memory writes done by one process do not affect the memory of the
other process. Hence, when the child updates the value of a variable, it is done in its own
memory space and does not impact the memory space of the parent process. Similarly, when
the parent process updates the value of a variable, it is done in the parent process's memory
space and does not impact the memory space of the child process.
Tree of Processes
When a process calls fork() to spawn a child, the child process may in turn calls fork() to
create its own child, and so on. Thus, over time the parent-child relationship between different
processes leads to a tree structure.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-process-api-creating-and-terminating-processes?module_item_id=24111757 2/4
6/8/24, 1:59 PM Exploration: Process API – Creating and Terminating Processes: OPERATING SYSTEMS I (CS_374_400_S2024)
process with PID 648 is created. Later process with PID 538 again calls fork and another child
process with PID 932 is created. The process with PID 932 calls fork and its child process with
PID 1008 is created.
Failure of fork()
If fork fails it returns the value of -1. The most likely reasons are either that there are too many
processes already running in the system or that creating the new processes will cause the
resource limit to exceed.
Terminating Processes
A process can terminate for two reasons:
Normal Termination
The process completes its execution and exits itself. This is considered normal termination.
There are two cases to consider:
Case 1:
The process completed what it was supposed to do. This happens, e.g., when a process
executes a return statement from its main function. Note that when the main function
executes a return statement, internally this calls the exit() function with the value specified in
the return statement. E.g., when main executes return 0; this results in the function call
exit(0) .
Case 2:
The process encountered an error condition, recognized it, and exited by calling the exit
function. This is still considered normal termination because the process is terminating itself,
rather than the process being terminated by some other process.
#include <stdlib.h>
void exit(int status);
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-process-api-creating-and-terminating-processes?module_item_id=24111757 3/4
6/8/24, 1:59 PM Exploration: Process API – Creating and Terminating Processes: OPERATING SYSTEMS I (CS_374_400_S2024)
By convention, when the process successfully completes what it was supposed to do, i.e., Case
1 above, the status value should be 0 and for Case 2, the status value should be non-zero. The
value of the exit status is transmitted to the parent process.
It is possible to register functions, termed exit handlers, by calling the function atexit() . The
exit function calls all these exit handler functions. It flushes all the standard IO streams. It then
calls the _exit() function. The _exit() function closes all the files and cleans up the process.
Abnormal Termination
A process termination is considered abnormal when it receives a signal which causes it to
terminate. We will study signals in the next module.
Additional Resources
Here are some references to learn more about the topics we discussed in this exploration:
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-process-api-creating-and-terminating-processes?module_item_id=24111757 4/4
6/8/24, 1:59 PM Exploration: Process API - Monitoring Child Processes: OPERATING SYSTEMS I (CS_374_400_S2024)
wait
#include <sys/wait.h>
pid_t wait(int *wstatus);
The wait system call blocks until any one of its child processes terminates. It returns the
process ID of the terminated child. If a child process has already terminated, wait will return
immediately. The termination status of that child is put in the memory location pointed to be
wstatus .
Example
In the following example, the parent process blocks when it calls wait and resumes its
execution only after the child process terminates:
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
int main(){
pid_t spawnpid = -5;
int childStatus;
int childPid;
// If fork is successful, the value of spawnpid will be 0 in the child, the child's
pid in the parent
spawnpid = fork();
switch (spawnpid){
case -1:
perror("fork() failed!");
exit(1);
break;
case 0:
// spawnpid is 0 in the child
printf("I am the child. My pid = %d\n", getpid());
break;
default:
// spawnpid is the pid of the child
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-process-api-monitoring-child-processes?module_item_id=24111758 1/5
6/8/24, 1:59 PM Exploration: Process API - Monitoring Child Processes: OPERATING SYSTEMS I (CS_374_400_S2024)
printf("I am the parent. My pid = %d\n", getpid());
childPid = wait(&childStatus);
printf("Parent's waiting is done as the child with pid %d exited\n", childP
id);
break;
}
printf("The process with pid %d is returning from main\n", getpid());
return 0;
}
waitpid
The wait system call has a few limitations. If the parent process has created several children,
wait does not support waiting for the completion of a specific child, we wait for whichever child
that will terminate next. With the wait system call, there is also no possibility of doing a non-
blocking wait, i.e., if no child has terminated the parent process will block when it calls wait . It
is not possible for the parent process to just check the status of the child and continue to do
something else if it has not terminated. The waitpid system call addresses some of these
limitations of wait .
#include <sys/wait.h>
pid_t waitpid(pid_t pid, int *wstatus, int options);
The value of the argument pid determines which child process or child processes the waitpid
system call will wait for. The options for this value include (among others) the following options:
If the value of pid is greater than 0, then waitpid will wait for the child whose process ID
equals pid .
If the value of pid is -1, then waitpid will wait for any child process, similar to the wait
system call.
Example
In the following example, the parent process creates two child processes and then blocks until a
specific child (the second child) terminates. We have added a call to the function sleep()
(https://man7.org/linux/man-pages/man3/sleep.3.html) in the child processes which will cause
each of these processes to sleep for 10 seconds. This makes sure that the child processes have
not terminated before the parent process calls wait and we can verify that the parent process
is blocked until the second child has terminated.
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
int main(){
int childStatus;
printf("Parent process's pid = %d\n", getpid());
pid_t firstChild = fork();
if(firstChild == -1){
perror("fork() failed!");
exit(1);
} else if(firstChild == 0){
// The first child process execute this
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-process-api-monitoring-child-processes?module_item_id=24111758 2/5
6/8/24, 1:59 PM Exploration: Process API - Monitoring Child Processes: OPERATING SYSTEMS I (CS_374_400_S2024)
printf("First child's pid = %d\n", getpid());
sleep(10);
} else{
// Parent process executes this
// Fork another child
pid_t secondChild = fork();
if(secondChild == -1){
perror("fork() failed!");
exit(1);
} else if(secondChild == 0){
// The second child process execute this
printf("Second child's pid = %d\n", getpid());
sleep(10);
} else{
// Parent process executes this to
// wait for the second child
pid_t childPid = waitpid(secondChild, &childStatus, 0);
printf("The parent is done waiting. The pid of child that terminate
d is %d\n", childPid);
}
}
printf("The process with pid %d is returning from main\n", getpid());
return 0;
}
Example
In the following example, the parent process creates a child process and then calls waitpid
with the option WNOHANG . However, the child process sleeps for 10 seconds. This means that
when the parent calls waitpid the child has not terminated. But because of the use of WNOHANG
the parent process is not blocked and it continues its execution.
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
int main(){
printf("Parent process's pid = %d\n", getpid());
int childStatus;
pid_t childPid = fork();
if(childPid == -1){
perror("fork() failed!");
exit(1);
} else if(childPid == 0){
// Child process executes this branch
sleep(10);
} else{
// The parent process executes this branch
printf("Child's pid = %d\n", childPid);
// WNOHANG specified. If the child hasn't terminated, waitpid will immediat
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-process-api-monitoring-child-processes?module_item_id=24111758 3/5
6/8/24, 1:59 PM Exploration: Process API - Monitoring Child Processes: OPERATING SYSTEMS I (CS_374_400_S2024)
ely return with value 0
childPid = waitpid(childPid, &childStatus, WNOHANG);
printf("In the parent process waitpid returned value %d\n", childPid);
}
printf("The process with pid %d is returning from main\n", getpid());
return 0;
}
Did the child terminate normally, by calling exit() ? Or did the child terminate abnormally?
In case of normal termination, we can find the value of the exit status
In case of abnormal termination, we can find the signal that caused the child to be
terminated.
However, the value of wstatus does not directly provide this information. To get this information,
we must decode wstatus by calling the following macros with the value placed in wstatus
variable:
Macro Description
This macro returns true if the child was terminated normally. Note that
WIFEXITED(wstatus)
exactly one of WIFEXITED and WIFSIGNALED will return true.
If WIFEXITED returned true, WEXITSTATUS will return the status value the
WEXITSTATUS(wstatus) child passed to exit() . Note that if WIFSIGNALED returned true,
WEXITSTATUS may return a garbage value.
This macro returns true if the child was terminated abnormally. Note that
WIFSIGNALED(wstatus)
exactly one of WIFEXITED and WIFSIGNALED will return true.
If WIFSIGNALED returned true, WTERMSIG will return the signal number that
WTERMSIG(wstatus) caused the child to terminate. Note that if WIFEXITED returned true,
WTERMSIGN may return a garbage value.
Example
In the following example, the parent process calls wait to wait for its child process to terminate.
Once the wait call returns, the parent process calls an if statement with WIFEXITED macro in the
condition to check if the process terminated normally. Since the child process terminated
normally, the if statement evaluates to true. The parent process then gets the return value from
the child by using the macro WEXITSTATUS .
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-process-api-monitoring-child-processes?module_item_id=24111758 4/5
6/8/24, 1:59 PM Exploration: Process API - Monitoring Child Processes: OPERATING SYSTEMS I (CS_374_400_S2024)
int main(){
printf("Parent process's pid = %d\n", getpid());
int childStatus;
pid_t childPid = fork();
if(childPid == -1){
perror("fork() failed!");
exit(1);
} else if(childPid == 0){
// Child process
sleep(10);
} else{
printf("Child's pid = %d\n", childPid);
childPid = waitpid(childPid, &childStatus, 0);
printf("waitpid returned value %d\n", childPid);
if(WIFEXITED(childStatus)){
printf("Child %d exited normally with status %d\n", childPid, WEXIT
STATUS(childStatus));
} else{
printf("Child %d exited abnormally due to signal %d\n", childPid, W
TERMSIG(childStatus));
}
}
return 0;
}
Zombie Processes
If the parent of a process does not wait for a child to terminate, then the child becomes a
“zombie” process on termination. Most of the resources allocated to a zombie process are
recycled, e.g., its memory, open file descriptors, etc. However, an entry for a zombie process is
retained in the process table. If the parent process of a zombie later uses waitpid to check the
termination status of this child, then the zombie process is removed from the process table.
If a parent process terminates without cleaning its zombie child processes, then the zombies
become children of the init process. The init process periodically waits for all its terminated
children, including zombies and the zombies will get cleaned up by the init process.
Additional Resources
Here are some references to learn more about the topics we discussed in this exploration.
The Linux man pages give a good overview of wait and waitpid
(https://man7.org/linux/man-pages/man2/wait.2.html) .
Chapter 26 of the book The Linux Programming Interface discusses process monitoring.
Kerrisk, M. (2010). The Linux programming interface : a Linux and UNIX system programming handbook. San
Francisco: No Starch Press.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-process-api-monitoring-child-processes?module_item_id=24111758 5/5
6/8/24, 1:59 PM Exploration: Process API - Executing a New Program: OPERATING SYSTEMS I (CS_374_400_S2024)
Note: There is no function with the specific name exec . Instead, the name exec is used to
refer to the family of exec functions.
#include <unistd.h>
int execl(const char *pathname, const char *arg, ... /* (char *) NULL */);
int execlp(const char *filename, const char *arg, ... /* (char *) NULL */);
int execle(const char *pathname, const char *arg, ... /*, (char *) NULL, char * const envp[]
*/);
int execv(const char *pathname, char *const argv[]);
int execvp(const char *filename, char *const argv[]);
int execve(const char *pathname, char *const argv[], char *const envp[]);
execv
We first look at execv, before looking at the other functions and the differences between them.
The function execv executes the file provided as the argument pathname . The file corresponding
to pathname must be either a binary executable, or an executable script. The value of pathname
can be absolute, e.g., /bin/myfile ) or relative, e.g., mydir/myfile .
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-process-api-executing-a-new-program?module_item_id=24111759 1/8
6/8/24, 1:59 PM Exploration: Process API - Executing a New Program: OPERATING SYSTEMS I (CS_374_400_S2024)
The array argv is an array of strings that is passed as an argument to the new program. The first
element of this array, i.e., argv[0] , must be the same as pathname . The last element of argv
must be a null pointer.
Example
Suppose we want to run the command ls -al in a program without creating a new process. In
the following example, we use execv to do that. In the call to execv , we pass "/bin/ls" as the
pathname . We pass the following array as argv
The first element of the array is again the path to the program that should replace the existing
program. The second element is "-al" which is the argument we want to pass to the ls
command. The last element is a NULL.
When execv is called the program in the original process is replaced by the executable file
/bin/ls and the program /bin/ls is passed the argument -al
int main(){
// Run /bin/ls and pass it the argument -al
char *newargv[] = { "/bin/ls", "-al", NULL };
execv(newargv[0], newargv);
/* exec returns only on error */
perror("execv");
exit(EXIT_FAILURE);
}
Consider the following program gets compiled to an executable file called my_shell
int main(){
// Run /bin/ls and pass it the argument -al
char *newargv[] = { "/bin/ls", "-al", NULL };
execv(newargv[0], newargv);
/* exec returns only on error */
perror("execv"); // This and the following statement will only be executed if execv retur
ns an error
exit(EXIT_FAILURE);
}
Let's suppose a process with PID 87 is running my_shell . This means the code segment of
process 87 is loaded with the executable file my_shell . When my_shell executes the statement
that calls execv() then
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-process-api-executing-a-new-program?module_item_id=24111759 2/8
6/8/24, 1:59 PM Exploration: Process API - Executing a New Program: OPERATING SYSTEMS I (CS_374_400_S2024)
If execv() succeeds, then the code segment of the process 87 is updated so that the
executable my_shell is replaced in memory by the executable for the program /bin/ls and
the process 87 executes the instructions of /bin/ls until that program exits and the process
terminates.
If execv()fails, the code segment of process 87 is still loaded with the executable for
my_shell . In this case, execv() returns a value back to my_shell . If this happens, then the
statements following the call to execv() in my_shell code are executed. These statement
print the error message and then the last statement calls the exit() function, at which point
the process terminates.
The parent process corresponds to the shell and displays a prompt to the user.
When the user enters a command, the parent process uses fork to create a child process.
The parent process then waits for the child to terminate.
This child process, which is a copy of the parent and thus is loaded with the shell program,
uses execv to replace the shell program with the program corresponding to the command
entered by the user.
When the command ends, the child process terminates.
When the child process terminates, the parent now loops back and again shows the prompt to
the user for entering the next command.
Example
In the following program, the parent process forks a child process and then waits for the
termination of this child. The child process uses execv to run the ls -al command. When the
child process terminates, the parent process resumes its execution.
/*
The following program forks a child process. The child process then replaces
the program using execv to run "/bin/ls". The parent process waits for the
child process to terminate.
*/
int main(){
char *newargv[] = { "/bin/ls", "-al", NULL };
int childStatus;
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-process-api-executing-a-new-program?module_item_id=24111759 3/8
6/8/24, 1:59 PM Exploration: Process API - Executing a New Program: OPERATING SYSTEMS I (CS_374_400_S2024)
switch(spawnPid){
case -1:
perror("fork()\n");
exit(1);
break;
case 0:
// In the child process
printf("CHILD(%d) running ls command\n", getpid());
// Replace the current program with "/bin/ls"
execv(newargv[0], newargv);
// exec only returns if there is an error
perror("execve");
exit(2);
break;
default:
// In the parent process
// Wait for child's termination
spawnPid = waitpid(spawnPid, &childStatus, 0);
printf("PARENT(%d): child(%d) terminated. Exiting\n", getpid(), spawnPid);
exit(0);
break;
}
}
The differences between the functions are summarized in the following table:
The first difference is that the functions whose name ends with a p , i.e., execvp and execlp ,
take a filename as an argument and search the PATH environment variable for an executable with
this filename. As opposed to this, the other functions, i.e., execl, execv, execle and execve , are
provided with the pathname of the executable.
The second difference between the functions is how arguments for the new program are passed
to the function. This is indicated by the letter l for list or v for vector immediately after exec
in the names of the functions:
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-process-api-executing-a-new-program?module_item_id=24111759 4/8
6/8/24, 1:59 PM Exploration: Process API - Executing a New Program: OPERATING SYSTEMS I (CS_374_400_S2024)
The l in the name indicates a list of arguments. execl functions require that each
argument that we want to pass to the new program should be provided as a separate
argument to the execl function. This list of arguments must be terminated by an argument
with NULL value. The functions execl, execlp and execle thus require a list of arguments.
The v in the name indicates a vector , i.e., an array, of arguments. execv functions require
that the arguments to be passed to the new program should be provided as an array of
strings terminated with a NULL element. The functions execv, execvp and execve thus
require an array of arguments.
The third difference between the functions is that the functions whose name ends in e , i.e.,
execle and execve , allow passing in a new list of environment variables to the new program. This
is provided via the last argument envp which is an array of strings which by convention have the
form key=value . Just like argv , the last element of envp must be a null pointer. The other four
functions, whose name does not end in e, run the new program with a copy of the existing
environment. (Environments are discussed in the next exploration).
Examples
execl
int execl(const char *pathname, const char *arg, ... /* (char *) NULL */);
The execl function takes the new program to run as the first argument. This is followed by the list
of command line arguments to be passed to the new program. This list has the new program to
run as the first argument and the last argument in the list is NULL.
The following example uses execl to run the ls program passing it the argument -al :
int main(){
// Run /bin/ls and pass it the argument -al
execl("/bin/ls", "/bin/ls", "-al", NULL);
/* exec returns only on error */
perror("execl");
exit(EXIT_FAILURE);
}
The following example also uses execl to run ls -al but forks a child process to run this
command while the parent process waits for the termination of the child process:
/*
The following program forks a child process. The child process then replaces
the program using execv to run "/bin/ls". The parent process waits for the
child process to terminate.
*/
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-process-api-executing-a-new-program?module_item_id=24111759 5/8
6/8/24, 1:59 PM Exploration: Process API - Executing a New Program: OPERATING SYSTEMS I (CS_374_400_S2024)
int main(){
int childStatus;
switch(spawnPid){
case -1:
perror("fork()\n");
exit(1);
break;
case 0:
// In the child process
printf("CHILD(%d) running ls command\n", getpid());
// Replace the current program with "/bin/ls"
execl("/bin/ls", "/bin/ls", "-al", NULL);
// exec only returns if there is an error
perror("execl");
exit(2);
break;
default:
// In the parent process
// Wait for child's termination
spawnPid = waitpid(spawnPid, &childStatus, 0);
printf("PARENT(%d): child(%d) terminated. Exiting\n", getpid(), spawnPid);
exit(0);
break;
}
}
execlp
int execlp(const char *filename, const char *arg, ... /* (char *) NULL */);
The execlp function differs from execl function in that the first argument is the name of the file
and execlp will search the PATH environment variable for the executable with this file name.
The following example uses execlp to run the ls program passing it the argument -al :
int main(){
// Run ls and pass it the argument -al
execlp("ls", "ls", "-al", NULL);
/* exec returns only on error */
perror("execlp");
exit(EXIT_FAILURE);
}
Note that the first argument to execlp is simply "ls" instead of "/bin/ls" which in the example
for execl was the first argument. This is because execlp finds ls by searching the path and we
do not need to specify the complete path as we had to do for execl .
execv
In all the previous exec examples, the new program we ran is the ls program. Now we look at
an example where a program uses exec to run an arbitrary program that is provided to it as an
argument.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-process-api-executing-a-new-program?module_item_id=24111759 6/8
6/8/24, 1:59 PM Exploration: Process API - Executing a New Program: OPERATING SYSTEMS I (CS_374_400_S2024)
Note: to view all the files in a repl.it, click the "Files" icon on the left-side of the repl.it.
In the following example, adapted from the Linux man pages (https://man7.org/linux/man-
pages/man2/execve.2.html) , the program execv_example.c uses execv to run a new program that
is provided to it as an argument. It passes the array newargv as arguments for the new program
to run.
If you hit the “run” button, the commands in the file .replit will be executed. The .replit file
syntax is specific to the Rep.it (https://repl.it/) website, but here is an explanation of what the
commands in this file are doing:
When execv_example runs, it uses exec to replace itself with the argument passed to it, i.e.,
the program ./myecho ,
This new program is passed the arguments in the array newargv
When any program is executed, its main() function gets called
In this case, the main function of myecho will get called.
As we see from the code in myecho.c , this main function simply prints out the arguments
passed to it.
#include <stdio.h>
#include <stdlib.h>
/*
* Just print out the arugments passed in argv
*/
int main(int argc, char *argv[]){
int j;
exit(EXIT_SUCCESS);
}
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
if (argc != 2) {
fprintf(stderr, "Usage: %s <file-to-exec>\n", argv[0]);
exit(EXIT_FAILURE);
}
// Put the name of the program being executed as element 0 of the arguments
// passed to the program to be run by execve
newargv[0] = argv[1];
Exercise
Using execv_example.c and myecho.c as models, write the following 2 programs:
hello_world_driver.c that
Prompts the user for their name
Uses execv to call hello_world.c and passes it the name entered by the user
hello_world.c that
Additional Resources
Here are some references to learn more about the topics we discussed in this exploration.
Chapter 27 of the book The Linux Programming Interface discusses the exec family of
functions.
Kerrisk, M. (2010). The Linux programming interface : a Linux and UNIX system programming handbook. San
Francisco: No Starch Press.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-process-api-executing-a-new-program?module_item_id=24111759 8/8
6/8/24, 1:59 PM Exploration: Environment: OPERATING SYSTEMS I (CS_374_400_S2024)
Exploration: Environment
Introduction
Each process has an associated array of strings which is called the process’s
environment. In this exploration, we study process environment.
DISPLAY=:0
HOME=/Users/dmcgrath
LESS=-R
LOGNAME=dmcgrath
LSCOLORS=Gxfxcxdxbxegedabagacad
OLDPWD=/Users/dmcgrath/Documents/OSU Documents
PAGER=less
PATH=/opt/local/bin:/opt/local/sbin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Applicati
ons/VMware Fusion.app/Contents/Public:/opt/X11/bin:/Library/Apple/usr/bin:/Users/dmcgrath/.
antigen/bundles/robbyrussell/oh-my-zsh/lib:/Users/dmcgrath/.antigen/bundles/robbyrussell/oh
-my-zsh/plugins/sudo:/Users/dmcgrath/.antigen/bundles/robbyrussell/oh-my-zsh/plugins/zsh-in
teractive-cd:/Users/dmcgrath/.antigen/bundles/zdharma/fast-syntax-highlighting:/Users/dmcgr
ath/.antigen/bundles/zsh-users/zsh-autosuggestions:/Users/dmcgrath/.antigen/bundles/greymd/
docker-zsh-completion:/Users/dmcgrath/.antigen/bundles/srijanshetty/zsh-pip-completion:/Use
rs/dmcgrath/.antigen/bundles/robbyrussell/oh-my-zsh/plugins/autojump:/Users/dmcgrath/.antig
en/bundles/robbyrussell/oh-my-zsh/plugins/git:/Users/dmcgrath/.antigen/bundles/robbyrussel
l/oh-my-zsh/plugins/docker:/Users/dmcgrath/.antigen/bundles/robbyrussell/oh-my-zsh/plugins/
tmuxinator:/Users/dmcgrath/.antigen/bundles/robbyrussell/oh-my-zsh/plugins/mosh:/Users/dmcg
rath/.antigen/bundles/robbyrussell/oh-my-zsh/plugins/extract:/Users/dmcgrath/.antigen/bundl
es/robbyrussell/oh-my-zsh/plugins/zsh_reload:/Users/dmcgrath/.antigen/bundles/psprint/zsnap
shot:/Users/dmcgrath/.antigen/bundles/ael-code/zsh-colored-man-pages:/Users/dmcgrath/.antig
en/bundles/unixorn/autoupdate-antigen.zshplugin:/Users/dmcgrath/.antigen/bundles/psprint/zs
napshot:/Users/dmcgrath/bin:/opt/010editor/
PWD=/Users/dmcgrath/Documents/OSU Documents/cs344
SHELL=/bin/zsh
SHLVL=1
SSH_AUTH_SOCK=/private/tmp/com.apple.launchd.63nuZ4c4q9/Listeners
TERM=screen-256color
TMPDIR=/var/folders/pj/9z5x9d812dl9p649xhdmyz_h0000gn/T/
TMUX=/tmp//tmux-501/default,1984,20
TMUX_PANE=%30
TMUX_PLUGIN_MANAGER_PATH=/Users/dmcgrath/.tmux/plugins/
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-environment?module_item_id=24111760 1/4
6/8/24, 1:59 PM Exploration: Environment: OPERATING SYSTEMS I (CS_374_400_S2024)
USER=dmcgrath
ZSH=/Users/dmcgrath/.oh-my-zsh
LANG=en_US.UTF-8
_=/usr/bin/printenv
Here are some interesting environment variables shown in the above output:
PATH
The PATH variable contains a list of directories with the entries separated by a colon. When a
user enters a command on the shell, the shell searches these directories, in their order of
appearance in this variable, to find a program corresponding to this command. A PATH variable
entry with just a . means that the current directory will be searched for the command that the
user enters on the shell.
HOME
The HOME variable contains the pathname of the user’s login directory.
PWD
The PWD variable contains the pathname of the current working directory.
Examples
1. We can set the value of a new or existing variable by using the export command. E.g., we
set the value of MYVAR to foo as follows:
export MYVAR=foo
unset MYVAR
3. We can access the value of an environment variable by using bash's echo command and
adding a $ before the name of the variable. E.g., to print the current value of the PATH variable,
we can use the following command:
echo $PATH
4. We can append something to the current value of a variable by getting the value by adding a
$ before the name to get the current value, follow this by the value we want to append to the
current value, and then using export to update the value. E.g., we can append . to the end of
the PATH variable as follows:
export PATH=$PATH:.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-environment?module_item_id=24111760 2/4
6/8/24, 1:59 PM Exploration: Environment: OPERATING SYSTEMS I (CS_374_400_S2024)
getenv
#include <stdlib.h>
char *getenv(const char *name);
The function getenv looks for the environment variable name . If this variable is defined, the
function returns a pointer to its value. If this variable is not in the environment list, then NULL is
returned.
setenv
#include <stdlib.h>
int setenv(const char *name, const char *value, int overwrite);
The function setenv adds the variable name to the environment list with the value value . If the
environment variable name already exists, the behavior is determined by the argument
overwrite .
If overwrite is 0 and the variable name already exists, then the value is not changed.
If overwrite is non-zero and the variable name already exists, then the value is still
changed.
The function setenv creates copies of the strings name and value when adding them to the
environment.
unsetenv
#include <stdlib.h>
int unsetenv(const char *name);
The function unsetenv deletes the variable name from the environment list. If no variable name
exists in the environment, the function leaves the environment unchanged and still succeeds.
Example
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-environment?module_item_id=24111760 3/4
6/8/24, 1:59 PM Exploration: Environment: OPERATING SYSTEMS I (CS_374_400_S2024)
The following program shows an example of using setenv and getenv functions to add and
read environment variables, and highlights the fact that changing the environment variable in a
child process does not change that variable in its parent process.
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
int main(){
int childStatus;
char* varName = "MYVAR";
// We set the value of MYVAR to foo in the parent process
setenv(varName, "foo", 1);
Additional Resources
Chapter 7 of the book The Linux Programming Interface discusses process environment
Kerrisk, M. (2010). The Linux programming interface : a Linux and UNIX system programming handbook. San
Francisco: No Starch Press.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-environment?module_item_id=24111760 4/4
6/8/24, 1:59 PM Exploration: Shell Commands Related to Processes: OPERATING SYSTEMS I (CS_374_400_S2024)
Basic ps
The basic ps command, without any options or arguments, displays the processes that are
running from the current terminals for the user who is logged in. The information displayed
includes the process ID in the PID column, and the command corresponding to the process in
the CMD column, as shown in the following example:
$ ps
PID TTY TIME CMD
19766 pts/18 00:00:00 bash
20096 pts/18 00:00:00 ps
We will not go into understanding the details of this command, but here are a few high-level
comments about the command:
The ; character separates commands. When the first command ends, the second will be run.
The | character is the pipe command. It sends the output of the command before the pipe to
the input of the command after the pipe. Or in other words, it redirects the standard output of
the command before the pipe to the standard input of the command after the pipe (we will
discuss Unix pipes in greater detail in a later module).
The augmented command is thus composed of the following two commands
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-shell-commands-related-to-processes?module_item_id=24111764 1/6
6/8/24, 1:59 PM Exploration: Shell Commands Related to Processes: OPERATING SYSTEMS I (CS_374_400_S2024)
ps -o ppid,pid,pgid,sid,euser,stat,%cpu,rss,args | head -n 1
ps -eH -o ppid,pid,pgid,sid,euser,stat,%cpu,rss,args | grep chaudhrn
The first command
Runs ps and
The output is piped to the head -n 1 command which takes only the first row of ps
output, which is displayed as the header of augmented command
The second command
Runs ps with flags that show the parent-child hierarchy and other info and
The output is piped to the grep command to only show those rows that contain the
pattern chaudhrn
For reference, the column headers in the output of the augmented ps command are described
below:
The following video shows a detailed example of running this command and how to interpret the
output:
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-shell-commands-related-to-processes?module_item_id=24111764 2/6
6/8/24, 1:59 PM Exploration: Shell Commands Related to Processes: OPERATING SYSTEMS I (CS_374_400_S2024)
Additional characters in the state column give more detail about the process. We list the
important values from the man pages of ps
The following video shows an example of running our augmented ps command and interpreting
the state column of the output of the command.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-shell-commands-related-to-processes?module_item_id=24111764 3/6
6/8/24, 1:59 PM Exploration: Shell Commands Related to Processes: OPERATING SYSTEMS I (CS_374_400_S2024)
In the video, we used the following program to create the zombie process. In this program, the
child process exits immediately. However, the parent process sleeps for 10 seconds before
waiting for the child. During these 10 seconds, the child process is in the Zombie state because
it has been terminated but its parent has not yet reaped it.
#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
int main(){
pid_t spawnpid = -5;
int childExitStatus;
spawnpid = fork();
switch (spawnpid){
case -1:
perror("fork() failed!");
exit(1);
break;
case 0:
printf("CHILD termination\n");
break;
default:
printf("PARENT: making child a zombie for 10 seconds");
fflush(stdout);
sleep(10);
spawnpid = waitpid(spawnpid, &childExitStatus, 0);
break;
}
}
Job Control
A job is a group of processes that share the same process group ID. Job Control is a feature of
Unix shells that allows us to start multiple jobs from a single terminal and control the access of
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-shell-commands-related-to-processes?module_item_id=24111764 4/6
6/8/24, 1:59 PM Exploration: Shell Commands Related to Processes: OPERATING SYSTEMS I (CS_374_400_S2024)
these jobs to the terminal. If we use the pipe command to create a pipeline of processes, then
each of these processes has the same process group ID and hence belong to the same job.
Example
will create a job with two processes, one running the ps command and the other running the
grep command. However, if we run a command such as vi to edit a file, then the job consists
When we are at the command prompt, the foreground job consists of one process – the shell
itself. When the user enters a command at a shell that is intended to run in the foreground, i.e.,
a normal command, a process is started to run this command. This process runs to completion
before the user is prompted again. As opposed to this, when a user enters a command that is
intended to be run in the background, the user is immediately prompted again after the process
(or processes) in the command is started. In other words, a background job does not interrupt
the input to the terminal.
We start a job in the background by specifying an ampersand as the last character in the
command. E.g., the following command will run ping in the background:
Standard output and standard error of background jobs still goes to the terminal, i.e., if we run
the ping command in the background, its output will keep printing to the shell.
To see which jobs are running in a terminal, we can use the jobs command. This command
lists the jobs with a job number for each job. Using the -l option with the jobs commands
additionally displays the process ID for the processes in the jobs.
The commands fg and bg allow us to manipulate jobs using the job number provided by the
jobs command. The fg command can bring a job from the background to the foreground. The
bg command can restart a specific background job that is currently stopped while keeping it in
the background.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-shell-commands-related-to-processes?module_item_id=24111764 5/6
6/8/24, 1:59 PM Exploration: Shell Commands Related to Processes: OPERATING SYSTEMS I (CS_374_400_S2024)
Stopping a job
We can stop the foreground job by entering Control-z on the command prompt. This stops the
job and puts it into the background. Control-z sends a TSTP signal to the job. We can also send
this signal using the kill command. E.g., the following command will stop the job with job number
1.
$ kill -TSTP %1
Additional Resources
Job control is discussed in Chapter 34 of The Linux Programming Interface.
Kerrisk, M. (2010). The Linux programming interface : a Linux and UNIX system programming handbook. San
Francisco: No Starch Press.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-shell-commands-related-to-processes?module_item_id=24111764 6/6
6/8/24, 1:59 PM Exploration: Signals – Concepts and Types: OPERATING SYSTEMS I (CS_374_400_S2024)
Basic Concepts
Signals are often termed software interrupts, as they interrupt the normal flow of control, i.e.,
the order in which individual statements are executed in the program of a process. Signals
cause the process to stop the current program and jump to execute the signal handler, which
is a function that is invoked whenever the signal corresponding to this handler is received by the
process. The OS provides default signal handlers for all signals. However, for most signals the
OS supports the capability to register custom signal handlers.
Example: Consider we are running the ping command from a shell. If we enter Control-Z on
this shell, a signal called SIGTSTP is sent to the process running the ping command. The
process will stop executing the ping program and instead will start executing the signal handler
function corresponding to SIGTSTP. If no custom signal handler has been registered for
SIGTSTP, the default signal handler will be executed. The default signal handler for SIGTSTP
stops the program currently running, i.e., ping , and returns control to the shell.
There are a fixed set of signals. This means that we cannot create our own signals. However,
we can control the programmatic response to, as well as the meaning of, most signals. As we
will see there are two signals that have no inherent meaning at all. These are provided so that
we can assign meaning to these signals.
The process has done something wrong, e.g., made an invalid memory reference.
The process had set a timer which has now expired.
A child process has completed execution.
An event associated with the terminal has occurred, e.g., the user entered Control-Z.
The process was communicating with another process which has died.
Common uses of signals from one user process to another may include:
A process wants the other process to change its communication method, e.g., switch to a
different port from the one currently being used for communication.
where
WXYZ is the name of the signal. If the name of the signal is omitted, the TERM signal will be
sent which terminates the process.
PID is the process ID of the process to which we want to send the signal. Note that if the
value of PID is 0 or negative, then it is interpreted differently and the details can be found in
the man pages (https://man7.org/linux/man-pages/man1/kill.1.html) .
We now look at some of the most common or useful signals. We have grouped the signals
based on common functionality. Information about the signals is summarized in tables with more
detailed description given after the table. Here is a description of the table headers:
Signal
Name of the signal.
#
The POSIX standard specifies the signal number for certain signals. However, keep in mind
that for most signals the signal number is implementation dependent.
Easy name
An easy name to describe the signal.
Catchable
A signal is catchable if we can register a custom signal handler for it. Most signals are
catchable. The exceptions are SIGKILL and SIGSTOP
Default action if not caught
The default action that will be taken if we don’t register a custom signal handler for this signal.
Core Dump
When certain signals are received by a process, they cause the process to terminate
abnormally. An example is when a process makes an illegal memory reference that results in a
segmentation fault. When this happens, a file is created with dump of the memory which
contains the contents of all variables, hardware registers, the kernel process info at the time
the termination occurred. This file is called the core dump and can be used after the fact to
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-signals-concepts-and-types?module_item_id=24111765 2/6
6/8/24, 1:59 PM Exploration: Signals – Concepts and Types: OPERATING SYSTEMS I (CS_374_400_S2024)
identify what went wrong. Depending on the configuration of a machine, core dump files can
be difficult to locate.
Signal # Easy name Catchable Default Action if not caught Core Dump
SIGABRT 6 Abort Yes Terminate Yes
SIGINT 2 Interrupt Yes Terminate No
SIGKILL 9 Kill No Terminate, not catchable No
SIGQUIT 3 Quit Yes Terminate Yes
SIGTERM 15 Terminate Yes Terminate No
SIGABRT
SIGABRT is sent to a process telling it to abort, i.e., terminate, itself. It is usually sent by a
process itself when it calls abort() which performs no cleanup unlike exit() .
SIGINT
SIGINT is sent by a terminal to the foreground process group when the user types the interrupt
character, which is usually Control-C.
SIGKILL
SIGQUIT
SIGQUIT is sent by a terminal to the foreground process group when the user types the quit
character, which is usually Control-\. This signal results in creating a dump file, unlike SIGINT.
SIGTERM
SIGTERM is sent to terminate a process, but it is catchable unlike SIGKILL. This means that
SIGTERM allows the process to perform cleanup. It is thus preferable to terminate a process
using SIGTERM as opposed to SIGKILL.
Signal # Easy name Catchable Default Action if not caught Core Dump
SIGBUS - Bus Error Yes Terminate Yes
SIGFPE 8 Floating Point Error Yes Terminate Yes
SIGILL 4 Illegal Instruction Yes Terminate Yes
SIGPIPE 13 Pipe Yes Terminate Yes
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-signals-concepts-and-types?module_item_id=24111765 3/6
6/8/24, 1:59 PM Exploration: Signals – Concepts and Types: OPERATING SYSTEMS I (CS_374_400_S2024)
Signal # Easy name Catchable Default Action if not caught Core Dump
SIGSEV 11 Segmentation Fault Yes Terminate Yes
SIGSYS - System Call Yes Terminate Yes
SIGBUS
SIGBUS is used to indicate some memory-access errors, such as a process trying to access
non-existent physical address.
SIGFPE
SIGFPE is sent when a process executes an erroneous arithmetic operation such as divide-by-
zero. This signal can be sent for floating-point operations as well as for integer operations, even
though the letters FPE stands for Floating Point Error.
SIGILL
SIGILL is sent when a process attempts an illegal hardware instruction, e.g., malformed or
unknown.
SIGPIPE
SIGPIPE is sent when a process writes to a pipe but there is no reader at the other end of the
pipe.
SIGSEV
SIGSEV is sent when a process makes an illegal memory reference. This is usually the result of
a programming bug, such as dereferencing a pointer that has not been initialized.
SIGSYS
SIGSYS is sent when a process passes incompatible argument to a system call. It is rare to
encounter this because we typically use libraries that make system calls.
Signal # Easy name Catchable Default Action if not caught Core Dump
SIGALARM 14 Alarm Yes Terminate No
SIGCONT Continue Yes Continue -
SIGHUP 1 Hang up Yes Terminate No
SIGSTOP Stop No Stop, not catchable -
SIGTSTP Terminal Stop Yes Stop -
SIGTRAP 5 Trap Yes Terminate No
SIGALARM
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-signals-concepts-and-types?module_item_id=24111765 4/6
6/8/24, 1:59 PM Exploration: Signals – Concepts and Types: OPERATING SYSTEMS I (CS_374_400_S2024)
SIGALARM is sent to a process when a timer that the process had set by calling the alarm()
function has expired. It is usually sent and caught to execute actions at a specific time.
SIGCONT
If SIGCONT is sent to a process that is currently stopped, the process will move to the runnable
state. If the process is not currently stopped, the signal is ignored.
SIGHUP
SIGSTOP
SIGSTOP stops a process. It is not catchable by a handler and cannot be blocked or ignored.
SIGTSTP
SIGTSTP is issued at a terminal to stop the process group currently running in the foreground. It
is commonly sent by typing Control-Z on the terminal.
SIGTRAP
SIGTRAP is sent when a trap occurs for debugging, e.g., the value of a variable changes, or a
function starts. It is used by debuggers to implement breakpoints.
SIGCHLD
SIGCHLD
Signal # Easy name Catchable Default Action if not caught Core Dump
SIGCHLD Child Terminated Yes None -
SIGHLD is sent by the kernel to a process when a child process of this process has terminated
or stopped or resumed its execution. Normally, wait() and waitpid() will suspend a process
until one of its child processes has terminated. Using the signal SIGCHLD allows a parent
process to do other work instead of going to sleep and be notified via signal when a child
terminates. Then, when SIGCHLD is received, the process can (immediately or later) call
wait() or waitpid() when ready, perhaps leaving the child a zombie for just a little while.
Signal # Easy name Catchable Default Action if not caught Core Dump
SIGUSR1 User 1 Yes Terminate No
SIGUSR2 User 2 Yes Terminate No
Both signals have no meaning to the kernel. These signals are provided for custom use of the
programmer. The interpretation of these 2 signals is up to the programmers who have code the
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-signals-concepts-and-types?module_item_id=24111765 5/6
6/8/24, 1:59 PM Exploration: Signals – Concepts and Types: OPERATING SYSTEMS I (CS_374_400_S2024)
sending and receiving process. The kernel never sends these signals.
Additional Resources
A very good overview of signals is provided in the Linux manpages
(https://man7.org/linux/man-pages/man7/signal.7.html)
Basic concepts about signals are discussed in Chapter 20 of The Linux Programming
Interface.
Kerrisk, M. (2010). The Linux programming interface : a Linux and UNIX system programming handbook. San
Francisco: No Starch Press.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-signals-concepts-and-types?module_item_id=24111765 6/6
6/8/24, 2:00 PM Exploration: Signal Handling API: OPERATING SYSTEMS I (CS_374_400_S2024)
Signal Sets
Many times, we want to customize signal-related functionality for a group of signals, e.g., we
may want to ignore a set of signals. A signal set is a list of signal types which is defined using
the special type sigset_t defined in signal.h . Many utility functions are also provided to
manipulate signals sets which we describe below:
sigset_t my_signal_set;
int sigemptyset(&my_signal_set);
Initialize or reset the signal set my_signal_test to empty, i.e., contain no signal types
Returns 0 on success and -1 on error.
int sigfillset(&my_signal_set);
Initialize or reset the signal set my_signal_test to have all signal types.
Returns 0 on success and -1 on error.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-signal-handling-api?module_item_id=24111766 1/9
6/8/24, 2:00 PM Exploration: Signal Handling API: OPERATING SYSTEMS I (CS_374_400_S2024)
We can register signal handling functions using the sigaction() function some of whose
parameters are variables of the structure sigaction . Yes, it is somewhat confusing that the
function and the structure share the same name!
sigaction()
#include <signal.h>
int sigaction(int signum, const struct sigaction *act, struct sigaction *oldact);
The function sigaction() registers a signal handling function that a programmer has created for
a specified set of signals. It has three parameters:
The first parameter signum is the signal type for which we are registering the handler. It can
be any valid signal except SIGKILL and SIGSTOP, which as you may recall cannot be
caught.
The second parameter is a pointer to a data-filled sigaction struct which describes the
action to be taken upon receipt of the signal given in the first parameter
The third parameter can be null or it can be a pointer to another sigaction struct. If it is not
null, then the sigaction() function will write to this structure the handling settings for this
signal before this change was requested.
struct sigaction {
void (*sa_handler)(int);
sigset_t sa_mask;
int sa_flags;
void (*sa_sigaction)(int, siginfo_t*, void*);
};
sa_handler
The first attribute is named sa_handler . It is a pointer to the function which we want to be
invoked to handle this signal, i.e., the signal handler function. We had discussed pointers to
functions in an earlier module, but let us examine the type of this attribute:
The * in front of the name sa_handler , and the parentheses around it, indicate that this is a
pointer to a function.
The void indicates that the signal handler function does not return anything.
The int indicates that the signal handler function must take exactly one parameter whose
type is integer.
This integer parameter will hold the signal number when the signal handler is called. This is
important because multiple signals may be registered with a struct. The only way to tell the
signal handler function which signal caused it to start is via this integer parameter.
Instead of passing the a pointer to a function defined by us, we can also pass one of the
following two constants as the value of sa_handler
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-signal-handling-api?module_item_id=24111766 2/9
6/8/24, 2:00 PM Exploration: Signal Handling API: OPERATING SYSTEMS I (CS_374_400_S2024)
SIG_DFL – specifying this value means we want the default action to be taken for the signal
type.
SIG_IGN – specifying this value means that the signal type should be ignored
sa_mask
The sa_mask attribute is of type sigset_t . It contains a set of signals which should be blocked
while the signal handler is executing. Blocking means that the signals arriving during the
execution of sa_handler are held until the signal handler is done executing. At this point the
signals will be delivered in order to the process. Note that multiple signals of the same type
arriving while the signal type is blocked may be combined, so we cannot use this to count how
many occurrences of a signal type occurred during this period.
sa_flags
The third attribute of the sigaction struct provides additional instructions (flags):
SA_RESTHAND – this flag resets the signal handler to SIG_DFL (default action) after the
first signal has been received and handled
SA_SIGINFO – this flag tells the kernel to call the function specified in the fourth attribute,
i.e., sa_sigaction , instead of the function specified in the first attribute, i.e., sa_handler .
More detailed information can be passed to this function specified as the argument
sa_sigaction , as you can see by the additional arguments for this function.
Set to 0 - if we aren't planning to set any flags
sa_sigaction
The fourth attribute, sa_sigaction , specifies an alternative signal handler function to be called.
Most of the time you will use sa_handler and not sa_sigaction , so we will not discuss this
further.
#include <unistd.h>
int pause(void)
When a process is suspended after calling pause() and it receives a signal, one of the following
actions takes place
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-signal-handling-api?module_item_id=24111766 3/9
6/8/24, 2:00 PM Exploration: Signal Handling API: OPERATING SYSTEMS I (CS_374_400_S2024)
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
int main(){
printf("Send the signal SIGINT to this process by entering Control-C\n");
fflush(stdout);
pause();
// When Control-C is entered, the process will terminate due to the default
// signal handler. The following line will not be printed
printf("pause() ended. The process will now end.\n");
return 0;
}
#include <signal.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
int main(){
// Initialize SIGINT_action struct to be empty
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-signal-handling-api?module_item_id=24111766 4/9
6/8/24, 2:00 PM Exploration: Signal Handling API: OPERATING SYSTEMS I (CS_374_400_S2024)
struct sigaction SIGINT_action = {0};
printf("Send the signal SIGINT to this process by entering Control-C. That will cau
se the signal handler to be invoked\n");
fflush(stdout);
Example: Custom Handlers for SIGINT, SIGUSR2, and Ignoring SIGTERM, etc.
In the next version of the program we do the following:
We register a signal handler for SIGINT that sleeps for 10 seconds and then raises
SIGUSR2
We register a signal handler for SIGUSR2 that terminates the program
We register the SIG_IGN constant as the handler for SIGTERM, SIGHUP and SIGQUIT.
This means that these 3 signals will be ignored by this program.
Note: the code is in the file signalexample.c. You can view the list of files by clicking the "Files"
icon on the left-hand side of the replit. Then click on signalexample.c to view the code.
#include <sys/types.h>
#include <signal.h>
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-signal-handling-api?module_item_id=24111766 5/9
6/8/24, 2:00 PM Exploration: Signal Handling API: OPERATING SYSTEMS I (CS_374_400_S2024)
int main(){
struct sigaction SIGINT_action = {0}, SIGUSR2_action = {0}, ignore_action = {0};
while(1)
pause();
return 0;
}
If you click the “run,” button, the program is compiled to an executable names signalexample.
Now execute this program in the background by issuing the following command on the
command prompt in repl.it
$ ./signalexample &
You can now observe the behavior of the program by sending it various signals by following the
directions printed by the program.
Reentrant Functions
In the above programs where we registered custom signal handlers, you may have noticed that
in the code of the signal handler we used the write function instead of using printf . The
reason for this is that write is a reentrant function while printf is non-reentrant. Let us
explain this concept.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-signal-handling-api?module_item_id=24111766 6/9
6/8/24, 2:00 PM Exploration: Signal Handling API: OPERATING SYSTEMS I (CS_374_400_S2024)
A signal interrupts the sequence of instructions that the process is executing. The instructions in
the signal handler will be executed. Then, if the signal handler does not terminate the process,
the sequence of instructions for the process will be resumed.
But what if the process was interrupted while it was executing a function that uses a global data
structure and the signal handler calls the same function as well? The call by the signal handler
can change the state of the global data structure in such a way that when the process’s
instructions are resumed, the function that had been interrupted fails on resumption due to the
change in that global data structure.
Because of this reason, we have to be careful about which functions we call in a signal handler.
Functions that are safe to use in a signal handler are called reentrant. A reentrant function
achieves its functionality even if its execution is interrupted and a signal handler executes the
same function. Functions that use local variables are guaranteed to be reentrant.
printf and other members of the standard I/O library use some global data structures in such a
way that they are not reentrant. Therefore, in signal handlers we use the write function
because it is reentrant.
write()
#include <unistd.h>
ssize_t write(int fd, const void *buf, size_t count);
write() writes up to count bytes from the buffer buf to the file referred to by the file descriptor
fd . On success this function returns the number of bytes it has written. On error, it returns -1.
One scenario in which write may write fewer than count bytes is when it is interrupted by a
signal handler. If it is interrupted before it has written any bytes, it falls with the error EINTR.
However, if it has written at least one byte, it succeeds and returns the number of bytes it has
written.
In our example program, you may have noticed that the call to write gave the value of count as
an integer rather than using strlen to get the length of the string we want to write. For
example, the code is:
rather than
The reason? Like printf , strlen is also a non-reentrant function and thus should not be used
in a signal handler!
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-signal-handling-api?module_item_id=24111766 7/9
6/8/24, 2:00 PM Exploration: Signal Handling API: OPERATING SYSTEMS I (CS_374_400_S2024)
A signal interrupts the sequence of instructions that the process is executing. The instructions in
the signal handler will be executed. Then, if the signal handler does not terminate the process,
the sequence of instructions for the process will be resumed.
When certain system calls, or library functions that use these system calls, are interrupted by a
signal, they return an error. An example of this is the function getline() which returns an error
if it is interrupted by a signal. Here are two ways to address this issue.
SIGTSTP_action.sa_flags = SA_RESTART;
Setting this flag will cause an automatic restart of the interrupted system call or library function
after the signal handler gets done.
Example
The following code checks for errors from getline() when it is reading from stdin and resets
the status of stdin in case of errors.
For more details see the man pages for clearerr (https://man7.org/linux/man-
pages/man3/clearerr.3.html) .
However, calling any of the exec() family of functions in a process removes any special signal
handler function assigned to sa_handler or sa_sigaction previously! The exception is
that SIG_DFL and SIG_IGN are preserved through an exec() .
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-signal-handling-api?module_item_id=24111766 8/9
6/8/24, 2:00 PM Exploration: Signal Handling API: OPERATING SYSTEMS I (CS_374_400_S2024)
This means that if we use exec() to run programs we didn't write (e.g., ls , or other bash
commands), we cannot set up arbitrary signal handlers for these programs. The only way to
customize signal handling for such programs to use SIG_IGN to tell these programs to ignore
particular signals.
Additional Resources
The signal handling API is discussed in Chapter 21 of The Linux Programming Interface.
Kerrisk, M. (2010). The Linux programming interface : a Linux and UNIX system programming handbook. San
Francisco: No Starch Press.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-signal-handling-api?module_item_id=24111766 9/9
6/8/24, 2:00 PM Exploration: Processes and I/O: OPERATING SYSTEMS I (CS_374_400_S2024)
Example
In the following program, the parent process opens a file and writes to it. It then spawns a child
process and waits for it. The child process inherits the open file descriptor and the file pointer.
As the child process writes to the file the file pointer moves. When the child terminates, the
position of the file pointer in the parent reflects the updated value due to the write done by the
child process.
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
int main(){
// Parent opens a file
char *newFilePath = "./newFile.txt";
printf("PARENT: Opening file %s.\n", newFilePath);
int fileDescriptor = open(newFilePath, O_RDWR | O_CREAT | O_TRUNC, S_IRUSR | S_IWUS
R);
if (fileDescriptor == -1) {
printf("open() failed on \"%s\"\n", newFilePath);
exit(1);
}
switch (spwanPID){
case -1:
perror("fork() failed\n");
exit(1);
break;
case 0:
printf("CHILD: started. FP position: %ld\n", lseek(fileDescriptor, 0, SEEK_
CUR));
printf("CHILD: Writing AB to file.\n");
fflush(stdout);
write(fileDescriptor, "AB", 2);
printf("CHILD: After write, new FP position: %ld\n", lseek(fileDescriptor,
0, SEEK_CUR));
fflush(stdout);
default:
// Wait for the child
waitpid(spwanPID, &childExitMethod, 0);
printf("PARENT: child terminated, FP position is: %ld\n", lseek(fileDescrip
tor, 0, SEEK_CUR));
//fflush(stdout);
break;
}
return 0;
}
In many cases this sharing is desirable. For example, consider we enter a command, such as
ls , on the shell. The shell spawns a child process to run ls . The output of ls gets written to
the standard output and it advances the file pointer. When the child process ends, the shell
which is waiting for this child resumes and writes something to standard output. Since the child
process running the ls program and the parent process running the shell program share the
same file pointer, this means that when the child process ends, the file pointer for standard
output in the parent process has also been advanced. The writes by the parent process to the
standard output do not intermix with what had been written by the child process.
If we want to prevent this sharing, then we can have one of the parent or the child close the file
and then re-open it to allocate a new file descriptor. Now read and writes by the parent and child
processes will move different file pointers.
In a previous module, we discussed how to do input and output redirection when using the shell.
Let us now look at how we can do such redirection programmatically using the dup2 function.
dup2()
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-processes-and-i-slash-o?module_item_id=24111767 2/5
6/8/24, 2:00 PM Exploration: Processes and I/O: OPERATING SYSTEMS I (CS_374_400_S2024)
#include <unistd.h>
int dup2(int oldfd, int newfd);
The dup2() system calls duplicates a file descriptor. If it executes successfully, then newfd will
point to a copy of oldfd . The two file descriptors can be used interchangeably and share the
same file pointer. If the newfd was previously open it will be automatically closed. If it successful
it returns the new file descriptor, otherwise it returns -1.
Let us see an example of how we can redirect standard output using dup2 . In the following
program, we first open the file whose name is provided as an argument to our program and
store the open file descriptor for this file in the variable targetFD . We then use dup2 to point
file descriptor 1, i.e., standard out, to targetFD . After we have done this, whatever the program
writes to standard out, will get written to the file.
./main foo.txt
then the printf statement executed after the call to dup2 will write its ouput to the file foo.txt.
You can view the contents of this file by running the cat command, e.g.,
cat foo.txt
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
if (argc == 1){
printf("Usage: ./main <filename to redirect stdout to>\n");
exit(1);
}
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-processes-and-i-slash-o?module_item_id=24111767 3/5
6/8/24, 2:00 PM Exploration: Processes and I/O: OPERATING SYSTEMS I (CS_374_400_S2024)
In the following example, we redirect standard input to read from a file and standard out to write
to a file. The program itself uses execlp() to run the shell command sort
(https://www.man7.org/linux/man-pages/man1/sort.1.html) . The program will thus sort the input
file and write the sorted contents to the output file.
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
On os1, you can test the program by running the following commands:
The first command creates a file inputfile.txt and writes each of the 3 words in a row by itself.
Running our program reads the contents of the input file, passes it to the sort command, and
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-processes-and-i-slash-o?module_item_id=24111767 4/5
6/8/24, 2:00 PM Exploration: Processes and I/O: OPERATING SYSTEMS I (CS_374_400_S2024)
Our C program thus runs the equivalent of the following shell command:
#include <fcntl.h>
int fcntl(int fd, int cmd, ... /* int arg */ );
Using fcntl we can set a flag FD_CLOEXEC on a file-by-file basis. This flag is inherited through
fork. So if a parent uses fcntl to set this flag on a file, a child that later calls an exec function
will trigger the close on exec flag and the corresponding file descriptor will be closed. Sample
code that uses this feature might look like the following:
#include
...
int fd;
fd = open("file", O_RDONLY);
…
fcntl(fd, F_SETFD, FD_CLOEXEC);
// Now whenever this process or one of its child processes calls exec, the file descriptor
fd will be closed in that process
Additional Resources
The system call dup2() and its use are discussed in Chapters 4 and 5, while close-on-exec
is discussed in Chapter 18 of The Linux Programming Interface.
Kerrisk, M. (2010). The Linux programming interface : a Linux and UNIX system programming handbook. San
Francisco: No Starch Press.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-processes-and-i-slash-o?module_item_id=24111767 5/5
6/8/24, 2:00 PM Review - Module 3: OPERATING SYSTEMS I (CS_374_400_S2024)
Review - Module 3
Key Take-Aways
At this point, you should be able to answer all of the following questions.
What is a process and what are the different states it can go through?
What is a parent process and a child process?
What does the fork system call do? How can you use it to create new processes?
What happens when a process calls the exit function?
How can a parent process wait for a child process? What do wait and waitpid functions do?
How can a new program be executed?
What do the functions in the exec family of functions accomplish?
What is the environment of a process?
How can the environment variables be accessed or modified when using the bash shell?
How can the environment variables be accessed or modified in C programs?
What is the relationship between the environment of a parent process and a child process?
How to use the ps command to get information on currently running processes?
How to manipulate foreground and background jobs from the shell?
What is a signal and what is a signal handler?
What are the different uses of signals?
How can you send a signal from the command line?
How can your write custom signal handlers?
How can you block signal in your C programs?
How are file descriptors inherited by a child process?
What is the use of the dup2() system call?
How can you redirect standard input and/or standard output?
How can you force an open file to automatically close whenever a process uses an exec
function to run a new program?
(https://canvas.oregonstate.edu/courses/1971495/modules)
https://canvas.oregonstate.edu/courses/1971495/pages/review-module-3-2?module_item_id=24111761 1/1
6/8/24, 2:00 PM Module 4 - Overview: OPERATING SYSTEMS I (CS_374_400_S2024)
Module 4 - Overview
Introduction
In this module, we look at multi-programming and possible issues related to concurrent
execution of multiple processes. We then introduce the thread abstraction and contrast it with
the process abstraction. We then study Unix APIs for safe execution of concurrent threads.
https://canvas.oregonstate.edu/courses/1971495/pages/module-4-overview?module_item_id=24445917 1/2
6/8/24, 2:00 PM Module 4 - Overview: OPERATING SYSTEMS I (CS_374_400_S2024)
(https://pixabay.com/)
https://canvas.oregonstate.edu/courses/1971495/pages/module-4-overview?module_item_id=24445917 2/2
6/8/24, 2:00 PM Exploration: Concurrency: OPERATING SYSTEMS I (CS_374_400_S2024)
Exploration: Concurrency
Multiprogramming and Multiprocessing
When studying process states, we learnt that an OS can switch the CPU between
multiple processes that are all ready to run. This gives the impression that multiple programs are
being run concurrently and is called multiprogramming. The part of an OS that makes
decisions about which process should be run on the CPU is called the scheduler.
The scheduler may give the CPU to another process because the process that was running
on the CPU is now in the blocked state, e.g., because it is waiting for an I/O request to
complete.
The scheduler may give the CPU to another process even if the currently running process
can still run instructions on the CPU. This may be done by the scheduler if the currently
running process has been on the CPU for a long time, and just letting this process run on the
CPU will no longer give the impression that multiple programs are being run concurrently.
On a system with multiple CPUs, many processes can be on the CPU at the same time. This is
called multiprocessing. However, even on such systems, the scheduler will make scheduling
decisions about which processes should be run on each of the CPUs and for how long they
should be allowed to run on the CPU.
When one process is moved off the CPU and another process is run on the CPU, a context
switch is said to have occurred. During a context switch, the OS needs to store the state of the
process removed from the CPU and restore the state of the process that will now be run on the
CPU. Specifically:
The OS stores the contents of the CPU’s registers and the program counter for the process
that has been removed from the CPU. This is done so that when this process is again given
a run on the CPU, the registers and the program counter can be restored to these values
and the process will continue exactly from where it was stopped.
The OS then sets the CPU registers and the program counter corresponding to the process
which will now be run on the CPU.
Context switch is an expensive operation. It is also an overhead in the sense that during the
context switch the CPU is essentially doing bookkeeping rather than doing the useful work of
running the programs.
Example
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-concurrency?module_item_id=24111770 1/4
6/8/24, 2:00 PM Exploration: Concurrency: OPERATING SYSTEMS I (CS_374_400_S2024)
Consider a program to withdraw money from an ATM machine has been coded as follows:
1. Read the balance of an account from a file into a local variable named balance .
2. Subtract the withdrawal amount from the local variable balance .
3. Write back the updated value of the local variable balance to the file.
Consider further that this file with the balance information is shared by multiple ATMs over the
network and the account balance of a particular user is $100. If $20 is withdrawn from this
account at about the same time from two different ATMs, the actions of the programs running on
each of these ATMs can interleave in the following order (time increases as we go down the
following table).
Account
Process 1 Process 2
Balance
$100
Reads value $100 from the file into the
$100
local variable balance
Reads value $100 from the file into the
$100
local variable balance
Subtracts $20 from the local variable
$100 balance. The value of the local variable
balance is now $80
Subtracts $20 from the local variable
$100 balance. The value of the local variable
balance is now $80
$80 Writes back $80 back to the file
$80 Writes back $80 back to the file
So even though $40 were withdrawn from the account, the balance is $80 instead of $60!
The part of code that accesses or modifies a shared resource is termed a critical section. If
multiple processes are accessing the same resource and execute their critical sections at the
same time, a race condition will occur. What we want is that when one process is in its critical
section, no other processes accessing this resource should be executing their critical section.
This property is called mutual exclusion.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-concurrency?module_item_id=24111770 2/4
6/8/24, 2:00 PM Exploration: Concurrency: OPERATING SYSTEMS I (CS_374_400_S2024)
Note that since the scheduler can remove a running process from the CPU to schedule another
process, it is possible that the scheduler removes a process while it is in the middle of executing
its critical section, and schedules another process that accesses the same resource and will
start executing its critical section. To ensure mutual exclusion, we needs some sort of “lock”
where only one process can own the lock at any given time, and only that process is allowed in
the critical section. Once the process is no longer in the critical section, it should give up the
lock so that other processes can run the critical section.
Example: Consider a process is running a browser. Many tabs can be simultaneously open in
the browser and a user may switch between these tabs.
One way to achieve concurrency in a browser could be to spawn multiple processes for the
browser. Then these processes will cooperatively work to concurrently do the tasks we
described above. This way when one process is waiting for an image to be downloaded from the
network, the CPU could schedule another process for this browser which could render an image
that had been previously downloaded. These cooperating processes will need some way to
communicate with each other. The OS provides various mechanisms to support Inter-Process
Communication (IPC), such as shared memory, pipes, etc., which we will study in a later
module in the course.
However, OSs provide a simpler and less resource intensive mechanism to implement
concurrency within a program called threads. We will study threads in detail in the coming
explorations in this module. In particular, we will study the POSIX standard Pthreads API for
creating and managing threads, and for implementing mutual exclusion on Unix based OSs.
Additional Resources
Here are some references to learn more about the topics we discussed in this exploration.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-concurrency?module_item_id=24111770 3/4
6/8/24, 2:00 PM Exploration: Concurrency: OPERATING SYSTEMS I (CS_374_400_S2024)
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-concurrency?module_item_id=24111770 4/4
6/8/24, 2:00 PM Exploration: Threads - Concepts & API: OPERATING SYSTEMS I (CS_374_400_S2024)
Creation of a thread is less expensive than the creation of a process because a new thread
will share the address space already allocated to the process in which this thread is created.
If one thread of a process is blocked, e.g., waiting for an I/O request to complete, the
process does not necessarily have to be blocked, because another thread of the process
may still be runnable and can be scheduled to run on the CPU.
A context switch to move one thread off the CPU and put another thread on the CPU is
much faster than the context switch from one process to another.
Communication between threads in a process can be simpler and faster than
communication between different processes.
Due to threads being very similar to processes, yet being a lot less resource intensive,
sometimes threads are referred to as lightweight processes in contrast to a heavyweight
process running just one thread.
Memory layout of a C program with multiple threads: all threads share the same Heap, Data
and Code segments. The Stack segment is broken up into pieces for the stack area of each
thread.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-threads-concepts-and-api?module_item_id=24111771 1/7
6/8/24, 2:00 PM Exploration: Threads - Concepts & API: OPERATING SYSTEMS I (CS_374_400_S2024)
However, another possibility is that a library provides user-level threads. In this case the OS
does not know about the threads but only knows about and schedules processes. The threading
library manages the threads in a process and is responsible for switching between them. User-
level threads are sometimes called green threads.
Pthreads API
Pthreads API is a standard API for threads in Unix. It is a POSIX standard and is available on
most Unix systems. The header file for the Pthreads library is pthread.h and the library includes
a large number of functions to, e.g., create, join, and destroy threads.
The main() function creates a new thread using the function pthread_create() and then
waits for the thread to complete by calling pthread_join() .
The thread created by main runs the function helloWorld which prints “Hello World” and
returns.
Once this thread returns, the main function resumes execution and executes its own return
statement.
6_2_thread_hello.c @cs344
main
Creating a Thread
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-threads-concepts-and-api?module_item_id=24111771 2/7
6/8/24, 2:00 PM Exploration: Threads - Concepts & API: OPERATING SYSTEMS I (CS_374_400_S2024)
#include <pthread.h>
The function pthread_create() starts a new thread in the process that has called this function.
Its arguments are as follows:
thread points to a variable into which the ID of the new thread is written. Its type is
pthread_t which is platform dependent, but is likely to be an unsigned long.
attr points to a pthread_attr_t struct that contains option flags. This argument can be
NULL if we don’t want to pass any flags.
start_routine points to a function that will be the starting point of execution for the new
thread. This function take an argument which is a void pointer and it return a void pointer.
arg points to the sole argument that is passed into start_routine . If multiple arguments
need to be passed, we can pass a struct. If no arguments need to be passed, we pass a
NULL.
In this case, the start_routine is the function helloWorld . We are not passing any flags or
arguments to the call to pthread_create .
Note that to compile and link a program that uses the Pthread API, we need to use the flag -
pthread with the gcc command. E.g.,
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-threads-concepts-and-api?module_item_id=24111771 3/7
6/8/24, 2:00 PM Exploration: Threads - Concepts & API: OPERATING SYSTEMS I (CS_374_400_S2024)
6_2_thread_many_args.c @cs344
main
Ending a Thread
A thread can end in one of the following ways:
pthread_exit()
pthread_cancel()
This function sends a cancellation request to the target thread that is provided as an argument.
The target thread has control over whether or when it decides to cancel. The details can be
found in the man pages for this function (https://man7.org/linux/man-
pages/man3/pthread_cancel.3.html) .
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-threads-concepts-and-api?module_item_id=24111771 4/7
6/8/24, 2:00 PM Exploration: Threads - Concepts & API: OPERATING SYSTEMS I (CS_374_400_S2024)
When we were discussing the process API, we learnt how a parent process can wait for the
termination one or more of its child processes. Similar to this, a thread in a process can wait for
the termination of one or more other threads in that process by calling pthread_join .
A thread that calls this function waits for the thread thread to terminate. If the target thread has
already terminated, the function returns immediately. If multiple threads try to join the same
thread, the result is undefined.
However, unlike the process API where a parent can wait for its children, but not the other way
round, there is no such hierarchy among the various threads in a process. Threads in a process
are peers of each other and any thread in the process can wait for the termination of another
thread in that process by calling pthread_join on that thread.
function. This thread running main() then waits for the termination of for all the threads it has
created by calling pthread_join on each of these threads. After all these threads have
terminated, the main() function returns, thus terminating the thread running it, and hence the
process.
6_2_thread_example.c @cs344
main
1 ELF>�@@�@8 @@@@@@��88@8@@@�
2 �
3 �
4 �
5 `�
6 `hp ``��TT@T@ P�td� � @� @DDQ�tdR�td�
7 �
8 `�
9 `/lib64/ld-linux-x86-64.so.2GNU*cUw
j9Plibm.so.6__gmon_start__libpthread.so.0pthread_createpthread_joinlibc
.so.6exit__assert_failprintfpthread_self__libc_start_mainGLIBC_2.2.5
F ui �ui ��`�`` `(`0`8`@`H��H��
10 H��t��H����5�
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-threads-concepts-and-api?module_item_id=24111771 5/7
6/8/24, 2:00 PM Exploration: Threads - Concepts & API: OPERATING SYSTEMS I (CS_374_400_S2024)
6_2_thread_ids.c @cs344
main
Note: As explained in the next section, definition of pthread_t is platform dependent. The
above program prints the value of pthread_t by assuming it is an integer. This program will
work on Linux, but is not guaranteed to work on other flavors of Unix. Printing pthread_t
by assuming it is an integer can still be useful for debugging on Linux and other platforms
where this works.
This function returns a value of nonzero if the two thread are identical, otherwise it returns the
value 0.
Exercise
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-threads-concepts-and-api?module_item_id=24111771 6/7
6/8/24, 2:00 PM Exploration: Threads - Concepts & API: OPERATING SYSTEMS I (CS_374_400_S2024)
Our goal in the following program is that the main() function will create two threads which will
run concurrently. Does the program achieve its goal? If not, why not and how can you fix it?
6_2_threads_gone_wrong.c @cs344
main
Answer
Additional Resources
Here are some references to learn more about the topics we discussed in this exploration.
The Pthreads API is discussed at length in Chapters 29, 30, 31 and 32 of The Linux
Programming Interface.
Kerrisk, M. (2010). The Linux programming interface : a Linux and UNIX system programming handbook. San
Francisco: No Starch Press.
An introduction to concurrency and threads is given in Chapter 26 Concurrency: An
Introduction (http://pages.cs.wisc.edu/~remzi/OSTEP/threads-intro.pdf) in the book
Operating Systems: Three Easy Pieces.
Arpaci-Dusseau, Remzi H., and Andrea C. Arpaci-Dusseau. Operating Systems: Three Easy Pieces, 2018.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-threads-concepts-and-api?module_item_id=24111771 7/7
6/8/24, 2:01 PM Exploration: Synchronization for Concurrent Execution: OPERATING SYSTEMS I (CS_374_400_S2024)
6_3_bad_threads.c @cs344
main
In the above program, we spawn NUM_THREADS threads. Each of these threads increments the
same shared variable counter . Each thread increments this variable COUNT_TO times. With
NUM_THREADSset to 3 and COUNT_TO set to 10,000,000, a correct execution of the program
should end with the value of counter set to 10,000,000 * 3, i.e., to 30,000,000. However, if you
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-synchronization-for-concurrent-execution?module_item_id=24111772 1/5
6/8/24, 2:01 PM Exploration: Synchronization for Concurrent Execution: OPERATING SYSTEMS I (CS_374_400_S2024)
run the program even once, you are likely to see a different value. If you the program multiple
times, almost certainly the result will be different each time. So what is causing this issue?
The issue is caused by unsafe concurrent access and modification of the variable counter . In
the program, the value of counter is incremented in the following line:
counter += 1;
The critical section of the program comprises this line because this is the only part of the
program where a shared resource is being modified. When we compile the program, this single
line of C code will be compiled to more than one instruction in machine language. The
compilation may, for example, comprise the following 3 instructions:
Recall the OS scheduler can at any time remove one thread from the CPU and instead run
another thread. Because of this, it is possible that two threads running concurrently may end up
executing the above instructions to increment this variable in the following interleaved order:
1. Thread 1 executes Instruction 1 and loads the value of counter into a register. Let’s say this
value was 100.
2. Thread 1 executes Instruction 2 and sets the value in the register to 101.
3. The OS removes Thread 1 from the CPU and instead gives the CPU to Thread 2. As part of
the context switch, the OS will save the value of all the registers so as that it can restore
them when Thread 1 resumes execution.
4. Thread 2 executes Instruction 1 and loads the value of counter into a register. Since Thread
1 had only updated the register, but had not yet updated the value of the variable counter ,
the value of the variable is still 100.
5. Thread 2 executes Instruction 2 and sets the value in the register to 101.
6. Thread 2 executes Instruction 3 and stores the value 101 from the register into the variable
counter .
7. The OS removes Thread 2 from the CPU and schedules Thread 1 again. As part of the
context switch, the OS restores the value 101 to the register that it had saved in Step 3.
8. Thread 1 executes Instruction 3 and stores the value 101 from the register into the variable
counter .
As a result of the interleaved execution, at the end of the two threads executing the increment
statement, the value of the variable has only been incremented once, instead of twice. This is
why the program gives an incorrect result.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-synchronization-for-concurrent-execution?module_item_id=24111772 2/5
6/8/24, 2:01 PM Exploration: Synchronization for Concurrent Execution: OPERATING SYSTEMS I (CS_374_400_S2024)
Answer
Answer
#include <pthread.h>
Calling pthread_mutex_lock() on a mutex will give the mutex to the calling thread. However,
if this mutex is already locked, the calling thread will block. A thread that has locked the
mutex is also sometimes referred to as holding the mutex.
As an alternative to this, the function pthread_mutex_trylock() can be used to make a non-
blocking attempt to lock the mutex. This function will return 0 if it is able to lock the mutex,
otherwise it will return a non-zero value.
A thread unlocks a mutex it has previously locked by calling pthread_mutex_unlock() .
In the following program, we fix the race condition from the previous program by using a mutex
to synchronize access to the critical section where the value of the shared variable counter is
incremented.
6_3_granular_sync.c @cs344
main
1 ELF>`@@�@8 @@@@@@��88@8@@@�
2 �
3 �
4 �
5 `�
6 `x� ``��TT@T@ P�tdt t @t @DDQ�tdR�td�
7 �
8 `�
9 `/lib64/ld-linux-x86-64.so.2GNUf�*�
�u�@Slibm.so.6__gmon_start__libpthread.so.0pthread_mutex_destroypthrea
d_mutex_initpthread_mutex_lockpthread_createpthread_joinpthread_mutex_unl
ocklibc.so.6exitprintf__libc_start_mainGLIBC_2.2.5� ui �ui
��`�`` `(`0`8`@`H` P`
With the change, each thread calls pthread_mutex_lock before it can increment the value of
counter by executing counter += 1 .
pthread_mutex_lock(&counterMutex);
counter += 1;
pthread_mutex_unlock(&counterMutex);
If multiple threads try to execute the increment statement at the same time, the thread which
first called pthread_mutex_lock will lock the mutex and go through to this statement, while the
other threads will be blocked. Thus the race condition has been eliminated, and this program will
always give the same correct result.
Note: When one thread has locked the mutex, all other threads that want the mutex are
blocked. For this reason a mutex must be locked for the shortest time that is necessary to
synchronize access to the shared resource. Any statements that are irrelevant to the
shared resource must be executed either before requesting the lock on the mutex or after
unlocking the mutex. Doing otherwise reduces the concurrency of the program because a
thread that has locked the mutex while executing statements that don't require the mutex is
unnecessarily blocking the other threads.
Additional Resources
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-synchronization-for-concurrent-execution?module_item_id=24111772 4/5
6/8/24, 2:01 PM Exploration: Synchronization for Concurrent Execution: OPERATING SYSTEMS I (CS_374_400_S2024)
The Pthreads API is discussed at length in Chapters 29, 30, 31 and 32 of The Linux
Programming Interface.
Kerrisk, M. (2010). The Linux programming interface : a Linux and UNIX system programming handbook. San
Francisco: No Starch Press.
The Pthreads API is also discussed in Chapter 27 Interlude: Thread API
(http://pages.cs.wisc.edu/~remzi/OSTEP/threads-api.pdf) in the book Operating Systems:
Three Easy Pieces.
Arpaci-Dusseau, Remzi H., and Andrea C. Arpaci-Dusseau. Operating Systems: Three Easy Pieces, 2018.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-synchronization-for-concurrent-execution?module_item_id=24111772 5/5
6/8/24, 2:01 PM Exploration: Synchronization Mechanisms Beyond Mutex: OPERATING SYSTEMS I (CS_374_400_S2024)
Producer-Consumer Problem
The Producer-Consumer problem, also called the bounded buffer problem, is a classic
example of multiple threads (or multiple processes) needing to synchronize their actions. In its
simplest form the problem consists of two threads, one of which is called the producer and the
other is called the consumer. The producer and the consumer share a common buffer of a
bounded size. The producer generates data and puts it in the common buffer. The consumer picks
data from the buffer and consumes it. If the buffer is full, the producer must wait until there is
space in the buffer to put data. Similarly, if the buffer is empty, then the consumer must wait until
there is data in the buffer to consume.
We can summarize the requirements for the producer and consumer as follows:
Producer should put data only when the buffer is not full
Consumer should get data only when the buffer is not empty
The Unix pipe | command can be used in a shell to send the output of one command to the
input of another command. To connect the output of one command to the input of another
command, the Unix kernel creates a bounded buffer. E.g., the bash command cat foo.txt |
more displays the file foo.txt one screen at a time. cat foo.txt puts the contents of foo.txt
in this buffer, while more picks up this data from the buffer and displays it one screen at a
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-synchronization-mechanisms-beyond-mutex?module_item_id=24111773 1/4
6/8/24, 2:01 PM Exploration: Synchronization Mechanisms Beyond Mutex: OPERATING SYSTEMS I (CS_374_400_S2024)
time. Thus, the command cat foo.txt is the producer and the command more is the
consumer in this example.
In many cases, there can be multiple producers and/or multiple consumers. An example of real
systems with such an architecture are multi-threaded web servers that process HTTP
requests. A producer thread puts the HTTP requests received by the web server in a bounded
buffer. Multiple consumer threads pick up these requests from the buffer and process them to
send back HTTP responses.
6_4_prodcons_mutex.c @cs344
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-synchronization-mechanisms-beyond-mutex?module_item_id=24111773 2/4
6/8/24, 2:01 PM Exploration: Synchronization Mechanisms Beyond Mutex: OPERATING SYSTEMS I (CS_374_400_S2024)
The above program successfully enforces the requirements that the producer should not put data
when the buffer is full and the consumer should not get data when the buffer is empty. However,
the solution is very inefficient. When the buffer is full, the producer sleeps for a second and then
again checks the buffer, wasting CPU time if the buffer is still full. Similarly, when the buffer is
empty, the consumer may repeatedly check the buffer again wasting CPU time. We could
decrease this inefficient use of the CPU by increasing the time the producer and the consumer
sleep when they cannot use the buffer. However, this will mean that the response time of the
application degrades because the producer or the consumer might be sleeping when they could
be using the buffer.
However, typically in Producer-Consumer problems the number of items is not known beforehand.
Instead, the producer finds out that there are no more items to produce when it encounters some
sort of marker.
E.g., consider a scenario where the producer is processing a file and putting the file line-by-
line in the shared buffer for the consumer. In this case, end-of-file is the marker that tells the
producer that there is no more data for it to put in the buffer.
In such situations, the producer thread must not terminate as soon as it sees the maker. Before
terminating it must place a marker in the shared buffer so that the consumer can also know that
no more data will be put in the buffer. Otherwise, after consuming all the items in the buffer, the
consumer thread will forever keep checking the buffer to see if there are any more items to
consume.
In the above program, we use a special value -1 as a marker to indicate that the producer will
not put any more data in the shared buffer. The producer first puts this marker in the shared
buffer and then terminates the producer thread. When the consumer gets this marker from the
shared buffer, it terminates its thread.
In the program the number of items to produce/consume is available in the variable
num_iterations . So we could have instead coded the consumer thread to terminate when it
had consumed num_iterations items. We instead used the marker approach to illustrate that
idea.
Also note that it will be incorrect for the program to terminate as soon as the producer thread
terminates. Doing that doesn't allow the consumer thread to consume any items that may still be
in the shared buffer. Instead, the consumer thread should terminate when it knows there are no
more items to consume. Only after that the program can terminate.
Summary
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-synchronization-mechanisms-beyond-mutex?module_item_id=24111773 3/4
6/8/24, 2:01 PM Exploration: Synchronization Mechanisms Beyond Mutex: OPERATING SYSTEMS I (CS_374_400_S2024)
In this exploration we looked at the Producer-Consumer problem. We found that while using a
mutex can provided the synchronization needed by the producer and consumer threads, the use
of mutex leads to an inefficient solution to the problem.
Additional Resources
A very good discussion of the producer-consumer problem is given in Chapter 30 Condition
Variables (http://pages.cs.wisc.edu/~remzi/OSTEP/threads-cv.pdf) in the book Operating
Systems: Three Easy Pieces.
Arpaci-Dusseau, Remzi H., and Andrea C. Arpaci-Dusseau. Operating Systems: Three Easy Pieces, 2018.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-synchronization-mechanisms-beyond-mutex?module_item_id=24111773 4/4
6/8/24, 2:01 PM Exploration: Condition Variables: OPERATING SYSTEMS I (CS_374_400_S2024)
Condition variables can be used to implement exactly this behavior. A condition variable allows a
thread to sleep until another thread indicates to this thread that a condition is now true and the
first thread should wake up. In the producer-consumer problem
When the buffer is empty, the consumer thread will sleep. When the producer puts an item in
the buffer, the buffer is no longer empty. The producer thread will indicate to the consumer
thread to resume execution.
Similarly, when the buffer is full, the producer thread will sleep. When the consumer consumes
data from the buffer, the buffer is no longer full. The consumer thread will indicate to the
producer thread to resume execution.
Note that the modification of the shared resource still requires a mutex to be shared between the
producer and the consumer. However, instead of calling sleep() to sleep for a fixed amount of
time, and then checking the buffer again, the producer and the consumer can instead use
condition variables.
#include <pthread.h>
pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *m);
pthread_cond_signal(pthread_cond_t *cond);
To call the function pthread_cond_wait , the thread calling the function must hold the mutex m .
As a result of calling this function, the thread gives up the mutex m and blocks on the
condition variable cond until another thread calls pthread_cond_signal on cond .
When a thread calls pthread_cond_signal on a condition variable, a thread that has called
on that condition variable is unblocked and tries to acquire the mutex that it
pthread_cond_wait
had released when it called pthread_cond_wait .
Calling pthread_cond_wait is called waiting for the condition variable and calling
pthread_cond_singal is called signalling the condition variable. Note that the term signal here is
not related to the signal mechanism in Unix, but "signal" here is used in the sense of "indicate."
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-condition-variables?module_item_id=24111774 1/4
6/8/24, 2:01 PM Exploration: Condition Variables: OPERATING SYSTEMS I (CS_374_400_S2024)
The main() function creates a producer thread and a consumer thread, and waits for these
threads to finish similar to the previous example program.
The producer thread runs the function producer() . This function locks the mutex and then
checks whether there is space in the buffer to place an item.
If there is space in the buffer, it places an item in the buffer and increments the count of
items in the buffer. It then signals the condition variable full to indicate that the buffer has
data and unlocks the mutex
If there is no space in the buffer, the producer thread calls pthread_cond_wait on the
condition variable empty . This will result in the producer thread unlocking the mutex and
blocking until the consumer thread signals this condition variable empty .
The consumer thread runs the function consumer() . This function locks the mutex and then
checks whether there is data in buffer to consume.
If there is data in the buffer, it picks up the next item in the buffer and decrements the count
of items in the buffer. It then signals the condition variable empty to indicate that the buffer
has space and unlocks the mutex.
If there is no data in the buffer, the consumer thread calls pthread_cond_wait on the
condition variable full . This will result in the consumer thread unlocking the mutex and
blocking until the producer thread signals this condition variable full .
64_prod_cons_cv.c @cs344
Thus, by using the two condition variables empty and full , the producer and consumer threads
can coordinate the production and consumption of data in the shared buffer such that
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-condition-variables?module_item_id=24111774 2/4
6/8/24, 2:01 PM Exploration: Condition Variables: OPERATING SYSTEMS I (CS_374_400_S2024)
The producer only puts data in the buffer when there is space in the buffer
The consumer only gets data from the buffer when there is data in the buffer, and
Neither thread does a busy wait by unnecessarily using CPU time to check if it can use the
buffer.
Modify the program so that synchronization is achieved using only one condition variable.
6_4_prod_cons_unbound.c @cs344
Answer
Summary
Condition variables provide another synchronization mechanisms for use by multiple threads in a
process. We illustrated the use of condition variables for the Producer-Consumer problem with a
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-condition-variables?module_item_id=24111774 3/4
6/8/24, 2:01 PM Exploration: Condition Variables: OPERATING SYSTEMS I (CS_374_400_S2024)
bounded buffer as well as an unbounded buffer. We only looked at scenarios where there was one
producer and one consumer. There are variations of the Producer-Consumer problem where there
can be multiple producers and/or multiple producers. In addition to the Producer-Consumer
problem, condition variables are also widely used in multi-threaded programs for other
synchronization problems.
Additional Resources
Pthread APIs support for condition variables is discussed in Ch 30.2 of The Linux
Programming Interface.
Kerrisk, M. (2010). The Linux programming interface : a Linux and UNIX system programming handbook. San
Francisco: No Starch Press.
A very good discussion of the producer-consumer problem and the use of condition variables
is given in Chapter 30 Condition Variables
(http://pages.cs.wisc.edu/~remzi/OSTEP/threads-cv.pdf) in the book Operating Systems: Three
Easy Pieces.
Arpaci-Dusseau, Remzi H., and Andrea C. Arpaci-Dusseau. Operating Systems: Three Easy Pieces, 2018.
The Dining Philosophers problem (Wikipedia link
(https://en.wikipedia.org/wiki/Dining_philosophers_problem) ) is another well-know example
problem that is used to illustrate synchronization issues. A solution to this problem using
condition variables is given here
(http://www.cs.fsu.edu/~baker/realtime/restricted/notes/philos.html) .
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-condition-variables?module_item_id=24111774 4/4
6/8/24, 2:01 PM Review - Module 4: OPERATING SYSTEMS I (CS_374_400_S2024)
Review - Module 4
Key Take-Aways
At this point, you should be able to answer all of the following questions.
What is multiprogramming?
What is multiprocessing?
What function does an OS scheduler perform and how is this role relevant to
multiprogramming and multiprocessing?
What is a race condition and what problems can be caused by it?
What is the critical section of a program?
What is mutual exclusion and why is it an important property to maintain when developing
programs that may concurrently access shared resources?
What is a thread and how do threads compare to processes?
What is the Pthreads API to create and destroy threads?
How can you use the Pthreads API to write multi-threaded programs with correct concurrent
behavior?
What are condition variables?
What is the Producer-Consumer problem and how are condition variables useful for
implementing solutions for this problem?
(https://canvas.oregonstate.edu/courses/1971495/modules)
https://canvas.oregonstate.edu/courses/1971495/pages/review-module-4-2?module_item_id=24111775 1/1
6/8/24, 2:02 PM Exploration: Inter-Process Communication: OPERATING SYSTEMS I (CS_374_400_S2024)
Introduction
In this exploration, we will take a tour of the mechanisms provided by Linux for
processes to communicate with each other, i.e., functionality for Inter-Process Communication
or IPC. We have already studied signals in an earlier module which are used by one process to
notify another process, and thus provide a form of (limited) IPC. In another module, we explored
mutex locks in the Pthreads API which is a facility that multiple threads in one process can use
to synchronize their actions. In this exploration, we will look at the breadth of IPC facilities, and
then, in later explorations in this module, we will study some of these facilities in greater depth .
Categories of IPC
At a high level, we can categorize IPC facilities into communication facilities which help
processes exchange data, and synchronization facilities which help processes (or threads)
coordinate their actions.
Pipes
A pipe is a unidirectional data channel for processes on one machine in which one process
writes to the pipe and another process reads from the pipe. A pipe can only be used between
processes that are forked by a common ancestor. A pipe can be used, e.g., for a parent process
to write to the child process, or vice versa. Pipes exchange data as byte streams. We will study
pipes in the next exploration.
FIFO
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-inter-process-communication?module_item_id=24111778 1/4
6/8/24, 2:02 PM Exploration: Inter-Process Communication: OPERATING SYSTEMS I (CS_374_400_S2024)
FIFO stands for First-in, First-out. These are similar to pipes in that one process writes to a
FIFO and another process on the same machine reads from it. However, FIFO can be used for
communication between two process even if these processes are not related to each other.
Unlike pipes, FIFOs have name and they are also called named pipes. FIFOs exchange data
as byte streams. We will study FIFOs in the next exploration.
Message Queues
Message queues are used to exchange data in the form of messages. Messages in the queue
have a type field, a length field and the actual message bytes. Messages can be fetched in first-
in, first-out order. But it is also possible to fetch messages in a non-FIFO order using the type
field.
Sockets
Sockets are different from the other IPC facilities that we have discussed in that these can be
used for IPC by processes running on the same machine as well processes that are on different
machines. Datagram sockets provide message based communication, whereas stream
sockets support byte streams. We will study sockets in depth in another module.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-inter-process-communication?module_item_id=24111778 2/4
6/8/24, 2:02 PM Exploration: Inter-Process Communication: OPERATING SYSTEMS I (CS_374_400_S2024)
With data transfer, reading the data removes it from the communication facilities. This means
that only one process can read the data. As opposed to this, multiple processes can read the
data when shared memory is used.
Exercises
Communication facilities, such as pipes and FIFOs, allow communication between processes.
Do you think it would be important to provide similar communication facilities for multiple threads
in the same process? If yes, why? If no, why not?
Answer
Synchronization Facilities
Processes and threads use synchronization facilities to coordinate their action, especially, when
modifying or accessing shared resources, such as memory or files. Let us look at different types
of synchronization facilities.
Semaphores
A semaphore is an integer whose value is always 0 or more. Processes, or threads in a
process, can use semaphores to coordinate their actions. There are two fundamental operations
that a process can do on a semaphore - increments its value and decrement its value. If the
value is already 0, then decrement operation is blocked until another process increments the
value of the semaphore.
To enforce mutual exclusion, processes can use a binary semaphore whose value is either 0
or 1. A process wanting to gain exclusive access to a shared resource will try to decrement the
value of the binary semaphore. It will succeed if the value is 1; otherwise it will be blocked.
However, if processes want to access a shared resource with multiple resources, they can use
counting semaphores where the value of the semaphore can go up to the number of shared
resources. We will study semaphore in a later exploration in this module.
File Locks
File locks are a specialized synchronization facility to coordinate operations on a file by multiple
processes. File locks can be created for complete files or at the finer granularity of portions of a
file.
Additional Resources
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-inter-process-communication?module_item_id=24111778 3/4
6/8/24, 2:02 PM Exploration: Inter-Process Communication: OPERATING SYSTEMS I (CS_374_400_S2024)
Here are some references to learn more about the topics we discussed in this exploration.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-inter-process-communication?module_item_id=24111778 4/4
6/8/24, 2:02 PM Exploration: Pipes & FIFO: OPERATING SYSTEMS I (CS_374_400_S2024)
Pipes
Pipes are used for Inter-process communication (IPC) between two processes that are forked
by a common ancestor process. A pipe is a unidirectional data channel that connects a write-
only file descriptor in one process to a read-only file descriptor in another process.
Creating a Pipe
Pipes are possible because file descriptors are shared across fork() and exec() family of
functions. Here is how pipes are created and used:
A parent process creates a pipe by calling the pipe() function. This results in two new open
file descriptors, one for input and one for output.
The parent process then calls fork() and possibly exec() . Now both the parent and the
child have the two file descriptors that were created by the pipe() function.
If a pipe is to used for the parent process to send data to the child process, then the parent
process writes to the output file descriptor and the child process reads from the input file
descriptor.
Conversely, if a pipe is to be used for the child process to send data to the parent process,
then the child process writes to the output file descriptor and the parent process reads from
the input file descriptor.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-pipes-and-fifo?module_item_id=24111779 1/5
6/8/24, 2:02 PM Exploration: Pipes & FIFO: OPERATING SYSTEMS I (CS_374_400_S2024)
Using a pipe to send data from the writer process to the reader process
#include <unistd.h>
int pipe(int pipefd[2]);
7_1_pipes.c @cs344
a.out
Note that pipes have a certain size. On Linux this size is typically 65,536 Bytes, i.e., 64 KBytes.
If a pipe fills up and there is no more room, then write() will block until space becomes
available which will happen when the reader process reads the data from the pipe.
#include <unistd.h>
ssize_t read(int fd, void *buf, size_t count);
The read() function reads data from the file descriptor fd . It attempts to read up to count
bytes into the buffer pointing at by buf . On success, read() returns the number of bytes it
read. This number can be less than count if, for example, there are fewer bytes available to
read, hence this does not indicate an error. read() returns –1 on error and returns 0 on
encountering end-of-file.
#include <string.h>
char *strstr(const char *haystack, const char *needle);
The strstr() function finds the first occurrence of the substring needle in the string haystack .
If the substring is found then the function returns a pointer to the beginning of the located
substring. If the substring is not found then it returns NULL.
Closing Pipes
What happens if one of the processes closes its end of the pipe?
If the reader process closes the input pipe, then when the writer process tries to write to the
pipe, write() will return –1. Additionally, the writer process will be sent the SIGPIPE signal
which we described in the exploration related to signals.
If the writer process closes the output pipe, then the process reading from the pipe will read
end-of-file after it has read all remaining data in the pipe and will return 0.
FIFO
FIFO stands for First-in, First-out. FIFO are very similar to pipes with the crucial difference that a
FIFO has a name. This allows FIFOs to be used for communication between two process on the
same machine even if these processes are not related to each other. Because FIFOs have
names and they are similar to pipes, they are also called named pipes.
A FIFO is essentially a persistent pipe, which is represented by a special file. We create a FIFO
by calling the function mkfifo() . Once a FIFO is created any process can open a FIFO with
open() and then use it just like using a pipe. When opening a FIFO, open() called by the first
process will block. The first process will unblock once the second process also calls open()
The mkfifo()function
#include <sys/types.h>
#include <sys/stat.h>
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-pipes-and-fifo?module_item_id=24111779 3/5
6/8/24, 2:02 PM Exploration: Pipes & FIFO: OPERATING SYSTEMS I (CS_374_400_S2024)
Example: FIFO
In the following example, there are two programs, a reader program and a writer program. The
programs use @@ as a terminal indicator, i.e., to indicate that the message is complete.
Note: Click on the "File" icon on the left side of the repl to view a list of files. The file
7_1_reader.c is the reader program, and the file 7_1_writer.c is the writer program.
7_1_fifo.c @cs344
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-pipes-and-fifo?module_item_id=24111779 4/5
6/8/24, 2:02 PM Exploration: Pipes & FIFO: OPERATING SYSTEMS I (CS_374_400_S2024)
A FIFO can also be created from the shell, by using the command mkfifo
$ mkfifo my_fifo
$ ls -l my_fifo
prw-rw----. 1 chaudhrn upg11000 0 May 25 06:13 my_fifo
Additional Resources
Here are some references to learn more about the topics we discussed in this exploration.
Pipes and FIFOs are discussed in Chapter 44 of The Linux Programming Interface.
Kerrisk, M. (2010). The Linux programming interface : a Linux and UNIX system programming handbook. San
Francisco: No Starch Press.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-pipes-and-fifo?module_item_id=24111779 5/5
6/8/24, 2:02 PM Exploration: Semaphores: OPERATING SYSTEMS I (CS_374_400_S2024)
Exploration: Semaphores
Introduction
We have already studied the use of Pthreads mutexes to synchronize access of
multiple threads in a process to a shared resource. POSIX semaphores provide another
mechanism that allows threads to synchronize their actions. However, in addition to threads,
semaphores can also be used by multiple processes to synchronize their actions. Additionally,
semaphores support the implementation of more complex synchronization between processes
or threads that goes beyond mutual exclusion.
Basic Concepts
A semaphore is an integer whose value is always 0 or more. To synchronize actions,
semaphores provide two fundamentals operations:
1. Decrement the value of the semaphore by one. This operation succeeds only if the current
value of the semaphore is greater than 0. This is called locking the semaphore. But if the
current value of the semaphore is 0, then the decrement operation blocks until the value
becomes greater than 0.
2. Increment the value of the semaphore by one. This is called unlocking the semaphore.
Semaphore API
Semaphores can be named or unnamed. In this exploration, we will only look at unnamed
semaphores. We
Let us look at these functions. Note that although semaphores can be used by multiple threads
in a process or by multiple processes, to keep the explanation simple we will many times use
only the term "process", or "thread", instead of repeatedly writing the whole term “process or
thread.”
sem_init()
#include <semaphore.h>
int sem_init(sem_t *sem, int pshared, unsigned int value);
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-semaphores?module_item_id=24111780 1/5
6/8/24, 2:02 PM Exploration: Semaphores: OPERATING SYSTEMS I (CS_374_400_S2024)
The value of the argument pshared specifies whether this semaphore will be shared
between multiple threads of a process, or between multiple processes. The value 0 means
that it will shared between threads of a process, and 1 means it will be shared between
multiple processes.
The argument value specifies the initial value of the semaphore.
If the function is successful, it return 0, otherwise it returns -1.
sem_destroy()
#include <semaphore.h>
int sem_destroy(sem_t *sem);
sem_wait()
#include <semaphore.h>
int sem_wait(sem_t *sem);
If the semaphore’s current value is greater than 0, the function succeeds and returns
immediately.
If the semaphore’s current value is 0, then any process that calls this function will block.
When the value of the semaphore again become greater than 0, then one process among all
the processes that are blocked on this function will be unblocked. This unblocked process
will decrement the value of the semaphore and continue its execution.
The function returns 0 on success, and -1 on error.
sem_post()
#include <semaphore.h>
int sem_post(sem_t *sem);
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-semaphores?module_item_id=24111780 2/5
6/8/24, 2:02 PM Exploration: Semaphores: OPERATING SYSTEMS I (CS_374_400_S2024)
Binary Semaphores
Semaphores with an initial value of 1 are called binary semaphores.
Let us consider that a semaphore binary_sem is initialized with the value 1 and is shared
between threads of a process.
If two threads T1 and T2 call sem_wait() at about the same time, one of them, say T1, will
succeed in decrementing the value to 0 while T2 will be blocked at this call because the
value is now 0.
T1 can then proceed with its execution and can access or modify the shared resource.
When T1 is done, it will call sem_post() . This will cause the value of binary_sem to
increment to 1.
Now T2 will be unblocked. It will decrement the value to 0 and proceed with its execution.
If T1 again calls sem_wait() , it will be blocked until T2 calls sem_post() .
Note how this use of a semaphore with an initial value of 1 to enforce mutual exclusion is similar
to how we used pthread_mutex_lock and pthread_mutex_unlock to implement mutual exclusion
using Pthreads API in the previous module.
Exercise
The following program implements an incorrect counter because of a race condition between
the two threads when updating the shared variable counter . Modify the the function
perform_work and add calls to sem_post() and sem_wait() to synchronize access to counter
so that the program always produces the correct result.
7_2_sem_exercise.c @cs344
Answer
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-semaphores?module_item_id=24111780 3/5
6/8/24, 2:02 PM Exploration: Semaphores: OPERATING SYSTEMS I (CS_374_400_S2024)
Answer (https://repl.it/@cs344/72semthreadsc#main.c)
Counting Semaphores
A semaphore with an initial value greater than 1 is called a counting semaphore. Counting
semaphores are used where we want to allow more than 1 process to simultaneously access a
pool of shared resources, but still impose an upper limit on how many processes can
simultaneously access this pool of shared resources. This is done by setting the initial value of
the counting semaphore to the number of resources that can be simultaneously used.
We create a counting semaphore and set its initial value to POOL_SIZE, i.e., to the number
of resources in the pool.
Each thread calls sem_wait() on the semaphore before using the resource. If any resources
are available for use, the value of the semaphore will be decremented and the thread can
proceed with using the resource by calling use_resource() .
When the number of threads that are currently using a resource equals POOL_SIZE, then
the value of the semaphore will be 0. At this time, any thread which calls sem_wait() will be
blocked.
When a thread gets done using the resource, i.e., it returns from use_resource() it calls
sem_post() which will increment the value of the semaphore. If there are threads currently
blocked on the semaphore, one of them will be unblocked, the semaphore value will be
decremented, and the thread can proceed with calling use_resource() .
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-semaphores?module_item_id=24111780 4/5
6/8/24, 2:02 PM Exploration: Semaphores: OPERATING SYSTEMS I (CS_374_400_S2024)
Additional Resources
Here are some references to learn more about the topics we discussed in this exploration.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-semaphores?module_item_id=24111780 5/5
6/8/24, 2:02 PM Exploration: Network Programming: OPERATING SYSTEMS I (CS_374_400_S2024)
As a specific example, consider that we open up browser on our home computer to view the
home page of Oregon State University. The client in this case is our browser, and the server is
the Web Server set up by our university’s IT team. Leaving out details (such as caching), at a
high level the interaction between the browser and Oregon State’s web server corresponds to
the following steps:
In order for the client/server interaction to succeed, the client and the server need to follow a
common protocol, i.e., a standard way to communicate. In this module, we will primarily look at
network programming for the Transmission Control Protocol/Internet Protocol (TCP/IP).
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-network-programming?module_item_id=24111784 1/6
6/8/24, 2:02 PM Exploration: Network Programming: OPERATING SYSTEMS I (CS_374_400_S2024)
The Application Layer is where an application creates data that needs to be sent over the
network. The application layer uses the services of transport layer to send the data from one
host machine to the other.
The Transport Layer is responsible for controlling how data is sent from one host all the
way to the other, irrespective of the number of nodes, hops, or intermediate networks the
data passes through. The transport layer does not interpret the bytes in the data it is
communicating. It is up to the application layer protocol to interpret the data. The two
protocols used at this layer are as follows:
1. Transmission Control Protocol (TCP):
TCP is the most commonly used protocol for transferring information across a
network.
It is a connection-oriented protocol that provides a byte-stream interface, much like
stdio.
TCP sends out network traffic in bundles that are called packets. TCP guarantees
that packets will arrive in the order in which they are sent.
As an example, the application layer protocol HTTP uses TCP as its transport layer
protocol.
2. Universal Datagram Protocol (UDP):
UDP is a connectionless protocol that does not guarantee delivery of data. UDP
breaks up data into unit called datagrams.
UDP does not guarantee that datagrams will be delivered in order, not does it
guarantee that datagrams will not be dropped by the network.
However, UDP has much lower overhead than TCP and can transfer data much
faster.
As an example, the application layer protocol DNS (Domain Name System) uses
UDP as its transport layer protocol.
The Network Layer is responsible for addressing and organization of a particular set of
connected hosts. The Internet Protocol (IP) is the protocol used at this layer in TCP/IP. The
Internet Protocol provides support for addressing and naming of the host machines, and for
routing from host to host and across network.
The Link Layer is responsible for getting data from just one node to the next neighboring
node. Protocols at this layer include Ethernet and 802.11, aka, Wi-fi.
The Physical Layer is responsible for physical moving the data between two hosts. It is the
actual hardware, which may be copper wires or radios to enable wireless devices to
communicate.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-network-programming?module_item_id=24111784 2/6
6/8/24, 2:02 PM Exploration: Network Programming: OPERATING SYSTEMS I (CS_374_400_S2024)
For a programmer, the Internet is a collection of host machines where a process on one host
can communicate with a process on another host using the TCP/IP protocol suite. A process
running on one host specifies which host it needs to send data to by using the addressing and
naming scheme supported by TCP/IP. TCP/IP provides two fundamental facilities for addressing
and naming:
1. IP Address:
In IP Version 4, IPv4, the set of hosts is mapped to a set of 32-bit addresses. Therefore,
IPv4 has 232 or 4,294,967,296 unique addresses.
Each interface of a host has its own IP address.
E.g., if a machine has wired Ethernet as well as Wireless 802.11,, then these two
interface will have their own IP addresses.
Thus, a host may have multiple IP addresses.
To keep up with the extremely rapid growth in the demand for addresses, a newer
version of IP, IP Version 6, IPv6, uses a set of 64-bit addresses.
However, in our examples, we will use the 32-bit addresses provided by IPv4.
2. Internet Domain Name:
IP addresses are long numbers which can be difficult for humans to remember.
The Internet defines domain names which are more human friendly and supports the
mapping of a set of domain names to a set of IP addresses.
Let us now look at IP Addresses and Domain Names in more detail including the API Unix
provides to use them.
IP Addresses
IP addresses in Unix are structures of type in_addr defined as follows:
#include <netinet/in.h>
struct in_addr {
unsigned int s_addr;
}
Example: Here is the same IP address represented in different number systems and in the
dotted-decimal notation
Binary: 00100011.10100001.01100010.00110110
Decimal: 597778998
Hexa-Decimal: 0x23.0xa1.0x62.0x36
Dotted-Decimal Notation: 35.161.98.54
Unix provides functions that can covert between IP addresses and dotted-decimal notation in
both directions. The function inet_ntoa() (https://man7.org/linux/man-
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-network-programming?module_item_id=24111784 3/6
6/8/24, 2:02 PM Exploration: Network Programming: OPERATING SYSTEMS I (CS_374_400_S2024)
The mapping between IP addresses and domain names is maintained in a distributed database
called Domain Name System (DNS). Entries in DNS can be looked up both by domain names
and by IP addresses.
#include <netdb.h>
struct hostent {
char *h_name; /* official name of host */
char **h_aliases; /* an array of alternate names, terminated by a null pointer
*/
int h_addrtype; /* host address type. One of AF_INET or AF_INET6 at present.
*/
int h_length; /* length of address */
char **h_addr_list; /* an array of addresses, terminated by a null pointer */
}
Example: In the following example, we use the function gethostbyname() to look up the DNS
entry for the given domain name. We print out the list of all aliases of this domain name, as well
as all the IP addresses corresponding to this domain name.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-network-programming?module_item_id=24111784 4/6
6/8/24, 2:02 PM Exploration: Network Programming: OPERATING SYSTEMS I (CS_374_400_S2024)
8_1_dns_by_name @cs344
Systems that store the most significant byte of a word at the smallest memory address and
the least significant byte at the largest address are called big-endian.
Systems that store the least significant byte of a word at the smallest memory address and
the most significant byte at the largest address are called little-endian.
TCP/IP uses big-endianness which means that the most significant byte is transmitted first. This
is termed network byte order and applies to all data items sent in a header across the network.
This means that in the struct in_addr the field s_addr stores the IP address in big-endian order.
Unix provides a number of functions (https://man7.org/linux/man-pages/man3/htonl.3.html)
are provided to convert from host byte order to network byte order and vice versa. These
include:
htonl() to convert an unsigned long int from the host byte order to the network byte order
htons() to convert an unsigned short int from the host byte order to the network byte
order
ntohl() to convert an unsigned long int from the network byte order to the host byte order
ntohs() to convert an unsigned short int from the network byte order to the host byte
order
Additional Resources
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-network-programming?module_item_id=24111784 5/6
6/8/24, 2:02 PM Exploration: Network Programming: OPERATING SYSTEMS I (CS_374_400_S2024)
Here are some references to learn more about the topics we discussed in this exploration.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-network-programming?module_item_id=24111784 6/6
6/8/24, 2:02 PM Exploration: Sockets: OPERATING SYSTEMS I (CS_374_400_S2024)
Exploration: Sockets
Introduction
Unix provides the sockets API for writhing applications that communicate over the
network. A socket is the endpoint of a communication link between two processes. A client and
a server communicate via a pair of sockets, one socket is at the client process and the other is
at the server process. The sockets API presents the socket for use as an open file with a
(socket) descriptor. Sockets can be used for communication between processes on the same
machine as well as for processes that are running on different machines.
Creating a Socket
To create a socket, we use the function socket() (https://man7.org/linux/man-
pages/man2/socket.2.html) .
#include <sys/types.h>
#include <sys/socket.h>
int socket(int domain, int type, int protocol);
The argument domain specifies a communication domain which determines the protocol
suite which will be used for communication. We will pass the value AF_INET for this
argument which corresponds to IPv4. To specify IPv6, this argument’s value would be
AF_INET6 .
The argument type specifies the type of the socket. We will use the argument SOCK_STREAM
which is the type for TCP sockets. To create UDP sockets, this argument’s value would be
SOCK_DGRAM .
The argument protocol is useful when multiple protocols exist for a particular socket type
within a given protocol suite. We will pass the value 0 for this argument.
The function socket() creates an endpoint for communication and returns a socket descriptor
that refers to that endpoint. On error, the function returns the value –1. The socket descriptor
returned by socket() is added to the open file descriptor table, which includes entries for all
open files, including stdin , stdout , and stderr .
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-sockets?module_item_id=24111785 1/4
6/8/24, 2:02 PM Exploration: Sockets: OPERATING SYSTEMS I (CS_374_400_S2024)
Open sockets are included in the file descriptor table: The file descriptor table for a
process includes descriptors for all open sockets in addition to descriptors for all open files,
such as stdin, stdout and stderr.
However, at this point, this socket is not yet ready to be used for reading and writing. Further
calls needs to be made to completely open a socket for reading and writing, which we will study
in the next exploration. Here we look at socket addresses.
Address of a Socket
An IP address identifies an interface on a host machine. But many different processes can be
running on a host. For this reason, an IP address by itself is insufficient to uniquely identify a
process that has opened a socket for communication on the network. In order to uniquely
identify a specific process on a machine, a port number is used in addition to the IP address. A
port number is a 16-bit unsigned integer. Thus, the value of a port number ranges from 0 to
65,535.
Lower numbered port numbers are called well-known ports or system ports. These port
numbers are reserved by applications that implement common services, so that servers for that
application can receive client requests at these ports. Examples of well-known ports include
ports 80 and 8080 for HTTP servers, 22 for SSH, etc.
Ports in the range 49152–65535 are called ephemeral ports or dynamic or private ports. These
port numbers cannot be reserved. A client socket is allocated an ephemeral port number by the
OS.
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
struct sockaddr_in {
sa_family_t sin_family; /* Address family: AF_INET */
in_port_t sin_port; /* Port number in network byte order */
struct in_addr sin_addr; /* IP address */
};
Note that the values of the address and the port are stored in network byte order. We will need
to call appropriate conversion functions, such as htons() , ntohs() , etc., to convert these value
between network byte order and host byte order.
Example
In the following example, we create a socket address corresponding to the hostname and the
port number passed in as arguments to the program.
8_2_socketaddress.c @cs344
main
Exercise
Add code to the above program to print the value of the port number value in the variable
address .
Answer
Additional Resources
Here are some references to learn more about the topics we discussed in this exploration.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-sockets?module_item_id=24111785 3/4
6/8/24, 2:02 PM Exploration: Sockets: OPERATING SYSTEMS I (CS_374_400_S2024)
A very good free tutorial on network programming using sockets is provided in Beej's Guide
to Network Programming: Using Internet Sockets (http://beej.us/guide/bgnet/) .
Sockets are discussed in Chapter 59 of The Linux Programming Interface.
Kerrisk, M. (2010). The Linux programming interface : a Linux and UNIX system programming handbook. San
Francisco: No Starch Press.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-sockets?module_item_id=24111785 4/4
6/8/24, 2:02 PM Exploration: Client-Server Communication Via Sockets: OPERATING SYSTEMS I (CS_374_400_S2024)
The server program creates a listening socket, binds it to a port, and starts listening for
connections.
The server program then starts a loop in which it accepts connections using accept()
system call which creates a connected socket for communication with a client.
The client program creates a socket and then connects to the server socket using the
connect() system call.
Once the connection is established, the client sends data to the server using the send()
system call and receives data from the server using the recv() system call.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-client-server-communication-via-sockets?module_item_id=24111786 1/8
6/8/24, 2:02 PM Exploration: Client-Server Communication Via Sockets: OPERATING SYSTEMS I (CS_374_400_S2024)
The server receives data from the client using the recv() system call and sends data to the
client using the send system call.
When communication with this client is complete, the server closes the connected socket
that was being used for communication with this client.
In this exploration, we will first discuss the client program and then the server program.
Note: You may not be able to run the client and the server programs in this exploration on
the repl.it website (https://repl.it) .
Client Program
To communicate to a server process over a socket, the client program does the following:
1. Creates a socket endpoint for the client by calling the function socket() .
2. Sets up a socket address structure sockaddr_in with the IP address and the port number for
the server socket.
3. Calls the function connect() to connect the client socket to the server socket. Successful
call to connect() makes the client socket ready for reading and writing.
4. Uses send() and recv() , or write() and read() , to send data to the server and receive
data from the server over the sockets.
5. Closes the client socket when done.
Example
Here is an example client program that prompts the user for input, sends this input to the server,
and then prints the message received from the server.
Note: You can view the list of files in the repl by using the "Files" icon on the left side. The
file with the client code is named client.c .
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-client-server-communication-via-sockets?module_item_id=24111786 2/8
6/8/24, 2:02 PM Exploration: Client-Server Communication Via Sockets: OPERATING SYSTEMS I (CS_374_400_S2024)
8_3_client.c @cs344
client
Successfully running the client program requires that the server must already be started. The
client program requires two arguments - the hostname where the server is running and the port
where the server process is listening.
E.g., if the server is running on the same machine as the client and the port number is 49123,
we run the client as follows
#include <sys/types.h>
#include <sys/socket.h>
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-client-server-communication-via-sockets?module_item_id=24111786 3/8
6/8/24, 2:02 PM Exploration: Client-Server Communication Via Sockets: OPERATING SYSTEMS I (CS_374_400_S2024)
The connect() system call connects the client socket specified in the file descriptor sockfd to
the server socket whose address is specified by addr . The addrlen argument specifies the
size of the structure addr . If the connection succeeds, then the function returns 0, otherwise it
returns –1.
#include <sys/types.h>
#include <sys/socket.h>
ssize_t send(int sockfd, const void *buf, size_t len, int flags);
This function sends len bytes of data in the buffer buf over the socket sockfd . We are going
to set the flags argument to 0. Note that with flags set to 0, send() system call is equivalent
to the write() system call.
If send() is successful, it returns the number of bytes sent. This number may be less than the
number of bytes we had specified in the len argument because the kernel could not send the
data in one chunk. It is therefore important to check how many bytes were actually sent by
send() and handle partial sends by keep calling send() until all the remaining bytes have been
sent.
#include <sys/types.h>
#include <sys/socket.h>
The recv() system call reads len bytes into the buffer buf from the socket sockfd . We are
going to set the flags argument to 0. Note that with flags set to 0, recv() system call is
equivalent to the read() system call.
By default, the recv() function will block if the connection is open but no data is available. Data
may arrive in odd size bundles. recv() will return exactly the amount of data that has already
arrived. Since data may arrives in odd size bundles, we may need to call the recv() function
multiple times to read the complete message sent on the socket, similar to how we need to
handle partial sends on the send side of the socket.
If the size of the data being sent is not known or can vary, we can use special control codes to
mark the termination of the message. This is similar to how we use @@ as the terminating
characters when using pipes for IPC in a previous module.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-client-server-communication-via-sockets?module_item_id=24111786 4/8
6/8/24, 2:02 PM Exploration: Client-Server Communication Via Sockets: OPERATING SYSTEMS I (CS_374_400_S2024)
Note: it is possible to use fcntl() system call to set up the socket to not block on recv() if
there is no data.
Server Program
To service client requests over a socket connection, a server process does the following:
1. Creates a socket endpoint for the server by calling the function socket() .
2. Sets up a socket address structure sockaddr_in with the port number at which the server
socket will listen for connections.
3. Call bind() to associate the socket with the socket address.
4. Call listen() on this socket to start listening for client connections. This listening socket
Example
Here is an example server program that accepts connection requests from clients, receives the
message sent by the client, prints out this message, and sends back a response to the client
that the message was received.
Note: You can view the list of files in the repl by using the "Files" icon on the left side. The
file with the server code is named server.c .
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-client-server-communication-via-sockets?module_item_id=24111786 5/8
6/8/24, 2:02 PM Exploration: Client-Server Communication Via Sockets: OPERATING SYSTEMS I (CS_374_400_S2024)
8_3_server.c @cs344
main.c
This server program requires one argument - the port where the the server will listen for
connections.
./server $port
e.g.,
./server 49123
Note: Directions to run the client program and the server program from the same shell
Add & at the end of the server startup command, i.e., ./server $port & .
This will start the server process in the background.
Now we can start the client program from that shell as well.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-client-server-communication-via-sockets?module_item_id=24111786 6/8
6/8/24, 2:02 PM Exploration: Client-Server Communication Via Sockets: OPERATING SYSTEMS I (CS_374_400_S2024)
#include <sys/types.h>
#include <sys/socket.h>
A socket created by calling the function socket() does not have an address assigned to it. We
use the bind() system call to assign the address addr to the socket sockfd . On success this
function returns 0, otherwise it returns –1.
#include <sys/types.h>
#include <sys/socket.h>
The system listen() marks the socket sockfd as a listening socket that will be used to
accept incoming connections using the accept() system call. The argument backlog specifies
the maximum number of pending connection requests for this socket that will be queued up.
Once the queue reaches this number, any client connection requests will receive an error until
the queue gets smaller than this number. On success this function returns 0, other it returns –1.
#include <sys/types.h>
#include <sys/socket.h>
The accept() system call takes the first connection request off of the listen queue for the
socket. If the queue is empty and the socket is marked as a blocking socket, then the accept()
call blocks the process until a connection request arrives. The accept() system call creates a
new connected socket for the connection established for this client and returns the file
descriptor of this connected socket. The original listening socket sockfd remains unaffected by
accept() and continues to be used to accept new client connection requests. On error,
Receive the Data Using recv() & Send the Data Using send()
The system calls to receive and send the data in the server program are the same as those
used in the client program.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-client-server-communication-via-sockets?module_item_id=24111786 7/8
6/8/24, 2:02 PM Exploration: Client-Server Communication Via Sockets: OPERATING SYSTEMS I (CS_374_400_S2024)
When the client request has been serviced, the server program closes the connected socket
that had been created for this client request. Note that the listening socket remains open
through the lifetime of the server.
Additional Resources
Here are some references to learn more about the topics we discussed in this exploration.
A very good free tutorial on network programming using sockets is provided in Beej's Guide
to Network Programming: Using Internet Sockets (http://beej.us/guide/bgnet/) . The
tutorial includes sample code (http://beej.us/guide/bgnet/html/#sendall) to handle partial
sends.
Sockets are discussed in Chapter 59 of The Linux Programming Interface.
Kerrisk, M. (2010). The Linux programming interface : a Linux and UNIX system programming handbook. San
Francisco: No Starch Press.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-client-server-communication-via-sockets?module_item_id=24111786 8/8
6/8/24, 2:02 PM Exploration: Server Design: OPERATING SYSTEMS I (CS_374_400_S2024)
Iterative Servers
We implemented a server in the previous exploration. The server in that example picks up the
client connection at the head of the queue of waiting client requests, services the request of this
client, and then picks up the next client. This server handles only one client at a time, while any
additional clients must wait for all previous requests to complete. Such servers are called
iterative servers.
Iterative servers are easy to design, implement and maintain. However if a service receives a
large number of concurrent client requests, then an iterative server for that service will provide
poor response time to the clients. Furthermore, an iterative server will not maximize CPU
utilization because it will block when it makes an I/O request and will not service other pending
client connections even though it can use the CPU for them until the I/O request completes.
Concurrent Servers
Concurrent servers service, or at least appear to service, multiple client connections at the
same time. Servers can provide concurrency in two ways:
1. Real Concurrency: A server that uses multiple threads or multiple processes to concurrently
service requests from multiple clients.
2. Apparent Concurrency: A server running a single thread of execution that gives the
appearance of concurrency by switching to servicing another client connection whenever an
I/O request blocks servicing one client connection.
Real Concurrency
A server can provide real concurrency by using either multiple threads or multiple processes to
service client connections. Since performance can deteriorate after too many concurrent
connections, such servers put an upper limit on concurrent connections.
Concurrent servers maximize CPU utilization, provide good response time and high throughput.
However, such servers are harder to design, implement, and maintain. There are four different
architectures for such servers, depending on whether they employ threads or processes, and
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-server-design?module_item_id=24111787 1/5
6/8/24, 2:02 PM Exploration: Server Design: OPERATING SYSTEMS I (CS_374_400_S2024)
whether they use a pool of available processes/threads or create a process/thread for each
client connection.
Advantages: The server design and implementation is very simple. There is minimal shared
state between processes that we need to worry about.
Disadvantages: Process creation via fork() is slow. Context-switching between processes
is also slow, but minor compared to fork() .
Advantages: Since servicing a request does not require the overhead of forking a new
process, the response to client requests is rapid as long as there is an idle process
available.
Disadvantage: Managing the pool of processes can be complex. There is still the overhead
of context switching between processes.
Advantages: Creation of new threads is a lot faster than creation of a new process. Context
switching between threads is also a lot faster than context switching between processes.
Disadvantages: Implementation of such a server is complex because code must be thread-
safe and must not inadvertently share data across different client requests.
Advantages: Since servicing a request does not require creation of a new thread, the
response to client requests is rapid as long as there is an idle thread available.
Disadvantages: Implementing the server is complex because of need for thread-safety,
guarding against inadvertent data sharing, and managing the thread pool.
Apparent Concurrency
A server can service multiple client requests concurrently by using I/O multiplexing. In this
design, when an I/O request may block, the server switches to servicing another client. At a
given time, there will be only one active process/thread of execution while other
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-server-design?module_item_id=24111787 2/5
6/8/24, 2:02 PM Exploration: Server Design: OPERATING SYSTEMS I (CS_374_400_S2024)
processes/threads are blocked on I/O. While this approach increases CPU utilization and
throughput, it adds complexity due to the need to track connections, detecting blocking calls and
choosing what to run next.
#include <sys/time.h>
#include <sys/types.h>
#include <unistd.h>
The three parameters readfds , writefds , and exceptfds are of type fd_set . fd_set is a bit
mask where each bit of the number refers to one file descriptor. Bit 0 is file descriptor 0, bit 1 is
file descriptor 1, and so on.
nfds : select() will check file descriptors with values up to nfds . nfds should be set to the
highest-numbered file descriptor in any of the three sets, plus 1.
readfds : select() will watch this set to see if reading from any file descriptors in this set
will not block. This will be true if there is data available for reading or if the file descriptor is
already on end-on-file.
writefds : select() will watch this set to see if writing to any file descriptors in this set will
not block, i.e., if space is available for write. However, a large write may still block.
exceptfds : select() will watch this set for exceptional conditions (we will not study the topic
The select() call blocks until either of the following three conditions becomes true:
A positive integers, if any file descriptors are ready. This integer equals the number of ready
file descriptors,
0 if the timeout expired,
–1 if an error occurred.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-server-design?module_item_id=24111787 3/5
6/8/24, 2:02 PM Exploration: Server Design: OPERATING SYSTEMS I (CS_374_400_S2024)
The API provides the following macros to manipulate these bit masks:
Example
The following program uses the select() function to watch stdin for input.
8_4_select.c @cs344
Here are some comments on the arguments we pass to select() in our program:
We are only watching stdin, i.e., file descriptor 0. We set the value of nfds to 0 + 1, i.e., 1.
We are not interested in watching any file descriptors for writing or for exceptional
conditions, so we set the corresponding arguments to NULL.
We set the timeout value to 25 seconds. If no data is available in stdin for 25 seconds, the
select() function will timeout.
We can run and test this program by clicking on “run” and then using the following three
commands from the command line:
mkfifo myfifo
./main < myfifo &
echo "text" > myfifo
These commands hooks the output of an echo command to the input of our program through a
FIFO.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-server-design?module_item_id=24111787 4/5
6/8/24, 2:02 PM Exploration: Server Design: OPERATING SYSTEMS I (CS_374_400_S2024)
This causes the select call in our program to detect that stdin has data and our program is
unblocked.
Additional Resources
Here are some references to learn more about the topics we discussed in this exploration.
Server design is discussed in Chapter 60 of The Linux Programming Interface, while I/O
multiplexing is discussed in Chapter 63.
Kerrisk, M. (2010). The Linux programming interface : a Linux and UNIX system programming handbook. San
Francisco: No Starch Press.
https://canvas.oregonstate.edu/courses/1971495/pages/exploration-server-design?module_item_id=24111787 5/5
6/8/24, 2:03 PM Review - Module 5: OPERATING SYSTEMS I (CS_374_400_S2024)
Review - Module 5
Key Take-Aways
At this point, you should be able to answer all of the following questions.
(https://canvas.oregonstate.edu/courses/1971495/modules)
https://canvas.oregonstate.edu/courses/1971495/pages/review-module-5-2?module_item_id=24445919 1/1