CS2105 Assignment 1
CS2105 Assignment 1
School of Computing
CS2105 Assignment 1 Semester 2 10/11
Deadline
6 March, 2011 (Sunday), 11:59pm.
Objective
In this assignment, you will learn how a Web application works. In particular, you will see how
a Web client communicates with a Web-based application using Common Gateway Interface
(CGI) through a Web server.
Pre-requisite
You are expected to know the format of HTTP request and response messages, socket program-
ming with TCP, and how to write a simple Web server that supports GET.
The assignment will be done under a controlled, UNIX environment. Familarity with UNIX
environment (how to copy/move/delete/edit files, how to compile and run programs, etc.) is
assumed.
You should use Firefox version 3.5 or above as your Web client for this assignment. You may
additionally choose to support other browsers. The “Live HTTP Headers” add-on for Firefox
might be useful for debugging.
Administrative Matters
This is a 2-person team assignment. You should register your team members via email to Ms.
Zhu Minhui [email protected] with the following exact email subject (failure to follow
the format would result in 1 point penalty for both team members) before 11:59pm. 11 February
2011.
CS2105 A1 Team: Matric Number 1 Matric Number 2
For instance, CS2105 A1 Team: U0987654H U0888888V
Only one email is needed per team.
An account has been setup for you on host cs2105-z.comp.nus.edu.sg. To access your ac-
count, ssh to the host and login using your SoC UNIX username and password from a SoC host
or through SoC VPN.
If you have any questions or encounter any problems with the steps discussed in the assign-
ment, please contact the teaching staff through CS2105’s blog.
Background
The earliest generation of the World Wide Web consists only of static Web pages. Web clients
send HTTP requests to Web servers, which then read the requested Web objects (HTML files,
images) and returned them to the Web clients in a HTTP response message.
Soon, people realize that the Web can be much more powerful – by allowing the Web server
to interface with back-end applications, users can use the Web to access and store information
on remote applications the same way they have been doing using applications on their local
machines. This new way of interacting with remote applications enables web-based applications
(eBay, Amazon, IVLE, CORS etc.). The Web server can be viewed as a gateway to these back-
end applications. A standard called Common Gateway Interface, or CGI, was established to
Figure 1: Communication between Web client, server, and back-end application.
define how the client, the server, and the back-end applications (also termed CGI scripts)
communicate (See Figure 1).
The Web server’s responsibilities include (i) receives the HTTP request from the client, (ii)
decides which CGI script to handle the client’s request based on the URI supplied by the client,
(iii) transforms the client request into a CGI request, (iv) executes the CGI script, and (v)
converts the CGI response into a response for the client.
Scripting languages such as Perl was a popular choice for writing CGI scripts. Several
frameworks, such as ASP, JSP, PHP, ColdFusion, Ruby on Rails, were later developed to provide
rapid web application development with better support for templating HTML code and database
access. Modern Web servers commonly integrate intepreter of these languages into the Web
servers to improve performance.
Your Tasks
In this assignment, you will modify the simple Web server given in class to interface between
Web clients and a given CGI script. To simplify the assignment, you are not expected to fully
implement the HTTP and CGI standard as specified in the RFCs, but only enough for the given
CGI scripts to function. The details of what your Web server should support and some Java
tips on how you can do it is given below.
You only need to support HTTP 1.0 for this assignment.
http://www.google.com.sg/search?hl=en&q=CS2105+Networks&meta=
In the example above, search refers to the script that the server should run. The arguments
to the script (e.g., what to search?) are given in the format of key=value. The ampersand &
seperates the key-value pairs, and the question mark ? seperates the name of the script with
the arguments.
Note that a value can be empty, and the arguments may need to be encoded due to restric-
tions on URL format. For instance, a plus “+” in the URL above represents a white space.
This encoding is called URL encoding. You are not required to decode URL encoded strings in
this assignment.
The section of a script URL after the question mark is called query string. The query string
is one method a Web client can send information to the server.
The other HTTP method we are interested in is HTTP POST method. There are two forms
of POST requests – differentiated by their “content type”. To determine the content type of a
POST request, you should look at the “Content-Type:” line of the HTTP request header.
The first content type is application/x-www-form-urlencoded. This type of POST re-
quest is not much different from GET request. The only difference is that the query string,
instead of becoming part of the URL, is stored in the body of the HTTP request.
Using query strings to pass information from the client to the server has its limitation. An
example is when the client needs to upload a file to the Web server (e.g., uploading a file to
IVLE Workbins). It is not feasible to encode the whole file inside the query string. For this
reason, there exists a second way of encoding information in a POST request.
The second content type is called multipart/form-data. Instead of using key-value pairs
to encode information, the body of the HTTP POST request looks something like this:
--LKJhl876x
Content-Disposition: form-data; name="q"
CS2105 Networks
--LKJhl876x
Content-Disposition: form-data; name="files"; filename="file1.txt"
Content-Type: text/plain
Since you are not required to parse the HTTP POST body, I will not go into details about
the format above.
Page 3
query strings, uploaded files) to the script for processing. This section explains how your Web
server can do that.
There are two ways the Web server communicates with a CGI script. The first method
is through environment variables. (See Wikipedia’s entry on environment variables if you are
not familiar with this term.) The CGI standards called these environment variables meta-
variables. According to the standards, there are many meta-variables that should, or must, be
set for CGI scripts to run properly. For the purpose of this assignment, we are only interested
in four environment variables: REQUEST METHOD, QUERY STRING, CONTENT TYPE,
and CONTENT LENGTH. The Web server sets the environment variables to proper values
before calling the CGI script. When invoked, the CGI script reads the value of the environment
variables.
The second method to communicate with a CGI script is by writing directly into its standard
input. When HTTP POST method is used, the CGI script expects to read data from its
standard input. By writing the data to the script standard input, the Web serve can pass the
data to the script.
Which methods to use and which variables to set depend on the HTTP method. The enri-
onment variable that you always have to set is REQUEST METHOD. Your Web server should
always set REQUEST METHOD to either POST or GET, depending on the HTTP request
received. If the REQUEST METHOD is GET, your Web server should set QUERY STRING
to the query string of the HTTP request. Remember to set the QUERY STRING to empty
when there is not query string in the URL.
When HTTP POST request is received, your Web server must set CONTENT TYPE and
CONTENT LENGTH to their appropriate values according to the HTTP request header, and
then write the HTTP body of the POST request to the standard input of the CGI script. It
is important to set the CONTENT LENGTH correctly and send the corresponding number of
bytes to the CGI script.
• A: Delete button. Click to delete the current to-do item, including description and notes.
Page 4
Figure 2: To-Do List Interface
• B: Edit button. Click to edit the current to-do item, including description and notes. (it
will bring the user to the next screen).
• F: Add button. Add a new to-do item with the current description entered in the textbox.
Figure 3 shows the screen when the edit button (B) is pressed.
The following are the user interface elements on screen.
• J: Update button – when click, the descriptions and notes will be updated. After the user
clicks “update”, the updated to-do list will be shown as in the previous screen.
Page 5
Figure 3: To-Do List Interface
How to Do It in Java
Calling an external program
You can use the exec() method of the Runtime class in Java to execute an external program. To
execute a CGI script written in Perl, e.g. /home/o/ooiwt/a1/todo.pl, you can use the following
code:
Runtime.getRuntime().exec("/usr/bin/perl /home/o/ooiwt/a1/todo.pl");
The call returns a Process object. You will need to use the Process object later.
Page 6
Your cs2105-z Account
An account on the server, cs2105-z.comp.nus.edu.sg, has been setup for you. From within SoC
(or through SoC-VPN), ssh to cs2105-z using your SoC UNIX id and password.
Copy the files prepared for you to your home directory, by executing:
cp -r ~sadm/a1 .
• TODO and data – file-based database to store to-do list items. If you found that the
database is corrupted, simply remove all contents from the file TODO and the directory
data to restart all over again.
• edit.gif, del.gif, style.css – images and style file used by the to-do list application.
Note that you must put your files inside the directory a1 directly under your home directory.
This is the root of your Web directory.
You should ensure that the files in the directory of both team members are updated to the
latest version before the deadline. We will randomly pick the files from one of the team member
to grade.
File Permission
You will be responsible for the security of your own source code. Please be careful and set the
correct permission for your files. They should not be readable by anyone else except the owner
(chmod 600 *.java will ensure that).
Port Numbers
Your Web server must listen to a non-standard port number. Once your Web server is up and
running, you can connect to it through Firefox browser on any SoC machines (or SoC-VPN
enabled machine) by specifying the port number as part of the URL. For example, if your Web
server is running on port 9090, use http://cs2105-z.comp.nus.edu.sg:9090/todo.pl to access the
file todo.pl under your $HOME/a1 directory. You should, of course, write your Web server in
such a way that it reads from $HOME/a1.
To make it easy to use a different port number, your Web server must take in the port
number as a command line argument. For instance, to run your Web server on port 9090,
Note that all of you will be running your Web server on the same host, and therefore must
use a different port number. To prevent collision, you should avoid ”nice” port numbers such
as 8000 or 8080.
Page 7
Submission and Grading
There is no need to submit the program by email or IVLE workbin. We will collect your
assignment from your home directory on cs2105-z.comp.nus.edu.sg when the deadline is over.
We will test your assignment automatically using a grading program. For this to work, you
must not modify todo.pl in any way. If you suspect that there is a bug in todo.pl, please contact
us by posting on the IVLE forum. In addition, there will be a short demo slot for both team
members to explain their program to the teaching staff.
You MUST name your java program WebServer.java. We will only compile this file when
we grade. You MUST not implement additional classes in other *.java files.
Plagiarism Warning
You are free to discuss the assignment with your peers. But, ultimately, you should write your
own code. We employ zero-tolerance policy against plagiarism. If you are caught copying from
other student, or let other student copies your code, you will receive zero for this assignment.
Further disciplinary action may be taken by the school.
Grading
• 2 marks: Able to invoke CGI script and return output (when calling todo.pl without query
string).
We will deduct one marks for every failure to following instructions (wrong email subject
when registering team members, wrong directory name, wrong filename for WebServer.java, not
taking in port number as command line arguments, inconsistent copies in the directory of team
members, etc.)
THE END
Page 8