CS 422                                                                                                       Spring 1999
                                                Lab 13: Building a HTTP server

Pre-reading for this Lab

    Before coming to the lab, study carefully the example server given in [1]. Also, familiarize yourself with the following functions in the socket API: getservbyname, getprotobyname, bind, listen, accept etc.

Purpose of the Lab

    The objective of this lab is to implement a HTTP server that will allow a HTTP client ( a web browser like Netscape or Internet Explorer ) to connect to it and download files.

HTTP Protocol Overview

    A HTTP client issues a `GET' request to a server in order to retrieve a file. The general syntax of such a request is given below :

                                                GET <sp> <Document Requested> <sp> HTTP/1.0 <crlf>
                                                {<Other Header Information> <crlf>}*
                                                <crlf>

where : <sp> stands for a whitespace character and,
             <crlf> stands for a carraige return-linefeed pair. i.e. a carriage return (ascii character 13) followed by a linefeed (ascii character 10).
             <Document Requested> gives us the name of the file requested by the client. As mentioned in the previous lab, this could be just a backslash ( / ) if the client is requesting the default file on the server.
             {<Other Header Information> <crlf>}* contains useful ( but not critical ) information sent by a client. These can be ignored for this lab. Note that this part can be composed of several lines each seperated by a <crlf>.
             Finally, observe that the client ends the request with two carraige return - linefeed character pairs: <crlf> <crlf>

The function of a HTTP server is to parse the above request from a client, identify the file being requested and send the file across to the client. However, before sending the actual document, the HTTP server must send a response header to the client. The following shows a typical response from a HTTP server when the requested file is found on the server:

                                                HTTP/1.1 <sp> 200 <sp> Document <sp> follows <crlf>
                                                Server: <sp> <Server-Type> <crlf>
                                                Content-type: <sp> <Document-Type> <crlf>
                                                {<Other Header Information> <crlf>}*
                                                <crlf>
                                                <Document Data>

where : <Server-Type> identifies the manufacturer/version of the server. For this lab, you can set this to CS 422 Lab13.
             <Document-Type> indicates to the client, the type of document being sent. This should be text/html for an html document, image/gif for a gif file, text/plain for plain text, etc.
             {<Other Header Information><crlf>}* as before, contains some additional useful header information for the client to use. These may be ignored for this lab.
             <Document Data> is the actual document requested. Observe that this is separated from the response headers be two carraige return - linefeed pairs.

If the requested file cannot be found on the server, the server must send a response header indicating the error. The following shows a typical response:

                                                HTTP/1.1 <sp> 404 File Not Found <crlf>
                                                Server: <sp> <Server-Type> <crlf>
                                                Content-type: <sp> <Document-Type> <crlf>
                                                <crlf>
                                                <Error Message>

where : <Document-Type> indicates the type of document (i.e. error message in this case) being sent. Since you are going to send a plain text message, this should be set to text/plain.
             <Error Message> is a human readable description of the error in plain text/html format indicating the error (e.g. Could not find the specified URL. The server returned an error).

Procedure and Algorithm Details

    The basic algorithm for the HTTP would be:

    * Open Passive Socket.
    * Do Forever
    * Accept new TCP connection
    * Read request from TCP connection and parse it.
    * Frame the appropriate response header depending on whether the URL requested is found on the server or not.
    * Write the response header to TCP connection.
    * Write requested document (if found) to TCP connection.
    * Close TCP connection

The server that you will implement for this lab will not be concurrent i.e it will not serve more than one client at a time (it queues the remaining requests while processing each request). You will be adding this functionality in the following lab by forking a process for each request received. Based on the server exaple server given in [1] (or the DAYTIME server given in [2]), implement a HTTP server. The server should work as specified in the overview above.

Hints

You should implement your server in a single file. Like in the previous lab, create a separate function to parse the request string. You can also have separate function that frames the response header to be sent to a client. As before, you can group all the socket functionality into a single function.

Reading and References

[1] Chapter 23 in `Computer Networks and Internets' by Douglas E. Comer - "Example of a client and a server".

[2] Chapter 10 in `Internetworking with TCP/IP - Vol 3' by Douglas E. Comer and David L. Stevens - "Iterative, Connection Oriented Servers (TCP)".

[3] RFC 1945 defines the HTTP 1.0 protocol. You can access this by typing `rfc 1945' on your console.