Launching a browser and visiting a site on the internet involves at least two parties: the web server and the requestor (you). In this assignment, we will use pthreads and Berkley Sockets to implement a primitive web server, which talks HTTP to browsers over TCP/IP.
HTTP (HyperText Transfer Protocol) is a simple, text-based communication protocol by which a web browser requests documents from a web server, and the web server replies. For instance, if you visit a web site such as http://www.example.com/test.html, your browser does the following:
1. Connect to the IP address of http://www.example.com/test.html (obtained via DNS lookup) at port 80 (standard web server port)
Send the HTTP request message:
GET /test.html HTTP/1.1
To which the server should reply (assuming it finds the file):
HTTP/1.1 200 OK
Date: Thu, 15 Nov 2007 00:24:55 GMT
Followed by a blank line and the contents of the file (which is 4892 bytes in this example). If the file is not found, it should return the infamous 404 error code like so:
HTTP/1.1 404 Not Found
Note that this message should also be followed by a blank line.
Your task is to use Berkley sockets to accept GET requests for HTML pages over HTTP. Your main thread should wait for a connection to occur, spawn off a worker thread, and have that thread communicate to the requestor according to the above protocol.
When the thread is done, it should add the request for that particular webpage to a file named stats.txt. Note that this file needs to be exclusively accessed, so youíll need to do some sort of synchronization.
Port 80 is the normal web server port, but we canít all use it at the same time. On the class website there is a list of names and your designated, personal port number. Please use this port and only this port. For an address of the machine, we will simply refer to it as localhost, or localhostís reserved ip address: 127.0.0.1
thot.cs.pitt.edu is firewalled from the outside world, meaning that you will only be able to connect to your server from thot itself. In order to do this, your best bet for testing is to use a few programs:
telnet is a terminal emulator. You can connect directly to your server, and anything you type will be sent. This means you can manually create a GET request, and you will see the reply of the server in plain text on your screen.
wget downloads files from the internet.
lynx is a text-mode web browser. No graphics, no tables, and minimal font support, but itís a real, working browser.
My advice is to work with two SSH windows open, one with your server printing error messages to stderr, and one that you are using one of the above programs to request pages.
You need to submit:
Make a tar.gz file as in the first assignment, named USERNAME-project5.tar.gz
Copy it to ~jrmst106/submit/449/ by the deadline for credit.