13.1 Updates:
  1. none

Simple Multi-threaded Web Server

Elliott

(Only for background reading this quarter. This is NOT THE CODING ASSIGNMENT.)

Overview:

MyWebServer Checklist
Firefox Browser tools (Quick: Ctrl-Shift-E to raise console. Network / Inspector tabs | drag top up for larger console window.)

All MyWebserver programs MUST communicate with the Firefox browser.

In this program you will follow through the steps of capturing the http stream between existing clients and servers, and write a web server that supports this same protocol. It builds on the JokeServer, which application does much of the same work. While the text of the assignment is quite long, the application itself is quite straightforward, and you might be surprised at how easily it can be written.

There are four+ phases in the development process:

  1. Capture the HTTP protocol first-hand by developing some hacking / debugging skills (hacking in the good sense).

  2. Return simple, static files on request from a browser client.

  3. Return dynamically created HTML (build a directory HTML page dynamically)

  4. Accept FORM input from the user and do back-end processing on the server to return computed values in (simple!) dynamically-created HTML.

  5. Add features of your own choosing, if you like.

See the MyWebServer Tips file for some suggestions once you get coding.

Run at port 2540 in the server directory!

In all cases these following specifications take precedence: The web server must run at port http://localhost:2540. It must, by default, serve files from the directory in which the web server is started, including dog.txt, cat.html. The source code should be contained in a single, stand-alone file name MyWebServer.java ready to compile and run. Subdirectories should be recursively traversed from the default directory in which the server is started.

Grading procedure:

  1. Run our various plagiarism checkers on your submission.

  2. Extract your zip file into a directory, and run a script file that:

    1. Executes > javac MyWebServer.java

    2. Populates the new directory with .txt files, .html files and .java files such as dog.txt, cat.html, MyWebserver.java and the file addnums.html (with an action statement that points to port 2540 on localhost), then creates subdirectories and populates those with .txt files and .html files.

    3. Executes "> java MyWebserver" to start your webserver at port 2540.

  3. In firefox read your directory listing for the directory where the server is running, using port 2540.

  4. Select checklist-mywebserver.html from your listing and read it.

  5. Browse the .txt .java (treated like .txt) and .html files with which we have populated your directory.

  6. Select the addnums.html file and submit data through it.

  7. Select http-streams.txt and read it.

  8. Select serverlog.txt and read it.

  9. Select MyWebserver.java, read your source code, and look at the comments. Note: you should display .java files the same as .txt files by sending the data as text/plain.

  10. Navigate to the subdirectories and read .txt, .java and .html files there.

Special Security Note:

I expect that you will find that in its most basic form this is not a particularly difficult assignment. If so, you will soon have a viable, running webserver of your own creation. If you are developing on a machine that is also connected to the Internet this means that you might well expose all of the files on your local machine (or any remote machine where you might be running) to evil hackers from around the world who are anxious to steal information from your files. In the worst case this information would allow them write access to your disk, and/or put financial/personal information in their hands. So—be careful. Hard-code into your server that you only return files from your root server directory of unimportant files, keep your firewall on, etc. Be careful about the "../.." form of URLs, which would allow someone to retrieve files from above your server's directory. For particularly sensitive machines you can always simply unplug your Internet connection while running your server.

Server Directories

For this assignment your server must serve files from the directory where the server is started. Place all of your submission files in this same directory.

Administration:

Capturing HTTP:

MIME headers

For this assignment we will use two mime types: Content-Type: text/plain and Content-Type: text/html. These must be followed by two cr/lf and then your data.

MIME types are determined by the server from the file extension of the files that are requested. .html will use text/html, and .txt and .java files will both use text/plain. (This is just a trick so we can view your java source code through your webserver.)

Modify your MultiThreaded server so that it becomes a simple web server.

Goal: Your web server must correctly return requests for files with extensions of .txt, and .html [and also .java which are treated as the same as .txt]. This means that it must return the correct MIME headers (That is, the Content-type [followed by two cr/lf], and Content-length headers), as well as the data. This is a server that operates on static data.

Extend your server to include directories:

Goal: Extend your server so that it sends back dynamically constructed data: in this case the HTML-formatted current contents of a directory. This will now be a server that operates on dynamic data.

[Intermediate step: If you are struggling with this assignment, you might want to first simply create some dynamically created HTML, by sending back an very simple HTML file with dynamic data in it, such as the current time. This way you can at least say you have written back dynamic HTML to the client. Then once you are getting the text/html mime type working with dynamic data, go on to creating a directory listing.]

Server-Side scripting and program execution.

Goal: write simple code to run arbitrary program code on the server processing user input from the web, and send the results back to the web client.

In this section we add back-end programming capability to your server, or at least simulate it. We create a simple addnums web form , accept input from a user, pass this to our webserver, process the information, and return a computed response based on the input.

For those who are more ambitious you might look into java's JNI, which allows us to call native code, by loading it into the virtual machine, and then running it. In this way we might write programs that actually run arbitrary scripts/programs under the web server.

Alternatively, for those writing in C, the "system()" function will execute any executables as subprocesses, making the running of programs and scripts trival. Note: be very security conscious of running user-input shell commands with the "system()" call, because, e.g., they might have you execute a command to erase all of your files!

Neither method is required. Instead, to keep the programming scope reasonable, we will only simulate the running of back-end scripts.

CGI (the Common Gateway Interface) has been around since the beginning of the web, so there are thousands of references on how to use it.

Tu-duh! You have now built a multi-threaded web server that can handle files, directory traversals, and server-side scripting after getting input from the user through a web client. Good job!

What you turn in


Grading note:

You can assume we will not have any spaces in file names.

We MUST be able to retrieve files from the directory in which your MyWebserver program is running. That is, when the following files are together in the indicated subdirectory. You can assume we will put a trailing slash if we enter a directory name in the address bar of a browser. Your root directory should display if there is no further information beyond the port number.

 
/users/elliott/students/435/Web/
    MyWebserver.class
    dog.txt
    cat.html
    /sub-a
       /sub-b
          cat.html
We should be able to retrieve your files from:
http://localhost:2540/dog.txt
http://localhost:2540/cat.html
http://localhost:2540/sub-a/sub-b/cat.html

and
http://localhost:2540/ or...
http://localhost:2540
should show us:

addnums.html
checklist-mywebserver.html
dog.txt
http-streams.txt
cat.html
serverlog.txt
sub-a/
MyWebServer.class
MyWebServer.java

...or at least something similar.

As per the grading specifications above, we should be able to retrieve all your files through your webserver from this kind of directory listing.

Bragging rights (not required):

Side note: Unix (Apache) servers usually serve files from USERACCOUNT/public_html. For example, if I put dog.txt on this unix/Apache machine as /condor/cscfclt/elliott/public_html/dog.txt we would find it on the web as http://condor.depaul.edu/elliott/dog.txt. or http://condor.depaul.edu/~elliott/dog.txt.