A HTTP Server

A HTTP Server #

Introduction #

An embedded server offers quick access to variables or structures inside an application and can bring a completely different experience for user interface design.

What this server can do:

  • provide access to application’s data structures
  • serve HTML and other media files
  • access control through basic authentication

What it is missing:

  • no SSL support
  • no digest authentication
  • no logging

Background #

A short review of where we are now: we started with a sock object as a wrapper for Windows sockets. Based on it, the tcpserver object implements a multi-threaded TCP server that listens for new connections. When a client connects it creates a new thread that services that connection. The application doesn’t interact much with these connection threads; all the configuration is done through the main server object.

The connection threads implement the HTTP protocol and can retrieve variables through a SSI type interface. They can also invoke specific application functions.

The httpd Object #

This is the server object that listens for new HTTP connections. It is derived form tcpserver and, when a new client connects, it creates a http_connection thread object.

The simplest application using this HTTP server is:

using namespace mlib;

int main (int argc, char **argv)
{
  httpd server (8080);
  server.start ();

  while (!_kbhit())
    ;

  server.terminate ();
}

The main application thread creates the server object and starts it. By default, the server listens on port 80 but in this case, it will listen on port 8080. The main thread waits for a key press and then, unceremoniously, kills the server.

Let’s create now a file index.html with the following content:

<html>
<body>Hello world!</body>
</html>

You can try connecting a browser to http://localhost:8080 and in response the server will send the content of the index.html file:

demo1.png

User variables #

You can make an application variable accessible to the server using the add_var function. To see how it works, first let’s modify our minimalist server:

using namespace mlib;

int main (int argc, char **argv)
{
  int answer = 42;
  httpd server (8080);
  server.add_var ("computer_answer", "%d", &answer);

  server.start ();

  while (!_kbhit())
    ;

  server.terminate ();
}

Now we create a file page1.shtml with the following content:

<html>
<body>
The answer to the "Ultimate Question of Life, the Universe, and Everything",<br/>
is <!--#echo var="computer_answer" -->.
<body/>
<html/>

Unsurprisingly, if you navigate to http://localhost:8080/page1.shtml you will see a page like this: demo2.png

The signature of the add_var function is:

void add_var (const char *name, const char *fmt, void *addr, double multiplier=1.);

where:

  • name is the external name of the variable as it appears in the SSI ‘var’ construct.

  • fmt is the printf-like format used to generate the SSI replacement string

  • addr is the address of the variable

  • multiplier is an optional parameter used for scaling floating-point values.

The http_connection object #

When a client connects to the server, the server creates a http_connection object. This object is derived from mlib::thread and implements the HTTP protocol. In many cases you don’t need to interact directly with these objects. You do need to use them however when you create URI handler functions.

URI handlers #

Sometimes the simple SSI mechanism is not flexible enough. In this case you can register directly a function that will be invoked in response to a request for a URI. To add a handler you need to call the add_handler function of the server object.

For an example, let’s first add a button to another web page, page2.shtml:

<html>
<body>
The answer to the "Ultimate Question of Life, the Universe, and Everything",<br/>
is <!--#echo var="computer_answer" -->.
<br/>
<form method="post" action="author.cgi">
  <input type="submit" value="Who Said That?" />
</form>
<body/>
<html/>

If we access that page we should see this: demo3.png

Now we have to add the handler function to our program:

int author (const char *uri, http_connection& client, void*)
{
   //send response headers
  client.respond (200); //200 = OK

  //send response body
  client.out() << "<html><body>Douglas Adams, Hitchhicker's Guide to The Galaxy</body></html>";
  return 1;
}

int main (int argc, char **argv)
{
  int answer = 42;
  httpd server (8080);
  server add_var ("computer_answer", "%d", &answer);
  server add_handler ("author.cgi", auhtor);
  server.start ();

  while (!_kbhit())
    ;

  server.terminate ();
}

The main program registers the response function giving the URI and the function to register. In response to the POST request to URI “http://localhost:8080/author.cgi”, the URI handler function is invoked and receives a reference to the http_connection object.

It first calls the respond() method of the connection object to send the appropriate response code and headers and then, streams out the HTML text of the response. The out() method returns the socket stream used to communicate with the client. The only thing our function has to do now is to stream out the content of the web page. The end result is this page: demo3a.png A handler function can return 0 to indicate that it does not want to process the URI request. In this case normal request processing continues.

Anatomy of a HTTP exchange #

This is a detailed look at the different actions that take place when a client initiates a HTTP connection.

First the listening server spawns a new thread as a http_connection object. The thread receives a Windows socket that is immediately transformed into a sockstream object.

The connection thread starts reading characters until the first <CR><LF> sequence is encountered. If a request is properly formatted, at this point the buffer should contain:

HTTP version space method uri <CR><LF>

After verifying the request syntax, the thread keeps reading characters (the request headers) until it encounters an empty line. That signals the end of request headers and the beginning of a beautiful friendship the request body. For POST or PUT requests, that can have a body, the thread reads as many characters as the Content-Length header indicates.

It is time now to dispatch the request. First the connection thread checks the URI against the table of handler functions maintained by the parent httpd object. If found, the handler is invoked and the cycle repeats until the connection socket is closed.

If there is no handler function registered for the URI, the thread checks if it can find a file matching the requested URI. If a file is found, the next step is to check the file extension against a list of MIME-types maintained by the parent. By default, the list contains a few basic HTML, text and image types. For regular files, the file content is sent back to the client and that completes the request cycle.

For SHTML files, the connection thread starts reading and parsing the content of the file looking for SSI ’echo’ constructs:

<!--- echo var=varname --->

If it finds such a construct, it checks the variables table maintained by the parent and fetches the current value and formats it according to its associated printf format (floating point values are also scaled). Once the file has been sent, the cycle repeats again until the socket is closed.

Objects Description #

For a better understanding of what you can achieve with this small HTTP server, here is a brief description of methods provided by httpd and http_connection objects. Detailed descriptions can be found in the Doxygen generated documentation.

Methods of httpd object #

When a httpd object is initialized, the constructor takes as argument the port number where the server will be listening. The port cannot be changed afterwards. It also takes an optional parameter that represents the maximum number of concurrent connections. If left to default value the number of connections is unlimited:

  httpd (unsigned short port=HTTPD_DEFAULT_PORT, unsigned int maxconn=0);

The port number can be retrieved at any time and it can be changed before the server is started:

  • unsigned short port () returns the port number where the server is listening
  • void port (unsigned short portnum) changes the port number where the serve is listening. This is effective only before the server is started.

A group of methods allows you to control the HTTP headers that are transmitted in each response:

  • void add_ohdr (const char *hdr, const char *value) adds a new header with the given value.
  • void remove_ohdr (const char *hdr) removes an existing response header

The origin of the files sent by the server (docroot) can be set or changed using the docroot functions:

  • void docroot (const char *path) sets the current origin
  • const char* docroot () const returns current origin

The the name of the default document using the function

  • void default_uri (const char *name) set the default filename that will be sent if the request doesn’t contain a filename. By default it is index.html

File structure, as seen by connecting clients can be changed also using aliases. The function:

void add_alias (const char* uri, const char* path);

creates an alias for a given URI segment. It works by replacing uri string in the incoming request with path. For instance, if docroot is set to c:\local_folder\, after calling:

  add_alias ("doc", "documentation");

an URI like /doc/project1/filename.html will be mapped to c:\local_folder\documentation\project1\filename.html.

The server maintains a table of MIME types used to map between file extensions and MIME types. In turn, the MIME types are used to populate the value of Content-Type header. This table can be modified using:

  • void add_mime_type (const char *ext, const char *type, bool shtml=false) to add an additional MIME type
  • void delete_mime_type (const char *ext) to delete a MIME type

If the shtml parameter of the add_mime_type function is true, the files with that extension will be parsed as SHTML files and scanned for SSI echo constructs.

We have seen before the function add_var that adds application variables that can be accessed through SSI constructs. To avoid race conditions, access to all variables is protected by a critical section:

  • void acquire_varlock () enters the critical section that protects all variables
  • void release_varlock () leaves the critical section
  • bool try_varlock () tries to enter the critical section and returns true if successful.

As a method of access control, the server provides basic user authentication with various ‘realms’ where different users are allowed access:

  • void add_realm (const char *realm, const char *uri) adds a realm and specifies the URIs covered by the realm. Any URI that starts with uri string is considered part of the realm.
  • bool add_user (const char *realm, const char *username, const char *pwd) adds a user to list of users with access to a realm
  • bool remove_user(const char *realm, const char *username) removes a user from the list of those with access to a realm.

Methods of http_connection object #

The http_connection object has methods for accessing different parts of the request:

  • const char* get_uri () returns the whole URI (e.g. http://localhost:8080/author.cgi)
  • const char* get_method () returns the HTTP verb of the request (‘GET’, ‘POST’, etc.)
  • const char* get_query () returns the query part of the URI (everything after ‘?’ and before ‘#’)
  • const char* get_body () returns the body of the query (for POST requests)

Request headers are also available through a number of methods:

  • const char* get_ihdr (const char *hdr) returns the content of an input (received) header
  • const char* get_ohdr (const char *hdr) returns the content of on output (sent) header
  • const char* get_all_ihdr () returns all input (received) headers
  • void add_ohdr (const char *hdr, const char *value) add a new output (sent header) or modifies header’s value

URL-encoded queries can be parsed using the following methods:

  • bool has_qparam (const char* key) returns true if the query contains the specified parameter.
  • const std::string& get_qparam (const char* key) returns the value of an URL-encoded query parameter

Similarly, URL-encoded request body can be parsed using the methods:

  • bool has_bparam (const char* key) returns true if the request body contains the specified parameter.
  • const std::string& get_bparam (const char* key) returns the value of an URL-encoded request body parameter

Conclusion #

You have now a small and flexible HTTP server that you can easily integrate in your applications. The last chapter of this series will show you the “JSON bridge” and how it can make even easier to integrate web pages in your application.