Understand WSGI by building a microframework

When you’re learning web development in Python, it’s tempting to go straight for higher level frameworks like Django and Flask that abstract the interactions between the web server and the application. While this is certainly a productive way of getting started, it’s a good idea to go back to the lower level at some point so you understand what these frameworks are doing for you. In this post, you will learn about the Web Server Gateway Interface (WSGI) – the standard interface between web servers like nginx and Apache and Python applications. You’ll do that by working from a simple “Hello, world!” WSGI example up to a microframework that supports Flask-like URL routing with decorators, templating, and lets you code your application logic inside simple Django-like controller functions.

The simplest WSGI application

Take a look at this “Hello, world!” code:

The make_server function imported from wsgiref.simple_server  is part of the reference WSGI implementation included in the Python standard library. It returns a web server instance that you can start by calling its serve_forever  method. make_server  takes three arguments: the hostname, the port and the WSGI application itself. In this case, the hello_world  function is the WSGI application. WSGI applications have to be Python callables, i.e. either functions or classes that implement a  __call__  method.

Now let’s take a look at hello_world . You’ll notice that it has two arguments: environ  and start_response . environ  is a dictionary that holds information about the execution environment and the request, such as the path, the query string, the body contents and HTTP headers. start_response  is a function that starts the HTTP response by writing out the status code and the headers. Finally, the function returns a list with a single string in it. This is because the return value of a WSGI application must be

Finally, the function returns a list with a single string in it. This is because the return value of a WSGI application must be an iterable. (Strings are iterable too, of course, but iterating over a string and writing it out one character at a time is pretty slow.)

Save this code in a helloworld.py  and run it. Then open a browser and go to http://localhost:8000 . You should see “Hello, world!”. (If you don’t, check that you don’t have anything else running on port 8000.)

What the framework will look like

So far, the application is very limited. In your browser, go to http://localhost:800/helloworld/ . You will see “Hello, world!” again. As it stands, the application returns the same response for every path you try. It would be much nicer to be able to write something like this (MicroFramework is the framework class, which you will see later.):

You’ll learn about the route decorators in a moment. For the moment, just take a look at what is happening in the home and hello_world controller functions.

The controller functions are taking a Request  object as a parameter and returning a Response object. This is much cleaner than trying to take everything out of the WSGI environ  dictionary and then calling start_response , setting headers manually every time.

Here is the Request  class. It’s just a wrapper that extracts information from the environ dictionary and makes it accessible in a more convenient way:

The Request class

Most of the code in this class is just extracting keys from environ , but a few lines deserve a special mention. HTTP headers sent with the request are stored in environ in keys with the “HTTP_” prefix, so you can use a dict comprehension to extract them and store them in self.headers:

GET and POST data

Parsing the query string to extract GET and POST data is also interesting. The urlparse  module in the standard library contains a function called parse_qs  that takes a standard HTTP query string in the format ?key1=value1&key2=value2  and converts it into a dictionary that maps each key to a list of values. To extract GET data and store it in self.GET , you can call parse_qs(environ["QUERY_STRING"]) .

Extracting POST data is a bit more complicated as it is contained in the body of the HTTP request. First, you have to check if the HTTP method is POST, then read the content length, then get the POST query string by reading that number of bytes from the input stream. Finally, you call parse_qs  on the query string.

With the Request class in place, can instantiate a request object like so:

The Response class

What about Response ? It follows a similar principle:

As in the Request  class, most of the code here is just for convenience. You can assume that the default response code will be “200 OK” and allow the code to be set manually when it differs.

It is more natural to manipulate response headers as a dictionary, but they need to be passed to start_response  as a list of tuples, so the wsgi_headers  method, with the @property  decorator, returns them in that structure. You will see how this is used when you take a look at the __call__  method in the framework class.

Now you can build different types of responses by passing different keyword arguments. How about a normal “200 OK” response?

What about a “404 Not Found”?

Or what if an error occurs?

And what if you want to issue a redirect? It is as simple as setting the “302” status code and adding a location header.

The MicroFramework class

This is where it all comes together:

Look at the __call__  function. It gets back a instance of Response  by first building a  Request  object based on  environ  and passing that to the framework’s dispatch  method. Then it calls  start_response  with the status code and the headers from the response. Notice how wsgi_headers  is used.

Dispatching requests to controllers

The next interesting thing in the framework class is the dispatch  method. In the constructor, the  self.routes dictionary is initialized. It contains a mapping from regular expressions that represent request paths to controller functions. The method iterates through the regular expressions until it finds one that matches the request path, then it calls the associated controller function and returns the response from it to __call__ . If no route matches the path, it returns a “404” response generated by the not_found  static method.

If an error occurs while executing the controller function, the framework grabs the stack trace, prints it, and returns a “500” response generated by the internal_error  static method.

The route decorator

How do routes get into the self.routes  dictionary in the first place? That’s where the route  decorator comes in. All it does is add a mapping from the regex provided as an argument to the decorator to the controller function itself. The regexes can also contain capture groups that are stored in the request.params  list and made available to controller functions, as in the following example:

In this route regex, the capture group is a sequence of one or more numeric characters. The question mark (?) after the last slash makes the slash optional.

Integrating Jinja2

So far, the framework has no support for templating, but it is easy to integrate Jinja2 or any other template engine. Here is a simple example, which assumes that you have a template directory called “templates” with a file called “helloworld.html” in it.

This post showed you how to build a super simple web framework in Python, but this is just the beginning. For instance, the framework doesn’t support cookies or sessions, and there is no database access facility. Caching, form handling and other niceties that Django and Flask provide either out of the box or as plugins would also need to be added to turn this into a fully featured framework.

Why don’t you try to implement some of these features yourself?

Download Mastering Decorators

Mastering_decorators_cover

Enjoyed this article? Join the newsletter and get Mastering Decorators - a gentle 22-page introduction to one of the trickiest parts of Python.

Weekly-ish. No spam. Unsubscribe any time. Powered by ConvertKit