Part 8 - Our own WSGI framework
(The changes introduced in this post start here.)
In our quest to replace all the parts of a python web application with our own lackluster implementations, we have now come to the grand WSGI
finale: a WSGI
application. or rather: a WSGI
framework which we’ll then build a WSGI
application with.
We’ve seen a super simple WSGI
application before. It can be as easy as a function that takes the environ
dict and start_response
callable as inputs, creates some useful response, calls start_response
with the response status and headers and finally returns an iterable with the (optional) response body.
def application(environ, start_response):
status = "200 OK"
response_headers = [("Content-type", "text/plain")]
start_response(status, response_headers)
return [b"Hello World!"]
It’s not too hard to imagine how you could gradually extend that to build up quite complex web applications.
def application(environ, start_response):
if environ["PATH_INFO"] == "/":
status = "200 OK"
headers = [("Content-type", "text/plain")]
body = b"Hello from /"
elif environ["PATH_INFO"] == "/create" and environ["REQUEST_METHOD"] == "POST":
status = "200 OK"
headers = [("Content-type", "text/plain")]
data = environ["wsgi.input"].read()
body = b"Hello from /create called with data " + data
else:
status = "404 NOT FOUND"
headers = [("Content-type", "text/plain")]
body = b""
start_response(status, headers)
return [body]
It’s also not too hard to see how this could gradually get you into trouble. First of all, you need to have intimate knowledge of the WSGI
specification, which shouldn’t really be necessary for someone just wanting to build a web application. Also, you’ll want to have an easier way to access the request information and define endpoints. So, sooner or later you’ll define abstractions on top of this basic WSGI
application pattern to make your life easier and keep the code maintainable.
The job of a WSGI framework
In order to prevent people from constantly reinventing the wheel, WSGI
application frameworks exist. They basically help you to focus on writing the actual business logic of your web application and try to abstract away the more low-level parts of what it means to interact with a WSGI
server.
The example app above, in a framework like flask
, would look much simpler and more readable:
from flask import Flask, request
app = Flask(__name__)
@app.route("/")
def root():
return "Hello from /"
@app.route("/create", methods=["POST"])
def create():
return f"Hello from /create with body data {request.data}"
In that example you can already see some of the jobs the framework is doing:
- gives you an easy way to define path operations (as decorated functions)
- a path operation is a behavior that should happen when calling a certain path/endpoint on the application
- lets you limit the allowed HTTP
methods on a certain path operation
- gives you much easier access to the request information
- automatically converts the path operation function return to a correct HTTP
response
- uses a default status and headers if not specified otherwise
- return value of the path operation function becomes response body
- returns a 404 NOT FOUND
response when trying to call unavailable path operations
There are also some other things a WSGI
framework usually does that can’t be seen in this tiny example, e.g.:
- provides other kinds of responses than just plain-text and adjusts the headers accordingly
- lets you manually set the response status and headers
- lets you register error handlers, such that if exceptions are raised in the application code, this results in a specific HTTP
response to be sent back to the client
- provides some way to modularize bigger codebases (e.g. via Blueprint
in flask
)
- provides abstractions for more advanced HTTP
features like cookies and sessions or authentication flows
In this post we’ll build a tiny flask
-like WSGI
application framework that ticks off some, but not all, of these boxes.
Some preparations in the codebase
To start with, let’s move all server-related files into a package ./wsgi/server/
and add __init__.py
files to both levels of that new module structure. In all server files we now also need to change the local imports to relative imports. We also turn the serve_forever()
function in ./wsgi/server/server.py
into a WSGIServer
class with a serve_forever()
method.
#./wsgi/server/server.py
...
class WSGIServer:
def __init__(self, host: str, port: int, app):
self.host = host
self.port = port
self.app = app
def serve_forever(self):
server_socket = socket.socket()
server_socket.bind((self.host, self.port))
server_socket.listen(1)
while True:
client_socket, address = server_socket.accept()
print(f"Socket established with {address}.")
session = Session(client_socket, address, self.app)
t = threading.Thread(target=session.run)
t.start()
As you can see, the WSGIServer
now accepts an app
input, which is how we’ll provide the WSGI
application from now on.
We also remove the script-part of ./wsgi/server/server.py
that was used to start a test server and instead put that into a top-level ./run.py
script. In there we also put what was previously in app.py
(the test flask
app) to have a combined script that defines a little WSGI
application and starts serving it through our WSGI
server.
#./run.py
from flask import Flask, request
from wsgi.server import WSGIServer
app = Flask(__name__)
@app.route("/", methods=["GET"])
def root():
print("Called root endpoint.")
return "hello from /"
@app.route("/create", methods=["POST"])
def create():
print(f"Called create endpoint with data {request.data}.")
return "hello from /create"
if __name__ == "__main__":
server = WSGIServer("127.0.0.1", 5000, app)
server.serve_forever()
The package can be installed via pip install -e .
through a rudimentary new top-level ./setup.py
.
#./setup.py
from setuptools import setup, find_packages
setup(
name="wsgi",
description="A tutorial implementation of a WSGI server and application.",
version="0.0.1",
packages=find_packages(),
)
After all those changes you can test that you get the previous behavior by running python run.py
.
The details of this refactor are probably best followed in this commit.
Easily register path operations on the app
Now let’s start with actually creating the application framework. We create a new module ./wsgi/application/
, create an __init__.py
and a new application.py
file. In there will be our framework skeleton:
from typing import Callable
from dataclasses import dataclass
class WSGIApplication:
def __init__(self):
self.path_operations = dict()
def _register_path_operation(
self, path: str, http_method: str, func: Callable
):
po = PathOperation(path, http_method)
self.path_operations[po] = func
def _create_register_decorator(self, path: str, http_method: str):
def decorator(func: Callable):
self._register_path_operation(path, http_method, func)
return func
return decorator
def get(self, path: str):
return self._create_register_decorator(path, "GET")
def post(self, path: str):
return self._create_register_decorator(path, "POST")
# enable the class instance to be used as a callable
def __call__(self, environ, start_response):
po = PathOperation(environ["PATH_INFO"], environ["REQUEST_METHOD"])
func = self.path_operations.get(po)
if func is None:
status = "404 NOT FOUND"
headers = [("Content-type", "text/plain")]
body = b""
else:
status = "200 OK"
headers = [("Content-type", "text/plain")]
body = func().encode("utf-8")
start_response(status, headers)
return [body]
# frozen so that we can use it as a hash in the path operation dict
@dataclass(frozen=True, eq=True)
class PathOperation:
path: str
http_method: str
In contrast to the super simple, function-based WSGI
applications we’ve seen so far, this one will actually be implemented as a class. You can see the familiar application interface in the __call__
magic method. You can register new path operation functions on the app using get
and post
as parametrized decorators (a function that returns a decorator). This is, of course, slightly different to the flask
way where a single path operation function could be in charge of responding to all kinds of HTTP
methods. But I like it better this way.
On registering a new function, it is simply put into a dict that maps from a unique path
+http_method
to the actual function to be called. When the application is called, it checks whether the request path
+http_method
matches any existing path operation function. If not it returns a 404 NOT FOUND
response, if yes it calls the path operation function and returns a 200 OK
plain-text response with the function return as the body.
Technically we should be returning a 405 METHOD NOT ALLOWED
error for a request to an existing path but with the wrong HTTP
method. But we want to keep it very simple here, so just be aware of that this is not 100% correct behavior.
We can now actually already replace flask
in our test application with our own little mini-framework. Change the top-level ./run.py
to the following:
#./run.py
from wsgi.server import WSGIServer
from wsgi.application import WSGIApplication
app = WSGIApplication()
@app.get("/")
def root():
print("Called root endpoint.")
return "hello from /"
@app.post("/create")
def create():
print(f"Called /create endpoint.")
return "hello from /create"
if __name__ == "__main__":
server = WSGIServer("127.0.0.1", 5000, app)
server.serve_forever()
Looks all very familiar. You can test the behavior using curl
.
Bundling request data in a single object
One major thing that is not possible yet is to actually get access to the request data (e.g. query parameters, the request body, headers, …) in order to be able to act on it.
Let’s solve this with a new Request
class in ./wsgi/application/request.py
.
#./wsgi/application/request.py
from typing import Dict
from dataclasses import dataclass
@dataclass
class Request:
query: Dict[str, str]
body: bytes
headers: Dict[str, str]
@classmethod
def from_environ(cls, environ: Dict):
query = {}
if environ["QUERY_STRING"]:
qs = environ["QUERY_STRING"]
query = dict(entry.split("=") for entry in qs.split("&"))
body = environ["wsgi.input"].read()
headers = {
k.replace("HTTP_", ""): v
for k, v in environ.items()
if k.startswith("HTTP_")
}
return cls(query, body, headers)
Nothing overly fancy here. Just a little dataclass
with a factory method that allows it to be instantiated based on an environ
dict. It processes the query string, the body and the headers and saves them in the specified format.
An instance of this class is created in the application every time the application is called (i.e. on every new HTTP
request to the server). The object is then passed into the path operation function. So we need some minor changes both in the application framework and in the path operation functions we register.
The following needs to be changed in ./wsgi/application/application.py
.
#./wsgi/application/application.py
@@ -1,5 +1,6 @@
from typing import Callable
from dataclasses import dataclass
+from .request import Request
class WSGIApplication:
@@ -33,9 +34,10 @@ class WSGIApplication:
headers = [("Content-type", "text/plain")]
body = b""
else:
+ request = Request.from_environ(environ)
status = "200 OK"
headers = [("Content-type", "text/plain")]
- body = func().encode("utf-8")
+ body = func(request=request).encode("utf-8")
start_response(status, headers)
return [body]
And the path operation functions in ./run.py
need to change like this:
#./run.py
@@ -1,19 +1,17 @@
from wsgi.server import WSGIServer
-from wsgi.application import WSGIApplication
+from wsgi.application import WSGIApplication, Request
app = WSGIApplication()
@app.get("/")
-def root():
- print("Called root endpoint.")
- return "hello from /"
+def root(request: Request):
+ return f"hello from / with query {request.query}"
@app.post("/create")
-def create():
- print(f"Called /create endpoint.")
- return "hello from /create"
+def create(request: Request):
+ return f"hello from /create with request body {request.body}"
As you can see, the request
object is passed right into the path operation functions. This is a reasonable approach for our mini-framework that does the job in a comprehensible way. As a comparison: in flask
, access to the request data is provided through a context-local request
object. This is a seemingly global object (you import it directly from the flask
module), but when you access the object it actually proxies to an instance that is unique to the specific concurrency unit (e.g. a thread) that your application is currently running in. I’ve always found that to be a bit too “magical” for my taste, so let’s stick with our approach.
The request
object stores information about the request query parameters, the request body and the request headers. One thing we haven’t implemented is a way to handle path parameters. Those are parameters from paths like e.g. /items/{id}
where the {id}
could be any number of different values out of some permissible set. Implementing that would also make our path operation matching in the app a bit more complicated, so we’ll skip that.
Implementing different response types
One other thing that would be nice to have is a way to abstract the construction of the responses. Every response has a status and a couple of headers and (optionally) a response body. What’s different between responses is usually the mimetype they define for their body. We want a convenient way to define three different kinds of responses: a plain-text, an HTML
and a JSON
response.
Let’s do that in a new file ./wsgi/application/response.py
.
#./wsgi/application/response.py
import json
from typing import List, Tuple, Optional, Any
class BaseResponse:
def __init__(
self,
status: str = "200 OK",
headers: Optional[List[Tuple[str, str]]] = None,
body: Optional[Any] = None,
):
self.status = status
self.headers = headers if headers is not None else []
self.body = self.body_conversion(body) if body is not None else b""
self.add_content_type_and_content_length()
def add_content_type_and_content_length(self):
header_names = {name for name, value in self.headers}
if not "Content-Type" in header_names:
self.headers.append(("Content-Type", self.content_type))
if self.body and not "Content-Length" in header_names:
self.headers.append(("Content-Length", str(len(self.body))))
class PlainTextResponse(BaseResponse):
content_type = "plain/text"
@classmethod
def body_conversion(cls, body):
return body.encode("utf-8")
class HTMLResponse(BaseResponse):
content_type = "plain/html"
@classmethod
def body_conversion(cls, body):
return body.encode("utf-8")
class JSONResponse(BaseResponse):
content_type = "application/json"
@classmethod
def body_conversion(cls, body):
return json.dumps(body).encode("utf-8")
They all derive from a common BaseResponse
, with the only differences being the Content-Type
header and how they convert their body to a bytes
. The add_content_type_and_content_length()
method is in charge of automatically setting the Content-Type
header and also determining the body length and setting it as the Content-Length
header.
Now we can simplify our framework in ./wsgi/application/application.py
.
#./wsgi/application/application.py
@@ -1,6 +1,7 @@
from typing import Callable
from dataclasses import dataclass
from .request import Request
+from .response import PlainTextResponse, BaseResponse
class WSGIApplication:
@@ -30,16 +31,16 @@ class WSGIApplication:
po = PathOperation(environ["PATH_INFO"], environ["REQUEST_METHOD"])
func = self.path_operations.get(po)
if func is None:
- status = "404 NOT FOUND"
- headers = [("Content-type", "text/plain")]
- body = b""
+ response = PlainTextResponse(status="404 NOT FOUND")
else:
request = Request.from_environ(environ)
- status = "200 OK"
- headers = [("Content-type", "text/plain")]
- body = func(request=request).encode("utf-8")
- start_response(status, headers)
- return [body]
+ ret = func(request=request)
+ if isinstance(ret, BaseResponse):
+ response = ret
+ else:
+ response = PlainTextResponse(body=ret)
+ start_response(response.status, response.headers)
+ return [response.body]
Defining a correct response is a lot less fiddly and much more expressive now. The default response from a path operation function would be a PlainTextResponse
. But we can now use those response types directly in a path operation function if we don’t want the default type. Let’s do that in ./run.py
and define a new path operation function that returns a JSONResponse
.
#./run.py
from wsgi.server import WSGIServer
from wsgi.application import WSGIApplication, Request
from wsgi.application.response import JSONResponse
app = WSGIApplication()
@app.get("/")
def root(request: Request):
return f"hello from / with query {request.query}"
@app.post("/create")
def create(request: Request):
return f"hello from /create with request body {request.body}"
@app.post("/some_json")
def some_json(request: Request):
return JSONResponse(body={"abc": "def"})
if __name__ == "__main__":
server = WSGIServer("127.0.0.1", 5000, app)
server.serve_forever()
You can test this with curl
to see that you’re getting the correct payload and headers back.
$ curl localhost:5000/some_json -X POST -i
HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 14
{"abc": "def"}
Notes
And that’s as far as we want to go with this. Our little framework is already pretty potent, adding any desired features is left as an exercise to the reader.
This also concludes our treatment of WSGI
. We’ve seen how we can implement a WSGI
server all the way from a pretty low-level socket-based TCP
server. We’ve seen how threads can be used to add concurrency to the server. We’ve looked at the interface between a WSGI
server and a WSGI
application. And we’ve now even created a tiny framework that allows to quickly build WSGI
applications without really having to know what WSGI
even is.
Seemingly this is a situation that allows one to build and deploy any imaginable web application. But if you remember the initial post in this series, you’ll recall that there is a successor specification to WSGI
called ASGI
. We’ll look at the motivations for it in the next post.