Sam Hooke

Django REST framework performance (part 1: profiling)

All notes in this series:

This is part 1 in a series about how to improve the performance of Django REST framework. I plan to publish a new note each Friday.

Introduction §

As tempting as it may be, we cannot consider Django REST framework in isolation, but must also delve into the database being used as the backend. The database is not a dumb container that you can just shove data into and out of willy-nilly.

Well, you can, but sooner or later you’ll find performance of your REST API becomes an issue. While Django REST framework is a great piece of software which I am very grateful for, it does not magically use the database in the most efficient manner for your particular use case, and there are pitfalls that are all too easy to succumb to (spoiler alert: N+1 queries!).

While most examples in these notes will focus on using MySQL as the backend, the same points will invariably translate to other database backends.

If you stop reading here, at least remember this:

If you treat your database like a dumb container, then you are not getting the most out of Django REST framework.

Let’s begin! 🚀

Profiling §

Before making any code changes, we first need to talk about profiling.

If performance is important1, then we cannot tell if a change is helpful without profiling. To do so, we measure the performance before and after a change, and look at the delta.

It can be tempting to “save time” by skipping profiling, and just making a change that you are confident will help. This will only get you so far, and will eventually be counter-productive. Some performance improvements are surprising! Others are fiddly and complex. The result may not always be what you expect.

It saves time in the long run to suitably profile each change as you go. Would you modify some code without checking the relevant unit tests still pass? Would you try and improve performance without profiling?

Hopefully by now I have you onboard with the idea that profiling is a good thing!

Django Debug Toolbar §

The Django Debug Toolbar is a fantastic companion for Django REST framework. When the toolbar is installed, it shows a floating widget at the top right corner of all pages of your website. Click on the widget to open it, and under the “SQL” tab you can see all queries Django performed to load the page.

Only queries made by Django to render the view on each page are displayed. If your website uses JavaScript to call a REST API endpoint, then any queries triggered by that will not be shown in the widget. However, if that endpoint is part of your Django app, then you can open that URL directly in the browser to see the widget and queries.

For each query in the widget, click the “Expl” button to get the MySQL EXPLAIN output, which can provide more details that are helpful in determining why a query is slow. It is worth taking time to understand the output of EXPLAIN, at least in part. Even a basic understanding of whether the query is using the indexes or not is invaluable.

Run queries manually §

If you have access to the database, you can open up a shell using django-admin dbshell and run queries directly against the database. By default MySQL prints out how long a query takes, which gives a basic way to compare performance.

Using the dbshell, rather than modifying your app’s Python code, typically allows for much faster debug cycle in trying out different queries.

If your Django app is using the ORM rather than raw SQL, it is worth using the Django Debug Toolbar to view the generated SQL. Simply viewing the SQL can help shed light on how complex the query is, and help you consider whether to change your approach. Additionally, you can copy the generated SQL and paste it into the dbshell, then quickly iterate on changing the query. If you make a change you want to keep, then you’ll need to adjust your Django code so that the ORM generates the corresponding SQL, or switch to using raw SQL.

Be aware that some queries using caching, and so may be much faster on subsequent runs, giving the illusion of a performance increase! For example, running SELECT COUNT(*) on a big table may take several seconds the first time, and then return instantly the second time.

For queries which output a single row, or for EXPLAIN, it can be helpful to terminate queries with \G rather than ;. This formats the output in a much clearer manner.

Automated performance tests §

This heading could be filled with a whole book!

If you’ve already got unit tests running under CI, why not add some performance tests? They may need to run nightly rather than on every commit, since it’s likely they’ll be more time consuming, but will provide valuable history of performance over time. This can effectively be a “regression test” to ensure that you have not accidentally done something to significantly break performance: have the test fail if there is a significant outlier in the result.

Decreases in performance are not the only thing to track. A sudden increase in “performance” may also be a red flag, for example, if a bug was introduced to a REST API endpoint meaning it no longer returns any data, and so returns much faster.

Django REST framework provides a whole section in the documentation on testing, including multiple clients to help: APIRequestFactory (which extends Django’s RequestFactory class), APIClient (which extends Django’s Client class), and RequestsClient (which parallels the popular requests library). In short, the Django REST framework documentation has plenty of help on the subject.

What next? §

The next note will be more practical, with numerous techniques to improve performance, and pitfalls to avoid.

  1. If performance isn’t important now, will it be in the future? How much data and how many users will your system have to deal with a few years down the line? There’s a healthy balance somewhere between “premature optimization” and “planning for the future”. ↩︎

All notes in this series: