Caching in Django to improve site performance

In this article, you will learn to make your Django website faster using caching. We will cover the core concepts of caching, cache stores, and different caching strategies in Django. We will briefly look at techniques for invalidating your cache.

Prerequisites

  • Basic knowledge of Django

LEVEL - 💻 Beginner - Intermediate - We will cover basic concepts and add some examples. This post includes some tips at the end for advanced users.

Caching and Dynamic Frameworks like Django

A cache is a temporary storage area. Caching is the process of storing data in a cache. The objective is to save the result of an expensive calculation so you don't have to save it later.

A downside of dynamic frameworks like Django is that each time user requests a page, the webserver needs to make a number of calculations such as making database queries, template rendering, and other business logic. For small websites this is fine, but for bigger sites with higher traffic, it is important to reduce this work as much as possible to improve page load times for your users.

Fortunately, Django has a robust, built-in cache system that lets you control which pages, views or fragments should be saved so they do not have to be calculated for each request.

Where should cached data be stored

Cached data can be stored in a database, filesystem, or directly in memory. Memory-based stores like Memcached & Redis will usually be the fastest and can handle high loads. You can also use database caching but this works best if you have a fast well-indexed database server. The choice of storage is important as it affects cache performance. See the Django docs for more information on this.

Setting up the cache in Django

To set up the cache system in Django, add your cache preferences to the CACHES setting in your settings file. We will be using Redis as the cache database in this example.

CACHES = {
    'default': {
        'BACKEND':'django.core.cache.backends.redis.RedisCache',
        'LOCATION':'redis://127.0.0.1:6739'
     }
}

Notes

  • BACKEND - we are using the RedisCache backend. This backend is only supported natively in Django 4. For previous versions, you will need to install the django-redis library.
  • LOCATION - the URL of the Redis server. This can be on the same instance as the app or a remote server.

How to use caching

Per site cache

The simplest way to use caching is to cache the entire site. This is done by adding appropriate middleware in the Middleware setting. This blanket approach may not be appropriate for all sites.

MIDDLEWARE = [
     'django.middleware.cache.UpdateCacheMiddleWare',
     'django.middleware.cache.CommonMiddleware',
     'django.middleware.cache.FetchFromCacheMiddle',     
]

Notes

  • The order of the middleware matters because each middleware runs at different parts of the request, response cycle. During the request phase, the middleware is run from top to bottom, this is when FetchFromCacheMiddleware will run. During the response phase, the middleware is run from bottom to top, therefore UpdateCacheMiddleware must be on top as it should be run last.
  • You can also customize the CACHE_MIDDLEWARE_ALIAS, CACHE_MIDDLEWARE_KEY_PREFIX and CACHE_MIDDLEWARE_SECONDS flags to your preferences in the same file.

Per view caching

Use this to cache the output of individual views. For example, you could use this to cache the output of your landing pages such as the home page, pricing page, and other marketing pages where site speed is important.

views.py

from django.views.decorators.cache import cache_page

@cache_page(60 * 15)
def pricing_view(request):
    some expensive calculations here

urls.py

from django.views.decorators.cache import cache_page

urlpatterns = [
    
    path('pricing/', pricing_view),
    path('home/', cache_page(60 * 60 * 24, cache="marketing")(TemplateView.as_view(template_name="home.html"))
]

Notes

  • views.py cache_page - Django provides a cache decorator that will automatically cache the response for your. The pricing_view in this example is cached for 15 minutes.
  • urls.py - the decorator can also be used on Class based views.
  • options - the cache_page decorator accepts additional arguments;
    • cache for specifying which cache to use.
    • key_prefix to be added to the cache key.

Template fragment caching.

Fragments of a template can also be cached using the cache template tag. The cache template tag caches the contents of a section for a given period of time.

{% load cache %}
{% cache 500 profile_card request.user.email %}
    .. profile card..
{% endcache %}

Notes

  • load cache - the template tag first has to be loaded
  • {% cache 500 .. %} - the cache template tag accepts at least arguments. The timeout in seconds, and the name of the fragment. In this example, the profile card will be cached separately for each user.

Low-level cache API

Django exposes a low-level cache API for when you need more control on what to cache. For example when you want to refresh the cache at specified intervals.

from django.core.cache import cache

def get_expensive_calc_data(force_refresh=False):
    data = cache.get('my_key')
    if data is None or force_refresh:
        data = calculate_expensive_thing here
        cache.set('my_key', data, timeout=3000)
    return data

def my_view(request):
    data = get_expensive_calc_data()
    ...

Notes

  • get_expensive_calc_data - fetches data from cache if it exists, else does the calculation and saves it to cache storage for future requests.
  • force_refresh - The force_refresh flag can be set to true to invalidate/refresh the cached results.
  • cache.set(.. - This method expects the key to be a string and the value can be any pickable Python object. The timeout argument is optional and defaults to the timeout set in the CACHES setting explained above.

Invalidating or refreshing your cache

In this next section, we will look at techniques for refreshing your cached content. Please note that this only refers to your own data, i.e. data that is returned from the webserver and not "downstream" caches like a proxy cache or browser.

Automated refresh using the timeout flag

Django will automatically invalidate the cache at specific intervals depending on the timeout seconds specified. Specifying a timeout can be done at all levels of caching i.e. site, view, template fragment, or low-level API.

Low level API and template fragments

To refresh the cache manually you will have to know the cache key beforehand. If you are using the low-level API this is straightforward since you set the key yourself.

For template fragment caching, Django provides a method to retrieve a fragment key.

from django.core.cache import cache
from django.core.cache.utils import make_template_fragment_key

def invalidate_cached_template_fragment(key:str, vary_on:List) -> bool:
    key = make_template_fragment_key(key, vary_on)
    return cache.delete(key)

# use it
invalidate_cached_template_fragment('profile_card', [user.email])

Notes

  • invalidate_cached_template_fragment - in the spirit of keeping things DRY, we make a small function we can call from anywhere to invalidate any template fragment (DRY). It will return True, if the key was deleted or False if not.

Invalidating cached view content

This one is a bit more complex as cache keys for cached views are generated based on request URL, query, and headers. Django provides a method to get a key based on the request URL and query, but we need to construct a fake request object to fetch the correct key.

utils.py

from typing import Dict, Optional

from django.conf import settings
from django.core.handlers.wsgi import WSGIRequest


def fake_request(path, vary_headers: Optional[Dict[str, str]] = None,):
    """
    Construct a request object, can be used to generate a valid cache key
    :param path: object base path
    :param vary_headers:
    :return: request
    """
    request = WSGIRequest(
        {
            "PATH_INFO": path,
            "REQUEST_METHOD": "GET",
            "HTTP_HOST": settings.HOST,
            "wsgi.input": None,
            "wsgi.url_scheme": "http" if settings.DEBUG else "https"
        }
    )
    if vary_headers:
        request.META.update(vary_headers)
    return request

Notes

  • fake_request - constructs a fake request that mimics a real WSGIRequest object.
  • path - the path for the view for example 'home/'.
  • settings.HOST - this host should be specified in your settings file. This is your naked domain test.com or including the port if in your dev environment localhost:8000.
  • vary_headers - a vary header defines which request headers a cache mechanism should take into account when building the cache key. You can read more about it here.

Now we can use Django get_cache_key util to retrieve and invalidate the cache for a view.

from django.utils.cache import get_cache_key
from .utils import fake_request


path = 'home/'
request = fake_request(path)
key = cache.get(key)
cache.delete(key)
 

Notes

  • This part is simple. We construct the request using our method from before, fetch and delete the key.

You may want to automate the last bit, for example, you can use a management command to wrap up this logic and then trigger it via a cron, webhook, or some other event in your app.

Conclusion

That's it. In this tutorial, we have covered the basics of caching, cache stores, and different caching strategies for caching in Django. We have also looked at techniques for refreshing the cache manually.

Further Reading

Django Docs on caching - Probably the only source you will need when getting started.

Redis docs - Our preferred cache database for production environments. The docs will be useful if you are installing Redis on a VPS for example or debugging.

Downstream caches - Read this for a more complete picture of how caching could work downstream( downstream caching is not covered in this tutorial.

Copyright © 2022 www.advantch.com