Challenges when building high profile editorial sites

!
BUILDING HIGH PROFILE
EDITORIAL SITES
YANN MALET
2014.DJANGOCON.EU
MAY 2014

ABOUT THIS TALK
● It comes after
− Data Herding: How to Shepherd Your Flock Through
Valleys of Darkness (2010)
− Breaking down the process of building a custom
CMS (2010)
− Stop Tilting at Windmills - Spotting Bottlenecks
(2011)

AGENDA
● Foreword
● Multi layer cache to protect your database
● Image management on responsive site
● Devops

HIGH PERFORMANCE
Django is web scale...

VARNISH
● Varnish Cache is a web application accelerator
− aka caching HTTP reverse proxy
− 10 – 1000 times faster
!
● This is hard stuff don't try to reinvent this wheel

VARNISH: TIPS AND TRICKS
● Strip cookies
● Saint Mode
● Custom error better than guru meditation

STRIP COOKIES
● Increasing hit rate is all about reducing
− Vary: on parameters
● Accept-Language
● Cookie

STRIP COOKIES
sub vcl_recv {
# unless sessionid/csrftoken is in the request,
# don't pass ANY cookies (referral_source, utm, etc)
if (req.request == "GET" &&
(req.url ~ "^/static" ||
(req.http.cookie !~ "sessionid" &&
req.http.cookie !~ "csrftoken"))) {
remove req.http.Cookie;
}
...
}
sub vcl_fetch {
# pass through for anything with a session/csrftoken set
if (beresp.http.set-cookie ~ "sessionid" ||
beresp.http.set-cookie ~ "csrftoken") {
return (pass);
} else {
return (deliver);
}
...
}

VARNISH: SAINT MODE
● Varnish Saint Mode lets you serve stale content
from cache, even when your backend servers are
unavailable.
− http://lincolnloop.com/blog/varnish-saint-mode/

VARNISH: SAINT MODE 1/2
# /etc/varnish/default.vcl
backend default {
.host = "127.0.0.1";
.port = "8000";
.saintmode_threshold = 0;
.probe = { .url = "/"; .interval = 1s; .timeout = 1s;
.window = 5; .threshold = 3;}
}
sub vcl_recv {
if (req.backend.healthy) {
set req.grace = 1h;
set req.ttl = 5s;
} else {
# Accept serving stale object (extend TTL by 6h)
set req.grace = 6h;
}
}

VARNISH: SAINT MODE 2/2
!
sub vcl_fetch {
# keep all objects for 6h beyond their TTL
set beresp.grace = 6h;
!
# If we fetch a 500, serve stale content instead
if (beresp.status == 500 ||
beresp.status == 502 ||
beresp.status == 503) {
set beresp.saintmode = 30s;
return(restart);
}
}

VARNISH: SAINT MODE
.url: Format the default request with this URL.
.timeout: How fast the probe must finish, you must specify a time
unit with the number, such as “0.1 s”, “1230 ms” or even “1 h”.
.interval: How long time to wait between polls, you must specify a
time unit here also. Notice that this is not a ‘rate’ but an ‘interval’. The
lowest poll rate is (.timeout + .interval).
.window: How many of the latest polls to consider when determining
if the backend is healthy.
.threshold: How many of the .window last polls must be good for
the backend to be declared healthy.

VARNISH: CUSTOM ERROR PAGE
sub vcl_error {
...
# Otherwise, return the custom error page
set obj.http.Content-Type = "text/html; charset=utf-8";
synthetic std.fileread("/var/www/example_com/varnish_error.html");
return(deliver);
}
● Use a nicely formatted error page instead of the
default white meditation guru

INEVITABLE QUOTE
!
„THERE ARE ONLY TWO HARD THINGS IN
COMPUTER SCIENCE:
CACHE INVALIDATION AND NAMING
THINGS, AND OFF-BY-ONE ERRORS.“
!
– PHIL KARLTON

CACHING STRATEGY
● Russian doll caching
● Randomized your cache invalidation for the
HTML cache
● Cache buster URL for your HTML cache
● Cache database queries
● More resilient cache backend

RUSSIAN DOLL CACHING
● Nested cache with increasing TTL as you walk down
{% cache MIDDLE_TTL "article_list"
request.GET.page last_article.id last_article.last_modified %}
{% include "includes/article/list_header.html" %}
<div class="article-list">
{% for article in article_list %}
{% cache LONG_TTL "article_list_teaser_" article.id article.last_modified %}
{% include "includes/article/article_teaser.html" %}
{% endcache %}
{% endfor %}
</div>
{% endcache %}

RUSSIAN DOLL CACHING
It get faster as traffic increases

● Do not invalidate all the `X_TTL` at the same time
− Modify cache templatetag: TTL +/- 20%
● Fork the {% cache … %} templatetag
try:
RANDOMIZED CACHE TTL
expire_time = int(expire_time)
expire_time = randint(expire_time * 0.8, expire_time * 1.2)
except (ValueError, TypeError):
raise TemplateSyntaxError(
'"cache" tag got a non-integer timeout value: %r' % expire_time)

CENTRAL TTL DEFINITION
● Context processor to set TTL
− SHORT_TTL
− MIDDLE_TTL
− LONG_TTL
− FOR_EVER_TTL (* not really)

RESILIENT CACHE BACKEND
● Surviving node outages is not included
− Wrap the Django cache backend in try / except
− You might also want to report it in New Relic
● Fork Django cache backend

CACHE BUSTER URL
● http://example.com/*/?PURGE_CACHE_HTML
● This URL
− traverses your stack
− purges the HTML cache fragment
− generates fresh one
!
● Fork the {% cache … %} templatetag

CACHING DB QUERIES
● Johnny cache
− It is a middleware so there is surprising side effects
− If you change the DB outside request / response
# johnny/cache.py
def enable():
"""Enable johnny-cache, for use in scripts, management
commands, async workers, or other code outside the Django
request flow."""
get_backend().patch()

MULTIPLE CACHE BACKENDS
!
CACHES = {
'default': {
'BACKEND': 'project.apps.core.backends.cache.PyLibMCCache',
'OPTIONS': cache_opts,
'VERSION': 1},
'html': {
'BACKEND': 'myproject.apps.core.backends.cache.PyLibMCCache',
'TEMPLATETAG_CACHE': True,
'VERSION': 1},
'session': {
'BACKEND': 'myproject.apps.core.backends.cache.PyLibMCCache',
'VERSION': 1,
'OPTIONS': cache_opts,},
'johnny': {
'BACKEND': 'myproject.apps.core.backends.cache.JohnnyPyLibMCCache',
'JOHNNY_CACHE': True,
'VERSION': 1}
}

CACHED_DB SESSION
SESSION_ENGINE = "Django.contrib.sessions.backends.cached_db"
SESSION_CACHE_ALIAS = "session"

RESPONSIVE DESIGN IMPACTS
● 3x more image sizes
− Desktop
− Tablet
− Mobile

IMAGE MANAGEMENT
● Django-filer
● Easy-thumbnails
● Cloudfiles (cloud containers)
!
● Assumption of fast & reliable disk should be forgotten
− The software stack is not helping, a lot of work is left to you
● Forked
− Dajngo-filer (fork)
− Easy-thumbnails (Fork - very close to to be able to drop it)
− Django-cumulus (81 Forks)
− Monkey patch pyrax
− ...
Heein!!!

DJANGO-CUMULUS
● The truth is much worst
− Log everything from the swiftclient
● Target 0 calls to the API and DB on a hot page
− The main repo is getting better ...
'loggers': {
...
'Django.db': {
'handlers': ['console'],
'level': 'DEBUG',
'propagate': True,
},
'swiftclient': {
'handlers': ['console'],
'level': 'DEBUG',
'propagate': True,
},

DJANGO-CUMULUS
● Django storage backend for Cloudfiles from Rakspace
− Be straight to the point when talking to slow API
diff --git a/cumulus/storage.py b/cumulus/storage.py
@@ -201,6 +202,19 @@ class SwiftclientStorage(Storage):
...
+ def save(self, name, content):
+ """
+ Don't check for an available name before saving, just overwrite.
+ """
+ # Get the proper name for the file, as it will actually be saved.
+ if name is None:
+ name = content.name
+ name = self._save(name, content)
+ # Store filenames with forward slashes, even on Windows
+ return force_text(name.replace('', '/'))

DJANGO-CUMULUS
Trust unreliable API at scale
diff --git a/cumulus/storage.py b/cumulus/storage.py
def _get_object(self, name):
"""
Helper function to retrieve the requested Object.
"""
- if self.exists(name):
+ try:
return self.container.get_object(name)
+ except pyrax.exceptions.NoSuchObject as err:
+ pass
def exists(self, name):
"""
exists in the storage system, or False if the name is
available for a new file.
"""
- return name in self.container.get_object_names()
+ return bool(self._get_object(name))

PATCH PYRAX
● Assume for the best
− Reduce the auth attempts
− Reduce the connection timeout
def patch_things():
# Automatically generate thumbnails for all aliases
models.signals.post_save.connect(queue_thumbnail_generation)
# Force the retries for pyrax to 1, to stop the request doubling
pyrax.cf_wrapper.client.AUTH_ATTEMPTS = 1
pyrax.cf_wrapper.client.CONNECTION_TIMEOUT = 2

GENERATE THE THUMBS
● Generate the thumbs as soon as possible
− post save signals that offload to a task
− easy-thumbnails
def queue_thumbnail_generation(sender, instance, **kwargs):
"""
Iterate over the sender's fields, and if there
is a FileField instance (or a subclass like
MultiStorageFileField) send the instance to a
task to generate All the thumbnails defined
in settings.THUMBNAIL_ALIASES.
"""
…

PICTUREFILL.JS
… A Responsive Images approach that you can
use today that mimics the proposed picture
element using spans...
− Old API demonstrated 1.2.1
<span data-picture data-alt="A giant stone facein Angkor Thom, Cambodia">
<span data-src="small.jpg"></span>
<span data-src="medium.jpg" data-media="(min-width: 400px)"></span>
<span data-src="large.jpg" data-media="(min-width: 800px)"></span>
<span data-src="extralarge.jpg" data-media="(min-width: 1000px)"></span>

<noscript>
<img src="small.jpg" alt="A giant stone face in Angkor Thom, Cambodia">
</noscript>
</span>

PUTTING IT ALL TOGETHER 1/2
● Iterate through article_list
● Nested cache

{% extends "base.html" %}
{% load image_tags cache_tags pagination_tags %}
{% block content %}
{% cache MIDDLE_TTL "article_list_" category author tag request.GET.page all_pages %}
<div class="article-list archive-list ">
{% for article in object_list %}
{% cache LONG_TTL "article_teaser_" article.id article.modified %}
{% include "newsroom/includes/article_teaser.html" with columntype="categorylist" %}
{% endcache %}
{% endfor %}
</div>
{% endcache %}

PUTTING IT ALL TOGETHER 2/2

{% load image_tags %}
<section class="blogArticleSection">
{% if article.image %}
<a href="{{ article.get_absolute_url }}" class="thumbnail">
<span data-picture data-alt="{{ article.image.default_alt_text }}">
<span data-src="{{ article.image|thumbnail_url:"large" }}"></span>
<span data-src="{{ article.image|thumbnail_url:"medium" }}" data-media="(min-width: 480px)"></span>
<span data-src="{{ article.image|thumbnail_url:"small" }}" data-media="(min-width: 600px)"></span>
<noscript>
<img src="{{ article.image|thumbnail_url:"small" }}" alt="{{ article.image.default_alt_text }}">
</noscript>
</span>
</a>
{% endif %}
...
Use Picturefill to render your images

DEVOPS
● Configuration management
● Single command deployment for all
environments
● Settings parity

CONFIGURATION MANAGEMENT
● Pick one that fits your brain & skillset
− Puppet
− Chef
− Ansible
− Salt
● At Lincoln Loop we are using Salt
− One master per project
− Minion installed on all the cloud servers

SALT
● Provision & deploy a server role
● +X app servers to absorb a traffic spike
● Replace an unsupported OS
● Update a package
● Run a one-liner command
− Restart a service on all instances
● Varnish, memcached, ...
− Check the version

SINGLE COMMAND DEPLOYMENT
● One-liner or you will get it wrong
● Consistency for each role is critical
− Avoid endless debugging of pseudo random issue

SETTING PARITY
● Is the Utopia you want to tend to but …
− There are some differences
● Avoid logic in settings.py
● Fetch data from external sources: .env

SETTINGS.PY READS FROM .ENV
import os
import ConfigParser
from superproject.settings.base import *
TEMPLATE_LOADERS = (
('Django.template.loaders.cached.Loader', TEMPLATE_LOADERS),)
config = ConfigParser.ConfigParser()
config.read(os.path.abspath(VAR_ROOT + "/../.env"))
DATABASES = {
'default': {
'ENGINE': 'Django.db.backends.mysql',
'NAME': config.get("mysql", "mysql_name"),
'USER': config.get("mysql", "mysql_user"),
'PASSWORD': config.get("mysql", "mysql_password"),
'HOST': config.get("mysql", "mysql_host"),
'PORT': config.get("mysql", "mysql_port"),
}
}

CONCLUSION
● Multi-layer Cache to protect your database
− Varnish
− Russian doll cache for the HTML fragments
● Smart key naming and invalidation condition
● Cache buster URL
● Image management
− Harder on high traffic responsive site
− Software stack not mature
● Devops
− Configuration management is a must
− Try to have settings parity between your environment

HIGH PERFORMANCE DJANGO
Kickstarter
http://lloop.us/hpd

A WORD ABOUT LEGACY MIGRATION
● This is often the hardest part to estimates
− Huge volume of data
− Often inconsistent
− Unknown implicit business logic
!
● At scale if something can go wrong it will
● It always take longer

REUSING PUBLISHED
APPLICATIONS
● Careful review before adding an external requirements
− Read the code
● Best practice
● Security audit
− Can operate at your targeted scale
− In line with the rest of your project
● It is not a binary choice you can
− extract a very small part
− Write your own version based on what you learned

Challenges when building high profile editorial sites

Challenges when building high profile editorial sites

More Related Content

What's hot (19)

Similar to Challenges when building high profile editorial sites (20)

Recently uploaded (20)

Challenges when building high profile editorial sites